Loski's SQL Movements

Public Use Microdata Sample (PUMS) for data engineers

Leave a Comment / Census, Public Data / Russel Loski

Years ago I was studying how to use Power Pivot and wanted a large dataset to test. I discovered the Public Use Microdata Sample data (Public Use Microdata Sample (PUMS)). This data provides curated results from millions of questionnaires sent out each year by the US Census bureau. These files are generated from the survey […]

Public Use Microdata Sample (PUMS) for data engineers Read More »

Why should a data engineer be interested in government data?

Leave a Comment / Public Data / Russel Loski

In a previous post, I introduced my readers to the data.gov (Data.gov a catalog for USA data). Why should a data engineer be interested in government data? Our plates are full with our own company’s data. We also have so much technology that we need to learn. Where is the time for such a side

Why should a data engineer be interested in government data? Read More »

Data.gov a catalog for USA data

Leave a Comment / Public Data / Russel Loski

The USA government produces and gathers a lot of data, but how to find it? Over the last decade, the United States executive branch and later Congress have required the government to publish the data on the web. data.gov provides a search engine for accessing that data. The OPEN Government Data Act makes Data.gov a

Data.gov a catalog for USA data Read More »

SQL Permission Roles versus User

Leave a Comment / Permissions, SQL Server, TSQL / Russel Loski

This week I set up a a user in our SQL Server. To save time, I simply assigned all of the permission that user needed. Then I remembered how many times I have been burned by this shortcut. It is so much easier in the long run to create a role in SQL Server, assign

SQL Permission Roles versus User Read More »

Time in application and database design

Leave a Comment / Data Design, Documentation, Time / Russel Loski

I take time for granted, especially time zones. When asked if a certain time works for a meeting, I assume that the time zone is the same as the one I am in. That logic began to fail when I began working remote with people all over the United States and the world. I am

Time in application and database design Read More »

Reading SSAS Tabular Row Level Security in SSMS

Azure Synapse Analytics pipeline expression to create dynamic URL

Leave a Comment / Azure, Azure Data Factory, Azure Synapse Analytics, ETL / Russel Loski

A URL can be broken up into multiple parts. 1 If you break up the URL into its parts, how do you put all the pieces together to form the full URL in Azure Synapse Analytics (#AzureSynapse) pipelines or Azure Data Factory? This post gives an example of how to create a valid URL from

Azure Synapse Analytics pipeline expression to create dynamic URL Read More »

Selective text qualifiers in SSIS

Leave a Comment / ETL, SSIS, SSIS Script, Uncategorized / Russel Loski

One of the things that messes up processing delimited files is when the delimiter character is actually part of the data rather than acting as a delimiter. So look at the following: The second line has 4 commas but the first has only 3 commas. The second comma is actually part of the data: One

Selective text qualifiers in SSIS Read More »

Limiting permissions with “Execute as” when using dynamic SQL

Leave a Comment / ETL, TSQL, Uncategorized / Russel Loski

When using dynamic SQL there is always the possibility that someone will inject SQL that does harm. They can put queries that can drop objects or can create sysadmin logins. Running the scripts as low permission users can reduce the potential harm. But that takes time to research and is easy to cut. You can

Limiting permissions with “Execute as” when using dynamic SQL Read More »

Getting metadata from query using sp_describe_first_result_set

Leave a Comment / Documentation, ETL, Testing, TSQL / Russ

One of the tasks that eats a lot of my ETL development time is documenting the views and procedures I write. I need to know where exactly the data is coming from. sp_describe_first_result_set provides a flag for outputting the source for many of the columns in a query. exec sys.sp_describe_first_result_set N’exec [Integration].[GetOrderUpdates] ”20100101”, ”20170122”’, null,

Getting metadata from query using sp_describe_first_result_set Read More »