Yesterday I had a simple enough task at hand. An existing EXTRACT that loads a result set to a DATA file needed to have a corresponding COUNT flat file that contained the number of rows in the extract’s data file. Not being the most savvy at SSIS I did what anyone else would do get to started: I Googled my request.
I found several examples of using a script to capture the row count and write it out to a file. After a couple of hours of many false starts I consulted with my resident SSIS Expert and all around go to guy, Matt Austin, who gave me a couple of ideas to work with.
Today’s post will demonstrate how NOT to do it and will explain why this solution does not work. I don’t know about you but I usually have 2 – 3 wrong ways before I get a good one.
Thankfully, this only applies to SSIS. T-SQL is usually more or less successful on the 1st try, possibly with a slight modification from my first idea.
How Not To Do It #1: Use the SSIS ROW COUNT Transformation
SSIS has a built-in, ready to go, out of the box transformation called ROW COUNT. Basically what it does is set the value of a variable to the number of records passing through it. This was my first approach and it seemed like such a simple task to write the value out to a flat file, until I got to the point of writing the file.
So here’s how I did it:
1. Create an OLE DB query to pull some records out of a database. In this case I’ll use AdventureWorks. I’ll grab about 2 percent of the 19,000+ records in the Sales.Customer table using TABLESAMPLE.
2. Create an INTEGER variable called RowCount.
3. Add a ROWCOUNT transform after the OLE DB. Configure the ROWCOUNT transform to store the results to a variable.
3. Add a DERIVED TABLE transformation and attach it after the ROWCOUNT transformation.
4. Configure the DERIVED TABLE transform to use the RowCount variable by dragging the USER:RowCount variable down to the “Derived Column Name”.
5. Finally, add the flat file destination and configure the flat file destination to wherever you want the file to be created. Configure the mappings to include only the RowCount; exclude the other columns:
6. Run the package. The final product will look like this:
7. Check the output file. Oh know!!!! There are 188 zeros in it!
So what went wrong?
1. We still have 188 rows. To get this to display just one row with the value of the RowCount variable we could add the Row Sampling transform, which is equivalent to the T-SQL TOP command. Set the row sample size to 1.
2. But we still have a problem. The row count is still zero! The problem is that the variable that gets set by the ROWCOUNT transformation does not get set until the last row passes through the data flow. In other words, the user variable is not usable until the data flow exits. Please reference the MSDN page for the ROW COUNT transform for details.
One of those 2 issues alone might seem like a show stopper. But not quite and the learning process is working in all its glory.
Tomorrow we will modify this package to give us the row count to the flat file that we are looking for.
Lastly, we will modify the package to output any SSIS variable to a flat file.
Happy SQL-ing!
Leave a Reply
Using Gravatars in the comments - get your own and be recognized!
XHTML: These are some of the tags you can use:
<a href=""> <b> <blockquote> <code> <em> <i> <strike> <strong>