Practical Use of the OUTPUT Clause

March 13th, 2010 Dave No comments

SQL Server 2005 introduced many new switches and features in T-SQL.  Often times new features are met with “Why would I need to do that?”.  The reality is that the SQL Server development team are not adding features for nothing.

One of my favorites is the OUTPUT clause.  The OUTPUT clause can be used in conjunction with INSERT, UPDATE and DELETE statements.  This post will show how I used the OUTPUT clause in a real world scenario with UPDATE.

Situation:  While working on my data warehouse it was discovered that certain records were not being loaded into the warehouse because one of the lookup tables did not include one or more records that were required for an INNER JOIN.  Since this has been going on over a period of several years it was considered important to reload those records.

Fortunately the source for the incoming data has been stored in a “staging” table so we have all of the original incoming data.  One of the columns in the staging table is the load_date, which stores the date that the data was extracted from the source system.

Only a small number of records are affected each time the warehouse is loaded, but they are scattered over a period of several years of data.  The load routine utilizes the load_date column to identify the records that need to be inserted or updated on the warehouse.

In order to utilize the existing load routine, i.e.., not have to craft a custom loader to identify the missing records, we would like to set the load_date of all of the missing records to the current date and run the load routine for the existing date.

Using the OUTPUT clause we will be able to update the missing stage records with a common load_date value and at the same time save the original load dates for the records in a separate table.  We can then run the ETL to load the missing records.  After the load successfully completes, we will restore the stage table to the original load dates.

Implementation Details: The first step is to create an output table where we will store 3 things: the primary key of the table, the original value of the column that is going to be updated, and the newly updated value of the updated column (optional but can be handy).

We want to save the primary key so we can conveniently do the update later after we’ve verified the results of the ETL load.

CREATE TABLE dbo.tmp_stg_output (stg_pkey_id         int,
                                 load_date_before    datetime,
                                 load_date_after     datetime );

CREATE UNIQUE CLUSTERED INDEX idx_uc_output_clause
    ON   dbo.tmp_stg_output (stg_pkey_id);

 

Next we’ll update the stage table with our desired load_date.  We’ll also include the OUTPUT clause to save the data to our temp table.

 

UPDATE stg
SET    load_date = '03/12/2010'
     OUTPUT inserted.stg_pkey_id,
            deleted.load_date,
            inserted.load_date
     INTO tmp_stg_output
FROM  STG_table stg
WHERE stg.section_code = 2213
AND  stg.district_code = 197l;

 

This is just your run of the mill UPDATE statement except we have the OUTPUT clause that indicates the 3 columns to save to the tmp table.  If you’re not familiar with the special tables INSERTED and DELETED check out BOL, Using the inserted and deleted TablesMy first encounter with these special tables were when I was creating triggers. 

For the purpose of the use of the OUTPUT clause all we need to understand is that the INSERTED table contains the data that is changing.  So in this example, the INSERTED.load_date will save the new load_date value.  Also note, the INSERTED.stg_pkey_id could have been DELETED.stg_pkey_id.  The primary key of the table doesn’t change so the INSERTED and DELETED tables will contain the same value for this column.

The DELETED.load_date will contain what was originally in the table before the update.

At this point we can run the load ETL to load our missing records.  The details of the ETL are not relevant to this post…

Once we’ve validated we have a good load of our missing records we can restore our stage table to it’s original state by using our handy little temp table.

-- Run the load package using the load_date of 20100312

-- Verify the results are correct

-- Set the stg table back to it's original state
UPDATE stg
SET    load_date = t.load_date_before
FROM  dbo.STG_table stg
       INNER JOIN dbo.tmp_stg_output t
         ON   t.stg_pkey_id = stg.stg_pkey_id;

Conclusion:  Using the OUTPUT clause in conjunction with the UPDATE statement we were able to capture the state of our  staging table, update one of it’s columns to allow us to easily perform an ETL load.  Once the ETL finishes and is verified we restore the staging table to it’s original state using the temp table we created and populated with the OUTPUT clause.

Similar capabilities exist for INSERT and DELETE statements.

In our example we used an actual table that was created with the CREATE TABLE statement.  We also could have just as easily used a temp table (#tmp_stg_output, ##tmp_stg_output), or a table variable (@tmp_stg_output).  Obviously the table variable would not have given us the luxury of validating the data because the table is no longer available after the batch completes.

The risk of using #temp tables is that they are dropped when the session ends.  If your session disconnects as a result of a network glitch the data in your temp output table would be lost.

Stay curious – keep learning…

Dave

SQL Saturday #33, Charlotte, NC from the Rear View Mirror

March 8th, 2010 Dave No comments

I just wrapped up my first ever attendance at a SQL Saturday.  I was not disappointed.  But not only was I an attendee, but I was also a presenter (speaker) too.

The event was held at the Microsoft campus in Charlotte.  The facility is first class and the principal organizers, Peter Shire and the management and staff at SQL Sentry, could not have found a better venue.

There were 54 scheduled sessions presented across 9 tracks that were delivered by 25-30 speakers.  Most of the speakers were SQL Server MVPs that flew in from all over the country.  And then there was me…

When I signed up for the event I suggested 3 “proposed” sessions: Query Plan Analysis, Implement a Sliding Window Partition, and a session on Profiler / Trace.  I figured one might get selected, maybe.  Fortunately for me the query plan session was picked.

Also, at the time of my “proposed” sessions, there was very few speakers registered.  So I assumed that they were going to have a tough time getting people to run the sessions.  Fast forward 2 weeks later and the speaker roster looked like a Who’s Who of SQL Server MVPs, both local and out of state.  Out-of-towners included Andy Kelly, Andy Leonard, Steve Jones, Andy Warren, Kevin Kline, Jessica Moss and more.  Local MVP talent included John Welch, Alejandro Mesa, Rafael Salas and several top SQL talents from Microsoft itself.  Honestly, had I known the speaker list was going to be of that caliber I would not have had the audacity to inject my name on such a list.  But sometimes Providence has a way of getting us do go outside our comfort zone to take us to the next level.

There were several unexpected benefits of being a speaker at SQL Saturday…  First of all you get to wear a really cool button down shirt at the event.  It was navy blue with the SQL Saturday logo and event number embroidered above the pocket.  It will be a nice keepsake of the event. 

The night before the event there was a “speakers dinner”.  Apparently the speakers dinner is standard protocol…  the main sponsor of the events hosts a dinner for all the speakers at a nice restaurant.  In our case we all went to Red Rocks in Birkdale Village.  Anything and everything on the menu or the bar, was fair game.

The second biggest benefit of being a speaker was the opportunity to meet, in a social setting, the many MVPs of the group.  Like they say about the PGA golf pros… Those guys are good.

Finally, at the top of the list of speaking at a SQL Saturday, is the lessons learned by preparing and delivering a technical presentation to a technology savvy audience.  For me there were many…  One of my favorite sayings is “If you want to learn it, teach it”.  The prep work of organizing your thoughts for the presentation, thinking about how to simplify complex concepts, trying to anticipate questions that may be presented, and then figuring out the answers, all contribute.

As expected, the attendees of the experience did ask some insightful questions that challenged my own knowledge of the subject.  Fortunately I had some help from the attendees who helped explain the answer to the question… To them I will repeat myself…  Thank you.

At first I was a little disappointed that my presentation was in the last time slot of the day… 4-5.  I didn’t expect anyone to stick around to see it, after all, it was one of the first decent Saturdays of the year.  Was I ever wrong…  Although my session was in one of the smaller rooms, it was packed.  My guess is there was between 40 and 60 people in it.  All the seats where taken.  People were sitting on a table that flanked the length of one of the walls, the back wall was lined with folks standing, and a few people were straddled on the floor.

Obviously I had a popular topic… because there weren’t coming to see me, a rookie presenter.  After seeing several sessions though out the day that were closely related to my session, in particular the session by Andy Warren on column and index statistics and Kevin Kline’s session on internals (basically how queries are processed) I was concerned that my session would be too much of a repeat of their messages.

Instead, I do believe, partly because one of the attendees told my so after the event, that my hands on demo filled the gap between the theory and concepts presented by others and the real world.  For that comment I am grateful…

I now have more work to do to improve my own knowledge and the overall presentation.  That only confirms the Learn it – Teach it theory.  But overall the experience was a 10!

Since Saturday I have had several LinkedIn invites from attendees, asked to consider presenting at the SQL Server UG in Columbia, SC, asked to consider presenting a webcast for SQLLunch, and had a couple folks indicate that the presentation has inspired them to look deeper at their own code and at the general topic of query plan analysis.  I know I struggle regularly with trying to get my arms around this 800 pound gorilla we call SQL server.  If 1 person in that room was moved to improve I consider it a homerun.  2 or more… well, that’s a grand slam.

I can ask no more from this event.  The rewards I have personally and professionally received from it far exceeded my expectations.  To all of the participants that stayed late, to those that provided constructive and supportive feedback, and to those that didn’t walk out…   I Thank You!

Stay curious; Keep learning…

Dave

Categories: T-SQL Tags: ,

Database Properties

February 27th, 2010 Dave No comments

A couple of weeks ago a colleague asked me how to determine the recovery model of a database using T-SQL.  In other words, not using SSMS.  For the purpose of this post I’m going to show how to determine the recovery model for a database using SSMS, and then answer the presented question.

The following screen can be viewed in SSMS by RIGHT clicking on the database in Object Explorer ==> Properties ==> Options.

image

From this we can see that the AdventureWorks database is Simple (the other options are Full and Bulk Logged).

But back to the original question…  How can this be determined using T-SQL?  At the time we were at lunch and I suggested using sp_helpdb.  I wasn’t positive this property was included…  but it turned out to be a good guess:

 

image

If you prefer to identify only the recovery model using T-SQL, such as to be able to save the information in a variable of temp table you can use the following approach:

image

There is an older version of the DATABASEPROPERTYEX called DATABASEPROPERTY.  According to BOL, it is not recommend using DATABASEPROPERTY as it has been scheduled for deprecation.

We can validate that DATABASEPROPERTY is deprecated by tracing deprecation in Profiler:

image

After running  the same query using DATABASEPROPERTY instead of DATABASEPROPERTYEX we observe the following in Profiler:

image

This only confirms what BOL was telling us.

OK, last approach (for today, I’m sure there are more methods) to determine the recovery model for a database:

image

 

Lessons Learned: As is always the case with SQL Server there are always multiple ways to arrive at the same result.  In this case we learned that we can capture a database’s recovery model by using:

  • SSMS
  • sp_helpdb
  • DATABASEPROPERTYEX
  • sys.databases view
    We also learned and confirmed that DATABASEPROPERTY (no EX) will not be supported in versions beyond SQL Server 2008. 
    Finally, we showed how to verify a deprecated command using SQL Profiler.
    Although this post was focused on the recovery model of the database any of these methods will work with any other database properties.  For more information on the entire set of database properties see BOL for DATABASEPROPERTY .

    Stay curious; keep learning…

    Dave

    Finding Uncommitted Transactions

    February 11th, 2010 Dave No comments

    One way to find out if a transaction is causing a block is with the sp_who2 system stored procedure.  The scope of this procedure is server-wide.  It identifies all connections to the database, the login name, the application that is running, and in the context of locks and blocks, any SPIDs that are causing a block.  The offending SPID will be in the BlkBy column.

    I often find myself using transactions when doing DELETE or UPDATE operations to allow myself the option of ROLLING back the transaction in case something goes awry.  As equally often, I forget to COMMIT the transaction and move on the the next order of business, resulting in a lock on the table and essentially blocking everyone out of it’s use (unless they use the with (nolock) hint in their query).

    If I ran the following query and forgot to commit the transaction what would happen?

    USE AdventureWorks;
    GO
    
    BEGIN TRAN
    INSERT INTO person.contact (FirstName, LastName, EmailAddress, PasswordHash, PasswordSalt)
    VALUES ('Dave', 'Turpin', 'dave.turpin@acm.org', 'goblygook', 'Salt');

     

    The table would lock up until either a commit or rollback happens.  But how would I know if there are any open transactions, either mine or somebody else?  sp_who2 would not indicate any blocks because there aren’t any.  There’s only an open transaction.  But as long as no other processes are tying to use that table there will not be any blocking/locking issues.

    DBCC OPENTRAN will list the earliest open transaction.  In our simple example, the DBCC will return:

    image

    In this case we can see that SPID 54 had a transaction open since 5:08AM.  Now we can take a look at SPID 54, see who it is, and initiate either a rollback or commit.  Chances are if an uncommitted transaction is hanging out there there’s either a major bug in someone’s code, we have a legitimate locking/blocking problem, or its an ad hoc batch.

    Since DBCC OPENTRAN only identifies the earliest open transaction, you may have to resolve someone else’s transaction before actually getting to yours.  Hopefully there will not be any open transactions for very long and the point is moot.  In practice, however, when we need to look for open transactions it’s because something else isn’t working or…  The red phones are lighting up with customer complaints.

    Stay curious; keep learning…

    Dave