I recently discovered an undesirable behavior of SQL Server on a brand new SQL 2008R2SP2 cluster. If the cluster failed over once it was fine. If it was immediately failed back to the original node, it would hang when taking SQL Agent offline.
The cluster had two physical nodes (01P and 02P) and three (3) SQL Server instances (AP, BP & CP). It turns out that neither the hardware or the OS were contributing factors, so there is no point in describing their configurations.
The instance that was presenting the most problems was for a new client. In fact, when the issue was first discovered when the cluster was not even in PRODUCTION yet. This afforded us the luxury, at least early on, of being able to fail the instance over, almost on an as needed basis. Eventually the cluster did go into production so running tests and collecting logs involved a more rigorous process.
The databases on the instance were fairly small, by our standards. On the order of 100GB. The databases are not transactional. For the most part they are load-and-report databases, the classic OLAP.
I worked with Microsoft support over a four month period. The SQL Server support engineer, Pradeep M. M., and I had sort of a doctor-patient relationship. He would prescribe the test, I would submit the test results to him.
Most of the tests were in the form of setting trace flags to enhance logging to the SQL error log, failing the cluster over, and providing the logs for analysis.
Over the course of the four months we did upgrade from cumulative update 3 (CU3) to CU8, and eventually CU12.
While nothing in the logs pointed directly to the root cause, it was discovered fairly early on that the startup of tempdb was taking upwards of 3 minutes to come online. We checked to insure Instant File Initialization was correctly setup (which only applies to the data files, not the log file). With two 40GB tempdb data files at initialization, plus an 8GB log file, we are dealing with a small (by our standards) tempdb.
This would explain why an instance would fail when taking SQL Agent offline. Agent couldn’t go offline until the instance is fully online. Without a tempdb, the instance is still coming up.
We checked other instances that we manage and discovered that most instances of SQL Server were able to initialize tempdb in 5 seconds or less, at service startup, even for instances where tempdb was up to 800GB.
Of course, any time we start seeing IO related issues, we always blame the SAN, right? Nothing was discovered on the SAN side.
What about the hardware? Could drivers be out of date? Sure, but we have 3 instances running on 2 nodes. Both nodes are brand new blade servers. While the AP and CP instances work fine, regardless of the physical node they were on. While the BP instance consistently hang, regardless of the physical node it is on.
At that point it is fair to conclude that we are not dealing with an external resource constraint, either at the SAN or physical hardware level. Most likely, the root cause is at the instance level.
The root cause of the problem is that the MODEL database was configured to have a 1GB initial file. The short version of the explanation is that, even though tempdb is configured with it’s own attributes (number of data files, log file, autogrowth, growth increments, etc.) at service startup it is initially created by MODEL.
Rather than re-invent the wheel, I’ll provide Pradeep’s detailed explanation:
When SQL Server starts it re-creates tempdb every time; If it has to automatically create the database it has to first set some default properties and to achieve this it uses model database as a template and then once the initial template for the tempdb is ready it goes ahead and reads the actual configuration that has been kept from the master database for the tempdb files and then tweaks the same. Now when it copies the model database as a template and carves out tempdb from it doesn’t mean that it uses a windows API call to copy the file and paste the file and then move ahead. It cannot use this because SQL server internally maintains file header ( System Pages like PFS , GAM ,SGAM ) and other integral structures and it has to modify the file header and change other internal pages to remove model reference and add tempdb references to it. Due to this it actually reads all the allocate and un-allocated extents from model in 64kb chunks and then copies them to the tempdb database. When compared against normal file copy operation this operation will be pretty slow. But in the current product this is how it has been designed. The bigger the model database the number of extents available will increase and it will take more time to copy the extents to the new database in 64kb chunks.
The above was the design which exists till SQL Server 2008 R2; In SQL Server 2012 some significant design changes have been made which avoids the problem completely and fastens the data copy of model database extents to tempdb as well as in SQL Server agent there is some code changes as well which avoids the waiting situation. At this point of time the same architecture changes cannot be ported back to SQL Server 2008 R2 as the product has already reached Extend Support Phase where in only security fixes will be made and any other new issues/design change requests will be denied.
Be mindful that this is only relevant through versions SQL Server 2008R2. As Predeep has indicated above, it is not an issue in SQL Server 2012.
If find it curious that this issue has not been raised in the 6 years that this product has been in the field. At this point in time, there may not ever be a KB article on this, since the product is past the maintenance support end date (July 8, 2014).
After resizing model to the default 2mb data and 1mb log, the delays in instance failovers disappeared. That 80GB tempdb database came on line in less than 2 seconds.
How do you know if your server has the issue?
- Are you running SQL Server 2008R2 or earlier? And…
- Has your model database been changed to have a data file larger than 2MB? And…
- Capture the @startTime = current_timestamp
- Create DATABASE myTest (ie., a plain copy of the model database)
- Capture the @endTime = current_timestamp
- The delta between @startTime and @endTime is more than a second, such as 3 minutes, (ie., the time to create that database) you probably have an issue with model.
Lesson Learned: Contrary to popular beliefs, for SQL Server 2008R2 (and earlier), leave the model data file size at the default 2MB. Increasing the size of model will create delays in the creation of tempdb at service startup.
Stay curious, keep learning…