Three More Tricky Tempdb Lessons

September 11, 2013

Three More Tricky Tempdb Lessons

Filed under: SQLServerPedia Syndication,Technical Articles — Tags: tempdb — Michael J. Swart @ 12:00 pm

Takeaway: Tempdb contention is a common DBA problem. I want to share three lesser known surprises about everyone’s favorite bottleneck.

If my professional life was made into a series, tempdb contention would have a recurring role as the antagonist. It provides challenges on a regular basis and I finally believe I understand it’s behavior well enough to have contention licked for good. But even recently, tempdb still holds some surprises for me.

You Can Still Get SGAM Contention After TF1118

You may know what trace flag TF1118 does. It affects the allocation of pages in all databases. But it’s most often used for its effect in tempdb. The idea is that TF1118 will cause SQL Server to store all table data in dedicated extents rather than mixed extents. Without mixed extents, the need to access SGAM pages is reduced and that leads to less contention as described in this KB article.

But the use of SGAM isn’t reduced to zero. What?! How can that be? Who’s still using mixed extents? It turns out that IAM pages are always stored on mixed extents. Every time any object is created (or destroyed) that will require at least one latch on an SGAM page. On an extremely busy system, you can still see contention here.

Equally Sized Data Files Can Become Unbalanced After Autogrow

That’s not the surprising thing. The surprising thing is that it can impact the effectiveness of having multiple data files and it needs to be kept in mind. Once again from Microsoft:

If data files [have] unequal sizes, the proportional fill algorithm tries to use the largest file more for GAM allocations instead of spreading the allocations between all the files.

This makes multiple data files less effective than it could be when tackling contention. But don’t disable auto-grow, just size your tempdb data files large enough (Tip: google “instant file initialization”)

Before we move on, if your tempdb data files have auto-grown, do you know what the initial sizes were?

Tempdb Contention Can Be Sudden and Catastrophic

I hadn’t seen this before. Most contention I’m familiar with looks like a “sluggish” database. The response times on particular queries just aren’t what they ought to be – getting gradually worse until the impact is felt hard. But just recently I have seen tempdb contention show up all-of-a-sudden making the database nearly useless. Especially after excessive blocking kicks in. Here’s how that can happen:

Imagine a super-frequent procedure creates a temp table.
Several of these temp tables are cached. (There can be several execution contexts for the procedure and so several temp tables).
A table used in the procedure has enough modified rows to have it’s statistics auto-updated.
The procedure’s query plan is invalidated and is dropped from the cache.
The cached temp tables that were used for the procedure are marked to be destroyed
The procedure is recompiled and new temp tables are created
A system process destroys the marked temp tables that need to be cleaned up

With a very busy system, these things can happen in only a few seconds. In fact those two last items describe actions where hundreds of threads can get into a fight over allocation pages.

The following query may give a list of query plans for frequently called procedures as described:

with frequentSprocs as 
(
    select top 10 count(1) as [memory objects], cp.plan_handle from sys.dm_exec_cached_plans cp
    cross apply sys.dm_exec_cached_plan_dependent_objects(cp.plan_handle) do
    join sys.dm_os_memory_objects mo
        on do.memory_object_address = mo.memory_object_address
    where objtype = 'Proc'
    group by cp.plan_handle
    order by 1 desc
)
select fs.*, qp.query_plan
from frequentSprocs fs
cross apply sys.dm_exec_query_plan(fs.plan_handle) qp
option (recompile)

What Next?

I’m lucky to work with software that runs on systems that are crazy powerful and crazy busy. It’s unlikely that you’ll run into tempdb problems in the exact same way that we did. So the typical advice for PFS/GAM/SGAM contention is still valid:

Enable trace flag TF 1118
Create multiple (equally sized) data files for tempdb
Reduce how frequently you use temp tables and table variables

-- Comments (4)

4 Comments »

A few additional thoughts.
Trace flag 1117 can help to keep data base files equally sized over their lifetime, by triggering all database files in a filegroup for autogrow at the same time.

In the context of multiple tempdb data files, I like to lobby for keeping each tempdb data file on its own LUN when possible.
Removing the bottleneck from allocation page access without increasing IO concurrency might stress the LUN queue depth for it to become the next bottleneck. So increase allocation page concurrency AND disk IO concurrency. (Also make sure that LUN queue depth and maximum transfer size are appropriately set.)

Another reason to stick to one datafile per LUN – if autoexpand occurs (or manual expand) with multiple files on a LUN, filesystem space allocation to the multiple files will result in filesystem fragmentation. One data file per LUN lowers that risk.

Final reason for one tempdb per LUN: avoiding IO weaving when multiple tempdb files are receiving heavy query spill writes or otherwise sustaining sequential IO. Query spill happens, and on busy, powerful systems it can happen in a big way to multiple queries simultaneously. Query spill tends to be sequential write. Sustain multiple large sequential writes to tracks remote from each other on the same physical HDD disks and you can tease out what Microsoft calls “io weaving”, and what most folks just call ‘head thrashing’. Even without any other activity, each RAID level has a number of sequential IO streams it can support before sequential IO latency becomes worse than random IO latency… and that number is surprisingly small in comparison to the number of concurrent users or queries that most database systems support.

Comment by Lonny Niederstadt — September 12, 2013 @ 3:57 pm
Hi Lonny!

Those are some great points. Trace flag 1117 will keep the datafiles growing in-step. It’s not right for all environments because the flag applies to all databases, not just tempdb. For my case, we just set our tempdb data files large enough.

You also bring up an interesting point about the configuration of tempdb files. The configuration of tempdb data files is one of the places that impacts two kinds of bottlenecks: I/O bottlenecks and logical contention (pagelatch contention). I’m lucky to work with a kick-ass server hardware team. They’ve made sure that we’ve got monster hardware servers that can handle anything we throw at it. In fact I get the feeling that the only reason I encountered the tempdb contention this week is because my hardware counterparts have already successfully tackled I/O bottlenecks (< 2 ms response-times even during the infrequent query spill you mentioned). Our environment is a little different than most shops anyway. We actively avoid heavy hitting queries that use tempdb. For us, it's lightweight queries or nothing it lets us cruise along quite nicely at 8K transactions per second (with peaks up to 10K). BTW, this comment is some of the best feedback I've received recently. Thanks!

Comment by Michael J. Swart — September 12, 2013 @ 5:05 pm
It should be noted that trace flag 1117 was introduced in SQL Server 2008. It is ignored by earlier versions.

Comment by Mark Freeman — September 16, 2013 @ 10:18 am
Interesting Mark! That’s something that earlier versions and I have in common.

Comment by Michael J Swart — September 16, 2013 @ 10:20 am