Michael J. Swart

January 22, 2013

PAGELATCH_EX Contention on 2:1:103

Filed under: SQLServerPedia Syndication — Michael J. Swart @ 12:00 pm
Update February 28, 2013: Please don’t forget to look at my follow up post here Follow up on Ad Hoc TVP Contention for some important notes on some promising fixes/changes found in later SQL Server versions.

This blog post is meant to help people troubleshoot page latch contention on 2:1:103. If that’s what brought you to this page, then hopefully you find it useful. But first…

A textbook entitled "Don't Panic"

Initial Investigation

Last week I was asked to help tackle a production outage. Queries were slow enough that the system was considered unavailable. And just like any other performance problem I started by having SQL Server tell me what was wrong with itself.

  • I first started with sp_whoisactive but it took about 30 seconds for it to return. Several queries were blocked by others and some of the lead blockers were waiting for a PAGELATCH_EX on 2:1:103
  • Then I checked on the most common waits using a query found at Paul Randal’s article: Wait statistics, or please tell me where it hurts. (I don’t have that page bookmarked, every time I’m in trouble, I google “Tell me where it hurts Paul Randal”). I learned that PAGELATCH_EX contention was our most serious wait type. Paul mentions that he sees this kind of contention most commonly on tempdb.
  • I followed a link from Paul’s article to Robert Davis’s Breaking Down TempDB Contention. A script there helped me to discover that while there was lots of tempdb contention, it was all on page 2:1:103, which is not PFS, GAM or SGAM.

Finding More Info

So I looked for more help.

  • Web Search: My work colleague found and then pointed me to Latch waits on 2:1:103? You are probably creating too many temp tables in SQL Server by Microsoft’s Matt Wrock who faced the same problem. If you found this page because you’re facing the same issue then stop reading this article and go read Matt’s first. He explains what’s going on better than I could. Think of this article as a kind of sequel to his article.
  • StackOverflow: (serverfault in this case) has TempDB contention on sysmultiobjrefs SQL 2005. The best answer there is Matt pointing back to his blog article.
  • Microsoft Support who helped us out, but maybe not as quick as we’d like. To be fair, the turnaround time that we would have been happy with was measured in nanoseconds.

With that information it became clear to us that we were creating temp tables too often. And we were creating them in a way that made it impossible for SQL Server to cache. Did you know that? That SQL Server can cache temp tables? When a query is done with a temp table, SQL Server can truncate that table and give it to the next query to avoid having to create it again. Cached temp tables reduce tempdb contention including contention on this page. But as it turns out, SQL Server cannot cache temp tables from ad hoc queries.

But who creates temp tables that often? We did, just by using a table valued parameter in a parameterized query. Since SQL Server began supporting table valued parameters (introduced in SQL Server 2008), we have been gradually moving towards this practice in lieu of sending xml to be shredded.

… And More Info Including Some Other Links

I can’t help including some extra articles I found on 2:1:103 contention.

Robert Davis blogged about 2:1:103 contention in Tempdb Contention That Can’t Be Soothed. His advice is to remove statements in code like “SELECT … INTO”.  However, I believe that such statements contribute to PAGELATCH contention only if the statement is not part of a stored procedure (i.e. can’t be cached). I also think there are more common causes of 2:1:103 PAGELATCH contention than SELECT … INTO statements.

But Robert did link to a demo by Paul Randal (an absolutely amazing 5 minute demo linked from a 2011 issue of his newsletter). Paul tells us

  • That “the SQL Team knows about this. It’s a known issue. Hopefully something will be done about it in one of the future releases.”
  • But unlike SGAM or GAM contention, there’s absolutely nothing you can do about this to spread the contention around.

From a DBA point of view there’s not much that can be done. So as DB Developers, we have to find a workaround. But before I get to that, I want to mention some things that were not so useful.

… And Some Bad Info

There was some red herrings out there…

  • Add more tempdb files (nope, wrong kind of tempdb contention).
  • Many resources suggested enabling trace flag T1118. It’s a trace flag that eases tempdb contention, but not this kind.
  • Get faster tempdb disks? No, PAGELATCH contention is for in-memory copies of pages, not disk (that’s PAGEIOLATCH).
  • Tune the queries in question? Not a bad idea, but this problem is about the number of problem queries, not the performance of each.
  • A fix from Microsoft Support: Fix: Poor performance in SQL Server 2008 R2 when table-valued functions use many table variables. Oooh… so close, but we weren’t using table-valued functions. And the workarounds they list (disabling AUTO_UPDATE_STATISTICS) did not help. But maybe it might help you?
  • The Object:Created event. Whether traced with Profiler or collected with an Extended Events session, this event can report on created temp tables. Maybe I can use this event to tell me which queries are creating the tempdb tables. Nope, not this time! This event  has two drawbacks which make it useless for troubleshooting 2:1:103 contention:
    • The Object:Created event reports the creation of temp tables even when they’re cached (which don’t need a latch on 2:1:103).
    • The Object:Created event doesn’t report the declaration of any table variables (which may need a latch on 2:1:103).

Strategy: So What Do We Do?

Knowing is half the battle right? But that means we still have a lot of work to do. I’m going to recap what we know so far. We know that we have trouble when there are queries that:

  • are executed frequently
  • create temp tables (either explicitly or by declaring table valued parameters).
  • are not cached (Microsoft explains when temp tables are not cached. In my case, it was because they were ad-hoc queries)
  • require a page latch on 2:1:103.

There is a performance counter that can track the all of the above (except maybe for that last bullet). It’s called Temp Tables Creation Rate and it’s found in the perf counter category “General Statistics”. Now this is a metric you can trust. We found that a high temp table creation rate was tightly correlated to the trouble we were seeing. So when troubleshooting, look at this performance counter (and leave the “Object:Create” event alone).

So now what do we do? First, we must find the ad hoc queries that create these temp tables. Then, we have to put them in stored procedures so that the temp tables can be cached. Alternatively, we could reduce the need for creating them. It’s a workaround, but it’s what we’ve got.

Finding Such Queries Is Difficult

But here’s the hard part. In a high volume system, it’s difficult to identify exactly which queries are causing the most trouble. Microsoft support can go through tons of collected trace and performance data to try to find such queries, but it’s a long process. On our side, we looked at sys.dm_os_waiting_tasks:

Select *
From sys.dm_os_waiting_tasks

And we saw all the contention on 2:1:103, but when we tried to look up the SQL text for it

Select wt.*, st.text
From sys.dm_os_waiting_tasks wt
left join sys.dm_exec_requests er
    on er.session_id = wt.session_id
outer apply fn_get_sql(er.sql_handle) st
where wt.resource_description is not null

The text was often unavailable. Basically, I’m guessing maybe the dmv’s I was using weren’t quick enough to tell me which queries were suffering from (or causing) contention on 2:1:103. So I decided to look through the cache for query candidates that might create temp tables. here’s what I came up with. It’s not a comprehensive list and there might be false positives, but it might be enough to go on. If you know your applications well, you can tailor the filters below to something more relevant for you.

select cp.plan_handle, sql_handle, text, refcounts, usecounts
from sys.dm_exec_cached_plans cp
join sys.dm_exec_query_stats qs
    on cp.plan_handle = qs.plan_handle
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
where (
       (
        st.text like N'%READONLY%'  -- possible table valued parameter (prepared queries)
        or st.text like N'%TABLE %' -- possible table creation
        or st.text like '%#%'       -- possible use of temp table
        or st.text like '%INTO%'    -- possible table creation
       )
       and objtype in ('prepared', 'ad hoc') -- temp tables not cachable with these
)
order by usecounts desc

Going Forward

Personally after helping identify and implement the workarounds. I’m doing a couple things:

  • I’m recommending that developers not create temp tables or declare table variables that cannot be cached. For now, this means we use stored procedures for any query that uses table variables or temp tables.
  • We now have thresholds on our performance tests which look at the performance counter Temp Tables Creation Rate.
  • I created a Microsoft Connect item. If you’re troubleshooting the same problem, head over there and let Microsoft know you’re having trouble too.

18 Comments »

  1. Yeesh, not a fun one! I’ve had to deal with tempdb contention a few times but lucky for me all the cases were on inherited servers and it was simple SGAM contention remedied via additional (and properly sized) tempdb files.

    Nice catch and great write-up! I’m sure this post will really help others who run into this!

    For anyone else interested, there is a great post about temp table caching by Paul White here: http://sqlblog.com/blogs/paul_white/archive/2012/08/17/temporary-object-caching-explained.aspx

    Comment by Andrew Lockwood — January 23, 2013 @ 2:13 pm

  2. Great post Michael. I really enjoyed the way you stepped through your process of understanding and resolving the problem. Very well explained, thanks for sharing!

    Erin

    Comment by Erin Stellato — January 24, 2013 @ 11:50 am

  3. […] PAGELATCH_EX Contention on 2:1:103 – A tremendous write up from Michael J. Swart (Blog|Twitter) on a recent performance troubleshooting effort. Great stuff! […]

    Pingback by Something for the Weekend - SQL Server Links 25/01/13 — January 25, 2013 @ 6:03 am

  4. @Andrew Thanks so much. “Not a fun one” is right. I wouldn’t wish this trouble on my enemies.

    @Erin Thank you too Erin! I wrote it by putting myself in the shoes of a reader encountering this problem really green. Which wasn’t hard. I mean, that person was me earlier this month.

    But I sincerely hope neither of you ever have to use this article.

    Cheers

    Comment by Michael J. Swart — January 25, 2013 @ 8:54 am

  5. Are you on a version that supports extended events?

    create event session foobar on server
    add event sqlos.wait_info (
    	action (
    		sqlserver.tsql_stack,
    		sqlserver.plan_handle
    	)
    	where (wait_type = 52 --PAGELATCH_EX
    		and duration > 10 --might need to dial this up or down
    	)
    )
    add target package0.ring_buffer;
    alter event session foobar on server state = start;
     
    --get the data
    select cast(xst.target_data as xml) as x
    from sys.dm_xe_session_targets as xst
    inner join sys.dm_xe_sessions as xs
        on xst.event_session_address = xs.address
    where xs.name = 'xe_JumpToDefault'

    I leave slicing as an exercise for the reader.

    Comment by Ben Thul — February 15, 2013 @ 1:38 pm

  6. Of course, you should change the session name in the target query to the actual session name. ;)

    Comment by Ben Thul — February 15, 2013 @ 1:40 pm

  7. That’s awesome Ben…
    I’ll give it a shot (I took the liberty of formatting your SQL)

    Comment by Michael J. Swart — February 15, 2013 @ 1:41 pm

  8. Great effort Ben, but it looks like the extended event is suffering from the same issue as dmv query that uses “outer apply fn_get_sql”
    And I get a lot of this:

        <action name="tsql_stack" package="sqlserver">
          <type name="unicode_string" package="package0" />
          <value>Unable to retrieve T-SQL stack</value>
          <text />
        </action>

    Comment by Michael J. Swart — February 15, 2013 @ 2:12 pm

  9. Is the plan handle non-empty? If you get something back for that, use it to at least get a query plan. One thing that might be killing you is the optimize for ad hoc server setting. If you truly suspect that it’s ad hoc queries, then those plans aren’t going to be in cache.

    Comment by Ben Thul — February 15, 2013 @ 3:54 pm

  10. Hmmm…
    “optimize for ad hoc workloads” = 0
    and

        <action name="plan_handle" package="sqlserver">
          <type name="unicode_string" package="package0" />
          <value>&lt;plan handle='0x000000000000000000000000000000000000000000000000'/&gt;</value>
          <text />
        </action>

    I don’t think that it’s a coincidence that finding a query (or even a plan) here is just as difficult as the efforts I describe in the section “Finding Such Queries Is Difficult”. I don’t know what the common cause is, I just know I have to look elsewhere.

    Comment by Michael J. Swart — February 15, 2013 @ 4:05 pm

  11. Here’s a crazy thought: what if it’s a system thread? There’s a sqlserver.is_system action. Heck, there’s a bunch of interesting actions you could add here (sqlserver.client_hostname could be instructive). The game is afoot!

    Comment by Ben Thul — February 15, 2013 @ 6:21 pm

  12. […] month I posted PAGELATCH Contention on 2:1:103. It described a troubleshooting experience I had. I was troubleshooting a performance problem that […]

    Pingback by Follow up on Ad Hoc TVP contention | Michael J. Swart — February 28, 2013 @ 12:41 pm

  13. The site ate some of my earlier post, likely because of my use of angle brackets so I am trying again…

    Thanks for the article. I am huge fan of passing table valued parameters to stored procedures, but am concerned about the contention issues and the lack of tools we have to combat it but I need some clarification on a couple of items from the article.

    ** But who creates temp tables that often? We did, just by using a table valued parameter in a parameterized query. Since SQL Server began supporting table valued parameters (introduced in SQL Server 2008), we have been gradually moving towards this practice in lieu of sending xml to be shredded. **

    Can you please provide a sample of the code you are passing to the database that is representative of “a table valued parameter in a parameterized query”?

    ** I’m recommending that developers not create temp tables or declare table variables that cannot be cached. For now, this means we use stored procedures for any query that uses table variables or temp tables. **

    Are you using “table variable” and “table type” interchangeably here? How would one call a procedure that accepts a table valued parameter without first declaring it in the batch, populating it, and then passing it in as a parameter in a similar fashion to what the code sample on Books Online shows here? http://msdn.microsoft.com/en-us/library/bb510489(v=sql.105).aspx#sectionToggle2

    Comment by opc.three — March 5, 2013 @ 12:00 pm

  14. Hey there opc.three. Thanks for coming by.

    It’s entirely possible that I’ve been sloppy with the terms I’ve used to talk about temp table variables and table valued parameters.

    Your first question about using a TVP in a parameterized query. What I mean is code similar to the example at the bottom of Temp table caching improvement for table valued parameters in SQL Server 2012. I have a reproduction at the follow-up (linked at the top of this page) which matches even more closely what I mean by TVPs in parameterized queries. That reproduction is likely what you’re asking for.

    For your second question, “table variable” and “table type” are not quite interchangeable in my head, but in my examples each table variable that is declared is a user defined type. So when I’ve defined a table type like:

    CREATE TYPE IntSet AS TABLE (Value INT)

    And create a variable of that type:

    DECLARE @MyListOfInts IntSet

    Then I consider @MyListOfInts a table variable the same way that I consider @Tablename a string variable in:

    DECLARE @Tablename sysname

    So when using stored procedures, I don’t declare and populate a table during the batch, but I do something like the following instead. Table Valued Parameters, A Short, Complete Example

    I hope I’ve answered your questions. There’s a chance that I didn’t address the thing you’re wondering about.

    Comment by Michael J. Swart — March 5, 2013 @ 1:24 pm

  15. Thanks for the reply.

    ** Then I consider @MyListOfInts a table variable the same way that I consider @Tablename a string variable in:

    DECLARE @Tablename sysname**

    OK, I do not consider them the same thing and here is why…we can say this:

    DECLARE @tt AS IntSet; -- Table Type
     
    EXEC dbo.SampleDataTypeProcedure 
        @SampleDataType = @tt;

    But this:

    DECLARE @tt AS TABLE (Value INT); -- Table Variable with the same definition as the Table Type
     
    EXEC dbo.SampleDataTypeProcedure 
        @SampleDataType = @tt;

    Results in error:

    Msg 206, Level 16, State 2, Procedure SampleDataTypeProcedure, Line 0
    Operand type clash: table is incompatible with SampleDataType

    ** So when using stored procedures, I don’t declare and populate a table during the batch, but I do something like the following instead. Table Valued Parameters, A Short, Complete Example **

    I think I know where we have a disconnect. The code sample in the article uses the SqlCommand object to map a ADO.NET DataTable to a stored procedure parameter. While there is nothing explicit in the C# code that says “declare a variable of Type BigIntList and populate it then call the proc” under the covers this is what is actually being generated by the ADO.NET object and shipped via RPC to the server when ExecuteReader() is called:

    declare @p1 dbo.BigIntList
    insert into @p1 values(1)
    insert into @p1 values(2)
    insert into @p1 values(3)
    insert into @p1 values(4)
    insert into @p1 values(5)
    insert into @p1 values(6)
    insert into @p1 values(7)
     
    exec ReturnEvenNumbers @list=@p1

    Comment by opc.three — March 5, 2013 @ 4:16 pm

  16. Right. I understand you now.

    So ADO.net is the one that declares and populates the variable before calling the stored procedure.
    For what it’s worth, I do see that SQL Server is able to re-use cached temp tables in this case. And I expect that invoking the stored procedure explicitly (without ado.net’s help) works the same.

    Now for what I call ad hoc queries or parameterized queries (where the ADO.net SqlCommand object uses CommandType.Text) ADO.net issues something like the following under the covers:

    declare @p3 dbo.BigIntList
    insert into @p3 values(1)
    insert into @p3 values(2)
    insert into @p3 values(3)
    insert into @p3 values(4)
    insert into @p3 values(5)
    insert into @p3 values(6)
    insert into @p3 values(7)
     
    exec sp_executesql N'SELECT i FROM @list WHERE i % 2 = 0',N'@list [dbo].[BigIntList] READONLY',@list=@p3

    And it’s that example that caused us grief and it’s also that example that Microsoft has been able to demonstrate a performance improvement in 2012 because SQL Server 2012 is able to cache and re-use temp tables where SQL Server 2008 and R2 could not.

    So now that that’s cleared up. From what you’ve said earlier, I believe that you’re using TVPs in a way where you won’t have to worry about tempdb contention.

    Comment by Michael J. Swart — March 5, 2013 @ 4:53 pm

  17. Thanks again for the reply. The parameterized query you showed being executed using sp_executesql in your sample code was very helpful towards knowing a specific use of table types that can cause the types of contention issues you were tracking.

    Comment by opc.three — March 5, 2013 @ 5:37 pm

  18. […] http://michaeljswart.com/2013/01/pagelatch-contention-on-21103/ […]

    Pingback by TempDB Contention 2.1.% | Vinay Thakur – Sql Server DBA — October 16, 2013 @ 1:10 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress