Michael J. Swart

July 20, 2017

SQL Server UPSERT Patterns and Antipatterns

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 11:34 am
Upsert Patterns and Antipatterns
Good ways and bad ways to update or insert rows

In 2015, Postgres added support for ON CONFLICT DO UPDATE to their INSERT statements. Their docs say that using this syntax guarantees an atomic INSERT or UPDATE outcome; one of those two outcomes is guaranteed, even under high concurrency.

Now, that is one amazing feature isn’t it? Postgres worked really hard to include that feature and kudos to them. I would trade ten features like stretch db for one feature like this. It makes a very common use case simple. And it takes away the awkwardness from the developer.

But if you work with SQL Server, the awkwardness remains and you have to take care of doing UPSERT correctly under high concurrency.

Easy things should be easy

I wrote a post in 2011 called Mythbusting: Concurrent Update/Insert Solutions. But since then, I learned new things, and people have suggested new UPSERT methods. I wanted to bring all those ideas together on one page. Something definitive that I can link to in the future.

The Setup

I’m going consider multiple definitions of a procedure called s_AccountDetails_Upsert. The procedure will attempt to perform an UPDATE or INSERT to this table:

CREATE TABLE dbo.AccountDetails

Then I’m going to test each definition by using a tool like SQL Query Stress to run this code

declare @rand float = rand() + DATEPART( second, getdate() );
declare @Email nvarchar(4000) = cast( @rand * 100 as nvarchar(100) ) + '@somewhere.com';
declare @Etc nvarchar(max) = cast( newid() as nvarchar(max) );
exec dbo.s_AccountDetails_Upsert @Email, @Etc;

Antipattern: Vanilla Solution

If you don’t care about concurrency, this is a simple implementation. Nothing fancy here:

CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
  IF EXISTS ( SELECT * FROM dbo.AccountDetails WHERE Email = @Email )
    UPDATE dbo.AccountDetails
       SET Etc = @Etc
     WHERE Email = @Email;
    INSERT dbo.AccountDetails ( Email, Etc )
    VALUES ( @Email, @Etc );

Unfortunately, under high concurrency, this procedure fails with primary key violations.

Antipattern: MERGE Statement

The MERGE statement still suffers from the same concurrency issues as the vanilla solution.

CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
    MERGE dbo.AccountDetails AS myTarget
    USING (SELECT @Email Email, @Etc etc) AS mySource
        ON mySource.Email = myTarget.Email
        SET etc = mySource.etc
        INSERT (Email, Etc) 
        VALUES (@Email, @Etc);

Primary Key violations are still generated under high concurrency. Even though it’s a single statement, it’s not isolated enough.

Antipattern: Inside a Transaction

If you try putting the vanilla solution inside a transaction, it makes the whole thing atomic, but still not isolated. You still get primary key violations:

CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
    IF EXISTS ( SELECT * FROM dbo.AccountDetails WHERE Email = @Email )
      UPDATE dbo.AccountDetails
         SET Etc = @Etc
       WHERE Email = @Email;
      INSERT dbo.AccountDetails ( Email, Etc )
      VALUES ( @Email, @Etc );

Antipattern: Inside a Serializable Transaction

So this should be isolated enough, but now it’s vulnerable to deadlocks:

CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
    IF EXISTS ( SELECT * FROM dbo.AccountDetails WHERE Email = @Email )
      UPDATE dbo.AccountDetails
         SET Etc = @Etc
       WHERE Email = @Email;
      INSERT dbo.AccountDetails ( Email, Etc )
      VALUES ( @Email, @Etc );

Antipattern: Using IGNORE_DUP_KEY

I want to mention one more bad solution before I move on to the good solutions. It works and it’s interesting, but it’s a bad idea.

ALTER TABLE dbo.AccountDetails 
  DROP CONSTRAINT PK_AccountDetails;
ALTER TABLE dbo.AccountDetails
  ADD CONSTRAINT PK_AccountDetails 
CREATE PROCEDURE dbo.s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
  -- This statement will not insert if there's already a duplicate row,
  -- But it won't raise an error either
  INSERT dbo.AccountDetails ( Email, Etc ) 
  VALUES ( @Email, @Etc );
  UPDATE dbo.AccountDetails 
     SET @Etc = Etc
   WHERE @Email = Email;

So what are some better solutions?

Pattern: Inside a Serializable Transaction With Lock Hints

Still my prefered solution.

CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
    IF EXISTS ( SELECT * FROM dbo.AccountDetails WITH (UPDLOCK) WHERE Email = @Email )
      UPDATE dbo.AccountDetails
         SET Etc = @Etc
       WHERE Email = @Email;
      INSERT dbo.AccountDetails ( Email, Etc )
      VALUES ( @Email, @Etc );

This is a solid solution, but every implementation is different so every time you use this pattern, test for concurrency.

Pattern: MERGE Statement With Serializable Isolation

Nice, but be careful with MERGE statements.

CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
    MERGE dbo.AccountDetails AS myTarget
    USING (SELECT @Email Email, @Etc etc) AS mySource
        ON mySource.Email = myTarget.Email
        SET etc = mySource.etc
        INSERT (Email, Etc) 
        VALUES (@Email, @Etc);

For alternative syntax, skip the SET TRANSACTION ISOLATION LEVEL SERIALIZABLE and include a lock hint: MERGE dbo.AccountDetails WITH (HOLDLOCK) AS myTarget.

Pattern: Just Do It

Just try it and catch and swallow any exception

CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
  INSERT INTO dbo.AccountDetails (Email, Etc) VALUES (@Email, @Etc);  
  -- ignore duplicate key errors, throw the rest.
  IF ERROR_NUMBER() IN (2601, 2627) 
    UPDATE dbo.AccountDetails
       SET Etc = @Etc
     WHERE Email = @Email;

I wrote a post about it in Ugly Pragmatism For The Win.
And I wrote about a rare problem with it in Case study: Troubleshooting Doomed Transactions.

I have no concurrency concerns here but because of this issue – not to mention performance concerns – it’s no longer my first preference.

Performance and Other Concerns

When describing each pattern so far, I haven’t paid attention to performance, just concurrency. There are many opportunities to improve the performance of these solutions. In particular, pay very close attention to your average use case. If your workload UPDATEs a row 99% of the time and INSERTs a row only 1% of the time, then optimal implementation will look different than if the UPDATE and INSERT frequency is reversed.

Check out these links (thanks Aaron Bertrand)

July 16, 2017

10 Things I Learned While Working At D2L

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication — Michael J. Swart @ 3:49 pm

Today marks my tenth anniversary working for D2L and it also marks ten years since I specialized in databases full time.
I’ve never worked anywhere this long. I always told myself that I’d stick around as long as there were new and interesting challenges to tackle and there’s no end in sight.

So I thought I’d compile a list of things I didn’t know in 2007.

1. Database Performance Improvements Are Often Dramatic

A poorly performing database query can run orders of magnitude slower than a tuned query. Compare that to poorly performing application code. Improvements there are usually more modest.

But this can work out. It makes you look like a genius. You get to brag how that report that once took 32 hours now takes 10 minutes. Even though that missing index suggestion in the query plan turned out to be right this time, take the credit, you deserve it.

2. The Community Has Your Back

Speaking of appearing like a genius, here are three resources I still use when solving problems.

3. People Love Pictures With Their Content

I’ve been writing here for nearly ten years. One day, I decided to include an illustration with each blog post. For some reason it makes each post just a little more compelling than a wall of text. Drawing has been a very rewarding hobby for me. I highly recommend it.

4. There Are Dozens of Walkable Lunch Options In Downtown Kitchener

Here are a few of my favorites

  • Kava Bean: Excellent all day breakfast
  • City Cafe: Awesome bagels in the morning.
  • Darlise: You have to hunt for it, but it’s great food
  • Holy Guacamole: Especially if you’re a fan of cilantro

If you’re in town, send me an email. I’ll introduce you to one of these places.

5. Don’t Overestimate Normal Forms

Knowing normal forms is a lot less useful than I thought it would be. Normal forms are an academic way of describing a data model and it’s ability to avoid redundant data. But knowing the difference between the different normal forms has never helped me avoid trouble.

Just use this rule of thumb: Avoid storing values in more than one place. That will get you most of the way. Do it right and you’ll never hear anyone say things like

  • We need a trigger to keep these values in sync
  • We need to write a script to correct bad data

6. “How Do You Know That?”

This is one of my favorite lines. I use it when troubleshooting a technical problem. It’s a line that helps me avoid red-herrings or to maybe find out if that info is coming via telephone game. It’s kind of my version of Missouri’s “you’ve got to show me”. It might go something like this:

Say a colleague says “I/O is slow lately.”

So I’d ask “How do you know?”

They might say something like:

  • “Hinky McUnreliable thought that’s what it was. He told me in the hallway just now”
  • The error log reports 3 instances of I/O taking longer than 15 seconds
  • I measured it on the QA box.

7. Reduce Query Frequency

The impact of a particular procedure or query on a server is simply the frequency of a query multiplied by the cost of that query. To reduce the impact on the server it’s often more valuable to focus on reducing the number of calls to that query in addition to tuning it.

  • Look for duplicate queries. You’d be surprised how often a web page asks for the same information.
  • Cache the results of queries that are cache-able. I rely on this one heavily.
  • Consider techniques like lazy loading. Only request values the moment they’re needed.

8. Schema Drift Is Real

If you have multiple instances of a database then schema will drift over time. In other words, the definition of objects like tables, views and procedures will be out of alignment with what’s expected. The drift is inevitable unless you have an automated process to check for misalignment.

We wrote our own but tools like Redgate’s Schema Compare can help with this kind of effort.

9. Get Ahead Of Trouble

Software will change over time and it requires deliberate attention and maintenance to remain stable. Watch production and look for expensive queries before they impact end users.

When I type it out like that, it seems obvious. But the question really then becomes, how much time do you want to invest in preventative maintenance? My advice would be more.

Not enough time means that your team gets really good practice firefighting urgent problems. I work directly with a dream team of troubleshooters. They’re smart, they’re cool during a crisis and they’re awesome at what they do. Fixing issues can be thrilling at times, but it pulls us away from other projects.

And what do they say about people who can’t remember the past? Sometimes we have to relearn this lesson every few years.

10. D2L Is A Great Employer

D2L values learning of course and that includes employee professional development. They have always supported my blogging and speaking activities and they recognize the benefits of it.

The people I work with are great. I couldn’t ask for a better team.

And we’ve got a decent mission. I’m really glad to be part of a team that works so hard to transform the way the world learns. Thanks D2L!

May 5, 2017

Drawing (again) with SQL Server

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Tongue In Cheek — Michael J. Swart @ 10:41 am

You have my permission to skip this post. This one’s just for me. So I’ve been drawing again with SQL Server’s spatial results tab, the first time I posted something was with Botticelli’s Birth of Venus in More images from the Spatial Results Tab.

Why Michael??

Because it’s a stupid challenge and I wanted to see what I could do with it. The SSMS spatial tab is a lousy crummy medium for images. It really is quite terrible and using SSMS to draw imposes restrictions and rules. It’s fun to see what I can do by staying within that framework. It’s something to push against just because it’s challenging. Others do crosswords, This week, I did this.

Why Now?

I realized a couple things lately.

The Colors Seem Dull … But Don’t Have to Be
I used to think the spatial results tab uses lousy colors, pastel and dull. I realized that they’re not dull, they’re just transparent. I can overlap polygons inside a geometry collection to get more solid colors. Here are the top 100 colors without transparency.


The Colors Seem Arbitrary … But Don’t Have to Be
The palette that SSMS uses is terrible. It’s almost as if the nth color is chosen using something like Color.FromArgb(new Random(n).Next()); Notice that color 6 and 7 (the beige colors on the left side of the grid) are almost indistinguishable from each-other. But I can use that. I can overlap different colors to get the color I need. And I can write a program to pick the best combination of overlaps. Here’s a nice red and blue:


But black remains difficult.

Curves Are Supported Now
I can use arc segments called CIRCULARSTRING. SVG files mostly use Bézier curves which cannot be translated easily to arc segments.
Here’s a logo that I rebuilt using arcs instead of Bézier curves:

For some reason, if you begin to use CIRCULARSTRING, then the transparent colors won’t blend with itself (just other colors).
Also arc segments are rendered as several small line segments anyway, so for my purposes, it’s not a super feature.


One last picture/query of a scarlet macaw. Click on it or any other picture in this post to get the query that generated it.

April 18, 2017

Find Your Dark Queries

Filed under: Miscelleaneous SQL,SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 11:36 am

My friend Josh came up with the term “dark queries”. Just like “dark matter”, dark queries can’t be detected directly, but their effect can still be observed. He’s talking about queries that aren’t stored in cache. If your monitoring solution relies heavily on the statistics of cached queries, then you may not be capturing all the activity on your server.

Obi-Wan never told you what happened to those queries

Some of my favorite monitoring solutions rely on the cached queries:

but some queries will fall out of cache or don’t ever make it into cache. Those are the dark queries I’m interested in today. Today let’s look at query recompiles to shed light on some of those dark queries that maybe we’re not measuring.

By the way, if you’re using SQL Server 2016’s query store then this post isn’t for you because Query Store is awesome. Query Store doesn’t rely on the cache. It captures all activity and stores queries separately – Truth in advertising!

High Recompile Rate?

If you work with a high-frequency transactional workload like I do, then you can’t afford the CPU required for frequent recompiles. If you have sustained recompiles larger than a few hundred per second, that’s probably too much. It’s easy to check. Use the performance monitor to take a look at the SQL Re-Compilations/sec counter which is found in SQLServer:SQL Statistics/sec.

Way too many recompiles

Drill Into Recompile Causes
You can drill into this a little further with an extended event session stored to a histogram like this:

  ADD EVENT sqlserver.sql_statement_recompile
  ADD TARGET package0.histogram (
      SET filtering_event_name=N'sqlserver.sql_statement_recompile',
          source_type=(0) );

Query it like this

SELECT sv.subclass_name as recompile_cause, 
  FROM sys.dm_xe_session_targets AS xet  
  JOIN sys.dm_xe_sessions AS xe  
       ON (xe.address = xet.event_session_address)  
 CROSS APPLY ( SELECT CAST(xet.target_data as xml) ) as target_data_xml ([xml])
 CROSS APPLY target_data_xml.[xml].nodes('/HistogramTarget/Slot') AS nodes (slot_data)
         SELECT nodes.slot_data.value('(value)[1]', 'int') AS recompile_cause,
                nodes.slot_data.value('(@count)[1]', 'int') AS recompile_count
       ) as shredded
  JOIN sys.trace_subclass_values AS sv
       ON shredded.recompile_cause = sv.subclass_value
 WHERE xe.name = 'Recompile_Histogram' 
   AND sv.trace_event_id = 37 -- SP:Recompile

To get results like this:

Infrequent Recompiles?

Even if you don’t have frequent recompiles, it’s important to know what’s going on in the server. On your OLTP database, I bet you’re reluctant to let those BI folks run their analytical queries whenever they want. They may try to hide their shenanigans with an OPTION(RECOMPILE) hint. This extended events captures those query recompiles. Be sure to only capture a small sample by having the session run for a small amount of time. Or you can use the histogram above to make sure that the frequency is low.

  ADD EVENT sqlserver.sql_statement_recompile(
    WHERE ([recompile_cause]=(11))) -- Option (RECOMPILE) Requested
  ADD TARGET package0.event_file(SET filename=N'DarkQueries');

Take a look at the dark queries by executing this:

SELECT DarkQueryData.eventDate,
       DB_NAME(DarkQueryData.database_id) as DatabaseName,
                OBJECT_NAME(DarkQueryData.object_id, DarkQueryData.database_id)) command,
  FROM sys.fn_xe_file_target_read_file ( 'DarkQueries*xel', null, null, null) event_file_value
 CROSS APPLY ( SELECT CAST(event_file_value.[event_data] as xml) ) event_file_value_xml ([xml])
         SELECT event_file_value_xml.[xml].value('(event/@timestamp)[1]', 'datetime') as eventDate,
                event_file_value_xml.[xml].value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') as sql_text,
                event_file_value_xml.[xml].value('(event/data[@name="object_type"]/text)[1]', 'nvarchar(100)') as object_type,
                event_file_value_xml.[xml].value('(event/data[@name="object_id"]/value)[1]', 'bigint') as object_id,
                event_file_value_xml.[xml].value('(event/data[@name="source_database_id"]/value)[1]', 'bigint') as database_id,
                event_file_value_xml.[xml].value('(event/data[@name="recompile_cause"]/text)[1]', 'nvarchar(100)') as recompile_cause
       ) as DarkQueryData
 ORDER BY eventDate DESC

This gives you results that look something like the following contrived example:

February 22, 2017

A Program to Find INSERT Statements That Don’t Specify Columns

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am

I’ve got a program here that finds SQL in procedures with missing column specifications.

Table Columns Are Ordered

Even though they shouldn’t be.

Unlike mathematical relations, SQL tables have ordered columns, but please don’t depend on it.
In other words, try to treat these tables as the same because it’s super-awkward to turn one into the other:

    LastName varchar(200),
    FirstName varchar(200)
    FirstName varchar(200),
    LastName varchar(200)

Don’t Omit Column Specification

And don’t forget to specify the columns in your INSERT statement. No excuses.

All the columns!

You’re depending on the column ordering if you write INSERT statements like this:

INSERT PEOPLE /* no column spec */
VALUES ('Rob', 'Farley'),
       ('Angela', 'Henry'),
       ('Andy', 'Leonard'),
       ('Richard', 'Douglas'),
       ('David', 'Maxwell'),
       ('Aaron', 'Nelson'),
       ('Paul', 'Randal'),
       ('Buck', 'Woody');

We recently got burned by something like this 🙁

Find Missing Column Specifications

Thomas LaRock recently encouraged DBAs to branch out horizontally. In that spirit, don’t be too afraid of the C#. I’ve got a program here that finds procedures with missing column specifications.

  • If for some reason, you don’t care about enforcing this rule for temp tables and table variables, then uncomment the line // visitor.TolerateTempTables = true;
  • It uses ScriptDom which you can get from Microsoft as a nuget package.
  • The performance is terrible in Visual Studio because ScriptDom uses Antlr which uses exceptions for flow control and this leads to lots of “first chance exceptions” which slows down debugging. Outside of Visual Studio, it’s just fine.
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using Microsoft.SqlServer.TransactSql.ScriptDom;
class Program {
    static void Main(string[] args) {
        SqlConnectionStringBuilder builder = new SqlConnectionStringBuilder {
            DataSource = ".",
            InitialCatalog = "test_database",
            IntegratedSecurity = true
        using (SqlConnection conn = new SqlConnection(builder.ToString())) {
            SqlCommand command = new SqlCommand(@"
                SELECT OBJECT_SCHEMA_NAME(object_id) [schema], 
                       OBJECT_NAME(object_id)        [procedure], 
                       OBJECT_DEFINITION(object_id)  [sql]
                  FROM sys.procedures 
                 ORDER BY OBJECT_SCHEMA_NAME(object_id), OBJECT_NAME(object_id) ;", conn);
            SqlDataReader reader = command.ExecuteReader();
            while (reader.Read()) {
                string schema = reader["schema"].ToString();
                string procedure = reader["procedure"].ToString();
                string sql = reader["sql"].ToString();
                if (SqlHasInsertWithoutColumnList(sql)) {
                    Console.WriteLine( $"{schema}.{procedure}" );
    static bool SqlHasInsertWithoutColumnList(string SQL) {
        SQLVisitor visitor = new SQLVisitor();
        // visitor.TolerateTempTables = true;
        TSql130Parser parser = new TSql130Parser(true);
        IList<ParseError> errors;
        var fragment = parser.Parse(new System.IO.StringReader(SQL), out errors);
        return visitor.HasInsertWithoutColumnSpecification;
internal class SQLVisitor : TSqlFragmentVisitor {
    public bool HasInsertWithoutColumnSpecification { get; set; }
    public bool TolerateTempTables { get; set; }
    public override void ExplicitVisit(InsertStatement node) {
        if (node.InsertSpecification.Columns.Any())
        var source = node.InsertSpecification.InsertSource as ValuesInsertSource;
        if (source != null && source.IsDefaultValues)
        if (TolerateTempTables) {
            var target = node.InsertSpecification.Target as NamedTableReference;
            if (target != null && !target.SchemaObject.BaseIdentifier.Value.StartsWith("#")) {
                HasInsertWithoutColumnSpecification = true;
        } else {
            HasInsertWithoutColumnSpecification = true;

In my environment, I found twelve examples which I’ll be fixing soon.

February 14, 2017

Generate Permutations Fast using SQL

Filed under: Miscelleaneous SQL — Michael J. Swart @ 9:49 am

If you google “generating permutations using SQL”, you get thousands of hits. It’s an interesting problem if not very useful.
I wrote a solution recently and thought I’d share it. If you’re keen, try tackling it yourself before moving on.

My Solution

Notice the use of recursive CTEs as well as bitmasks and the exclusive or operator (^).

with Letters as 
  select letter 
    from ( values ('a'), ('b'), ('c'), ('d'), ('e'), ('f'), ('g'), ('h'), ('i') ) l(letter) 
Bitmasks as 
  select cast(letter as varchar(max)) as letter, 
         cast(power(2, row_number() over (order by letter) - 1) as int) as bitmask 
    from Letters 
Permutations as
  select letter as permutation,
    from Bitmasks
  union all
  select p.permutation + b.letter,
         p.bitmask ^ b.bitmask
    from Permutations p
    join Bitmasks b
         on p.bitmask ^ b.bitmask > p.bitmask
select permutation
  from Permutations
 where bitmask = power(2, (select count(*) from Letters)) - 1

362880 rows (9!) in less than ten seconds. Let me know what you come up with.

January 16, 2017

Case study: Troubleshooting Doomed Transactions

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 8:00 am
Part of this post discusses the topic of UPSERT. For a recent, comprehensive post on UPSERT, see SQL Server UPSERT Patterns and Antipatterns

Another recap of a problem we recently tackled where we had to find out why a doomed transaction was still trying to do work. A doomed transaction is one that is about to roll back. In our case, it was caused by a rare set of circumstances:

If you don’t feel like reading ahead, here are the lessons. Don’t program in T-SQL. Avoid triggers. Extended events are awesome.

Getting Error 3930

Our app was getting an error coming from SQL Server. Error 3930 gives this message:

Msg 3930, Level 16, State 1, Line 14
The current transaction cannot be committed and cannot support operations that
write to the log file. Roll back the transaction.

Second most pessimistic robot in the universe

Google pointed me to this Stackoverflow question where Remus Rusanu talks about doomed transactions. You’re not allowed to do work in a catch block if your transaction is doomed. He then points to more information about uncommittable transactions in the TRY/CATCH docs. According to the TRY/CATCH docs, you can get this kind of error when XACT_ABORT is turned ON. This is something we do and I wrote about why in Don’t Abandon Your Transactions.

So there’s a CATCH block catching some error in a TRY block. Now I wonder what that error is. The SqlException that got raised to the client only had the 3930 error above. It had nothing about the original error. I did have the procedure name though. I can look there.

Complicated Procedure

So I took a look at the procedure getting called. I saw that it was calling another procedure. And that procedure called other procedures. It was a big complicated mess. The calling tree looked like this:

Something Complicated

Something Complicated

So the procedure was complicated and it used explicit transactions, but I couldn’t find any TRY/CATCH blocks anywhere! What I needed was a stack trace, but for T-SQL. People don’t talk about T-SQL stack traces very often. Probably because they don’t program like this in T-SQL. We can’t get a T-SQL stack trace from the SQLException (the error given to the client), so we have to get it from the server.

More Info Needed From the Server

So luckily, extended events do have T-SQL stack traces. I wanted to look at those stack traces whenever there was an error on the server. My session looked something like this:

ADD EVENT sqlserver.error_reported(
ADD TARGET package0.event_file(SET filename=N'errors')

Then I waited

Eventually It Failed Again

First I looked for the error. The one where error_code = 3930. I did it using Management Studio to view the session’s data.
Then I looked at the error immediately before it. The one where is_intercepted = true. That looks something like this


The T-SQL stack trace for that error is formatted as xml

  <frame level="1" handle="0x03001C021AD32B677F977801C8A6000001000000000000000000000000000000000000000000000000000000" line="17" offsetStart="634" offsetEnd="808" />
  <frame level="2" handle="0x03001C02A9D0D86D90977801C8A6000000000000000000000000000000000000000000000000000000000000" line="8" offsetStart="342" offsetEnd="582" />
  <frame level="3" handle="0x03001C02202218770CA27801C8A6000001000000000000000000000000000000000000000000000000000000" line="56" offsetStart="2822" offsetEnd="3218" />

It doesn’t look too useful at first glance but you can get the actual SQL query using sys.dm_exec_sql_text.

select * from sys.dm_exec_sql_text(0x03001C021AD32B677F977801C8A6000001000000000000000000000000000000000000000000000000000000)

The Original Error

Here’s what I found. The original error was a PK violation in a procedure using the Just Do It (JDI) pattern. It’s a technique which tackles upsert concurrency problems by inserting a row and catching and suppressing any PK errors that might be thrown.
So that’s where the TRY/CATCH block was.

Also, the stack trace told me that the procedure was called from a trigger. So my complicated procedure was actually more complicated than I thought. It actually looked like this

Something More Complicated

Something More Complicated

That’s why i didn’t see it before. I hate the way triggers can hide complexity. They’re sneaky.

The CATCH block catches a real error and the XACT_ABORT setting dooms the transaction. Because I didn’t do anything with the error, the transaction was allowed to continue. It was actually some query in s_ProcM that attempted to do work inside the doomed transaction.

To fix, we adjusted the text of the query to be concurrent without using TRY/CATCH for flow control. For the extra curious, we used method 4 from Mythbusting: Concurrent Update/Insert Solutions.


  • Don’t use the Just Do It (JDI) upsert pattern inside triggers
  • In fact don’t use TRY/CATCH for flow control
  • And also don’t use triggers
  • In fact don’t program inside SQL Server if you can help it
  • Oh, and Extended Events include a T-SQL stack trace if God forbid you need it

December 22, 2016

That Time We Fixed Prod Without Admin Credentials

Filed under: Miscelleaneous SQL,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 9:00 am

Merry Christmas readers! It’s story time. This is about a problem I encountered a few weeks ago. We were looking at a production site using sp_whoisactive and we noticed a lot of blocking on one particular procedure. I’m going to explain how we tackled it.

In this case, I think it’s interesting that we were able to mitigate the problem without requiring sysadmin access.

The Symptoms

Using sp_whoisactive and other tools, we noticed several symptoms.

  • SQLException timeout errors were reported by the app when calling one procedure in particular.
  • Many sessions were executing that procedure concurrently. Or at least they were attempting to.
  • There was excessive blocking and the lead blocker was running the same procedure.
  • The lead blocker had been running the longest (about 29 seconds)
  • The blocking was caused by processes waiting on Sch-M locks for a table used by that query

Here’s what was going on:
SQL Server was struggling to compile the procedure in time and the application wouldn’t let it catch its breath. The query optimizer was attempting to create statistics automatically that it needed for optimizing the query, but after thirty seconds, the application got impatient and cancelled the query.

So the compilation of the procedure was cancelled and this caused two things to happen. First, the creation of the statistics was cancelled. Second, the next session in line was allowed to run. But the problem was that the next session had already spent 28 seconds blocked by the first session and only had two seconds to try to compile a query before getting cancelled itself.

The frequent calls to the procedure meant that nobody had time to compile this query. And we were stuck in an endless cycle of sessions that wanted to compile a procedure, but could never get enough time to do it.

Why was SQL Server taking so long to compile anyway?
After a bunch of digging, we found out that a SQL Server bug was biting us. This bug involved

  • SQL Server 2014
  • Trace flag 2389 and 2390
  • Filtered Indexes on very large base tables

Kind of a perfect storm of factors that exposed a SQL Server quirk that caused long compilation times, timeouts and pain.

What We Did About It

Well, in this case, I think that the traceflag 2389, 2390 kind of outlived its usefulness (the cure is worse than the disease and all that). So the main fix for this problem is to get rid of those traceflags. But it would be some time before we could get that rolled out.

So for the short term, we worked at getting that procedure compiled and into SQL Server’s cache.

We called the procedure ourselves in Management Studio. Our call waited about thirty seconds before it got its turn to run. Then it spent a little while to compile and run the procedure. Presto! The plan is in the cache now! And everything’s all better right? Nope. Not quite. The timeouts continued.

If you’ve read Erland Sommarskog’s Slow in the Application, Fast in SSMS you may have guessed what’s going on. When we executed the procedure in SSMS, it was using different settings. So the query plan we compiled couldn’t be reused by the application. Remember, all settings (including ARITHABORT) need to match before cached plans can be reused by different sessions. We turned ARITHABORT off in SSMS and called the procedure again.

After a minute, the query completed and all blocking immediately stopped. Whew! The patient was stable.

The whole experience was a pain. And an outage is an outage. Though the count of the snags for the year had increased …

the Grinch had warded off downtime at least

November 21, 2016

Do You Have Check Constraints On Your Start and End Dates?

Filed under: Miscelleaneous SQL,SQL Scripts,SQLServerPedia Syndication,Technical Articles — Michael J. Swart @ 10:13 am

Adam Machanic tweeted this advice last week:

Are you missing any of these check constraints? Run this query to check.
This query looks for any columns in the same table that begin with “Start” and “End”. It then looks for check constraints that reference both these columns. If it doesn’t find them, it suggests a check constraint.

WITH StartColumnNames AS
    SELECT object_id,
           name AS column_name
      FROM sys.columns
     WHERE name like 'start%'
EndColumnNames AS
    SELECT object_id,
           name AS column_name
      FROM sys.columns
     WHERE name like 'end%'
SELECT t.object_id,            
       OBJECT_SCHEMA_NAME(t.object_id) AS [schema_name], 
       t.[name] AS table_name,
       s.column_name AS start_column,
       e.column_name AS end_column,
       N'ALTER TABLE ' + QUOTENAME(OBJECT_SCHEMA_NAME(t.object_id)) + N'.' + QUOTENAME(t.name) + 
           N' ADD CONSTRAINT ' + 
           QUOTENAME(N'CK_' + t.name + N'_' + s.column_name + N'_' + e.column_name) + 
           N' CHECK (' + QUOTENAME(s.column_name) + N' <= ' + QUOTENAME(e.column_name) + N');' as check_suggestion
  FROM StartColumnNames s
  JOIN EndColumnNames e
       ON s.object_id = e.object_id
       AND s.column_id <> e.column_id
       AND REPLACE(s.column_name, 'start', 'end') = e.column_name        
  JOIN sys.tables t
       ON t.object_id = s.object_id
           SELECT *
             FROM sys.check_constraints c
             JOIN sys.sql_expression_dependencies start_dependency
                  ON start_dependency.referencing_id = c.object_id
                  AND start_dependency.referenced_id = t.object_id
                  AND start_dependency.referenced_minor_id = s.column_id
             JOIN sys.sql_expression_dependencies end_dependency
                  ON end_dependency.referencing_id = c.object_id
                  AND end_dependency.referenced_id = t.object_id
                  AND end_dependency.referenced_minor_id = e.column_id
            WHERE c.parent_object_id = t.object_id


Don’t blindly run scripts that you got from some random guy’s blog. Even if that someone is me. That’s terribly irresponsible.

But this query may be useful if you do want to look for a very specific, simple kind of constraint that may match your business specs. These constraints are just suggestions and may not match your business rules. For example, when I run this query on Adventureworks, I get one “missing” check constraint for HumanResources.Shift(StartTime, EndTime) and when I look at the contents of the Shift table, I get this data:
Notice that I can’t create a constraint on this table because of the night shift. The constraint doesn’t make sense here.

Creating constraints on existing tables may take time if the table is huge. Locks may be held on that table for an uncomfortably long time.

Of course if your table has data that would violate the constraint, you can’t create it. But now you have to make some other choices. You can correct or delete the offending data or you can add the constraint with NOCHECK.

September 23, 2016

Build Your Own Tools

There are so many ways to look inside SQL Server. New extended events and dynamic management views are introduced every version. But if you want to collect something that’s unavailable, with a little bit of creativity, you can create your own tools.

Inspired by Richie Rump’s latest project, I’m going to describe a tool I wrote and the process I took to get there.

The Problem To Tackle

I had a deadlock graph I wanted to tackle, but I was having trouble reproducing it. I needed to know more about the queries involved. But the query plans were no longer in cache. So here’s the problem

Can I collect the execution plans that were used for the queries involved in a deadlock graph?

I want to use that information to reproduce – and ultimately fix – the deadlock.

The Right Tool For the Job

If you don’t have enough data to get to the root cause of an issue, put something in place for next time.

Can I Use Out-Of-The-Box Extended Events?
I’m getting used to extended events and so my first thought was “Is there a query plan field I can collect with the deadlock extended event? There is not. Which isn’t too surprising. Deadlock detection is independent of any single query.

So How Do I Get To The Execution Plans?
So when I look at a deadlock graph, I can see there are sql_handles. Given that, I can grab the plan_handle and then the query plan from the cache, but I’m going to need to collect it automatically at the time the deadlock is generated. So I’m going to need

  • XML shredding skills
  • Ability to navigate DMVs to get at the cached query plans
  • A way to programatically respond to deadlock graph events (like an XE handler or a trigger)

Responding Automatically to Extended Events
This is when I turned to #sqlhelp. And sure enough, six minutes later, Dave Mason helped me out:

I had never heard of Event Notifications, so after some googling, I discovered two things. The first thing is that I can only use Event Notifications with DDL or SQLTrace events rather than the larger set of extended events. Luckily deadlock graphs are available in both. The second thing is that Event Notifications aren’t quite notifications the way alerts are. They’re a way to push event info into a Service Broker queue. If I want automatic actions taken on Service Broker messages, I have to define and configure an activation procedure to process each message. In pictures, here’s my plan so far:

My Plan

Will It Work?
I think so, I’ve had a lot of success creating my own tools in the past such as

Also, looking at the different pieces, they’re probably robust enough to do what I need them to do. One surprise involved security concerns for activated procedures that need to access system views

The Deadlock Logging Script

Here’s the tool!

Create the Database

USE master;
IF (DB_ID('DeadlockLogging') IS NOT NULL)
    DROP DATABASE DeadlockLogging;

Create the Service Broker Objects
I’ve never used Service Broker before, so a lot of this comes from examples found in Books Online.

use DeadlockLogging;
CREATE QUEUE dbo.LogDeadlocksQueue;
CREATE SERVICE LogDeadlocksService
    ON QUEUE dbo.LogDeadlocksQueue 
CREATE ROUTE LogDeadlocksRoute
    WITH SERVICE_NAME = 'LogDeadlocksService',
-- add server level notification
IF EXISTS (SELECT * FROM sys.server_event_notifications WHERE [name] = 'LogDeadlocks')
    FOR deadlock_graph -- name of SQLTrace event type
    TO SERVICE ''LogDeadlocksService'', ''' + CAST(service_broker_guid as nvarchar(max))+ ''';'
FROM sys.databases 
WHERE [name] = DB_NAME();
EXEC sp_executesql @SQL;

The dynamic SQL is used to fetch the database guid of the newly created database.

a Place to Store Deadlocks

-- Create a place to store the deadlock graphs along with query plan information
CREATE SEQUENCE dbo.DeadlockIdentity START WITH 1;
CREATE TABLE dbo.ExtendedDeadlocks 
    DeadlockId bigint not null,
    DeadlockTime datetime not null,
    SqlHandle varbinary(64),
    StatementStart int,
    [Statement] nvarchar(max) null,
    Deadlock XML not null,
    FirstQueryPlan XML
    ON dbo.ExtendedDeadlocks(DeadlockTime, DeadlockId);

The Procedure That Processes Queue Messages

CREATE PROCEDURE dbo.ProcessDeadlockMessage
    WAITFOR ( 
        RECEIVE TOP(1)
            @RecvMsgTime = message_enqueue_time,
            @RecvMsg = message_body
        FROM dbo.LogDeadlocksQueue
    ), TIMEOUT 5000;
    IF (@@ROWCOUNT = 0)
    DECLARE @DeadlockId BIGINT = NEXT VALUE FOR dbo.DeadlockIdentity;
    DECLARE @RecsvMsgXML XML = CAST(@RecvMsg AS XML);
    DECLARE @DeadlockGraph XML = @RecsvMsgXML.query('/EVENT_INSTANCE/TextData/deadlock-list/deadlock');
    WITH DistinctSqlHandles AS
        SELECT DISTINCT node.value('@sqlhandle', 'varchar(max)') as SqlHandle
        FROM @RecsvMsgXML.nodes('//frame') AS frames(node)            
    INSERT ExtendedDeadlocks (DeadlockId, DeadlockTime, SqlHandle, StatementStart, [Statement], Deadlock, FirstQueryPlan)
    SELECT @DeadlockId,
    FROM DistinctSqlHandles s
    LEFT JOIN sys.dm_exec_query_stats qs 
        on qs.sql_handle = CONVERT(VARBINARY(64), SqlHandle, 1) 
    OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
    OUTER APPLY sys.dm_exec_sql_text (CONVERT(VARBINARY(64), SqlHandle, 1)) st
      SELECT SUBSTRING(st.[text],(qs.statement_start_offset + 2) / 2,
             WHEN qs.statement_end_offset = -1  THEN LEN(CONVERT(NVARCHAR(MAX), st.text)) * 2
             ELSE qs.statement_end_offset + 2
             END - qs.statement_start_offset) / 2)) as sqlStatement([statement]);
    -- clean up old deadlocks
    SELECT DISTINCT TOP (500) @limit = DeadlockId 
    FROM ExtendedDeadlocks 
    ORDER BY DeadlockId DESC;
    DELETE ExtendedDeadlocks 
    WHERE DeadlockId < @limit;

Activating the Procedure

ALTER QUEUE dbo.LogDeadlocksQueue
    ( STATUS = ON,
      PROCEDURE_NAME = dbo.ProcessDeadlockMessage,

Clean Up
And when you’re all done, this code will clean up this whole experiment.

use master;
if (db_id('DeadlockLogging') is not null)
    alter database DeadlockLogging set single_user with rollback immediate 
    drop database DeadlockLogging
if exists (select * from sys.server_event_notifications where name = 'DeadlockLogging')
Older Posts »

Powered by WordPress