Michael J. Swart

February 11, 2016

Ugly Pragmatism For The Win

Filed under: SQLServerPedia Syndication — Michael J. Swart @ 11:37 am

Yesterday I had my mind changed about the best way to do concurrency. I describe several methods in Mythbusting: Concurrent Update/Insert Solutions. My preferred method is to increase the isolation level and fine tune locks.

At least that was my preference. I recently changed my approach to use a method that gbn suggested in the comments. He describes his method as the “TRY CATCH JFDI pattern”. Normally I avoid solutions like that. There’s a rule of thumb that says developers should not rely on catching errors or exceptions for control flow. But I broke that rule of thumb yesterday.

By the way, I love the gbn’s description for the pattern “JFDI”. It reminds me of Shia Labeouf’s motivational video.

Okay, I’ll tell you the story.

The Original Defect

So there’s this table. It’s defined something like:

CREATE TABLE dbo.AccountDetails
(
  Email NVARCHAR(400) NOT NULL
    CONSTRAINT PK_AccountDetails PRIMARY KEY (Email),
  Created DATETIME NOT NULL
    CONSTRAINT DF_AccountDetails_Created DEFAULT GETUTCDATE(),
  Etc NVARCHAR(MAX)
)

And there’s a procedure defined something like:

CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
  @Email NVARCHAR(400),
  @Etc NVARCHAR(MAX)
)
AS
 
  DECLARE @Created DATETIME;
  DECLARE @EtcDetails NVARCHAR(MAX);
 
  SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
  SET XACT_ABORT ON
  BEGIN TRAN
 
    SELECT
      @Created = Created,
      @EtcDetails = Etc
    FROM dbo.AccountDetails
    WHERE Email = @Email
 
    IF @Created IS NULL
    BEGIN
      SET @Created = GETUTCDATE();
      SET @EtcDetails = @Etc;
      INSERT INTO dbo.AccountDetails (Email, Created, Etc)
      VALUES (@Email, @Created, @EtcDetails);
    END
 
  COMMIT
 
  SELECT @Email as Email, @Created as Created, @EtcDetails as Etc

Applications executing this procedure were deadlocking with each other. If you’re keen, try to figure out why before reading ahead. It’s pretty close to the problem described in the Mythbusting post. Specifically this was method 3: increased isolation level.

Initial Fix

So I decided to fine tune locks. I added an UPDLOCK hint:

CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
  @Email NVARCHAR(400),
  @Etc NVARCHAR(MAX)
)
AS
 
  DECLARE @Created DATETIME;
  DECLARE @EtcDetails NVARCHAR(MAX);
 
  SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
  SET XACT_ABORT ON
  BEGIN TRAN
 
    SELECT
      @Created = Created,
      @EtcDetails = Etc
    FROM dbo.AccountDetails WITH (UPDLOCK)    WHERE Email = @Email
 
    IF @Created IS NULL
    BEGIN
      SET @Created = GETUTCDATE();
      SET @EtcDetails = @Etc;
      INSERT INTO dbo.AccountDetails (Email, Created, Etc)
      VALUES (@Email, @Created, @EtcDetails);
    END
 
  COMMIT
 
  SELECT @Email as Email, @Created as Created, @EtcDetails as Etc

Bail Early If Possible

Okay, so this solution works. It’s concurrent and it performs just fine. I realized though that I can improve this further by avoiding the transaction and locks. Basically select the row and if it exists, bail early:

CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
  @Email NVARCHAR(400),
  @Etc NVARCHAR(MAX)
)
AS
 
  DECLARE @Created DATETIME;
  DECLARE @EtcDetails NVARCHAR(MAX);
 
  SELECT    @Created = Created,    @EtcDetails = Etc  FROM dbo.AccountDetails  WHERE Email = @Email;   IF (@Created IS NOT NULL)  BEGIN    SELECT @Email as Email, @Created as Created, @EtcDetails as Etc;    RETURN;  END 
  SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
  SET XACT_ABORT ON
  BEGIN TRAN
 
    SELECT
      @Created = Created,
      @EtcDetails = Etc
    FROM dbo.AccountDetails WITH (UPDLOCK)
    WHERE Email = @Email;
 
    IF @Created IS NULL
    BEGIN
      SET @Created = GETUTCDATE();
      SET @EtcDetails = @Etc;
      INSERT INTO dbo.AccountDetails (Email, Created, Etc)
      VALUES (@Email, @Created, @EtcDetails);
    END
 
  COMMIT
 
  SELECT @Email as Email, @Created as Created, @EtcDetails as Etc;

Take A Step Back

Okay, this is getting out of hand. The query shouldn’t have to be this complicated.
Luckily I work with a guy named Chris. He’s amazing at what he does. He questions everything without being a nitpicker (there’s a difference). He read through the Mythbusters post and followed all the links in the comments. He asked whether gbn’s JFDI pattern wasn’t better here. So I implemented it just to see what that looked like:

CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
  @Email NVARCHAR(400),
  @Etc NVARCHAR(MAX)
)
AS
  BEGIN TRY
    INSERT INTO dbo.AccountDetails Email, Etc
    SELECT @Email, @Etc
    WHERE NOT EXISTS 
      ( SELECT * FROM dbo.AccountDetails WHERE Email = @Email );
  END TRY
  BEGIN CATCH
    -- ignore duplicate key errors, throw the rest.
    IF ERROR_NUMBER() <> 2601 THROW; 
  END CATCH
 
  SELECT Email, Created, Etc
  FROM dbo.AccountDetails 
  WHERE Email = @Email;

Look at how much better that looks! No elevated transaction isolation levels. No query hints. The procedure itself is half as long as it used to be. The SQL is so much simpler and for that reason, I prefer this approach. I am happy in this case to use error handling for control flow.

So I checked in the change and updated my pull request. Chris’s last comment before he approved the pull request was “Looks good. Ugly pragmatism FTW.”

10 Comments »

  1. Sometimes, though, JFDI leads to worse performance overall, depending on what % of calls fail. Raising exceptions has substantial overhead. I showed this in a couple of posts:

    http://sqlperformance.com/2012/08/t-sql-queries/error-handling

    https://www.mssqltips.com/sqlservertip/2632/checking-for-potential-constraint-violations-before-entering-sql-server-try-and-catch-logic/

    Comment by Aaron Bertrand — February 11, 2016 @ 11:49 am

  2. You’re right Aaron, and we did test it.

    It turns out that in our case, the percent of calls that failed was 0 (when rounded to the nearest percent).
    I think you illustrate the point that as much as possible, evaluate things on a case-by-case basis over following rules-of-thumb.

    It’s also why we added the not-strictly-necessary WHERE NOT EXISTS clause.

    Comment by Michael J. Swart — February 11, 2016 @ 11:57 am

  3. i read this and the mythbusting article with great interest. thanks. question, though: in the mythbusting article, the problem at hand had an UPDATE component to it which is not present in this example. was the UPDATE component eliminated for simplicity? the reason i ask is because in the example as provided i’m wondering why you would catch and then literally ignore duplicate key errors rather than simply define the PRIMARY KEY with IGNORE_DUP_KEY=ON? Using IGNORE_DUP_KEY you could still adopt the JFDI ethos and simplify even further by removing even the TRY…CATCH. i remember reading about overhead associated with (ab)use of IGNORE_DUP_KEY, but seem to remember that being associated with lots of duplicate keys in a single large insert and not so much with singletons (unfortunately, i can’t seem to find the source).

    thanks for the article and for a very enjoyable and informative blog all around.

    Comment by mmiike — February 11, 2016 @ 8:14 pm

  4. Thanks for the comment Mike,

    I have to be honest, I come across the IGNORE_DUP_KEY feature so infrequently that I forgot that it existed. It is true that in this case that IGNORE_DUP_KEY would have worked like you describe.

    I actually would (slightly) prefer the TRY/CATCH method over the IGNORE_DUP_KEY method because it keeps the behavior with the code where it seems to belong. But that’s a personal preference.

    Thanks for the feedback!

    Comment by Michael J. Swart — February 11, 2016 @ 8:25 pm

  5. […] Sometimes, Michael J. Swart says, it’s better to just do it: […]

    Pingback by JDFI – Curated SQL — February 12, 2016 @ 8:01 am

  6. Nice Blog, well done.

    Comment by Paul Nielsen — February 15, 2016 @ 11:34 pm

  7. Michael,

    Nice post. I’m curious if a merge (without try catch) would perform as well and also avoid locking. Thanks.

    Comment by Drew — February 16, 2016 @ 10:35 am

  8. Hi Drew,

    Check out the original post I wrote: Mythbusting: Concurrent Update/Insert Solutions. In it I test out the MERGE statement and its concurrency.

    Although the MERGE statement is a single statement, that isn’t enough to avoid concurrency issues.
    In terms of performance, I’ve never heard of MERGE performing better than alternatives. I’m sure examples exist, but it hasn’t been my experience.

    Comment by Michael J. Swart — February 16, 2016 @ 11:22 am

  9. @Drew, not to steal any of Michael’s polite, Canadian thunder, but it is a common misconception that because MERGE is a single statement that it must be a single operation. Behind the scenes, it actually performs the same independent operations that we have traditionally written as separate statements ourselves. There are other concerns you should at least be aware of, I wrote about them here:

    http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/

    Comment by Aaron Bertrand — February 16, 2016 @ 11:22 am

  10. Hey Drew, listen to Aaron, he knows what he’s talking about. 🙂

    Comment by Michael J. Swart — February 16, 2016 @ 11:24 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress