Update: This is an archived post on the topic of UPSERT. See my more recent and comprehensive post SQL Server UPSERT Patterns and Antipatterns
Yesterday I had my mind changed about the best way to do concurrency. I describe several methods in Mythbusting: Concurrent Update/Insert Solutions. My preferred method is to increase the isolation level and fine tune locks.
At least that was my preference. I recently changed my approach to use a method that gbn suggested in the comments. He describes his method as the “TRY CATCH JFDI pattern”. Normally I avoid solutions like that. There’s a rule of thumb that says developers should not rely on catching errors or exceptions for control flow. But I broke that rule of thumb yesterday.
By the way, I love the gbn’s description for the pattern “JFDI”. It reminds me of Shia Labeouf’s motivational video.
Okay, I’ll tell you the story.
The Original Defect
So there’s this table. It’s defined something like:
CREATE TABLE dbo.AccountDetails
(
Email NVARCHAR(400) NOT NULL
CONSTRAINT PK_AccountDetails PRIMARY KEY (Email),
Created DATETIME NOT NULL
CONSTRAINT DF_AccountDetails_Created DEFAULT GETUTCDATE(),
Etc NVARCHAR(MAX)
) |
CREATE TABLE dbo.AccountDetails
(
Email NVARCHAR(400) NOT NULL
CONSTRAINT PK_AccountDetails PRIMARY KEY (Email),
Created DATETIME NOT NULL
CONSTRAINT DF_AccountDetails_Created DEFAULT GETUTCDATE(),
Etc NVARCHAR(MAX)
)
And there’s a procedure defined something like:
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
DECLARE @Created DATETIME;
DECLARE @EtcDetails NVARCHAR(MAX);
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SET XACT_ABORT ON
BEGIN TRAN
SELECT
@Created = Created,
@EtcDetails = Etc
FROM dbo.AccountDetails
WHERE Email = @Email
IF @Created IS NULL
BEGIN
SET @Created = GETUTCDATE();
SET @EtcDetails = @Etc;
INSERT INTO dbo.AccountDetails (Email, Created, Etc)
VALUES (@Email, @Created, @EtcDetails);
END
COMMIT
SELECT @Email as Email, @Created as Created, @EtcDetails as Etc |
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
DECLARE @Created DATETIME;
DECLARE @EtcDetails NVARCHAR(MAX);
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SET XACT_ABORT ON
BEGIN TRAN
SELECT
@Created = Created,
@EtcDetails = Etc
FROM dbo.AccountDetails
WHERE Email = @Email
IF @Created IS NULL
BEGIN
SET @Created = GETUTCDATE();
SET @EtcDetails = @Etc;
INSERT INTO dbo.AccountDetails (Email, Created, Etc)
VALUES (@Email, @Created, @EtcDetails);
END
COMMIT
SELECT @Email as Email, @Created as Created, @EtcDetails as Etc
Applications executing this procedure were deadlocking with each other. If you’re keen, try to figure out why before reading ahead. It’s pretty close to the problem described in the Mythbusting post. Specifically this was method 3: increased isolation level.
Initial Fix
So I decided to fine tune locks. I added an UPDLOCK hint:
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
DECLARE @Created DATETIME;
DECLARE @EtcDetails NVARCHAR(MAX);
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SET XACT_ABORT ON
BEGIN TRAN
SELECT
@Created = Created,
@EtcDetails = Etc
FROM dbo.AccountDetails WITH (UPDLOCK) WHERE Email = @Email
IF @Created IS NULL
BEGIN
SET @Created = GETUTCDATE();
SET @EtcDetails = @Etc;
INSERT INTO dbo.AccountDetails (Email, Created, Etc)
VALUES (@Email, @Created, @EtcDetails);
END
COMMIT
SELECT @Email as Email, @Created as Created, @EtcDetails as Etc |
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
DECLARE @Created DATETIME;
DECLARE @EtcDetails NVARCHAR(MAX);
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SET XACT_ABORT ON
BEGIN TRAN
SELECT
@Created = Created,
@EtcDetails = Etc
FROM dbo.AccountDetails WITH (UPDLOCK)
WHERE Email = @Email
IF @Created IS NULL
BEGIN
SET @Created = GETUTCDATE();
SET @EtcDetails = @Etc;
INSERT INTO dbo.AccountDetails (Email, Created, Etc)
VALUES (@Email, @Created, @EtcDetails);
END
COMMIT
SELECT @Email as Email, @Created as Created, @EtcDetails as Etc
Bail Early If Possible
Okay, so this solution works. It’s concurrent and it performs just fine. I realized though that I can improve this further by avoiding the transaction and locks. Basically select the row and if it exists, bail early:
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
DECLARE @Created DATETIME;
DECLARE @EtcDetails NVARCHAR(MAX);
SELECT @Created = Created, @EtcDetails = Etc FROM dbo.AccountDetails WHERE Email = @Email; IF (@Created IS NOT NULL) BEGIN SELECT @Email as Email, @Created as Created, @EtcDetails as Etc; RETURN; END
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SET XACT_ABORT ON
BEGIN TRAN
SELECT
@Created = Created,
@EtcDetails = Etc
FROM dbo.AccountDetails WITH (UPDLOCK)
WHERE Email = @Email;
IF @Created IS NULL
BEGIN
SET @Created = GETUTCDATE();
SET @EtcDetails = @Etc;
INSERT INTO dbo.AccountDetails (Email, Created, Etc)
VALUES (@Email, @Created, @EtcDetails);
END
COMMIT
SELECT @Email as Email, @Created as Created, @EtcDetails as Etc; |
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
DECLARE @Created DATETIME;
DECLARE @EtcDetails NVARCHAR(MAX);
SELECT
@Created = Created,
@EtcDetails = Etc
FROM dbo.AccountDetails
WHERE Email = @Email;
IF (@Created IS NOT NULL)
BEGIN
SELECT @Email as Email, @Created as Created, @EtcDetails as Etc;
RETURN;
END
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
SET XACT_ABORT ON
BEGIN TRAN
SELECT
@Created = Created,
@EtcDetails = Etc
FROM dbo.AccountDetails WITH (UPDLOCK)
WHERE Email = @Email;
IF @Created IS NULL
BEGIN
SET @Created = GETUTCDATE();
SET @EtcDetails = @Etc;
INSERT INTO dbo.AccountDetails (Email, Created, Etc)
VALUES (@Email, @Created, @EtcDetails);
END
COMMIT
SELECT @Email as Email, @Created as Created, @EtcDetails as Etc;
Take A Step Back
Okay, this is getting out of hand. The query shouldn’t have to be this complicated.
Luckily I work with a guy named Chris. He’s amazing at what he does. He questions everything without being a nitpicker (there’s a difference). He read through the Mythbusters post and followed all the links in the comments. He asked whether gbn’s JFDI pattern wasn’t better here. So I implemented it just to see what that looked like:
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
BEGIN TRY
INSERT INTO dbo.AccountDetails Email, Etc
SELECT @Email, @Etc
WHERE NOT EXISTS
( SELECT * FROM dbo.AccountDetails WHERE Email = @Email );
END TRY
BEGIN CATCH
-- ignore duplicate key errors, throw the rest.
IF ERROR_NUMBER() NOT IN (2601, 2627) THROW;
END CATCH
SELECT Email, Created, Etc
FROM dbo.AccountDetails
WHERE Email = @Email; |
CREATE PROCEDURE dbo.s_GetAccountDetails_CreateIfMissing
(
@Email NVARCHAR(400),
@Etc NVARCHAR(MAX)
)
AS
BEGIN TRY
INSERT INTO dbo.AccountDetails Email, Etc
SELECT @Email, @Etc
WHERE NOT EXISTS
( SELECT * FROM dbo.AccountDetails WHERE Email = @Email );
END TRY
BEGIN CATCH
-- ignore duplicate key errors, throw the rest.
IF ERROR_NUMBER() NOT IN (2601, 2627) THROW;
END CATCH
SELECT Email, Created, Etc
FROM dbo.AccountDetails
WHERE Email = @Email;
Look at how much better that looks! No elevated transaction isolation levels. No query hints. The procedure itself is half as long as it used to be. The SQL is so much simpler and for that reason, I prefer this approach. I am happy in this case to use error handling for control flow.
So I checked in the change and updated my pull request. Chris’s last comment before he approved the pull request was “Looks good. Ugly pragmatism FTW.”