What To Avoid If You Want To Use MERGE

August 4, 2021

Filed under: SQLServerPedia Syndication — Michael J. Swart @ 1:06 pm

Aaron Bertrand has a post called Use Caution with SQL Server’s MERGE Statement. It’s a pretty thorough compilation of all the problems and defects associated with the MERGE statement that folks have reported in the past. But it’s been a few years since that post and in the spirit of giving Microsoft the benefit of the doubt, I revisited each of the issues Aaron brought up.

Some of the items can be dismissed based on circumstances. I noticed that:

Some of the issues are fixed in recent versions (2016+).
Some of the issues that have been marked as won’t fix have been fixed anyway (the repro script associated with the issue no longer fails).
Some of the items are complaints about confusing documentation.
Some are complaints are about issues that are not limited to the MERGE statement (e.g. concurrency and constraint checks).

So what about the rest? In what circumstances might I decide to use a MERGE statement? What do I still need to avoid? In 2019 and later, if I’m using MERGE, I want to avoid:

It’s a shorter list than Aaron’s but there’s another gotcha. The same way that some of the items get addressed with time, new issues continue to pop up. For example, temporal tables are a relatively new feature that weren’t a thing when Aaron’s first posted his list. And so I also want to avoid:

Temporal tables

If MERGE has trouble with old and new features , then it becomes clear that MERGE is a very complicated beast to implement. It’s not an isolated feature and it multiplies the number of defects that are possible.

Severity of Issues

There’s a large number of issues with a large variety of severity. Some of the issues are minor annoyances, or easily avoidable. Some of them are serious performance issues that are harder to deal with. But a few of the issues can be worse than that! If I ask SQL Server to UPDATE something, and SQL Server responds with (1 row affected) , then it better have affected that row! If it didn’t, then that’s a whole new level of severity.

That leads me to what I think is the worst, unfixed bug. To be safe from it, avoid MERGEs with:

DELETE operations on tables used by indexed views

The defect is called Merge statement Delete does not update indexed view in all cases. It’s still active as of SQL Server 2019 CU9 and it leaves your indexed view inconsistent with the underlying tables. For my case, I can’t predict what indexed views would be created in the future, so I would shorten my advice to say avoid MERGEs with:

DELETE operations

Conclusion

When I first started writing this post, I thought the gist was going to be “MERGE isn’t so bad if you just avoid these things”. But it still is bad. Aaron’s list is a little out of date but his advice is not. If Aaron updated that post today, the list of defects and issues with MERGE would be different and just as concerning to me as the day he wrote it.

So just to simplify my advice to you. Avoid:

MERGE

-- Comments (32)

32 Comments »

I laughed out loud at the bullet point at the end. Nicely done.

Comment by Brent Ozar — August 5, 2021 @ 12:50 am
I had another issue with MERGE last week. We recently changed the database to RCSI and also had the HOLDLOCK hint in our procedures. It look like this was causing DEADLOCKS (no other database activity at the time of deadlocks so it should be the merge). I removed the HOLDLOCK hint and changed the MERGE to an explicit transaction. This fixed the deadlocks, but I don’t want to jump into the conclusion this locking hit was het root cause. Do you think we need HOLDLOCK in case of an RCSI database?

Comment by Wilfred van Dijk — August 5, 2021 @ 4:35 am
Great evaluation !
btw: I just received an urgent call from one of our ETL responsibles, their system goes haywire and it is getting worse. Turns out they use MERGE in EVERY single process, because it is so “convenient”. We take it one beat at a time …

Comment by alzdba — August 5, 2021 @ 4:40 am
The solution is simple, don’t use indexed views. /s

Comment by Tom — August 5, 2021 @ 5:14 am
Dang, I was under the impression that almost everything was fixed and what was left was very minor, but this is still quite a lot of stuff. Thanks for the write up.

Comment by Alex Friedman — August 5, 2021 @ 5:34 am
[…] Michael J. Swart has important bullet points: […]

Pingback by Ways to avoid the MERGE Operator – Curated SQL — August 5, 2021 @ 8:00 am
@Wilfred
I’ve had a lot of experience with the features you’ve just mentioned: RCSI (on or off), Working with HOLDLOCKs to avoid deadlocks and the MERGE statement.
In my experience RCSI has only had the effect of reducing deadlocks.
Deadlocks are an interesting thing to troubleshoot and I can’t give you advice without knowing a *lot* more details. It sounds like a good question for DBA.stackexchange.com.

Comment by Michael J. Swart — August 5, 2021 @ 11:16 am
@alzdba
Databases that use MERGE statements can go haywire. But be careful not to turn it into a scapegoat.
What I’m saying is, I’d be wary of giving advice that sounds like “Avoid MERGE and your problems will go away”

Comment by Michael J. Swart — August 5, 2021 @ 11:22 am
@Michael J. Swart : great post.

Comment by Mitchell Wheat — August 9, 2021 @ 10:25 pm
To be honest, I use MERGE often, even with partitioned columnstore tables in a simple recovey DWH database and had no problems (SQL 2016 SP2 CU ).

If I look onto the bug regarding columnstore (https://feedback.azure.com/forums/908035-sql-server/suggestions/34791778-sql-server-2016-sp2-cui-bug-merge-and-insert-wi), I wonder, whyever someone would use such a silly syntax.

He does an INSERT INTO TWO SELECT … FROM (MERGE TWO…. OUTPUT)
So he puts the MERGE into a FROM clause (didn’t know that this is possible) AND is inserting the output of the MERGE into the same table which he was just MERGING. To be honest, I understand, that such a “concept” could cause errors, but since I’d never do something as this, I see no reason not to use MERGE on columnstores.
BTW: a simple MERGE INTO TWO … OUTPUT … INTO TWO would what I’d have done and – if this still causes error (because output target = merge target), I’d insert the output into a #temptable first

The simple dupicate key with filtered indexes is only for < SQL 2016 and the partitioning bug for partitioned tables with only a single partition (not splitted yet) and can't be tested, because it is only available in the webarchiv (where you can't access the attachments).

So I see no reason to not use MERGE when / where it would saves me time (depends as always) because I have to do insert / updates / deletes with the same (big) set of data and don't want to join the same tables three times.

Comment by Thomas Franz — August 10, 2021 @ 5:54 am
@Thomas Franz
Okay, granted, the partition bug has been resolved (I missed that link). And the syntax of the columnstore bug is bizarre and falls under the category of easily avoidable.
The other issues still concern me (like I said, especially that last one).

You’ve been using MERGE in your data warehouse without issues. Chances are you’re going to continue to see no further issues. For my own experience, about 10 years ago, I was affected by incorrect results found in a join in a MERGE statement. The specific issue I encountered has since been fixed but I’m still not convinced that MERGE is worth reconsidering, especially when there alternatives.

Comment by Michael J. Swart — August 10, 2021 @ 9:59 am
“In the spirit of giving Microsoft the benefit of the doubt”? I’m doing that less and less lately. In fact I’ve been considering moving our flagship product to PostgreSQL in a future release.

Comment by Dave Boltman — August 10, 2021 @ 7:32 am
If we stopped using features of SQL server that somebody somewhere thought was bad, we wouldn’t use it at all.

Comment by Todd Chittenden — August 10, 2021 @ 12:47 pm
@Todd, I expect you’re being a bit facetious, but what you said is not true at all. I mean no-one complains about SELECT/INSERT/UPDATE/DELETE.
Besides, features aren’t separated into good/bad, dangerous/safe. There’s varying degrees of value and varying degrees of risk.
For me I’ve been burned before by the MERGE statement before and I’m sharing my thoughts. I’m not by nature a pessimist or a grouch and the criticism of the MERGE statement is valid.

Comment by Michael J. Swart — August 10, 2021 @ 12:53 pm
Michael, thanks for the update. I’d like to offer another reason to avoid MERGE that is more managerial than technical. Many shops allow SQL to be modified by app developers or others who are not hard-core SQL gurus. Selecting an obscure feature to save a few lines of code is tempting, but may be counter-productive if it’s not easily understood by all who may subsequently touch it. I ignored that principle once to use MERGE in a stored procedure, only to regret it after a junior developer modified an input that caused an insert/delete recursion storm inside the MERGE in production. Keeping code simple and understandable is often more valuable than an elegant technical solution.

Comment by L. P. Hineman — August 11, 2021 @ 9:46 am
@ L.P. Hineman
Oh absolutely, I like to think of everything you wrote as “maintainability” and it’s an often overlooked value.
Simplicity is good for maintainability, it’s good for SQL Server’s query optimizer.
Also I would argue that simple and understandable code is exactly what makes a technical solution elegant. 🙂

Comment by Michael J. Swart — August 11, 2021 @ 11:05 am
My case *for* MERGE is that, for follow-on devs maintaining my original code, the intent is clear. But then again, I’ve only been bitten once (if the target needs filtering, do it upstream – use say a CTE – don’t do it in the join… plan is death). I’ll probably keep using MERGE, and then one day have horrible regret. 😉

Comment by Paul Holmes — August 11, 2021 @ 8:04 pm
[…] has a number of known bugs, and I’d generally advise to steer clear, but I am not aware of any alternative that would […]

Pingback by refactoring sql while loop to regular inserts – SQL — March 21, 2022 @ 6:35 am
All well and good but is someone going to recommend a pattern to use when presented with a data feed where incoming increments involve changes to existing rows?

Comment by Peter Nightingale — January 18, 2023 @ 6:14 am
[…] What To Avoid If You Want To Use MERGE […]

Pingback by UPDATE if exists else INSERT in SQL Server 2008 [duplicate] – Row Coding — August 1, 2023 @ 1:55 am
Not only do I make use of the merge statement, but I will point out the EF Core (.Net) is built on top of the merge statement. If you ever have it generate the SQL it will use, you will see it does all of its work with merge statements.

I use the merge statements without issue on databases in simple recovery mode and on partitioned tables (who would create a partitioned table with ONE partition? That is a reason not to use it?). A text column involved in a fulltext index updated via a MERGE statement on a partitioned table with a nonclustered primary key? Do you really create tables without a clustered PK? The majority of these issues seem to be very edge case based.

The only use case to me that is valid to avoid is in conjunction with indexed views.

and on a side note, I 100% agree with not using Temporal tables.

Comment by Keith MacDonald — August 8, 2023 @ 4:39 pm
Many of them are edge cases. Some are not (temporal tables which we use, indexed views which we use).

I’ve been burned by the merge statement in the past (a defect that has since been patched) I’m glad to hear you haven’t yet.

Comment by Michael J. Swart — August 8, 2023 @ 5:48 pm
[…] on documentation. And he dismissed items that were not actually about MERGE. And then he still had quite a list. Enough for him to end hist post with the advice to better just avoid MERGE […]

Pingback by An update on MERGE - SQLServerFast — September 29, 2023 @ 9:57 am
The only time we use MERGE is when we need to output values from 2 different tables when inserting data. So if we have:

INSERT INTO TableI
SELECT whatever
FROM TableA a
JOIN TableB b on whatever

And I need the new IDENT value from TableI and a corresponding value from TableB in a temp table for later use. I am not aware of any other way to accomplish this. Does anyone have an idea on how to make this happen?

Comment by Marc — October 5, 2023 @ 9:48 am

Hi Marc,
Doesn’t this work?

select 1 as Id, 'a' as Value
into #A;
 
select 1 as Id, 'b' as Value 
into #B
 
create table #C (
	Id int identity,
	AValue varchar(10), 
	BValue varchar(10)
)
 
create table #results
(
	CId int,
	AValue varchar(10), 
	BValue varchar(10)
)
 
insert into #C (AValue, BValue)
output inserted.Id, inserted.AValue, inserted.BValue
	into #results(CId, AValue, BValue)
select #A.Value, #B.Value
from #A
join #B on #A.Id = #B.Id
 
select * from #results

Comment by Michael J. Swart — October 5, 2023 @ 10:06 am

[…] of the MERGE operator in T-SQL. Hugo builds this on top of work that Aaron Bertrand pioneered and Michael J. Swart walked through a couple years […]

Pingback by Shop Talk: 2024-01-15 – Shop Talk with TriPASS — January 18, 2024 @ 6:57 am
[…] will leave the finer points of the problems with MERGE statements to the Michaels and Aarons of the SQL Server […]

Pingback by A Mild Annoyance With MERGE Statements And Triggers – Darling Data — May 20, 2024 @ 9:07 am
[…] will leave the finer points of the problems with MERGE statements to the Michaels and Aarons of the SQL Server […]

Pingback by MERGE Statements and Triggers – Curated SQL — May 22, 2024 @ 8:00 am
The issue is that one of the values we need to return is not a value that is being inserted.

Comment by Marc — January 3, 2025 @ 9:37 am
Is MERGE still a “proceed with caution” thing with 2022?

Comment by Marc — January 3, 2025 @ 9:38 am
Potentially you can also include data masking as another feature with possible issues https://stackoverflow.com/q/79483487/73226

Comment by Martin Smith — March 4, 2025 @ 10:23 am
Turns out there was no issue with data masking and all working as intended!

Comment by Martin Smith — March 4, 2025 @ 2:50 pm

RSS feed for comments on this post. TrackBack URL

Michael J. Swart

August 4, 2021