Investigating an optimizer transformation that exposes a bug in SQL Server’s MERGE
implementation.
About This Blog
Wednesday 4 August 2010
An Interesting MERGE Bug
Wednesday 6 February 2013
Incorrect Results with Indexed Views
If you use MERGE
, indexed views and foreign keys, your queries might return incorrect results. Microsoft have released a fix for incorrect results returned when querying an indexed view. The problem applies to:
- SQL Server 2012
- SQL Server 2008 R2
- SQL Server 2008
The Knowledge Base article does not go into detail, or provide a reproduction script, but this blog post does.
Monday 10 December 2012
MERGE Bug with Filtered Indexes
A MERGE
statement can fail, and incorrectly report a unique key violation when:
- The target table uses a unique filtered index; and
- No key column of the filtered index is updated; and
- A column from the filtering condition is updated; and
- Transient key violations are possible
Monday 13 November 2023
Why Batch Mode Sort Spills Are So Slow
Batch mode sorting was added to SQL Server in the 2016 release under compatibility level 130. Most of the time, a batch mode sort will be much faster than the row mode equivalent.
This post is about an important exception to this rule, as recently reported by Erik Darling (video).
No doubt you’ll visit both links before reading on, but to summarize, the issue is that batch mode sorts are very slow when they spill—much slower than an equivalent row mode sort.
This also seems like a good opportunity to write down some sorting details I haven’t really covered before. If you’re not interested in those details and background to the current issue, you can skip down to the section titled, “Erik’s Demo”.
Monday 8 July 2013
Working Around Missed Optimizations
In my last post, we saw how a query featuring a scalar aggregate could be transformed by the optimizer to a more efficient form. As a reminder, here’s the schema again:
Wednesday 24 July 2013
Two Partitioning Peculiarities
Table partitioning in SQL Server is essentially a way of making multiple physical tables (row sets) look like a single table. This abstraction is performed entirely by the query processor, a design that makes things simpler for users, but which makes complex demands of the query optimizer.
This post looks at two examples which exceed the optimizer’s abilities in SQL Server 2008 onward.
Wednesday 21 August 2013
Incorrect Results Caused By Adding an Index
Say you have the following two tables, one partitioned and one not:
CREATE PARTITION FUNCTION PF (integer)
AS RANGE RIGHT
FOR VALUES (1000, 2000, 3000, 4000, 5000);
CREATE PARTITION SCHEME PS
AS PARTITION PF
ALL TO ([PRIMARY]);
-- Partitioned
CREATE TABLE dbo.T1
(
T1ID integer NOT NULL,
SomeID integer NOT NULL,
CONSTRAINT [PK dbo.T1 T1ID]
PRIMARY KEY CLUSTERED (T1ID)
ON PS (T1ID)
);
-- Not partitioned
CREATE TABLE dbo.T2
(
T2ID integer IDENTITY (1,1) NOT NULL,
T1ID integer NOT NULL,
CONSTRAINT [PK dbo.T2 T2ID]
PRIMARY KEY CLUSTERED (T2ID)
ON [PRIMARY]
);
Tuesday 17 September 2024
Why a Self-Join Requires Halloween Protection
This article was originally published on 𝕏.
I was asked recently why Halloween Protection was needed for data modification statements that include a self-join of the target table. This gives me a chance to explain, while also covering some interesting product bug history from the SQL Server 7 and 2000 days.
If you already know all there is to know about the Halloween Problem as it applies to SQL Server, you can skip the background section.
Tuesday 6 December 2011
SQL Server Optimizer Bug with JOIN and GROUP BY
I came across a SQL Server optimizer bug recently that made me wonder how on earth I never noticed it before.
As the title of this post suggests, the bug occurs in common JOIN
and GROUP BY
queries. While it does not cause incorrect results to be returned, it will often cause a poor query plan to be selected by the optimizer.
If you are just interested in the bug itself, you will find a description in the section headed “the bug revealed”. It relates to cardinality estimation for serial partial aggregates.
As the regular reader will be expecting though, I am going to work up to it with a bit of background. The lasting value of this post (once the bug is fixed) is in the background details anyway.
Sunday 26 July 2020
A bug with Halloween Protection and the OUTPUT Clause
Background
The OUTPUT
clause can be used to return results from an INSERT
, UPDATE
, DELETE
, or MERGE
statement. The data can be returned to the client, inserted to a table, or both.
There are two ways to add OUTPUT
data to a table:
- Using
OUTPUT INTO
- With an outer
INSERT
statement.
For example:
-- Test table
DECLARE @Target table
(
id integer IDENTITY (1, 1) NOT NULL,
c1 integer NULL
);
-- Holds rows from the OUTPUT clause
DECLARE @Output table
(
id integer NOT NULL,
c1 integer NULL
);
Wednesday 18 August 2010
Inside the Optimizer: Row Goals In Depth
Background
One of the core assumptions made by the SQL Server query optimizer cost model is that clients will eventually consume all the rows produced by a query.
This results in plans that favour the overall execution cost, though it may take longer to begin producing rows.
Wednesday 24 March 2021
Incorrect Results with Parallel Eager Spools and Batch Mode
You might have noticed a warning at the top of the release notes for SQL Server 2016 SP2 CU 16:
Note: After you apply CU 16 for SQL Server 2016 SP2, you might encounter an issue in which DML (insert/update/delete) queries that use parallel plans cannot complete any execution and encounter HP_SPOOL_BARRIER waits. You can use the trace flag 13116 or MAXDOP=1 hint to work around this issue. This issue is related to the introduction of fix for 13685819 and it will be fixed in the next Cumulative Update.
That warning links to bug reference 13685819 on the same page. There isn’t a separate KB article, only the description:
Fixes an issue with insert query in SQL Server 2016 that reads the data from the same table and uses a parallel execution plan may produce duplicate rows
Friday 13 September 2024
A Small Sample of SQL Server Chaos
This article was originally published on 𝕏.
Background
Since SQL Server indexed views don’t allow MIN
or MAX
aggregates, I recently found myself writing a trigger instead. The trigger’s job was to keep a summary table in sync with a source query (which featured a MAX
aggregate).
There’s a cost to running a trigger after every insert, update, or delete (with up to three trigger invocations per merge statement) but fast access to the summary data was worth it in this case. Though a trigger is a bit more expensive than the inline materialised view maintenance automatically added to the source statement’s execution plan by SQL Server, efficient trigger code and good indexing can help with the performance aspect (as always).