Wednesday, 21 August 2013

Incorrect Results Caused By Adding an Index

Introduction

Say you have the following two tables, one partitioned and one not:

CREATE PARTITION FUNCTION PF (integer)
AS RANGE RIGHT
FOR VALUES (1000, 2000, 3000, 4000, 5000);

CREATE PARTITION SCHEME PS
AS PARTITION PF
ALL TO ([PRIMARY]);

-- Partitioned
CREATE TABLE dbo.T1
(
    T1ID    integer NOT NULL,
    SomeID  integer NOT NULL,

    CONSTRAINT [PK dbo.T1 T1ID]
        PRIMARY KEY CLUSTERED (T1ID)
        ON PS (T1ID)
);

-- Not partitioned
CREATE TABLE dbo.T2
(
    T2ID    integer IDENTITY (1,1) NOT NULL,
    T1ID    integer NOT NULL,

    CONSTRAINT [PK dbo.T2 T2ID]
        PRIMARY KEY CLUSTERED (T2ID)
        ON [PRIMARY]
);

Adding sample data will use a table of numbers. In case you do not already have one of these, the following script will efficiently create one, with integers from 1 to 10,000,000:

IF OBJECT_ID(N'dbo.Numbers', N'U') IS NOT NULL
BEGIN
    DROP TABLE dbo.Numbers;
END;
GO
WITH Ten(N) AS 
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1
)   
SELECT
	n = IDENTITY(int, 1, 1)
INTO dbo.Numbers
FROM Ten AS T10
CROSS JOIN Ten AS T100
CROSS JOIN Ten AS T1000
CROSS JOIN Ten AS T10000
CROSS JOIN Ten AS T100000
CROSS JOIN Ten AS T1000000
CROSS JOIN Ten AS T10000000
ORDER BY n
    OFFSET 0 ROWS
    FETCH FIRST 10 * 1000 * 1000 ROWS ONLY
OPTION
    (MAXDOP 1);
GO
ALTER TABLE dbo.Numbers
ADD CONSTRAINT [PK dbo.Numbers n]
PRIMARY KEY CLUSTERED (n)
WITH
(
    SORT_IN_TEMPDB = ON, 
    MAXDOP = 1, 
    FILLFACTOR = 100
);

We load table T1 with 4,999 rows. All of the rows have a SomeID value of 1234, and the T1ID primary key is sequentially numbered from 1 to 4,999:

INSERT dbo.T1
    (T1ID, SomeID)
SELECT
    N.n, 1234
FROM dbo.Numbers AS N
WHERE
    N.n BETWEEN 1 AND 4999;

Table T2 gets 999 rows, generated by adding only T1ID values from T1 that divide exactly by 5:

INSERT dbo.T2
    (T1ID)
SELECT
    T1ID
FROM dbo.T1
WHERE
    T1ID % 5 = 0;

Visually, T1 looks like this (T1ID goes up to 4,999):

Table T1

And T2 looks like this (T2ID goes up to 999):

Table T2

The test query counts rows that match between the two tables when joined on T1ID:

SELECT
    COUNT_BIG(*)
FROM dbo.T1 AS T1
JOIN dbo.T2 AS T2
    ON T2.T1ID = T1.T1ID
WHERE
    T1.SomeID = 1234;

The execution plan features a merge join:

Merge join execution plan

The correct result (999) is returned and everyone is happy:

Correct result

Enter the Index

Now someone comes along and adds a new index to table T1:

CREATE NONCLUSTERED INDEX
    [dbo.T1 SomeID]
ON dbo.T1
    (SomeID DESC);

This is a perfectly reasonable index, apparently essential for some crucial query or other. Let’s run our COUNT_BIG(*) query again:

SELECT
    COUNT_BIG(*)
FROM dbo.T1 AS T1
JOIN dbo.T2 AS T2
    ON T2.T1ID = T1.T1ID
WHERE
    T1.SomeID = 1234;

The execution plan looks similar:

Execution plan with index

But the result is wrong! There are still 999 matches in the underlying data, but this execution counts only 200.

Wrong result

An Even Simpler Query

With the new index still in place, we run this query:

SELECT
    T1ID
FROM dbo.T1 
WHERE
    SomeID = 1234
ORDER BY
    T1ID ASC;

This query should return all the T1IDs from 1 to 4,999 in ascending order. Instead, we get this:

Query output

The list starts at 4000 not 1! Also, out-of-order rows are found further down:

Out-of-order rows

The results are not ordered by T1ID despite the ORDER BY T1TD ASC clause. Quite shocking!

Cause

Both problems are caused by a bug in the query optimizer, which is present in all versions of SQL Server from 2008 to 2012 inclusive regardless of patch level. For example, the bug reproduces on:

Microsoft SQL Server 2012 (SP4-GDR) - 11.0.7462 (X64) 
Developer Edition (64-bit) on Windows NT 6.3 x64

The bug produces a query plan that does not provide the ordering guarantees the optimizer thinks it does, leading to incorrect results. A partitioned table is required to reproduce the bug.

The sneaky aspect to it is that the index which causes the problem could be added at any time, without the original query-writer’s knowledge.

Equally, data changes could mean that a query plan that used to use a hash or nested loops join suddenly recompiles to choose a merge join. Since a merge join requires sorted input, the opportunity for suddenly incorrect (incomplete!) results is obvious (and an example was shown above).

I opened a Connect item for this bug (archive link) and wrote more about the detailed explanation in a separate article.

Resolution

The fix for this issue is now available and documented in a Knowledge Base article:

FIX: Query that you run against a partitioned table returns incorrect results in SQL Server 2008, SQL Server 2008 R2 or SQL Server 2012

Please note the fix requires trace flag 4199, which enables a range of other query processor changes. It is unusual for an incorrect results bug to be fixed under 4199. I asked for clarification on that and the response was:

Even though this problem involves incorrect results like other hotfixes involving the Query Processor we have only enabled this fix under trace flag 4199 for SQL Server 2008, 2008 R2, and 2012. However, this fix is “on” by default without the trace flag in SQL Server 2014 RTM.

Be sure to clear the procedure cache after enabling trace flag 4199 to ensure fresh plans are generated.

Thanks for reading.