Thursday, 23 September 2010

A Tale of Two Index Hints

Introduction

If you look up Table Hints in the official documentation, you’ll find the following statements:

If a clustered index exists, INDEX(0) forces a clustered index scan and INDEX(1) forces a clustered index scan or seek.

If no clustered index exists, INDEX(0) forces a table scan and INDEX(1) is interpreted as an error.

The interesting thing there is that both hints can result in a scan. If that is the case, you might wonder if there is any effective difference between the two.

This article explores that question and highlights an optimizer quirk that can result in a much less efficient query plan when using INDEX(0). I’ll also cover some things about ordering guarantees.

Test Environment

As usual, we need to create a test table:

USE tempdb;
SET NOCOUNT ON;
GO
IF OBJECT_ID(N'Test.Orders', N'U') IS NOT NULL
    DROP TABLE Test.Orders;
GO
IF SCHEMA_ID(N'Test') IS NOT NULL
    DROP SCHEMA Test;
GO
CREATE SCHEMA Test
    CREATE TABLE Orders
    (
        order_id integer NOT NULL,
        product_id integer NOT NULL,
        quantity integer NOT NULL,
        padding char(980) NOT NULL DEFAULT (SPACE(0)),
                
        CONSTRAINT  [PK Test.Orders order_id]
            PRIMARY KEY CLUSTERED (order_id ASC)
    );
GO
CREATE NONCLUSTERED INDEX  [IX Test.Orders quantity]
ON Test.Orders (quantity ASC)
INCLUDE (padding);
GO
DECLARE
    @Counter integer = 1,
    @RowCount integer = 64 * 8;
        
WHILE @Counter <= @RowCount
BEGIN
    INSERT Test.Orders
    (
        order_id,
        product_id,
        quantity
    )
    VALUES
    (
        @RowCount - @Counter,
        @RowCount - @Counter,
        @RowCount - @Counter
    );
                
    SET     @Counter += 1;
END;

The test table has 512 rows, padded so that eight rows fit on each 8KB data page.

The table is clustered on the order_id column, and there is a nonclustered index on the quantity column, with the padding column INCLUDEd to again ensure that eight rows fit on each nonclustered index page.

Sample data

The script also deliberately generates a very high level of logical fragmentation, which we can see with the following query:

SELECT
    SI.name,
    IPS.index_type_desc,
    IPS.avg_fragmentation_in_percent,
    IPS.fragment_count,
    IPS.page_count
FROM sys.dm_db_index_physical_stats
(
    DB_ID(),
    OBJECT_ID(N'Test.Orders', N'U'),
    DEFAULT,
    DEFAULT,
    DEFAULT
) AS IPS
JOIN sys.indexes AS SI
    ON SI.[object_id] = IPS.[object_id]
    AND SI.index_id = IPS.index_id
ORDER BY
    IPS.index_id;

The reason for the fragmentation will become apparent later on. These are the results I saw on my machine:

Fragmentation report

Background Information on Scans

A scan of a table or index processes every row, whereas a seek returns one or more rows from one or more ranges of the index.

Scan Strategies

When the SQL Server optimizer produces a plan containing a scan of an index or clustered table, the Storage Engine may be able to choose from two different strategies: scanning the pages in logical order, or the order in which they were allocated.

In the first case, it follows the doubly-linked list at the leaf level of the index. In the second case, it can use the Index Allocation Map (IAM) pages to drive the scan.

A scan of a heap always uses the IAM pages since there is no clustered index to provide a logical page order, but this post is concerned mainly with indexes and clustered tables, so I won’t talk much more about heaps today.

The theoretical advantage of an IAM-driven scan is that pages will tend to be read in physical order on the disk, which generally results in sequential I/O, and promotes effective read-ahead. How efficient scanning the pages in logical order is depends on the level of logical fragmentation. If fragmentation is high, the scan will tend to result in a high proportion of random I/Os, and read-ahead may be much less effective.

The downside to the IAM-driven scan is that it is not guaranteed to return results in logical key order (in fact it usually won’t), and there is an extra start-up cost in accessing the IAM pages and planning the scan.

IAM-driven scans

The Storage Engine may choose an IAM-driven scan if:

The index contains 64 or more pages (giving the method a chance to recover the start-up cost);
The optimizer indicates that an ordered scan is not required for correct results; and
Either the data cannot change during the scan, or the query indicates that incorrect results are acceptable.

Hopefully, the first point makes sense on it’s own. The second point refers to a flag that the optimizer can set on a scanning iterator in a query plan.

This flag shows up as an Ordered:True or Ordered:False property of the iterator. When set to true, it indicates that the Storage Engine must perform a scan in logical key order for the plan to produce correct results. When set to false, it means that the order does not matter. In the latter case, the Storage Engine has the choice of performing either type of scan. The execution plan does not show which type of scan was chosen at runtime.

The third point comes in two parts, but both relate to the fact that an IAM-driven scan does not take the locks necessary to prevent other concurrent operations from changing the data in the table, or moving the pages around (perhaps as the result of a page split).

The simplest way to satisfy the first part (to ensure that the data cannot be changed or moved) is to take a shared table lock.

The second part (acceptable incorrect results) refers to transaction isolation level. If the scan runs at an effective isolation level of READ UNCOMMITTED (either explicitly or through the use of a NOLOCK hint) we are effectively saying that we don’t care about the consistency of the data at all.

The Reason for the fragmentation

Our test table has been created with a high level of fragmentation. This allows us to see whether the Storage Engine used an IAM-driven scan or not.

If the results come back in logical key order, we can be sure (in these specific examples) that the scan was performed by following the doubly-linked list at the leaf level of the index. If the results come back in allocation order, they will look very different, and we can infer that an IAM-driven scan was performed.

Important

Please don’t take any of that to mean that you can rely on query results to come back in a particular order in more general cases.
The very simple tests shown here are carefully crafted to encourage the ordering behaviour I have just described, but there are no guarantees.
Never rely on plan details to provide ordering––always use an ORDER BY clause if you require ordered results.

Scanning Examples

Let’s look first at two queries that return all columns and all rows from our test table:

-- Query 1
SELECT T.*
FROM Test.Orders T;
 
-- Query 2
SELECT T.*
FROM Test.Orders T 
ORDER BY T.order_id ASC;

Both queries produce very similar execution plans, but the first specifies Ordered:False while the second specifies Ordered:True.

The second example avoids an expensive explicit Sort iterator by requiring the Storage Engine to return rows from the scan in logical key order (the clustered index is on order_id, remember).

Execution plan Ordered properties

Both queries return records ordered by the clustering key: Query 1 because the Storage Engine cannot perform an IAM-driven scan (we haven’t met all the conditions) and Query 2 because an ordered scan was requested by the optimizer.

Query 1 is not guaranteed to return records in that order, but Query 2 (almost) is.

In case you are wondering what might cause Query 1 to return records in a different order, consider the effect of a parallel plan, or Enterprise Edition’s Advanced Scanning feature. For anyone that follows that second link, I should mention that Advanced Scan is only possible when the Ordered flag is set to False.

An IAM-driven scan

Query 1 above meets two of the conditions for an IAM-driven scan: the index contains at least 64 pages, and the optimizer did not specify Ordered:True.

We can meet the third condition by adding a WITH (TABLOCK) hint:

SELECT T.*
FROM Test.Orders AS T WITH (TABLOCK);

We receive the following result set:

Results with TABLOCK

Each group of eight records (each 8KB page) comes back ordered by the clustering key, but the overall order of pages is in the order the pages were written.

This is the result of an IAM-driven scan—pages in allocation order, records on those pages in clustered key order. As you may have guessed, neither of these two observed orderings can be relied on either. I mention them purely for interest’s sake.

Adding the TABLOCK hint to Query 2 does not result in an IAM-driven scan, because the optimizer continues to specify Ordered:True on the scan (to avoid the sort that would be necessary to honour the ORDER BY clause).

For Query 2, the optimizer does still consider a plan with an unordered scan followed by a sort, but rejects it as being more expensive than an ordered scan. For very large tables, the optimizer may calculate that an IAM-driven scan might save more time than would be consumed by the extra sort, and a plan featuring the unordered scan + sort would be chosen. This is a heuristic optimization: the optimizer knows nothing about the fragmentation level of an index (though it does know how many pages it occupies).

Index Hints

Let’s now run the Query 1 test again, this time specifying INDEX(0) or INDEX(1) hints, as well as choosing to specify TABLOCK or not:

SELECT T.* FROM Test.Orders AS T WITH (INDEX(0));
SELECT T.* FROM Test.Orders AS T WITH (INDEX(1));
SELECT T.* FROM Test.Orders AS T WITH (INDEX(0), TABLOCK);
SELECT T.* FROM Test.Orders AS T WITH (INDEX(1), TABLOCK);

In all cases, the results are the same:

If we specify TABLOCK, we meet the conditions for an IAM-driven scan and the records come back in physical order.
If we omit TABLOCK, we happen to get results in clustered-key order—but we know that is just coincidental.

All four query plans look exactly the same as the simple clustered index scans shown earlier, and all feature Ordered:False.

Let’s try the same thing with the ORDER BY clause (Query 2):

SELECT T.*
FROM Test.Orders AS T WITH (INDEX(0))
ORDER BY T.order_id;

SELECT T.*
FROM Test.Orders AS T WITH (INDEX(0), TABLOCK)
ORDER BY T.order_id;

SELECT T.*
FROM Test.Orders AS T WITH (INDEX(1))
ORDER BY T.order_id;

SELECT T.*
FROM Test.Orders AS T WITH (INDEX(1), TABLOCK)
ORDER BY T.order_id;

All four queries produce the same results, in the same order, and are guaranteed to do so thanks to the ORDER BY clause. The execution plans are rather different though:

Different execution plans

Both INDEX(1) plans produce the plan shown on the left with the scan showing Ordered:True” and a cost of 0.05.
Both INDEX(0) plans produce the plan shown on the right with the scan showing Ordered:False, and a cost of 0.07.

Optimizer Bug, Limitation, or “By Design”?

It seems that INDEX(0) forces Ordered:False on the scan. As a consequence, we get an explicit Sort iterator when the ORDER BY clause is present. Why should INDEX(0) force an unordered scan?

One can imagine a line of reasoning that says we would normally specify INDEX(0) only on a heap table, where an ordered scan is not possible. The flaw in this argument is that INDEX(1) does not enable us to force a scan of a clustered table––it will result in a seek if at all possible.

`INDEX(0)` and `Ordered:True`

Nevertheless, it is possible to produce Ordered:True when INDEX(0) is specified in an INSERT, UPDATE, or DELETE query.

Let’s look at three alternative formulations of a trivial UPDATE statement on our test table:

Update plan

All three plans feature the Ordered:True flag, but the one with INDEX(0) includes a sort on order_id.

That sort is probably redundant, since the Clustered Index Scan (with Ordered:True) ought to be guaranteed to produce rows in that same order. Not only will that sort slow down your query, it will require a memory grant and might even spill to tempdb.

A data modification optimization

When performing our UPDATE query, the optimizer takes into account the effect of ordering. If rows arrive at the update iterator in clustered index order, the pattern of updates will likely result in sequential I/O. If rows are in a different order, the updates will likely cause random I/O as the index is updated row by row.

The optimizer may choose to pre-sort the records on the clustered index key to optimize for sequential I/O. On other occasions, it may decide that the cost of the sort would exceed the savings made––this is seen when the number of rows to be updated is very small, and the rows can be located most efficiently from a nonclustered index.

When the rows are sourced from the clustered index, the optimizer always adds the Ordered:True flag to maximize the chances for sequential I/O, regardless of the estimated number of rows.

This explains how the Clustered Index Scan above manages to acquire the Ordered:True flag when INDEX(0) was specified. It seems that the optimizer misses the fact that by adding the ordering flag, the sort is now redundant.

Final Thoughts

Although I have concentrated very much on clustered indexes in this entry, I want to stress that the two scanning strategies apply to both clustered and non-clustered indexes. I created a nonclustered index on the test table so you can verify for yourself that IAM-driven scans are possible with nonclustered indexes, and the rules are exactly the same.

It is also not necessary to specify TABLOCK explicitly to get an IAM-driven scan. If the index in question happens to have row and page locks disabled (using ALTER INDEX (…) SET ALLOW_ROW_LOCKS = OFF, ALLOW_PAGE_LOCKS = OFF) the Storage Engine will recognise that a table lock has to be taken. Other circumstances like the database being read-only will also meet the underlying requirement that the data cannot change during the scan.

Beware of using NOLOCK when the other conditions for an IAM-driven scan are met. Not only might you read data that has not been committed yet, your query might also fail completely with the error:

Could not continue scan with NOLOCK due to data movement.

Finally, watch out for unnecessary sorts in plans that use the INDEX(0) table hint. Consider using INDEX(1) instead or rewriting the query to avoid the hint altogether.