About This Blog

A copy of my content originally hosted on SQLBlog.com
Showing posts with label Indexes. Show all posts
Showing posts with label Indexes. Show all posts

Wednesday, 21 August 2013

Incorrect Results Caused By Adding an Index

Incorrect Results Caused By Adding an Index

Say you have the following two tables, one partitioned and one not:

CREATE PARTITION FUNCTION PF (integer)
AS RANGE RIGHT
FOR VALUES (1000, 2000, 3000, 4000, 5000);

CREATE PARTITION SCHEME PS
AS PARTITION PF
ALL TO ([PRIMARY]);

-- Partitioned
CREATE TABLE dbo.T1
(
    T1ID    integer NOT NULL,
    SomeID  integer NOT NULL,

    CONSTRAINT [PK dbo.T1 T1ID]
        PRIMARY KEY CLUSTERED (T1ID)
        ON PS (T1ID)
);

-- Not partitioned
CREATE TABLE dbo.T2
(
    T2ID    integer IDENTITY (1,1) NOT NULL,
    T1ID    integer NOT NULL,

    CONSTRAINT [PK dbo.T2 T2ID]
        PRIMARY KEY CLUSTERED (T2ID)
        ON [PRIMARY]
);

Friday, 1 February 2013

A creative use of IGNORE_DUP_KEY

A creative use of IGNORE_DUP_KEY

Let’s say you have a big table with a clustered primary key, and an application that inserts batches of rows into it. The nature of the business is that the batch will inevitably sometimes contain rows that already exist in the table.

The default SQL Server INSERT behaviour for such a batch is to throw error 2627 (primary key violation), terminate the statement, roll back all the inserts (not just the rows that conflicted) and keep any active transaction open:

Tuesday, 9 August 2011

SQL Server, Seeks, and Binary Search

SQL Server, Seeks, and Binary Search

The following table summarizes the results from my last two articles, Enforcing Uniqueness for Performance and Avoiding Uniqueness for Performance. It shows the CPU time used when performing 5 million clustered index seeks into a unique or non-unique index:

Test summary

In test 1, making the clustered index unique improved performance by around 40%.

In test 2, making the same change reduced performance by around 70% (on 64-bit systems – more on that later).

Thursday, 4 August 2011

Avoiding Uniqueness for Performance

Avoiding Uniqueness for Performance

In my last post, Enforcing Uniqueness for Performance, I showed how using a unique index could speed up equality seeks by around 40%.

Friday, 29 July 2011

Enforcing Uniqueness for Performance

Enforcing Uniqueness for Performance

A little while back, I posted a short series on seeks and scans:

One of the things I highlighted in the middle post was the difference between a singleton seek and a range scan:

  • A singleton equality seek always retrieves exactly one row, and is guaranteed to do so because a unique index exists to enforce it.

  • A range scan seeks down the B-tree to a starting (or ending) point, and scans forward (or backward) from that point using the next or previous page pointers.

Today’s short post shows how much faster a singleton seek is, compared with a range scan, even when both return exactly the same number of records.

Thursday, 17 February 2011

So…is it a Seek or a Scan?

So…is it a Seek or a Scan?

You might be most familiar with the terms ‘Seek’ and ‘Scan’ from the graphical plans produced by SQL Server Management Studio (SSMS). You might look to the SSMS tool-tip descriptions to explain the differences between them:

Scan and Seek tooltips

Both mention scans and ranges (nothing about seeks) and the Index Seek description maybe implies that it will not scan the index entirely (which isn’t necessarily true). Not massively helpful.

Wednesday, 16 February 2011

When is a Seek not a Seek?

When is a Seek not a Seek?

The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive.

IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL
BEGIN
    DROP TABLE #Test
END;
GO
CREATE TABLE #Test
(
    id integer PRIMARY KEY CLUSTERED
);
INSERT #Test
    (id)
SELECT
    V.number
FROM master.dbo.spt_values AS V
WHERE
    V.[type] = N'P'
    AND V.number BETWEEN 1 AND 1000;

Let’s say we are given the following task:

Find the rows with values from 100 to 170, excluding any values that divide exactly by 10.

Thursday, 23 September 2010

A Tale of Two Index Hints

A Tale of Two Index Hints

If you look up Table Hints in the official documentation, you’ll find the following statements:

If a clustered index exists, INDEX(0) forces a clustered index scan and INDEX(1) forces a clustered index scan or seek.

If no clustered index exists, INDEX(0) forces a table scan and INDEX(1) is interpreted as an error.

The interesting thing there is that both hints can result in a scan. If that is the case, you might wonder if there is any effective difference between the two.

This blog entry explores that question, and highlights an optimizer quirk that can result in a much less efficient query plan when using INDEX(0). I’ll also cover some stuff about ordering guarantees.

Wednesday, 11 August 2010

The Impact of Non-Updating Updates

The Impact of Non-Updating Updates

From time to time, I encounter a system design that always issues an UPDATE against the database after a user has finished working with a record — without checking to see if any of the data was in fact altered.

The prevailing wisdom seems to be “the database will sort it out”. This raises an interesting question: How smart is SQL Server in these circumstances?

In this post, I’ll look at a generalisation of this problem: What is the impact of updating a column to the value it already contains?

The specific questions I want to answer are:

  • Does this kind of UPDATE generate any log activity?
  • Do data pages get marked as dirty (and so eventually get written out to disk)?
  • Does SQL Server bother doing the update at all?