About This Blog

SQL Server internals information including my content from SQLBlog.com and SQLPerformance.com.

Saturday, 31 August 2013

Nested Loops Prefetching

Nested Loops Prefetching

Nested loops join query plans can be a lot more interesting (and complicated) than is commonly realized.

One query plan area I get asked about a lot is prefetching. It is not documented in full detail anywhere, so this seems like a good topic to address in a blog post.

The examples used in this article are based on questions asked by Adam Machanic.

Wednesday, 28 August 2013

Parameter Sniffing, Embedding, and the RECOMPILE Options

Parameter Sniffing, Embedding, and the RECOMPILE Options

Parameter Sniffing

Query parameterization promotes the reuse of cached execution plans, thereby avoiding unnecessary compilations, and reducing the number of ad-hoc queries in the plan cache.

These are all good things, provided the query being parameterized really ought to use the same cached execution plan for different parameter values. An execution plan that is efficient for one parameter value may not be a good choice for other possible parameter values.

When parameter sniffing is enabled (the default), SQL Server chooses an execution plan based on the particular parameter values that exist at compilation time. The implicit assumption is that parameterized statements are most commonly executed with the most common parameter values. This sounds reasonable enough (even obvious) and indeed it often works well.

A problem can occur when an automatic recompilation of the cached plan occurs. A recompilation may be triggered for all sorts of reasons, for example because an index used by the cached plan has been dropped (a correctness recompilation) or because statistical information has changed (an optimality recompile).

Whatever the exact cause of the plan recompilation, there is a chance that an atypical value is being passed as a parameter at the time the new plan is generated. This can result in a new cached plan (based on the sniffed atypical parameter value) that is not good for the majority of executions for which it will be reused.

It is not easy to predict when a particular execution plan will be recompiled (for example, because statistics have changed sufficiently) resulting in a situation where a good-quality reusable plan can be suddenly replaced by a quite different plan optimized for atypical parameter values.

One such scenario occurs when the atypical value is highly selective, resulting in a plan optimized for a small number of rows. Such plans will often use single-threaded execution, nested loops joins, and lookups. Serious performance issues can arise when this plan is reused for different parameter values that generate a much larger number of rows.

Wednesday, 21 August 2013

Incorrect Results Caused By Adding an Index

Incorrect Results Caused By Adding an Index

Say you have the following two tables, one partitioned and one not:

CREATE PARTITION FUNCTION PF (integer)
AS RANGE RIGHT
FOR VALUES (1000, 2000, 3000, 4000, 5000);

CREATE PARTITION SCHEME PS
AS PARTITION PF
ALL TO ([PRIMARY]);

-- Partitioned
CREATE TABLE dbo.T1
(
    T1ID    integer NOT NULL,
    SomeID  integer NOT NULL,

    CONSTRAINT [PK dbo.T1 T1ID]
        PRIMARY KEY CLUSTERED (T1ID)
        ON PS (T1ID)
);

-- Not partitioned
CREATE TABLE dbo.T2
(
    T2ID    integer IDENTITY (1,1) NOT NULL,
    T1ID    integer NOT NULL,

    CONSTRAINT [PK dbo.T2 T2ID]
        PRIMARY KEY CLUSTERED (T2ID)
        ON [PRIMARY]
);

Wednesday, 24 July 2013

Two Partitioning Peculiarities

Two Partitioning Peculiarities

Table partitioning in SQL Server is essentially a way of making multiple physical tables (row sets) look like a single table. This abstraction is performed entirely by the query processor, a design that makes things simpler for users, but which makes complex demands of the query optimizer.

This post looks at two examples which exceed the optimizer’s abilities in SQL Server 2008 onward.

Thursday, 18 July 2013

Aggregates and Partitioning

Aggregates and Partitioning

The changes in the internal representation of partitioned tables between SQL Server 2005 and SQL Server 2008 resulted in improved query plans and performance in the majority of cases (especially when parallel execution is involved).

Unfortunately, the same changes caused some things that worked well in SQL Server 2005 to suddenly not work so well in SQL Server 2008 and later.

This post looks at a one example where the SQL Server 2005 query optimizer produced a superior execution plan compared with later versions.

Monday, 8 July 2013

Working Around Missed Optimizations

Working Around Missed Optimizations

In my last post, we saw how a query featuring a scalar aggregate could be transformed by the optimizer to a more efficient form. As a reminder, here’s the schema again:

Wednesday, 26 June 2013

Optimization Phases and Missed Opportunities

Optimization Phases and Missed Opportunities

There are two complementary skills that are very useful in query tuning. One is the ability to read and interpret execution plans. The second is knowing a bit about how the query optimizer works to translate SQL text into an execution plan.

Putting the two things together can help us spot times when an expected optimization was not applied, resulting in an execution plan that is not as efficient as it could be.

The lack of documentation around exactly which optimizations SQL Server can apply (and in what circumstances) means that a lot of this comes down to experience, however.

Monday, 17 June 2013

Improving Partitioned Table Join Performance

Improving Partitioned Table Join Performance

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example of that, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing.

Tuesday, 11 June 2013

Hello Operator, My Switch Is Bored

Hello Operator, My Switch Is Bored

This post is in two parts. The first part looks at the Switch execution plan operator. The second part is about an invisible plan operator and cardinality estimates on filtered indexes.

Thursday, 4 April 2013

Optimizer Limitations with Filtered Indexes

Optimizer Limitations with Filtered Indexes

One of the filtered index use cases mentioned in the product documentation concerns a column that contains mostly NULL values. The idea is to create a filtered index that excludes the NULLs, resulting in a smaller nonclustered index that requires less maintenance than the equivalent unfiltered index.

Another popular use of filtered indexes is to filter NULLs from a UNIQUE index, giving the behaviour users of other database engines might expect from a default UNIQUE index or constraint: Uniqueness enforced only for non-NULL values.

Unfortunately, the query optimizer has limitations where filtered indexes are concerned. This post looks at a couple of less well-known examples.

Wednesday, 20 March 2013

The Problem with Window Functions and Views

The Problem with Window Functions and Views

Introduction

Since their introduction in SQL Server 2005, window functions like ROW_NUMBER and RANK have proven to be extremely useful in solving a wide variety of common T-SQL problems. In an attempt to generalize such solutions, database designers often look to incorporate them into views to promote code encapsulation and reuse.

Unfortunately, a limitation in the SQL Server query optimizer often means that views1 containing window functions do not perform as well as expected. This post works through an illustrative example of the problem, details the reasons, and provides a number of workarounds.

Friday, 8 March 2013

Execution Plan Analysis: The Mystery Work Table

Execution Plan Analysis: The Mystery Work Table

I love SQL Server execution plans. It is often easy to spot the cause of a performance problem just by looking at one closely. That task is considerably easier if the plan includes run-time information (a so-called ‘actual’ execution plan), but even a compiled plan can be very useful.

Nevertheless, there are still times when the execution plan does not tell the whole story, and we need to think more deeply about query execution to really understand a problem. This post looks at one such example, based on a question I answered.

Thursday, 21 February 2013

Halloween Protection – The Complete Series

Halloween Protection – The Complete Series

I have written a four-part series on the Halloween Problem.

Some of you will never have heard about this issue. Those that have might associate it only with T-SQL UPDATE queries. In fact, the Halloween Problem affects execution plans for INSERT, UPDATE, DELETE and MERGE statements.

This is a topic I have been meaning to write about properly for years, ever since I read Craig Freedman’s 2008 blog post on the topic, which ended with the cryptic comment:

“…although I’ve used update statements for all of the examples in this post, some insert and delete statements also require Halloween protection, but I’ll save that topic for a future post.”

That future post never materialized, so I thought I would have a go. The four parts of the series are summarized and linked below, I hope you find the material interesting.

Wednesday, 20 February 2013

The Halloween Problem – Part 4

The Halloween Problem – Part 4

The Halloween Problem can have a number of important effects on execution plans. In this final part of the series, we look at the tricks the optimizer can employ to avoid the Halloween Problem when compiling plans for queries that add, change or delete data.

Monday, 18 February 2013

The Halloween Problem – Part 3

The Halloween Problem – Part 3

The MERGE statement (introduced in SQL Server 2008) allows us to perform a mixture of INSERT, UPDATE, and DELETE operations using a single statement.

The Halloween Protection issues for MERGE are mostly a combination of the requirements of the individual operations, but there are some important differences and a couple of interesting optimizations that apply only to MERGE.

Friday, 15 February 2013

The Halloween Problem – Part 2

The Halloween Problem – Part 2

In the first part of this series, we saw how the Halloween Problem applies to UPDATE queries. To recap briefly, the problem was that an index used to locate records to update had its keys modified by the update operation itself (another good reason to use included columns in an index rather than extending the keys). The query optimizer introduced an Eager Table Spool operator to separate the reading and writing sides of the execution plan to avoid the problem. In this post, we will see how the same underlying issue can affect INSERT and DELETE statements.

Wednesday, 13 February 2013

The Halloween Problem – Part 1

The Halloween Problem – Part 1

Much has been written over the years about understanding and optimizing SELECT queries, but rather less about data modification. This series looks at an issue that is specific to INSERT, UPDATE, DELETE and MERGE queries – the Halloween Problem.

The phrase “Halloween Problem” was originally coined with reference to a SQL UPDATE query that was supposed to give a 10% raise to every employee who earned less than $25,000. The problem was that the query kept giving 10% raises until everyone earned at least $25,000.

We will see later on in this series that the underlying issue also applies to INSERT, DELETE and MERGE queries, but for this first entry, it will be helpful to examine the UPDATE problem in a bit of detail.

Wednesday, 6 February 2013

Incorrect Results with Indexed Views

Incorrect Results with Indexed Views

If you use MERGE, indexed views and foreign keys, your queries might return incorrect results. Microsoft have released a fix for incorrect results returned when querying an indexed view. The problem applies to:

  • SQL Server 2012
  • SQL Server 2008 R2
  • SQL Server 2008

The Knowledge Base article does not go into detail, or provide a reproduction script, but this blog post does.

Friday, 1 February 2013

A creative use of IGNORE_DUP_KEY

A creative use of IGNORE_DUP_KEY

Let’s say you have a big table with a clustered primary key, and an application that inserts batches of rows into it. The nature of the business is that the batch will inevitably sometimes contain rows that already exist in the table.

The default SQL Server INSERT behaviour for such a batch is to throw error 2627 (primary key violation), terminate the statement, roll back all the inserts (not just the rows that conflicted) and keep any active transaction open:

Saturday, 26 January 2013

Optimizing T-SQL queries that change data

Optimizing T-SQL queries that change data

Most tuning efforts for data-changing operations concentrate on the SELECT side of the query plan. Sometimes people will also look at storage engine considerations (like locking or transaction log throughput) that can have dramatic effects. A number of common practices have emerged, such as avoiding large numbers of row locks and lock escalation, splitting large changes into smaller batches of a few thousand rows, and combining a number of small changes into a single transaction in order to optimize log flushes.

This is all good, but what about the data-changing side of the query plan — the INSERT, UPDATE, DELETE, or MERGE operation itself — are there any query processor considerations we should take into account? The short answer is yes.

The query optimizer considers different plan options for the write-side of an execution plan, though there isn’t a huge amount of T-SQL language support that allows us to affect these choices directly. Nevertheless, there are things to be aware of, and things we can look to change.