2010

Tuesday 14 December 2010

Beware Sneaky Reads with Unique Indexes

I saw a question asked recently on the #sqlhelp hash tag:

Might SQL Server retrieve (out-of-row) LOB data from a table, even if the column isn’t referenced in the query?

Leaving aside trivial cases like selecting a computed column that does reference the LOB data, one might be tempted to say that no, SQL Server does not read data you haven’t asked for.

In general, that is correct; however, there are cases where SQL Server might sneakily read a LOB column.

Read the full article »

Friday 10 December 2010

Heaps of Trouble?

Brad Schulz recently wrote about optimizing a query run against tables with no indexes at all. The problem was, predictably, that performance was not very good. The catch was that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts.

In this post, I’m going to look at the problem from a different angle, and present an alternative solution to the one Brad found.

Read the full article »

Thursday 4 November 2010

Myth: SQL Server Caches a Serial Plan with every Parallel Plan

Many people believe that whenever SQL Server creates an execution plan that uses parallelism, an alternative serial plan is also cached.

The idea seems to be that the execution engine then decides between the parallel and serial alternatives at runtime. I’ve seen this on forums, in blogs, and even in books.

In fairness, a lot of the official documentation is not as clear as it might be on the subject. In this post I will show that only a single (parallel) plan is cached. I will also show that SQL Server can execute a parallel plan on a single thread.

Read the full article »

Monday 1 November 2010

The Case of the Missing Shared Locks

This post covers a little-known locking optimization that provides a surprising answer to the question:

If I hold an exclusive lock on a row, can another transaction running at the default read committed isolation level read it?

Most people would answer ‘no’, on the basis that the read would block when it tried to acquire a shared lock. Others might respond that it depends on whether the READ_COMMITTED_SNAPSHOT database option was in effect, but let’s assume that is not the case, and we are dealing simply with the default (locking) read committed isolation level.

Read the full article »

Tuesday 19 October 2010

Sequence Tables

It is frequently useful to generate sequences of values within SQL Server, perhaps for use as surrogate keys. Using the IDENTITY property on a column is the easiest way to automatically generate such sequences:

CREATE TABLE dbo.SomeTable
(
    row_id integer IDENTITY PRIMARY KEY,
    [data] sql_variant NOT NULL,
);

Sometimes though, the database designer needs a more flexible scheme than is provided by the IDENTITY property. One alternative is to use a Sequence Table.

Read the full article »

Thursday 23 September 2010

A Tale of Two Index Hints

If you look up Table Hints in the official documentation, you’ll find the following statements:

If a clustered index exists, INDEX(0) forces a clustered index scan and INDEX(1) forces a clustered index scan or seek.

If no clustered index exists, INDEX(0) forces a table scan and INDEX(1) is interpreted as an error.

The interesting thing there is that both hints can result in a scan. If that is the case, you might wonder if there is any effective difference between the two.

This blog entry explores that question, and highlights an optimizer quirk that can result in a much less efficient query plan when using INDEX(0). I’ll also cover some stuff about ordering guarantees.

Read the full article »

Wednesday 1 September 2010

Inside the Optimizer: Plan Costing

A detailed look at costing, and more undocumented optimizer fun.

The SQL Server query optimizer generates a number of physical plan alternatives from a logical requirement expressed in T-SQL. If full cost-based optimization is required, a cost is assigned to each iterator in each alternative plan, and the plan with the lowest overall cost is ultimately selected for execution.

Read the full article »

Friday 27 August 2010

Sorting, Row Goals, and the TOP 100 Problem

When you write a query to return the first few rows from a potential result set, you’ll often use the TOP clause.

To give a precise meaning to the TOP operation, it will normally be accompanied by an ORDER BY clause. Together, the TOP…ORDER BY construction can be used to precisely identify which top ‘n’ rows should be returned.

Read the full article »

Sunday 22 August 2010

Row Goals and Grouping

You might recall from Inside the Optimizer: Row Goals In Depth that query plans containing a row goal tend to favour nested loops or sort-free merge join over hashing.

This is because a hash join has to fully process its build input (to populate its hash table) before it can start probing for matches on its other input. Hash join therefore has a high start-up cost, balanced by a lower per-row cost once probing begins.

In this post, we will take a look at how row goals affect grouping operations.

Read the full article »

Wednesday 18 August 2010

Inside the Optimizer: Row Goals In Depth

Background

One of the core assumptions made by the SQL Server query optimizer cost model is that clients will eventually consume all the rows produced by a query.

This results in plans that favour the overall execution cost, though it may take longer to begin producing rows.

Read the full article »

Saturday 14 August 2010

Viewing Another Session’s Temporary Table

Is it possible to view the contents of a local temporary table, created on a different connection?

Say we run this code on connection 1:

CREATE TABLE #Private
(
    [data] nvarchar(30) NOT NULL
);
GO
INSERT #Private
(
    [data]
)
VALUES
(
    N'Only I can see this'
);

Is there a way to see the contents of the #Private table from connection 2?

It isn’t particularly easy, but a user with sysadmin permissions, a bit of internal knowledge (and some patience) can do it.

Read the full article »

Wednesday 11 August 2010

The Impact of Non-Updating Updates

From time to time, I encounter a system design that always issues an UPDATE against the database after a user has finished working with a record — without checking to see if any of the data was in fact altered.

The prevailing wisdom seems to be “the database will sort it out”. This raises an interesting question: How smart is SQL Server in these circumstances?

In this post, I’ll look at a generalisation of this problem: What is the impact of updating a column to the value it already contains?

The specific questions I want to answer are:

Does this kind of UPDATE generate any log activity?
Do data pages get marked as dirty (and so eventually get written out to disk)?
Does SQL Server bother doing the update at all?

Read the full article »

Thursday 5 August 2010

Iterators, Query Plans, and Why They Run Backwards

Iterators

SQL Server uses an extensible architecture for query optimization and execution, using iterators as the basic building blocks.

Iterators are probably most familiar in their graphical showplan representation, where each icon represents a single iterator. They also show up in XML query plan output as RelOp nodes:

Each iterator performs a single simple function, such as applying a filtering condition, or performing an aggregation. It can represent a logical operation, a physical operation, or (most often) both.

Read the full article »

Wednesday 4 August 2010

An Interesting MERGE Bug

Investigating an optimizer transformation that exposes a bug in SQL Server’s MERGE implementation.

Read the full article »

Saturday 31 July 2010

Inside the Optimizer: Constructing a Plan – Part 4

More undocumented ways to explore how the query optimizer works.

Read the full article »

Inside the Optimizer: Constructing a Plan – Part 3

Presenting an undocumented Dynamic Management View we can use to identify the optimization rules involved in producing an executable plan.

Read the full article »

Thursday 29 July 2010

Inside the Optimizer: Constructing a Plan - Part 2

Continuing the series of posts looking at how the optimizer matches and applies internal rules to refine a query plan.

The last post ended with this query plan:

The optimizer has pushed the predicate ProductNumber LIKE 'T%' down from a Filter to the Index Scan on the Product table, but it remains as a residual predicate.

Read the full article »

Inside the Optimizer: Constructing a Plan - Part 1

For today’s entry, I thought we might take a look at how the optimizer builds an executable plan using rules. To illustrate the process performed by the optimizer, we will configure it to produce incrementally better plans by progressively applying the necessary rules.

Read the full article »

Wednesday 28 July 2010

Ranking Function Optimizer Transformations

In my last post I showed how SQL Server 2005 and later can use a Segment Spool to implement aggregate window functions and the NTILE ranking function.

The query optimizer is also smart enough to recognise that some queries are logically equivalent to a window function, even if they are written using different syntax.

Read the full article »

Partitioning and the Common Subexpression Spool

SQL Server 2005 introduced the OVER clause to enable partitioning of rowsets before applying a window function. This post looks at how this feature may require a query plan containing a ‘common subexpression spool’. This query plan construction is required whenever an aggregate window function or the NTILE ranking window function is used.

Read the full article »

The Segment and Sequence Project Iterators

In my last post I promised to cover the Segment iterator in more detail, so here we go.

Segment

The Segment iterator partitions rows into groups as they flow through a query plan, checking whether the current row belongs in the same group as the previous row. For this to work, the incoming rows must be presented in an order which guarantees that all members of a group are received sequentially.

Segment has a “Group By” argument to specify how it should partition its input. The iterator adds an additional column to the rows that flow through it. This new column is used to communicate with its parent iterator, and is named something like [Segment1003].

The new column is visible in the graphical query plan properties window, or by hovering your mouse cursor over the Segment operator. The attribute name in both cases is Segment Column).

Read the full article »

The “Segment Top” Query Optimization

The Segment Top Query Optimization

A question that often comes up on the forums is how to get the first or last row from each group of records in a table. This post describes a clever query plan optimisation that SQL Server can use for these types of query.

Read the full article »

About This Blog

Tuesday 14 December 2010

Friday 10 December 2010

Thursday 4 November 2010

Monday 1 November 2010

Tuesday 19 October 2010

Thursday 23 September 2010

Wednesday 1 September 2010

Friday 27 August 2010

Sunday 22 August 2010

Wednesday 18 August 2010

Background

Saturday 14 August 2010

Wednesday 11 August 2010

Thursday 5 August 2010

Iterators

Wednesday 4 August 2010

Saturday 31 July 2010

Thursday 29 July 2010

Wednesday 28 July 2010

Segment