About This Blog

Including my content from SQLBlog.com and some from SQLPerformance.com

Monday, 12 March 2012

Fun with Scalar and Vector Aggregates

Fun with Scalar and Vector Aggregates

There are interesting things to be learned from even the simplest queries.

For example, imagine you are asked to write a query that lists AdventureWorks product names, where the product has at least one entry in the transaction history table, but fewer than ten.

Wednesday, 18 January 2012

Dynamic Seeks and Hidden Implicit Conversions

Dynamic Seeks and Hidden Implicit Conversions

A LIKE predicate with only a trailing wildcard can usually use an index seek, as the following AdventureWorks sample database query shows:

SELECT 
    P.[Name]
FROM Production.Product AS P
WHERE 
    P.[Name] LIKE N'D%';

Friday, 23 December 2011

Forcing a Parallel Query Execution Plan

Forcing a Parallel Query Execution Plan

This article is for SQL Server developers who have experienced the special kind of frustration that only comes from spending hours trying to convince the query optimizer to generate a parallel execution plan.

This situation often occurs when making an apparently innocuous change to the text of a moderately complex query — a change which somehow manages to turn a parallel plan that executes in ten seconds, into a five-minute serially-executing monster.

Tuesday, 6 December 2011

SQL Server Optimizer Bug with JOIN and GROUP BY

SQL Server Optimizer Bug with JOIN and GROUP BY

I came across a SQL Server optimizer bug recently that made me wonder how on earth I never noticed it before.

As the title of this post suggests, the bug occurs in common JOIN and GROUP BY queries. While it does not cause incorrect results to be returned, it will often cause a poor query plan to be selected by the optimizer.

If you are just interested in the bug itself, you will find a description in the section headed “the bug revealed”. It relates to cardinality estimation for serial partial aggregates.

As the regular reader will be expecting though, I am going to work up to it with a bit of background. The lasting value of this post (once the bug is fixed) is in the background details anyway.

Sunday, 4 December 2011

Is Distinct Aggregation Still Considered Harmful?

Is Distinct Aggregation Still Considered Harmful?

Back in 2008, Marc Friedman of the SQL Server Query Processor Team wrote a blog entry entitled “Distinct Aggregation Considered Harmful”.

Marc shows a way to work around the poor performance that often results simply from adding the keyword DISTINCT to an otherwise perfectly reasonable aggregate function in a query.

This post is an update to that work, presenting a query optimizer enhancement in SQL Server 2012 that reduces the need to perform the suggested rewrite manually.

Wednesday, 21 September 2011

Finding the Statistics Used to Compile an Execution Plan

Finding the Statistics Used to Compile an Execution Plan

In this post, I show you how to determine the statistics objects used by the query optimizer in producing an execution plan.

Note: This technique only applies to queries compiled using the original (70) cardinality estimation model.

Tuesday, 30 August 2011

Can a SELECT query cause page splits?

Can a SELECT query cause page splits?

The SQL Server documentation has this to say about page splits:

When a new row is added to a full index page, the Database Engine moves approximately half the rows to a new page to make room for the new row. This reorganization is known as a page split. A page split makes room for new records, but can take time to perform and is a resource intensive operation. Also, it can cause fragmentation that causes increased I/O operations.

Given that, how can a SELECT statement be responsible for page splits?

Well, I suppose we could SELECT from a function that adds rows to a table variable as part of its internal implementation, but that would clearly be cheating, and no fun at all from a blogging point of view.

Tuesday, 9 August 2011

SQL Server, Seeks, and Binary Search

SQL Server, Seeks, and Binary Search

The following table summarizes the results from my last two articles, Enforcing Uniqueness for Performance and Avoiding Uniqueness for Performance. It shows the CPU time used when performing 5 million clustered index seeks into a unique or non-unique index:

Test summary

In test 1, making the clustered index unique improved performance by around 40%.

In test 2, making the same change reduced performance by around 70% (on 64-bit systems – more on that later).

Thursday, 4 August 2011

Avoiding Uniqueness for Performance

Avoiding Uniqueness for Performance

In my last post, Enforcing Uniqueness for Performance, I showed how using a unique index could speed up equality seeks by around 40%.

Friday, 29 July 2011

Enforcing Uniqueness for Performance

Enforcing Uniqueness for Performance

A little while back, I posted a short series on seeks and scans:

One of the things I highlighted in the middle post was the difference between a singleton seek and a range scan:

  • A singleton equality seek always retrieves exactly one row, and is guaranteed to do so because a unique index exists to enforce it.

  • A range scan seeks down the B-tree to a starting (or ending) point, and scans forward (or backward) from that point using the next or previous page pointers.

Today’s short post shows how much faster a singleton seek is, compared with a range scan, even when both return exactly the same number of records.