About This Blog

Including my content from SQLBlog.com and some from SQLPerformance.com

Sunday, 29 April 2012

Query Optimizer Deep Dive – Part 3

Query Optimizer Deep Dive – Part 3

This is the third in a series of posts based on the content of the Query Optimizer Deep Dive presentations I have given over the last month or so at the Auckland SQL Users’ Group, and SQL Saturday events in Wellington, New Zealand and Adelaide, Australia.

Links to other parts of this series: Part 1 Part 2 Part 4

Storage of Alternative Plans

We saw in part 2 how optimizer rules are used to explore logical alternatives for parts of the query tree, and how implementation rules are used to find physical operations to perform each logical steps.

To keep track of all these options, the cost-based part of the SQL Server query optimizer uses a structure called the Memo. This structure is part of the Cascades general optimization framework developed by Goetz Graefe.

Saturday, 28 April 2012

Query Optimizer Deep Dive – Part 2

Query Optimizer Deep Dive – Part 2

This is the second in a series of posts based on the content of the Query Optimizer Deep Dive presentations I have given over the last month or so at the Auckland SQL Users’ Group, and SQL Saturday events in Wellington, New Zealand and Adelaide, Australia.

Links to other parts of this series: Part 1 Part 3 Part 4

Cost-Based Optimization Overview

The input to cost-based optimization is a tree of logical operations produced by the previous optimization stages discussed in part one.

Cost-based optimization takes this logical tree, explores logical alternatives (different logical tree shapes that will always produce the same results), generates physical implementations, assigns an estimated cost to each, and finally chooses the cheapest physical option overall.

The goal of cost-based optimization is not to find the best possible physical execution plan by exploring every possible alternative. Rather, the goal is to find a good plan quickly.

Query Optimizer Deep Dive - Part 1

Query Optimizer Deep Dive - Part 1

This is the first in a series of posts based on the content of the Query Optimizer Deep Dive presentations I have given over the last month or so at the Auckland SQL Users’ Group, and SQL Saturday events in Wellington, New Zealand and Adelaide, Australia.

Links to other parts of this series: Part 2 Part 3 Part 4

Introduction

The motivation behind writing these sessions is finding that relatively few people have a good intuition for the way the optimizer works. This is partly because the official documentation is rather sparse, and partly because what information is available is dispersed across many books and blog posts.

The content presented here is very much geared to my preferred way of learning. It shows the concepts in what seems to me to be a reasonably logical sequence, and then provides tools to enable the interested reader to explore further, if desired.

Monday, 12 March 2012

Fun with Scalar and Vector Aggregates

Fun with Scalar and Vector Aggregates

There are interesting things to be learned from even the simplest queries.

For example, imagine you are asked to write a query that lists AdventureWorks product names, where the product has at least one entry in the transaction history table, but fewer than ten.

Wednesday, 18 January 2012

Dynamic Seeks and Hidden Implicit Conversions

Dynamic Seeks and Hidden Implicit Conversions

A LIKE predicate with only a trailing wildcard can usually use an index seek, as the following AdventureWorks sample database query shows:

SELECT 
    P.[Name]
FROM Production.Product AS P
WHERE 
    P.[Name] LIKE N'D%';

Friday, 23 December 2011

Forcing a Parallel Query Execution Plan

Forcing a Parallel Query Execution Plan

This article is for SQL Server developers who have experienced the special kind of frustration that only comes from spending hours trying to convince the query optimizer to generate a parallel execution plan.

This situation often occurs when making an apparently innocuous change to the text of a moderately complex query — a change which somehow manages to turn a parallel plan that executes in ten seconds, into a five-minute serially-executing monster.

Tuesday, 6 December 2011

SQL Server Optimizer Bug with JOIN and GROUP BY

SQL Server Optimizer Bug with JOIN and GROUP BY

I came across a SQL Server optimizer bug recently that made me wonder how on earth I never noticed it before.

As the title of this post suggests, the bug occurs in common JOIN and GROUP BY queries. While it does not cause incorrect results to be returned, it will often cause a poor query plan to be selected by the optimizer.

If you are just interested in the bug itself, you will find a description in the section headed “the bug revealed”. It relates to cardinality estimation for serial partial aggregates.

As the regular reader will be expecting though, I am going to work up to it with a bit of background. The lasting value of this post (once the bug is fixed) is in the background details anyway.

Sunday, 4 December 2011

Is Distinct Aggregation Still Considered Harmful?

Is Distinct Aggregation Still Considered Harmful?

Back in 2008, Marc Friedman of the SQL Server Query Processor Team wrote a blog entry entitled “Distinct Aggregation Considered Harmful”.

Marc shows a way to work around the poor performance that often results simply from adding the keyword DISTINCT to an otherwise perfectly reasonable aggregate function in a query.

This post is an update to that work, presenting a query optimizer enhancement in SQL Server 2012 that reduces the need to perform the suggested rewrite manually.

Wednesday, 21 September 2011

Finding the Statistics Used to Compile an Execution Plan

Finding the Statistics Used to Compile an Execution Plan

In this post, I show you how to determine the statistics objects used by the query optimizer in producing an execution plan.

Note: This technique only applies to queries compiled using the original (70) cardinality estimation model.

Tuesday, 30 August 2011

Can a SELECT query cause page splits?

Can a SELECT query cause page splits?

The SQL Server documentation has this to say about page splits:

When a new row is added to a full index page, the Database Engine moves approximately half the rows to a new page to make room for the new row. This reorganization is known as a page split. A page split makes room for new records, but can take time to perform and is a resource intensive operation. Also, it can cause fragmentation that causes increased I/O operations.

Given that, how can a SELECT statement be responsible for page splits?

Well, I suppose we could SELECT from a function that adds rows to a table variable as part of its internal implementation, but that would clearly be cheating, and no fun at all from a blogging point of view.