About This Blog

A copy of my content from SQLBlog.com and SQLPerformance.com, plus occasional new content.
Showing posts with label Execution Plan. Show all posts
Showing posts with label Execution Plan. Show all posts

Sunday, 31 May 2020

Pulling Group By Above a Join

Pulling Group By Above a Join

One of the transformations available to the SQL Server query optimizer is pulling a logical Group By (and any associated aggregates) above a Join.

Visually, this means transforming a tree of logical operations from:

Group By Below Join

…to this:

Group By Above Join

The above diagrams are logical representations. They need to be implemented as physical operators to appear in an execution plan. The options are:

  • Group By
    • Hash Match Aggregate
    • Stream Aggregate
    • Distinct Sort
  • Join
    • Nested Loops Join
    • Nested Loops Apply
    • Hash Match Join
    • Merge Join

When the optimizer moves a Group By above a Join it has to preserve the semantics. The new sequence of operations must be guaranteed to return the same results as the original in all possible circumstances.

One cannot just pick up a Group By and arbitrarily move it around the query tree without risking incorrect results.

Saturday, 24 August 2019

Batch Mode Bitmap Demos

Batch Mode Bitmap Demos

This is a companion post to my main article Batch Mode Bitmaps in SQL Server. This post provides demos and illustrations to supplement the technical article.

The scripts presented here were run on SQL Server 2017 CU 16.

Sunday, 9 June 2019

Apply versus Nested Loops Join

Apply versus Nested Loops Join

SQL is a declarative language. We use SQL to write a logical query specification that defines the results we want. For example, we might write a query using either APPLY or JOIN that logically describes exactly the same results.

It is up to the query optimizer to find an efficient physical implementation of that logical requirement. SQL Server is free to choose any plan it likes, so long as the results are guaranteed to be the same as specified in the original SQL.

The optimizer is capable of transforming an apply to a join and vice versa. It generally tries to rewrite apply to join during initial compilation to maximize the searchable plan space during cost-based optimization. Having transformed an apply to a join early on, it may also consider a transformation back to an apply shape later on to assess the merits of e.g. an index loops join.

Wednesday, 28 August 2013

Parameter Sniffing, Embedding, and the RECOMPILE Options

Parameter Sniffing, Embedding, and the RECOMPILE Options

Parameter Sniffing

Query parameterization promotes the reuse of cached execution plans, thereby avoiding unnecessary compilations, and reducing the number of ad-hoc queries in the plan cache.

These are all good things, provided the query being parameterized really ought to use the same cached execution plan for different parameter values. An execution plan that is efficient for one parameter value may not be a good choice for other possible parameter values.

When parameter sniffing is enabled (the default), SQL Server chooses an execution plan based on the particular parameter values that exist at compilation time. The implicit assumption is that parameterized statements are most commonly executed with the most common parameter values. This sounds reasonable enough (even obvious) and indeed it often works well.

A problem can occur when an automatic recompilation of the cached plan occurs. A recompilation may be triggered for all sorts of reasons, for example because an index used by the cached plan has been dropped (a correctness recompilation) or because statistical information has changed (an optimality recompile).

Whatever the exact cause of the plan recompilation, there is a chance that an atypical value is being passed as a parameter at the time the new plan is generated. This can result in a new cached plan (based on the sniffed atypical parameter value) that is not good for the majority of executions for which it will be reused.

It is not easy to predict when a particular execution plan will be recompiled (for example, because statistics have changed sufficiently) resulting in a situation where a good-quality reusable plan can be suddenly replaced by a quite different plan optimized for atypical parameter values.

One such scenario occurs when the atypical value is highly selective, resulting in a plan optimized for a small number of rows. Such plans will often use single-threaded execution, nested loops joins, and lookups. Serious performance issues can arise when this plan is reused for different parameter values that generate a much larger number of rows.

Thursday, 18 July 2013

Aggregates and Partitioning

Aggregates and Partitioning

The changes in the internal representation of partitioned tables between SQL Server 2005 and SQL Server 2008 resulted in improved query plans and performance in the majority of cases (especially when parallel execution is involved).

Unfortunately, the same changes caused some things that worked well in SQL Server 2005 to suddenly not work so well in SQL Server 2008 and later.

This post looks at a one example where the SQL Server 2005 query optimizer produced a superior execution plan compared with later versions.

Monday, 8 July 2013

Working Around Missed Optimizations

Working Around Missed Optimizations

In my last post, we saw how a query featuring a scalar aggregate could be transformed by the optimizer to a more efficient form. As a reminder, here’s the schema again:

Wednesday, 26 June 2013

Optimization Phases and Missed Opportunities

Optimization Phases and Missed Opportunities

There are two complementary skills that are very useful in query tuning. One is the ability to read and interpret execution plans. The second is knowing a bit about how the query optimizer works to translate SQL text into an execution plan.

Putting the two things together can help us spot times when an expected optimization was not applied, resulting in an execution plan that is not as efficient as it could be.

The lack of documentation around exactly which optimizations SQL Server can apply (and in what circumstances) means that a lot of this comes down to experience, however.

Monday, 17 June 2013

Improving Partitioned Table Join Performance

Improving Partitioned Table Join Performance

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example of that, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing.

Tuesday, 11 June 2013

Hello Operator, My Switch Is Bored

Hello Operator, My Switch Is Bored

This post is in two parts. The first part looks at the Switch execution plan operator. The second part is about an invisible plan operator and cardinality estimates on filtered indexes.

Thursday, 4 April 2013

Optimizer Limitations with Filtered Indexes

Optimizer Limitations with Filtered Indexes

One of the filtered index use cases mentioned in the product documentation concerns a column that contains mostly NULL values. The idea is to create a filtered index that excludes the NULLs, resulting in a smaller nonclustered index that requires less maintenance than the equivalent unfiltered index.

Another popular use of filtered indexes is to filter NULLs from a UNIQUE index, giving the behaviour users of other database engines might expect from a default UNIQUE index or constraint: Uniqueness enforced only for non-NULL values.

Unfortunately, the query optimizer has limitations where filtered indexes are concerned. This post looks at a couple of less well-known examples.

Friday, 1 February 2013

A creative use of IGNORE_DUP_KEY

A creative use of IGNORE_DUP_KEY

Let’s say you have a big table with a clustered primary key, and an application that inserts batches of rows into it. The nature of the business is that the batch will inevitably sometimes contain rows that already exist in the table.

The default SQL Server INSERT behaviour for such a batch is to throw error 2627 (primary key violation), terminate the statement, roll back all the inserts (not just the rows that conflicted) and keep any active transaction open:

Wednesday, 18 January 2012

Dynamic Seeks and Hidden Implicit Conversions

Dynamic Seeks and Hidden Implicit Conversions

A LIKE predicate with only a trailing wildcard can usually use an index seek, as the following AdventureWorks sample database query shows:

SELECT 
    P.[Name]
FROM Production.Product AS P
WHERE 
    P.[Name] LIKE N'D%';

Thursday, 4 August 2011

Avoiding Uniqueness for Performance

Avoiding Uniqueness for Performance

In my last post, Enforcing Uniqueness for Performance, I showed how using a unique index could speed up equality seeks by around 40%.

Friday, 29 July 2011

Enforcing Uniqueness for Performance

Enforcing Uniqueness for Performance

A little while back, I posted a short series on seeks and scans:

One of the things I highlighted in the middle post was the difference between a singleton seek and a range scan:

  • A singleton equality seek always retrieves exactly one row, and is guaranteed to do so because a unique index exists to enforce it.

  • A range scan seeks down the B-tree to a starting (or ending) point, and scans forward (or backward) from that point using the next or previous page pointers.

Today’s short post shows how much faster a singleton seek is, compared with a range scan, even when both return exactly the same number of records.

Thursday, 7 July 2011

Bitmap Magic (or… how SQL Server uses bitmap filters)

Bitmap Magic (or… how SQL Server uses bitmap filters)

Question

Can a parallel query use less CPU than the same serial query, while executing faster?

The answer is yes! To demonstrate, I’ll use the following two (heap) tables, each containing a single column typed as integer:

Tables #BuiltInt and #Probe

Wednesday, 28 July 2010

The “Segment Top” Query Optimization

The Segment Top Query Optimization

A question that often comes up on the forums is how to get the first or last row from each group of records in a table. This post describes a clever query plan optimisation that SQL Server can use for these types of query.