More Consistent Execution Plan Timings in SQL Server 2022
The updated showplan schema shipped with SSMS 19 preview 2 contains an interesting comment:
ExclusiveProfileTimeActive: true if the actual elapsed time (ActualElapsedms attribute) and the actual CPU time (ActualCPUms attribute) represent the time interval spent exclusively within the relational iterator.
It sometimes makes sense to add OPTION (RECOMPILE) to a query. Typically this will be when:
A good enough plan for the query is very sensitive to one or more parameters
No good single value exists for the parameter to use in a hint
Optimize for unknown doesn’t give a good result
The plan might be expected to change over time
The cost of recompiling the statement is much less than the expected execution time
Recompiling every time is very likely to save more time and resources than it costs overall
All that is fairly well-known. The point of this short post is to draw your attention to another side-effect of adding OPTION (RECOMPILE) — the parameter embedding optimization (PEO).
Note: After you apply CU 16 for SQL Server 2016 SP2, you might encounter an issue in which DML (insert/update/delete) queries that use parallel plans cannot complete any execution and encounter HP_SPOOL_BARRIER waits. You can use the trace flag 13116 or MAXDOP=1 hint to work around this issue. This issue is related to the introduction of fix for 13685819 and it will be fixed in the next Cumulative Update.
That warning links to bug reference 13685819 on the same page. There isn’t a separate KB article, only the description:
Fixes an issue with insert query in SQL Server 2016 that reads the data from the same table and uses a parallel execution plan may produce duplicate rows
In When Do SQL Server Sorts Rewind? I described how most sorts can only rewind when they contain at most one row. The exception is in-memory sorts, which can rewind at most 500 rows and 16KB of data.
These are certainly tight restrictions, but we can still make use of them on occasion.
To illustrate, I am going reuse a demo Itzik Ben-Gan provided in part one of his Closest Match series, specifically solution 2 (modified value range and indexing).
As Itzik’s title suggests, the task is to find the closest match for a value in one table in a second table.
As Itzik describes it:
The challenge is to match to each row from T1 the row from T2 where the absolute difference between T2.val and T1.val is the lowest. In case of ties (multiple matching rows in T2), match the top row based on val ascending, keycol ascending order.
That is, the row with the lowest value in the val column, and if you still have ties, the row with the lowest keycol value. The tiebreaker is used to guarantee determinism.
The SQL Server 2019 query optimizer has a new trick available to improve the performance of large aggregations. The new exploration abilities are encoded in two new closely-related optimizer rules:
GbAggSplitToRanges
SelOnGbAggSplitToRanges
The extended event query_optimizer_batch_mode_agg_split is provided to track when this new optimization is considered. The description of this event is:
Occurs when the query optimizer detects batch mode aggregation is likely to spill and tries to split it into multiple smaller aggregations.
Other than that, this new feature hasn’t been documented yet. This article is intended to help fill that gap.
A bug with Halloween Protection and the OUTPUT Clause
Background
The OUTPUT clause can be used to return results from an INSERT, UPDATE, DELETE, or MERGE statement. The data can be returned to the client, inserted to a table, or both.
There are two ways to add OUTPUT data to a table:
Using OUTPUT INTO
With an outer INSERT statement.
For example:
-- Test tableDECLARE@Targettable(
id integerIDENTITY(1,1)NOTNULL,c1integerNULL);-- Holds rows from the OUTPUT clauseDECLARE@Outputtable(
id integerNOTNULL,c1integerNULL);
One of the transformations available to the SQL Server query optimizer is pulling a logical Group By (and any associated aggregates) above a Join.
Visually, this means transforming a tree of logical operations from:
…to this:
The above diagrams are logical representations. They need to be implemented as physical operators to appear in an execution plan. The options are:
Group By
Hash Match Aggregate
Stream Aggregate
Distinct Sort
Join
Nested Loops Join
Nested Loops Apply
Hash Match Join
Merge Join
When the optimizer moves a Group By above a Join it has to preserve the semantics. The new sequence of operations must be guaranteed to return the same results as the original in all possible circumstances.
One cannot just pick up a Group By and arbitrarily move it around the query tree without risking incorrect results.
This is a companion post to my main article Batch Mode Bitmaps in SQL Server. This post provides demos and illustrations to supplement the technical article.
The scripts presented here were run on SQL Server 2017 CU 16.
SQL is a declarative language. We use SQL to write a logical query specification that defines the results we want. For example, we might write a query using either APPLY or JOIN that logically describes exactly the same results.
It is up to the query optimizer to find an efficient physical implementation of that logical requirement. SQL Server is free to choose any plan it likes, so long as the results are guaranteed to be the same as specified in the original SQL.
The optimizer is capable of transforming an apply to a join and vice versa. It generally tries to rewrite apply to join during initial compilation to maximize the searchable plan space during cost-based optimization. Having transformed an apply to a join early on, it may also consider a transformation back to an apply shape later on to assess the merits of e.g. an index loops join.
Cardinality Estimation for Disjunctive Predicates in SQL Server 2014
Introduction
Back in January 2014, I wrote an article called Cardinality Estimation for Multiple Predicates that described the cardinality estimation process for queries with multiple predicates, from the point of view of the old and new cardinality estimators.
The article describes the various behaviours and formulas involved, along with the usual sprinkling of documented and undocumented trace flags. I described the formula SQL Server 2014 uses to calculate a cardinality estimate for multiple predicates connected with AND (conjunctive predicates), which was already relatively well-known.
Despite some fairly energetic research, and basic-to-intermediate skills with Excel, I was unable to deduce a similar formula for the disjunctive case, where predicates are connected by OR. The trace flag output I describe in the other article clearly showed that exponential backoff was used in the new 2014 cardinality estimator, but the precise formula eluded me.
Nested loops join query plans can be a lot more interesting (and complicated) than is commonly realized.
One query plan area I get asked about a lot is prefetching. It is not documented in full detail anywhere, so this seems like a good topic to address in a blog post.
The examples used in this article are based on questions asked by Adam Machanic.
In my last post, we saw how a query featuring a scalar aggregate could be transformed by the optimizer to a more efficient form. As a reminder, here’s the schema again:
There are two complementary skills that are very useful in query tuning. One is the ability to read and interpret execution plans. The second is knowing a bit about how the query optimizer works to translate SQL text into an execution plan.
Putting the two things together can help us spot times when an expected optimization was not applied, resulting in an execution plan that is not as efficient as it could be.
The lack of documentation around exactly which optimizations SQL Server can apply (and in what circumstances) means that a lot of this comes down to experience, however.
This post is in two parts. The first part looks at the Switch execution plan operator. The second part is about an invisible plan operator and cardinality estimates on filtered indexes.
One of the filtered index use cases mentioned in the product documentation concerns a column that contains mostly NULL values. The idea is to create a filtered index that excludes the NULLs, resulting in a smaller nonclustered index that requires less maintenance than the equivalent unfiltered index.
Another popular use of filtered indexes is to filter NULLs from a UNIQUE index, giving the behaviour users of other database engines might expect from a default UNIQUE index or constraint: Uniqueness enforced only for non-NULL values.
Unfortunately, the query optimizer has limitations where filtered indexes are concerned. This post looks at a couple of less well-known examples.
Since their introduction in SQL Server 2005, window functions like ROW_NUMBER and RANK have proven to be extremely useful in solving a wide variety of common T-SQL problems. In an attempt to generalize such solutions, database designers often look to incorporate them into views to promote code encapsulation and reuse.
Unfortunately, a limitation in the SQL Server query optimizer often means that views1 containing window functions do not perform as well as expected. This post works through an illustrative example of the problem, details the reasons, and provides a number of workarounds.
Note: The limitation described here was first fixed in SQL Server 2017 CU 30. Optimizer fixes must be enabled using trace flag 4199 or the database scoped configuration option.