About This Blog

A copy of my content originally hosted on SQLBlog.com
Showing posts with label Aggregates. Show all posts
Showing posts with label Aggregates. Show all posts

Monday, 12 March 2012

Fun with Scalar and Vector Aggregates

Fun with Scalar and Vector Aggregates

There are interesting things to be learned from even the simplest queries.

For example, imagine you are asked to write a query that lists AdventureWorks product names, where the product has at least one entry in the transaction history table, but fewer than ten.

Tuesday, 6 December 2011

SQL Server Optimizer Bug with JOIN and GROUP BY

SQL Server Optimizer Bug with JOIN and GROUP BY

I came across a SQL Server optimizer bug recently that made me wonder how on earth I never noticed it before.

As the title of this post suggests, the bug occurs in common JOIN and GROUP BY queries. While it does not cause incorrect results to be returned, it will often cause a poor query plan to be selected by the optimizer.

If you are just interested in the bug itself, you will find a description in the section headed “the bug revealed”. It relates to cardinality estimation for serial partial aggregates.

As the regular reader will be expecting though, I am going to work up to it with a bit of background. The lasting value of this post (once the bug is fixed) is in the background details anyway.

Sunday, 4 December 2011

Is Distinct Aggregation Still Considered Harmful?

Is Distinct Aggregation Still Considered Harmful?

Back in 2008, Marc Friedman of the SQL Server Query Processor Team wrote a blog entry entitled “Distinct Aggregation Considered Harmful”.

Marc shows a way to work around the poor performance that often results simply from adding the keyword DISTINCT to an otherwise perfectly reasonable aggregate function in a query.

This post is an update to that work, presenting a query optimizer enhancement in SQL Server 2012 that reduces the need to perform the suggested rewrite manually.

Saturday, 2 July 2011

Undocumented Query Plans: The ANY Aggregate

Undocumented Query Plans: The ANY Aggregate

As usual, here’s a sample table:

CREATE TABLE #Example
(
    pk numeric IDENTITY PRIMARY KEY NONCLUSTERED,
    col1 sql_variant NULL,
    col2 sql_variant NULL,
    thing sql_variant NOT NULL,
);

Some sample data:

Sample data

And an index that will be useful shortly:

CREATE INDEX nc1 
ON #Example
    (col1, col2, thing);

There’s a complete script to create the table and add the data at the end of this post. There’s nothing special about the table or the data (except that I wanted to have some fun with values and data types).

Sunday, 27 February 2011

SQL Server Bug: Slow T-SQL Sums and Averages

SQL Server Bug: Slow T-SQL Sums and Averages

It’s a curious thing about SQL that the SUM or AVG of no items (an empty set) is not zero, it’s NULL.

In this post, you’ll see how this means your SUM and AVG calculations might run at half speed, or worse. As usual though, this entry is not so much about the result, but the journey we take to get there.

Sunday, 22 August 2010

Row Goals and Grouping

Row Goals and Grouping

You might recall from Inside the Optimizer: Row Goals In Depth that query plans containing a row goal tend to favour nested loops or sort-free merge join over hashing.

This is because a hash join has to fully process its build input (to populate its hash table) before it can start probing for matches on its other input. Hash join therefore has a high start-up cost, balanced by a lower per-row cost once probing begins.

In this post, we will take a look at how row goals affect grouping operations.

Wednesday, 28 July 2010

Ranking Function Optimizer Transformations

Ranking Function Optimizer Transformations

In my last post I showed how SQL Server 2005 and later can use a Segment Spool to implement aggregate window functions and the NTILE ranking function.

The query optimizer is also smart enough to recognise that some queries are logically equivalent to a window function, even if they are written using different syntax.

Partitioning and the Common Subexpression Spool

Partitioning and the Common Subexpression Spool

SQL Server 2005 introduced the OVER clause to enable partitioning of rowsets before applying a window function. This post looks at how this feature may require a query plan containing a ‘common subexpression spool’. This query plan construction is required whenever an aggregate window function or the NTILE ranking window function is used.