Thursday, 05 August 2010

Iterators, Query Plans, and Why They Run Backwards

Iterators

SQL Server uses an extensible architecture for query optimization and execution, using iterators as the basic building blocks. Iterators are probably most familiar in their graphical showplan representation, where each icon represents a single iterator. They also show up in XML query plan output as RelOp nodes:

Iterators

Each iterator performs a single simple function, such as applying a filtering condition, or performing an aggregation. It can represent a logical operation, a physical operation, or (most often) both. For example, Aggregate is a logical operation. Stream Aggregate and Hash Aggregate are physical operations. Similarly, Inner Join is a logical operation, and Nested Loops is a physical one.

Both graphical plans and XML showplan output show these properties:

Physical and Logical Operations

Logical iterators can be combined in complex ways to satisfy any requirement that can be expressed in SQL. The physical iterators combine to produce an executable tree used by the execution engine to produce results matching the logical query specification. The logical iterators represent the original query translated into relational algebra. The physical iterators represent a physical algebra; the concrete algorithms used by the SQL Server storage engine and query processor to perform a logical task.

Those of you that followed my series of posts on optimizer internals may remember that implementation rules in the query optimizer convert a logical requirement to one of the possible physical implementations. Each round of exploration rules expand the search space by applying logically-equivalent transformations, before implementation rules run to introduce physical iterators. Once that’s done, cost-based decisions can be made to select a final plan for execution.

A Common Interface and Design

Iterators are code objects, which share a common interface of methods and properties. The most frequently used of these are the methods used to process rows, set and retrieve property information, and produce a cost estimate for the optimizer. These common features (and the abstractness of the design) enable easier extensibility as new database and language features are introduced. It also allows iterators to be connected in a tree structure without any of them needing to know much about the other iterator(s) it might be connected to.

The executable code that forms the core of the iterator implementation is also written to avoid hard-wired dependencies on the parent and child iterators in the plan tree where possible. This avoids the need to recompile the iterator base code each time it is incorporated in a new plan. Those of you that have some programming experience will recognise this arrangement as one that uses virtual function pointers.

All iterators require a small fixed amount of memory to store state information (think local variables) while they are executing. This memory forms part of the compiled plan, so a formal memory grant is not required each time a plan is executed. The cached size of a plan includes this memory allocation.

The common methods most relevant to the present discussion are those used to process a set of rows. These are the Open or Init, GetNext, and Close methods. These names might give you a clue as to why query plan nodes are known as iterators.

Plan Execution

Let’s look at a simple plan (the query itself isn’t important):

Simple plan

Have you ever wondered how a plan like that is executed? Where does it start, and how does it know when it has finished?

Judging from the lines and arrows that connect the iterators in a graphical plan, we might assume that execution starts with the data-producing iterators (the scans) and continues until they run out of rows. The arrow heads on the connecting lines do indicate the direction of data flow, but the iterator execution order is quite different. Query execution always begins with the root iterator—the hash join in the diagram above (though technically execution starts with the SELECT icon).

The execution engine only directly interacts with this root iterator. It calls its Open method to prepare it to return data, and then repeatedly calls its GetNext method to retrieve one row of query output at a time. As soon as the GetNext method call fails to return a new row, the engine calls the Close method, and query execution ends. Simple as that.

The Ultimate Cursors

Ok, so there’s a bit more to it than that. Let’s see what happens when the Open method is called on the hash join iterator:

The hash join calls the Open method on its build input.
The Open method of the Scan iterator prepares the storage engine to start producing rows from the Product table. Control then returns to the hash join (which is still running its Open method, remember).
The hash join repeatedly calls the GetRow method on the Scan iterator.
Each GetRow call returns a single row from the Scan to the hash join. These rows are used to build the join hash table.
As soon as a call to GetRow on the Product table Scan reports that there are no more rows, the hash join calls the Close method on the Scan.
The Scan iterator performs its clean-up tasks, and control again returns to the hash join.
The hash join now calls Open on its inner input (the probe input).
The Inventory table Scan iterator prepares its connection to the storage engine, and returns control to the hash join.
The hash join returns from its Open method call (finally!) and control returns to the execution engine.

The execution engine now repeatedly calls GetRow on the hash join, to return the query results:

The hash join calls GetRow on its probe input.
The Scan’s GetRow method returns a single row from the Inventory table.
The hash join performs the join using its hash table.
If the row does not join, go back to step 1.
If the row does join, return it to the engine.

This process continues until the probe input Scan reports that no more rows are available. The hash join then also returns from its GetRow method without producing a row. When that happens, the execution engine knows that the query is complete, and calls the Close method on the hash join, to allow it to clean up:

The hash join calls Close on its probe input.
The Scan performs its clean-up tasks and returns control to the hash join.
The hash join deallocates its hash table, and returns from the Close method.

This row-by-row processing—calling the GetNext method repeatedly—is why query plan nodes are referred to as iterators: They iterate over a set of input rows, returning one row at a time to its parent. Execution of a query plan starts at the root, and flows down the plan via a series of nested method calls. Data is pulled up the plan in the reverse direction: from the leaves toward the root, a row at a time. Query plans do run backwards 🙂

Inside an Iterator

Exactly what is done in the Open, GetNext, and Close methods depends on the type of iterator, and the mode in which it is running.

The following table gives a broad outline of the work performed by different iterators when these methods are called:

Iterator activity

There is a lot more to say about query plan execution and iterators, so I will return to this topic in a future post.