Ranking Function Optimizer Transformations
Window Function Transforms
In my last post, I showed how SQL Server 2005 and later can use a Segment Spool to implement aggregate window functions and the NTILE
ranking function.
The query optimizer is also smart enough to recognise that some queries are logically equivalent to a window function, even if they are written using different syntax.
As a reminder, hereā€™s the sample data defining what the AdventureWorks report output should look like:
Hereā€™s a query written to meet the same requirements, but without the SUM...OVER
construction used last time:
SELECT
INV.Bin,
INV.Quantity,
INV.ProductID,
INV.Quantity,
SubAgg.total_in_bin
FROM Production.ProductInventory INV
JOIN
(
-- Calculates the total quantity
-- in each bin
SELECT
INV2.Shelf,
INV2.Bin,
total_in_bin = SUM(INV2.Quantity)
FROM Production.ProductInventory AS INV2
GROUP BY
INV2.Shelf,
INV2.Bin
) AS SubAgg
ON SubAgg.Shelf = INV.Shelf
AND SubAgg.Bin = INV.Bin
WHERE
INV.Shelf >= 'A'
AND INV.Shelf <= 'C'
ORDER BY
INV.Shelf,
INV.Bin;
This query is simplified by the optimizer as if it had been written:
SELECT
INV.Shelf,
INV.Bin,
INV.ProductID,
INV.Quantity,
total_in_bin =
SUM(INV.Quantity) OVER (
PARTITION BY INV.Shelf, INV.Bin)
FROM Production.ProductInventory AS INV
WHERE
INV.Shelf BETWEEN 'A' AND 'C'
ORDER BY
INV.Shelf,
INV.Bin;
Both query forms produce the same Segment Spool query plan:
There are many limitations on the query forms that can be simplified to the logically-equivalent window function representation.
For example, changing the predicates on INV.Shelf
to IN (N'A', N'B', N'C')
in the first query breaks the transformation, producing a quite different plan where the aggregation is pushed below a merge join:
This is still a good plan, but it has a somewhat higher estimated cost (0.013) compared to the Segment Spool plan (0.0074). Itā€™s not a huge absolute difference, but it does show that optimizer support for this transformation is relatively shallow at present.
Another example of this lack of depth is that the transformation is not performed in reverse: A query written to use an aggregate window function (and therefore the Segment Spool) cannot be transformed to a join equivalent (as in the Merge Join plan shown above).
This is not the result a cost-based decision; there are examples where the Merge Join plan has a lower estimated cost, but the optimizer still does not consider it. In such cases, we would need to rewrite the original query to help the optimizer out.
In fact, if we disable the optimizer rule that transforms a partitioned windowed aggregate into a Segment Spool, the SUMā€¦OVER
form of the query shown above fails to compile at all:
Internal Query Processor Error: The query processor could not produce a query plan. For more information, contact Customer Support Services.
More details on that sort of dangerous mucking about with optimizer internals will be in future posts. Thanks for reading.