The Eager Index Spool and The Optimizer
Introduction
An Eager Index Spool reads all rows from its child operator into an indexed worktable before it starts returning rows to its parent operator. In some respects, an eager index spool is the ultimate missing index suggestion, but it is not reported as such.
Cost assessment
Inserting rows into an indexed worktable is relatively low-cost, but not free. The optimizer must consider that the work involved saves more than it costs. For that to work out in the spool’s favour, the plan must be estimated to consume rows from the spool more than once. Otherwise, it might as well skip the spool, and just do the underlying operation that one time.
- To be accessed more than once, the spool must appear on the inner side of a nested loops join operator.
- Each iteration of the loop should seek to a particular index spool key value provided by the outer side of the loop.
That means that the join needs to be an apply, not a nested loops join. For the difference between the two, please see my article Apply versus Nested Loops Join.
Notable features
While an eager index spool may only appear on the inner side of an nested loops apply, it is not a “performance spool”. An eager index spool cannot be disabled with trace flag 8690 or the NO_PERFORMANCE_SPOOL
query hint.
Rows inserted into the index spool are not normally pre-sorted into index key order, which can result in index page splits. Undocumented trace flag 9260 can be used to generate a Sort operator before the index spool to avoid this. The downside is the extra sorting cost may dissuade the optimizer from choosing the spool option at all.
SQL Server does not support parallel inserts to a b-tree index. This means that everything below a parallel eager index spool runs on a single thread. The operators below the spool are still (misleadingly) marked with the parallelism icon. One thread is chosen to write to the spool. The other threads wait on EXECSYNC
while that completes. Once the spool is populated, it may be read from by parallel threads.
Index spools do not tell the optimizer they support output ordered by the spool’s index keys. If sorted output from the spool is required, you may see an unnecessary Sort operator. Eager index spools should often be replaced by a permanent index anyway, so this is a minor concern much of the time.
There are five optimizer rules that can generate an Eager Index Spool option (known internally as an index on-the-fly). We will look at three of these in detail to understand where eager index spools come from.
SelToIndexOnTheFly
This is the most common one. It matches one or more relational selections (a.k.a filters or predicates) just above a data-access operator. The SelToIndexOnTheFly
rule replaces the predicates with a seek predicate on an eager index spool.
Demo
An AdventureWorks sample database example is shown below:
SELECT
P.ProductID,
P.[Name],
P.SafetyStockLevel,
TH.Quantity
FROM Production.Product AS P
CROSS APPLY
(
SELECT MAX(TH.Quantity)
FROM Production.TransactionHistory AS TH
WHERE
TH.ProductID = P.ProductID
AND TH.Quantity < P.SafetyStockLevel
GROUP BY ()
) AS TH (Quantity)
WHERE
P.[Name] LIKE N'A%';
This execution plan has an estimated cost of 3.0881 units. Some points of interest:
- The Nested Loops Inner Join operator is an apply, with
ProductID
andSafetyStockLevel
from theProduct
table as outer references. - On the first iteration of the apply, the Eager Index Spool is fully populated from the Clustered Index Scan of the
TransactionHistory
table. - The spool’s worktable has a clustered index keyed on
(ProductID, Quantity)
. - Rows matching the predicates
TH.ProductID = P.ProductID
andTH.Quantity < P.SafetyStockLevel
are answered by the spool using its index. This is true for every iteration of the apply, including the first one. - The
TransactionHistory
table is only scanned once.
Sorted input to the spool
It is possible to enforce sorted input to the eager index spool, but this does affect estimated cost, as noted in the introduction. For the example above, enabling the undocumented trace flag produces a plan without a spool:
SELECT
P.ProductID,
P.[Name],
P.SafetyStockLevel,
TH.Quantity
FROM Production.Product AS P
CROSS APPLY
(
SELECT
MAX(TH.Quantity)
FROM Production.TransactionHistory AS TH
WHERE
TH.ProductID = P.ProductID
AND TH.Quantity < P.SafetyStockLevel
GROUP BY ()
) AS TH (Quantity)
WHERE
P.[Name] LIKE N'A%'
OPTION (QUERYTRACEON 9260);
The estimated cost of this Index Seek and Key Lookup plan is 3.11631 units. This is more than the cost of the plan with an index spool alone, but less than the plan with an index spool and sorted input.
To see a plan with sorted input to the spool, we need to increase the expected number of loop iterations. This gives the spool a chance to repay the extra cost of the Sort. One way to expand the number of rows expected from the Product
table is to make the Name
predicate less restrictive:
SELECT
P.ProductID,
P.[Name],
P.SafetyStockLevel,
TH.Quantity
FROM Production.Product AS P
CROSS APPLY
(
SELECT
MAX(TH.Quantity)
FROM Production.TransactionHistory AS TH
WHERE
TH.ProductID = P.ProductID
AND TH.Quantity < P.SafetyStockLevel
GROUP BY ()
) AS TH (Quantity)
WHERE
P.[Name] LIKE N'[A-P]%'
OPTION (QUERYTRACEON 9260);
This gives us an execution plan with sorted input to the spool:
Warning: Do not use this undocumented trace flag for anything except educational purposes. It is not supported and appears not to be very well tested. I have experienced more than a few retail assertions and SQL Server stack dumps when generating plans with this flag enabled.
JoinToIndexOnTheFly
This rule transforms an inner join to an apply, with an eager index spool on the inner side. At least one of the join predicates must be an inequality for this rule to be matched. This is a much more specialized rule than SelToIndexOnTheFly
, but the idea is much the same. In this case, the selection (predicate) being transformed to an index spool seek is associated with the join. The transformation from join to apply allows the join predicate to be moved from the join itself to the inner side of the apply.
Demo
SELECT
P.ProductID,
P.[Name],
P.SafetyStockLevel,
Quantity = MAX(TH.Quantity)
FROM Production.Product AS P
JOIN Production.TransactionHistory AS TH
ON TH.ProductID = P.ProductID
AND TH.Quantity < P.SafetyStockLevel
WHERE
P.[Name] LIKE N'[A-P]%'
GROUP BY
P.ProductID,
P.[Name],
P.SafetyStockLevel
OPTION (LOOP JOIN);
As before, we can request sorted input to the spool:
SELECT
P.ProductID,
P.[Name],
P.SafetyStockLevel,
Quantity = MAX(TH.Quantity)
FROM Production.Product AS P
JOIN Production.TransactionHistory AS TH
ON TH.ProductID = P.ProductID
AND TH.Quantity < P.SafetyStockLevel
WHERE
P.[Name] LIKE N'[A-P]%'
GROUP BY
P.ProductID,
P.[Name],
P.SafetyStockLevel
OPTION (LOOP JOIN, QUERYTRACEON 9260);
This time, the extra cost of sorting has encouraged the optimizer to choose a parallel plan.
An unwelcome side-effect is the Sort operator spills to tempdb. The total memory grant available for sorting is sufficient, but it is evenly split between parallel threads (as usual). As noted in the introduction, SQL Server does not support parallel inserts to a b-tree index, so the operators below the eager index spool run on a single thread. This single thread only gets a fraction of the memory grant, so the Sort spills to tempdb.
This side-effect is perhaps another reason the trace flag is undocumented and unsupported.
SelSTVFToIdxOnFly
This rule does the same thing as SelToIndexOnTheFly
, but for a streaming table-valued function (sTVF) row source. These sTVFs are used extensively internally to implement DMVs and DMFs among other things. They appear in modern execution plans as Table Valued Function operators (originally as remote table scans).
In the past, many of these sTVFs could not accept correlated parameters from an apply. They could accept literals, variables, and module parameters, just not apply outer references. There are still warnings about this in the documentation, but they are somewhat out of date now.
Anyway, the point is that sometimes it is not possible for SQL Server to pass an apply outer reference as a parameter to an sTVF. In that situation, it can make sense to materialize part of the sTVF result in an eager index spool. The present rule provides that ability.
Demo
The next code example shows a DMV query that is successfully converted from a join to an apply. Outer references are passed as parameters to the second DMV:
-- Transformed to an apply
-- Outer reference passed as a parameter
SELECT
DES.session_id,
DES.login_time,
DESWS.waiting_tasks_count
FROM sys.dm_exec_sessions AS DES
JOIN sys.dm_exec_session_wait_stats AS DESWS
ON DESWS.session_id = DES.session_id
OPTION (FORCE ORDER);
The plan properties of the wait stats TVF show the input parameters. The second parameter value is provided as an outer reference from the sessions DMV:
It is a shame that sys.dm_exec_session_wait_stats
is a view, not a function, because that prevents us writing an apply directly.
The rewrite below is enough to defeat the internal conversion:
-- Rewrite to avoid TVF parameter trickery
SELECT
DES.session_id,
DES.login_time,
DESWS.waiting_tasks_count
FROM sys.dm_exec_sessions AS DES
JOIN sys.dm_exec_session_wait_stats AS DESWS
ON DESWS.session_id >= DES.session_id
AND DESWS.session_id <= DES.session_id
OPTION (FORCE ORDER);
With the session_id
predicates now not consumed as parameters, the SelSTVFToIdxOnFly
rule is free to convert them to an eager index spool:
I don’t want to leave you with the impression that tricky rewrites are needed to get an eager index spool over a DMV source—it just makes for an easier demo. If you do happen to encounter a query with DMV joins that produces a plan with an eager spool, at least you know how it got there.
You can’t create indexes on DMVs, so you may need to use a hash or merge join if the execution plan does not perform well enough.
Recursive CTEs
The remaining two rules are SelIterToIdxOnFly
and JoinIterToIdxOnFly
. They are direct counterparts to SelToIndexOnTheFly
and JoinToIndexOnTheFly
for recursive CTE data sources. These are extremely rare in my experience, so I am not going to provide demos for them. (Just so the Iter
part of the rule name makes sense: It comes from the fact that SQL Server implements tail recursion as nested iteration.)
When a recursive CTE is referenced multiple times on the inside of an apply, a different rule (SpoolOnIterator
) can cache the result of the CTE:
WITH R AS
(
SELECT 1 AS n
UNION ALL
SELECT R.n + 1
FROM R
WHERE R.n < 10
)
SELECT
R1.n
FROM R AS R1
CROSS JOIN R AS R2;
The execution plan features a rare Eager Row Count Spool:
Final Thoughts
Eager index spools are often a sign that a useful permanent index is missing from the database schema. This is not always the case, as the streaming table-valued function examples show. Thank you for reading.