What Was the TOP PERCENT Bug and How Was It Fixed?

Bug Demo
Letâs create a table with six rows, two in group A and four in group B:
CREATE TABLE dbo.T
(
TID integer NOT NULL PRIMARY KEY,
GroupID char(1) NOT NULL
);
INSERT dbo.T
(TID, GroupID)
VALUES
(1, 'A'),
(2, 'A'),
(3, 'B'),
(4, 'B'),
(5, 'B'),
(6, 'B');
The task is to return half the rows in each group. Any rows will do, so a deterministic query is not required.
A suitable T-SQL expression of the requirement is shown below:
SELECT
Groups.GroupID,
CA.TID,
CA.GroupID
FROM
(
-- Group list
VALUES
('A'),
('B')
) AS Groups (GroupID)
CROSS APPLY
(
-- Half the rows per group
SELECT TOP (50) PERCENT
T.TID,
T.GroupID
FROM dbo.T AS T
WHERE
T.GroupID = Groups.GroupID
) AS CA;
-- Tidy up
DROP TABLE dbo.T;
The correct result is to return one of the two rows from group A and two of the four rows from group B.
Results
If you run the demo on SQL Server 2019 or later, you will get correct results.
On SQL Server 2017 and earlier, the output will be wrong.
In both cases, the execution plan is the same in all respects:
An example of incorrect results obtained on SQL Server 2017 is below:
Notice that two rows (not one) have been returned from group A and three rows (not two) from group B.
An example of correct results (using SQL Server 2022) is:
The problem
The fault lies with the highlighted Eager Index Spool operator.
The spool performs its normal nonclustered index building function of course, but it is also required to count the total number of rows in the source set so the Top operator can calculate how many rows 50 PERCENT
represents.
The issue is that the Eager Index Spool counts six rows (correctly) and the Top operator deduces (also correctly) that three rows is 50% of that count. Which would be fine, but the set is split into two groups (A & B) and the 50% should be computed separately for each.
For correct results, the Top and Eager Index Spool combination should recalculate what 50 PERCENT
means whenever the APPLY
correlated parameter (Groups.GroupID
) changes (a ârebindâ).
The current plan does not do that, so it tries to produce 50% of six rows (three rows) for each group. That explains why we see two rows from group A (there are only two present) and three rows from group B.
The spoolâs single count of six rows was applied uniformly, ignoring the group-specific logic required by the CROSS APPLY
.
Side note: The query could easily be made deterministic by the addition of an ORDER BY
clause. The demo doesnât do this to avoid the optimizer choosing a Sort instead of an Eager Index Spool. This cost based decision depends on the small number of rows in the table.
A larger demo table with a deterministic query would also produce an Eager Index Spool and reproduce the issue.
Potential solutions
At first glance, itâs not easy to see how this issue could be fixed.
The Eager Index Spool only ever reads its input once. Unlike an Eager Table Spool, its input can never depend on correlated parameter values. Even if we were to relax that restriction, thereâs no guarantee the spoolâs subtree could efficiently deliver rows just for the current groupârepeated subtree executions with a filter would rather defeat the main purpose of the spool, which is to cache results.
Another idea would be for the spool to remember the count of rows per group, rather than just the overall total. It would return the per group count for the current value(s) of any correlated parameters. This information would have to be stored somewhere, since we generally donât know ahead of time how many groups there will be at runtime. Storing it in memory isnât really a sensible option without also making the spool an operator that can request a memory grant, which would be quite a bit of engineering effort.
The other option is to store the row count in the spoolâs worktable, but this breaks the general layout of spool worktables and reduces the maximum row size available for user data. Itâs also conceptually a little clunky to store a per-group subtotal in a worktable designed to hold detail rows. This type of solution would require extensive testing.
While these options posed challenges, Microsoft ultimately took a simpler approach.
Chosen solution
Microsoft chose to physically count the rows in the worktable that match the current value of any correlated parameters whenever the Top operator requests row count statistics.
While perhaps not the last word in overall efficiency, this does massively simplify the implementation and testing effort. Instead of returning the single stored total row count, the spool simply counts the matching rows. It knows it can do this relatively cheaply because the spool is indexed on the correlated parameter keys.
For those who enjoy a call stack, here is SQL Server 2022 seeking to the first matching row in the worktable in response to a request for statistics from the Top operator:
The spool then calls sqlmin!RowsetNewSS::GetNextRows
to count each matching row individually:
So there you have it. Microsoft chose to fix this bug by having the spool count rows in its worktable whenever there are correlated parameters and it is asked for statistics.
I couldnât find any reference to this fix in the Cumulative Update documentation, so perhaps it was quietly addressed for SQL Server 2019 RTM.
Note: This bug remains unfixed prior to SQL Server 2019.
Related articles:
- The Eager Index Spool and The Optimizer
- How a Lazy Index Spool Works
- Nested Loops Joins and Performance Spools
- The SQL Server âSegment Applyâ Optimizer Transformations
Thanks for reading.