Reordering Concatenation Operator Inputs
Introduction
The concatenation of two or more data sets is most commonly expressed in T-SQL using the UNION ALL
clause. Given that the SQL Server optimizer can often reorder things like joins and aggregates to improve performance, it is quite reasonable to expect that SQL Server would also consider reordering concatenation inputs where this would provide an advantage. For example, the optimizer could consider the benefits of rewriting A UNION ALL B
as B UNION ALL A
.
In fact, the SQL Server optimizer does not do this. More precisely, there was some limited support for concatenation input reordering in SQL Server releases up to 2008 R2, but this was removed in SQL Server 2012.
SQL Server 2008 R2
Intuitively, the order of concatenation inputs only matters if there is a row goal. By default, SQL Server optimizes execution plans on the basis that all qualifying rows will be returned to the client. When a row goal is in effect, the optimizer tries to find an execution plan that will produce the first few rows quickly.
Row goals can be set in a number of ways, for example using TOP
, a FAST n
query hint, or by using EXISTS
. Where there is no row goal (i.e. the client requires all rows), it does not generally matter in which order the concatenation inputs are read: Each input will be fully processed eventually in any case.
The limited support in versions up to SQL Server 2008 R2 applies where there is a goal of exactly one row. In this specific circumstance only, SQL Server will reorder concatenation inputs on the basis of expected cost. This is not done during cost-based optimization (as one might expect), but rather as a last-minute post-optimization rewrite of the normal optimizer output. This arrangement has the advantage of not increasing the cost-based plan search space (potentially one alternative for each possible reordering), while still producing a plan that is optimized to return the first row quickly.
Examples
The following examples use two tables with identical contents: A million rows of integers from 1 to a 1,000,000. One table is a heap with no nonclustered indexes; the other has a unique clustered index:
CREATE TABLE dbo.Expensive
(
Val bigint NOT NULL
);
CREATE TABLE dbo.Cheap
(
Val bigint NOT NULL,
CONSTRAINT [PK dbo.Cheap Val]
UNIQUE CLUSTERED (Val)
);
GO
INSERT dbo.Cheap WITH (TABLOCKX)
(Val)
SELECT TOP (1000000)
Val = ROW_NUMBER() OVER (ORDER BY SV1.number)
FROM master.dbo.spt_values AS SV1
CROSS JOIN master.dbo.spt_values AS SV2
ORDER BY
Val
OPTION (MAXDOP 1);
GO
INSERT dbo.Expensive WITH (TABLOCKX)
(Val)
SELECT
C.Val
FROM dbo.Cheap AS C
OPTION (MAXDOP 1);
No row goal
The following query looks for the same rows in each table, and returns the concatenation of the two sets:
SELECT E.Val
FROM dbo.Expensive AS E
WHERE
E.Val BETWEEN 751000 AND 751005
UNION ALL
SELECT C.Val
FROM dbo.Cheap AS C
WHERE
C.Val BETWEEN 751000 AND 751005;
The execution plan produced by the query optimizer is:
The warning on the root SELECT
operator is alerting us to the obvious missing index on the heap table. The warning on the Table Scan operator is added by Plan Explorer. It is drawing our attention to the I/O cost of the residual predicate hidden within the scan.
The order of the inputs to the Concatenation does not matter here, because we have not set a row goal. Both inputs will be fully read to return all result rows. Of interest (though this is not guaranteed) notice that the order of the inputs follows the textual order of the original query. Observe also that the order of the final result rows is not specified either, since we did not use a top-level ORDER BY
clause. We will assume that is deliberate and final ordering is inconsequential to the task at hand.
If we reverse the written order of the tables in the query like so:
SELECT C.Val
FROM dbo.Cheap AS C
WHERE
C.Val BETWEEN 751000 AND 751005
UNION ALL
SELECT E.Val
FROM dbo.Expensive AS E
WHERE
E.Val BETWEEN 751000 AND 751005;
The execution plan follows the change, accessing the clustered table first (again, this is not guaranteed):
Both queries may be expected to have the same performance characteristics, as they perform the same operations, just in a different order.
With a row goal
Clearly, the lack of indexing on the heap table will normally make finding specific rows more expensive, compared with the same operation on the clustered table. If we ask the optimizer for a plan that returns the first row quickly, we would expect SQL Server to reorder the concatenation inputs so the cheap clustered table is consulted first.
Using the query that mentions the heap table first, and using a FAST 1
query hint to specify the row goal:
SELECT E.Val
FROM dbo.Expensive AS E
WHERE
E.Val BETWEEN 751000 AND 751005
UNION ALL
SELECT C.Val
FROM dbo.Cheap AS C
WHERE
C.Val BETWEEN 751000 AND 751005
OPTION (FAST 1);
The estimated execution plan produced on an instance of SQL Server 2008 R2 is:
Notice that the concatenation inputs have been reordered to reduce the estimated cost of returning the first row. Note also that the missing index and residual I/O warnings have disappeared. Neither issue is of consequence with this plan shape when the goal is to return a single row as quickly as possible.
The same query executed on SQL Server 2016 (using either cardinality estimation model) is:
SQL Server 2016 has not reordered the concatenation inputs. The Plan Explorer I/O warning has returned, but sadly the optimizer has not produced a missing index warning this time (though it is relevant).
General reordering
As mentioned, the post-optimization rewrite that reorders concatenation inputs is only effective for:
- SQL Server 2008 R2 and earlier
- A row goal of exactly one
If we genuinely only want one row returned, rather than a plan optimized to return the first row quickly (but which will ultimately still return all rows), we can use a TOP
clause with a derived table or common table expression (CTE):
SELECT TOP (1)
UA.Val
FROM
(
SELECT E.Val
FROM dbo.Expensive AS E
WHERE
E.Val BETWEEN 751000 AND 751005
UNION ALL
SELECT C.Val
FROM dbo.Cheap AS C
WHERE
C.Val BETWEEN 751000 AND 751005
) AS UA;
On SQL Server 2008 R2 or earlier, this produces the optimal reordered-input plan:
On SQL Server 2012, 2014, and 2016 no post-optimization reordering occurs:
If we want more than one row returned, for example using TOP (2)
, the desired rewrite will not be applied on SQL Server 2008 R2 even if a FAST 1
hint is also used. In that situation, we need to resort to tricks like using TOP
with a variable and an OPTIMIZE FOR
hint:
DECLARE @TopRows bigint = 2; -- Number of rows actually needed
SELECT TOP (@TopRows)
UA.Val
FROM
(
SELECT E.Val
FROM dbo.Expensive AS E
WHERE
E.Val BETWEEN 751000 AND 751005
UNION ALL
SELECT C.Val
FROM dbo.Cheap AS C
WHERE
C.Val BETWEEN 751000 AND 751005
) AS UA
OPTION (OPTIMIZE FOR (@TopRows = 1)); -- Just a hint
The query hint is sufficient to set a row goal of one, while the runtime value of the variable ensures the desired number of rows (2) is returned.
The actual execution plan on SQL Server 2008 R2 is:
Both rows returned come from the reordered seek input, and the Table Scan is not executed at all. Plan Explorer shows the row counts in red because the estimate was for one row (due to the hint) whereas two rows were encountered at run time.
Without UNION ALL
This issue is also not limited to queries written explicitly with UNION ALL
. Other constructions such as EXISTS
and OR
can also result in the optimizer introducing a concatenation operator, which may suffer from the lack of input reordering. There was a recent question on Database Administrators Stack Exchange with exactly this issue. Transforming the query from that question to use our example tables:
SELECT
CASE
WHEN
EXISTS
(
SELECT 1
FROM dbo.Expensive AS E
WHERE E.Val BETWEEN 751000 AND 751005
)
OR EXISTS
(
SELECT 1
FROM dbo.Cheap AS C
WHERE C.Val BETWEEN 751000 AND 751005
)
THEN 1
ELSE 0
END;
The execution plan on SQL Server 2016 has the heap table on the first input:
On SQL Server 2008 R2 the order of the inputs is optimized to reflect the single row goal of the semi join:
In the more optimal plan, the heap scan is never executed.
Workarounds
In some cases, it will be apparent to the query writer that one of the concatenation inputs will always be cheaper to run than the others. If that is true, it is quite valid to rewrite the query so that the cheaper concatenation inputs appear first in written order. Of course this means the query writer needs to be aware of this optimizer limitation, and prepared to rely on undocumented behaviour.
A more difficult issue arises when the cost of the concatenation inputs varies with the circumstances, perhaps depending on parameter values. Using OPTION (RECOMPILE)
will not help on SQL Server 2012 or later. That option may assist on SQL Server 2008 R2 or earlier, but only if the single row goal requirement is also met.
If there are concerns about relying on observed behaviour (query plan concatenation inputs matching the query textual order) a plan guide can be used to force the plan shape. Where different input orders are optimal for different circumstances, multiple plan guides may be used, where the conditions can be accurately coded in advance. This is hardly ideal though.
Final Thoughts
The SQL Server query optimizer does in fact contain a cost-based exploration rule, UNIAReorderInputs
, which is capable of generating concatenation input order variations and exploring alternatives during cost-based optimization (not as a single-shot post-optimization rewrite).
This rule is not currently enabled for general use. As far as I can tell, it is only activated when a plan guide or USE PLAN
hint is present. This allows the engine to successfully force a plan that was generated for a query that qualified for the input-reordering rewrite, even when the current query does not qualify.
My sense is that this exploration rule is deliberately limited to this use, because queries that would benefit from concatenation input reordering as part of cost-based optimization are considered not sufficiently common, or perhaps because there is a concern that the extra effort would not pay off. My own view is that Concatenation operator input reordering should always be explored when a row goal is in effect.
It is also a shame that the (more limited) post-optimization rewrite is not effective in SQL Server 2012 or later. This might have been due to a subtle bug, but I could not find anything about this in the documentation, knowledge base, or on Connect. I added a new Connect item, but it was lost when Microsoft retired that site.
Update 9 August 2017
This issue was addressed under trace flag 4199 for SQL Server 2014 and 2016, see KB 4023419:
Input reordering is still limited to cases where the row goal is exactly one row.
The trace flag is not required from SQL Server 2017 to SQL Server 2022. The single-row limitation remaims.