Wednesday, 01 April 2015

An Indexed View Bug with Scalar Aggregates

Note: The bug described in this article has been fixed in:

Introduction

The general strategy the SQL Server database engine uses to keep an indexed view synchronized with its base tables—which I described in more detail in my last post—is to perform incremental maintenance of the view whenever a data-changing operation occurs against one of the tables referenced in the view. In broad terms, the idea is to:

Collect information about the base table changes.
Apply the projections, filters, and joins defined in the view.
Aggregate the changes per indexed view clustered key.
Decide whether each change should result in an insert, update, or delete against the view.
Compute the values to change, add, or remove in the view.
Apply the view changes.

Or, even more succinctly (albeit at the risk of gross simplification):

Compute the incremental view effects of the original data modifications.
Apply those changes to the view.

This is usually a much more efficient strategy than rebuilding the whole view after every underlying data change (the safe but slow option), but it does rely on the incremental update logic being correct for every conceivable data change, against every possible indexed view definition.

As the title suggests, this article is concerned with an interesting case where this incremental update logic breaks down, resulting in a corrupt indexed view that no longer matches the underlying data. Before we get to the bug itself, we need to quickly review scalar and vector aggregates.

Scalar and Vector Aggregates

In case you are not familiar with the term, there are two types of aggregate in SQL Server. An aggregate that is associated with a GROUP BY clause (even if the group by list is empty) is known as a vector aggregate. An aggregate without a GROUP BY clause is known as a scalar aggregate. Whereas a vector aggregate is guaranteed to produce a single output row for each group present in the data set, scalar aggregates always produce a single output row, even if the input set is empty.

Vector aggregate example

The following AdventureWorks example computes two vector aggregates (a sum and a count) on an empty input set:

-- There are no TransactionHistory records for ProductID 848
-- Vector aggregate produces no output rows
SELECT COUNT_BIG(*) 
FROM Production.TransactionHistory AS TH
WHERE TH.ProductID = 848
GROUP BY TH.ProductID;

SELECT SUM(TH.Quantity)
FROM Production.TransactionHistory AS TH
WHERE TH.ProductID = 848
GROUP BY TH.ProductID;

These queries produce the following output (no rows):

Empty result set with vector aggregation

The result is the same if we replace the GROUP BY clause with an empty set (requires SQL Server 2008 or later):

-- Equivalent vector aggregate queries with
-- an empty GROUP BY column list
-- (SQL Server 2008 and later required)
-- Still no output rows
SELECT COUNT_BIG(*) 
FROM Production.TransactionHistory AS TH
WHERE TH.ProductID = 848
GROUP BY ();

SELECT SUM(TH.Quantity)
FROM Production.TransactionHistory AS TH
WHERE TH.ProductID = 848
GROUP BY ();

The execution plans are identical in both cases as well. This is the execution plan for the count query:

Vector count aggregate plan

Zero rows input to the Stream Aggregate, and zero rows out. The sum execution plan looks like this:

Vector sum aggregate plan

Again, zero rows into the aggregate, and zero rows out.

Scalar aggregates

Now look what happens if we remove the GROUP BY clause from the queries:

-- Scalar aggregate (no GROUP BY clause)
-- Returns a single output row from an empty input
SELECT COUNT_BIG(*) 
FROM Production.TransactionHistory AS TH
WHERE TH.ProductID = 848;

SELECT SUM(TH.Quantity)
FROM Production.TransactionHistory AS TH
WHERE TH.ProductID = 848;

Instead of an empty result, the COUNT aggregate produces a zero and the SUM aggregate returns NULL:

Scalar aggregation results

The count execution plan confirms that zero input rows produce a single row of output from the Stream Aggregate:

Scalar count aggregation plan

The sum execution plan is even more interesting:

Scalar sum aggregation plan

The Stream Aggregate properties show a count aggregate being computed in addition to the sum we asked for:

Stream Aggregate Properties

The new Compute Scalar operator is used to return NULL if the count of rows received by the Stream Aggregate is zero, otherwise it returns the sum of the data encountered:

Compute Scalar Properties

This might all seem strange, but this is how it works:

A vector aggregate of zero rows returns zero rows.
A scalar aggregate always produces exactly one row of output, even for an empty input.
The scalar count of zero rows is zero.
The scalar sum of zero rows is NULL (not zero).

The important point for our present purposes is that scalar aggregates always produce a single row of output, even if it means creating one out of nothing. Also, the scalar sum of zero rows is NULL, not zero.

These behaviours are all “correct” by the way. Things are the way they are because the SQL Standard originally did not define the behaviour of scalar aggregates, leaving it up to the implementation. SQL Server preserves its original implementation for backward compatibility reasons. Vector aggregates have always had well-defined behaviours.

Indexed Views and Vector Aggregation

Now consider a simple indexed view incorporating a couple of (vector) aggregates:

CREATE TABLE dbo.T1 
(
    GroupID integer NOT NULL, 
    Value   integer NOT NULL
);
GO
INSERT dbo.T1
    (GroupID, Value)
VALUES
    (1, 1),
    (1, 2),
    (2, 3),
    (2, 4),
    (2, 5),
    (3, 6);
GO
CREATE VIEW dbo.IV
WITH SCHEMABINDING
AS
SELECT
    T1.GroupID,
    GroupSum = SUM(T1.Value),
    RowsInGroup = COUNT_BIG(*)
FROM dbo.T1 AS T1
GROUP BY
    T1.GroupID;
GO
CREATE UNIQUE CLUSTERED INDEX cuq
ON dbo.IV (GroupID);

The following queries show the content of the base table, the result of querying the indexed view, and the result of running the view query on the table underlying the view:

-- Sample data
SELECT * FROM dbo.T1 AS T1;
-- Indexed view contents
SELECT * FROM dbo.IV AS IV WITH (NOEXPAND);
-- Underlying view query results
SELECT * FROM dbo.IV AS IV OPTION (EXPAND VIEWS);

The results are:

Sample data and indexed view contents

As expected, the indexed view and underlying query return exactly the same results. The results will continue to remain synchronized after any and all possible changes to the base table T1. To remind ourselves how this all works, consider the simple case of adding a single new row to the base table:

INSERT dbo.T1
    (GroupID, Value)
VALUES
    (4, 100);

The execution plan for this insert contains all the logic needed to keep the indexed view synchronized:

Insert indexed view maintenance

The major activities in the plan are:

The Stream Aggregate computes the changes per indexed view key.
The Outer Join to the view links the change summary to the target view row, if any.
The Compute Scalar decides whether each change will require an insert, update, or deletion against the view, and computes the necessary values.
The view update operator physically performs each change to the view clustered index.

There are some plan differences for different change operations against the base table (e.g. updates and deletions), but the broad idea behind keeping the view synchronized remains the same: Aggregate the changes per view key, find the view row if it exists, then perform a combination of insert, update, and delete operations on the view index as necessary.

No matter what changes you make to the base table in this example, the indexed view will remain correctly synchronized—the NOEXPAND and EXPAND VIEWS versions will always return the same result set. This is how things should always work.

Indexed views and scalar aggregation

Now try this example, where the indexed view uses scalar aggregation (no GROUP BY clause in the view):

DROP VIEW dbo.IV;
DROP TABLE dbo.T1;
GO
CREATE TABLE dbo.T1 
(
    GroupID integer NOT NULL, 
    Value   integer NOT NULL
);
GO
INSERT dbo.T1
    (GroupID, Value)
VALUES
    (1, 1),
    (1, 2),
    (2, 3),
    (2, 4),
    (2, 5),
    (3, 6);
GO
CREATE VIEW dbo.IV
WITH SCHEMABINDING
AS
SELECT
    TotalSum = SUM(T1.Value),
    NumRows = COUNT_BIG(*)
FROM dbo.T1 AS T1;
GO
CREATE UNIQUE CLUSTERED INDEX cuq
ON dbo.IV (NumRows);

This is a perfectly legal indexed view; no errors are encountered when creating it. There is one clue that we might be doing something a little strange, though: When it comes time to materialize the view by creating the required unique clustered index, there isn’t an obvious column to choose as the key. Normally, we would choose the grouping columns from the view’s GROUP BY clause.

The script above arbitrarily chooses the NumRows column. That choice isn’t important. Feel free to create the unique clustered index how ever you choose. The view will always contain exactly one row because of the scalar aggregates, so there is no chance of a unique key violation. In that sense, the choice of view index key is redundant, but nevertheless required by SQL Server.

Reusing the test queries from the previous example, we can see that the indexed view works correctly:

SELECT * FROM dbo.T1 AS T1;
SELECT * FROM dbo.IV AS IV OPTION (EXPAND VIEWS);
SELECT * FROM dbo.IV AS IV WITH (NOEXPAND);

Results

Inserting a new row to the base table (as we did with the vector aggregate indexed view) continues to work correctly as well:

INSERT dbo.T1
    (GroupID, Value)
VALUES
    (4, 100);

The execution plan is similar, but not quite identical:

Insert execution plan

The main differences are:

The new Compute Scalar is there for the same reasons as when we compared vector and scalar aggregation results earlier: it ensures a NULL sum is returned (instead of zero) if the aggregate operates on an empty set. This is the required behaviour for a scalar sum of no rows.
The Outer Join seen previously has been replaced by an Inner Join. There will always be exactly one row in the indexed view (due to the scalar aggregation) so there is no question of needing an outer join to test if a view row matches or not. The one row present in the view always represents the entire set of data. This Inner Join has no predicate, so it is technically a cross join (to a table with a guaranteed single row).
The Sort and Collapse operators are present for technical reasons covered in my previous article on indexed view maintenance. They do not affect the correct operation of the indexed view maintenance here.

Many different types of data-changing operations can be performed successfully against the base table T1 in this example—the effects will be correctly reflected in the indexed view. The following change operations against the base table can all be performed while keeping the indexed view correct:

Delete existing rows.
Update existing rows.
Insert new rows.

This might seem like a comprehensive list, but it isn’t.

The bug revealed

The issue is rather subtle and relates (as you might be expecting) to the different behaviours of vector and scalar aggregates. The key points are that a scalar aggregate will always produce an output row, even if it receives no rows on its input, and the scalar sum of an empty set is NULL, not zero.

To cause a problem, all we need do is insert or delete no rows in the base table.

That statement is not as crazy as it might at first sound. The point is that an insert or delete query that affects no base table rows will still update the view, because the scalar Stream Aggregate in the indexed view maintenance portion of the query plan will produce an output row even when it is presented with no input. The Compute Scalar that follows the Stream Aggregate will also generate a NULL sum when the count of rows is zero.

The following script demonstrates the bug in action:

-- So we can undo
BEGIN TRANSACTION;

-- Show the starting state
SELECT * FROM dbo.T1 AS T1;
SELECT * FROM dbo.IV AS IV OPTION (EXPAND VIEWS);
SELECT * FROM dbo.IV AS IV WITH (NOEXPAND);

-- A table variable intended to hold new base table rows
DECLARE @NewRows AS table (GroupID integer NOT NULL, Value integer NOT NULL);

-- Insert to the base table (no rows in the table variable!)
INSERT dbo.T1
SELECT NR.GroupID,NR.Value
FROM @NewRows AS NR;

-- Show the final state
SELECT * FROM dbo.T1 AS T1;
SELECT * FROM dbo.IV AS IV OPTION (EXPAND VIEWS);
SELECT * FROM dbo.IV AS IV WITH (NOEXPAND);

-- Undo the damage
ROLLBACK TRANSACTION;

The output of that script is shown below:

Corruption demo results

The final state of the indexed view’s Total Sum column does not match the underlying view query or the base table data. The NULL sum has corrupted the view, which can be confirmed by running DBCC CHECKTABLE WITH EXTENDED_LOGICAL_CHECKS (on the indexed view).

The execution plan responsible for the corruption is shown below:

Corrupting plan

Zooming in shows the zero-rows input to the Stream Aggregate and the one-row output:

Plan fragment

If you want to try the corruption script above with a delete instead of an insert, here is an example:

-- No rows match this predicate
DELETE dbo.T1
WHERE Value BETWEEN 10 AND 50;

The delete affects no base table rows, but it still changes the indexed view’s sum column to NULL.

Generalizing the bug

You can probably come up with any number of insert, and delete base table queries that affect no rows, and cause this indexed view corruption. However, the same basic issue applies to a broader class of problem than just inserts and deletes that affect no base table rows.

It is possible, for example, to produce the same corruption using an insert that does add rows to the base table. The essential ingredient is that no added rows should qualify for the view. This will result in an empty input to the Stream Aggregate, and the corruption-causing NULL row output from the following Compute Scalar.

One way to achieve this is to include a WHERE clause in the view that rejects some of the base table rows:

ALTER VIEW dbo.IV
WITH SCHEMABINDING
AS
SELECT
    TotalSum = SUM(T1.Value),
    NumRows = COUNT_BIG(*)
FROM dbo.T1 AS T1
WHERE 
    -- New!
    T1.GroupID BETWEEN 1 AND 3;
GO
CREATE UNIQUE CLUSTERED INDEX cuq
ON dbo.IV (NumRows);

Given the new restriction on group IDs included in the view, the following insert will add rows to the base table, but still corrupt the indexed view will a NULL sum:

-- So we can undo
BEGIN TRANSACTION;

-- Show the starting state
SELECT * FROM dbo.IV AS IV OPTION (EXPAND VIEWS);
SELECT * FROM dbo.IV AS IV WITH (NOEXPAND);

-- The added row does not qualify for the view
INSERT dbo.T1
    (GroupID, Value)
VALUES
    (4, 100);

-- Show the final state
SELECT * FROM dbo.IV AS IV OPTION (EXPAND VIEWS);
SELECT * FROM dbo.IV AS IV WITH (NOEXPAND);

-- Undo the damage
ROLLBACK TRANSACTION;

The output shows the now-familiar index corruption:

Filtered view corruption

A similar effect can be produced using a view that contains one or more inner joins. As long as rows added to the base table are rejected (for example by failing to join), the Stream Aggregate will receive no rows, the Compute Scalar will generate a NULL sum, and the indexed view will become corrupted.

Final Thoughts

This problem happens not to occur for update queries (as far as I can tell), but this appears to be more by accident than design—the problematic Stream Aggregate is still present in potentially-vulnerable update plans, but the Compute Scalar that generates the NULL sum is not added (or perhaps optimized away). Please let me know if you manage to reproduce the bug using an update query.

Until this bug is corrected (or, perhaps, scalar aggregates become disallowed in indexed views) be very careful about using aggregates in an indexed view without a GROUP BY clause.

Update

The bug described in this article has been fixed in:

Acknowledgement

This article was prompted by a Connect item (sadly lost when Microsoft retired the site) submitted by Vladimir Moldovanenko, who was kind enough to leave a comment on an old article of mine (which concerns a different indexed view corruption caused by the MERGE statement). Vladimir was using scalar aggregates in an indexed view for sound reasons, so don’t be too quick to judge this bug as an edge case that you will never encounter in a production environment. My thanks to Vladimir for alerting me to his Connect item.