Generally speaking, an index can improve the efficiency of read operations, by providing a data structure that SQL Server can use to seek or scan specific data pages, while skipping non-relevant pages in the process.
Regarding Non-Clustered Indexes, since they typically represent narrower copies of the underlying table, even if a complete index scan is required, the amount of data pages that end up being read can be significantly smaller in comparison to an equivalent Clustered Index scan.
Filtered indexes can improve this process even further, simply by reducing the amount of data comprehended by the index, hence reducing the number of nodes and data pages to be traversed.
In order to illustrate a scenario where the usage of a filtered index can be beneficial, let’s create a store procedure in the StackOverflow2010 database, that will retrieve all questions (PostTypeId=1) with a creation date and a score greater than or equal to the inputted values:
CREATE OR ALTER PROCEDURE GetQuestionsByDateAndMinimumScore (@Score int,@CreationDate datetime) AS BEGIN SELECT p.Id,p.creationDate,u.DisplayName, p.title,p.Score,p.ViewCount,p.AnswerCount FROM Posts p INNER JOIN Users u ON u.id = p.OwnerUserId WHERE p.CreationDate >= @CreationDate AND Score >= @Score AND PostTypeId = 1 ORDER BY CreationDate DESC END
Next, we’ll create a non-filtered index on the table Posts, in order to improve the performance of our query.
CREATE INDEX IX_CreationDate_PostTypeId_Score ON Posts (CreationDate,Score,PostTypeId)
After its creation, it’s time to execute the store procedure and analyze our baseline values.
EXEC GetQuestionsByMinimumScoreAndDate 500, '2009-07-31 00:00:00.000'
In the execution plan, we are able to see that SQL Server used our index to seek the Posts table and ended up reading 2 809 841 rows, which translated into 12 000 logical reads.
Now, the fun part. Since we care only about question posts and they only represent about one third of the table, we could create a filtered index instead.
CREATE INDEX IX_CreationDate_PostTypeId_Score_Filtered ON Posts (CreationDate,Score,PostTypeId) WHERE (PostTypeId = 1)
Finally, after executing the procedure with the same parameters, we should be able to see some improvements.
And indeed we do. This time SQL Server ended up reading 875 361 rows and the IO statistics show 5676 logical reads on the Posts table. Noice.
I’ll Only Create Filtered Indexes From Now On
I mean, you could. You probably shouldn’t though.
In our example we were interested only in querying a specific PostTypeId, but if we needed to address another type, e.g. Answers, our index would be pretty useless and even considering a filtered index would make no sense, because in that particular case, both types represent almost the entire table.
The takeaway is that filtered indexes can be great for performance, but we must always keep in mind the selectivity of what we want to filter and the gains versus costs, because although filtered indexes are lighter by concept, they aren’t exactly free.