Performance Woes & Wows #2: LIKE ‘%Make This Query Go Slower Please%’

Sargability is a funny word that’s used to characterize the ability of the engine to take advantage of and index, in order to speed up the execution of a query. The legend says that it is derived from a contraction of Search ARGument ABLE.

Usually, when some writes a query, the goal it’s to retrieve the desired information, as fast as possible. What’s interesting is that “as fast as possible” can either refer to “execution time” or “time I will spend writing this query” and we all know which one is, most of the times. As a rule of thumb, queries must be written in the simplest way possible, being the underlying reason that if it’s easy for you to read, it will also be easier for the Optimizer to read and process it.

But that’s not always true, for example, we use functions because they save us effort and improve query readability, but typically they don’t make Optimizer’s life easier. Au contraire, because they tend to make predicates non-sargable. On other occasions, bighearted developers decide to allow the customer to search on every damn existing column, using LIKE with leading and trailing % wildcards and that’s also a sargable capital sin, as we’re going to look further.

Non-Sargable Predicates

So, let’s talk a bit more about bad ideas. Not those that start with “Hold my beer and watch this”, but instead “It runs fast on my machine, let’s take it to Production”.

Non-sargable predicates have a negative impact on query performance, e.g. by causing poor cardinality estimates or inability to perform index seeks, which consequently will lead to inappropriate plan choices and increased CPU consumption.

The following list include some of the most common patterns that kill predicate sargability. We’ll very briefly overview them and posteriorly focus on one specific case.

ColumnX = Function(ColumnX): As already pointed out, functions can be naughty. One might be tempted to use functions because they encapsulate logic in a very practical way to us humans, but this often implies performance loss. The query presented below, where we extract the year from a date column, comparing it against a specific value, represents one of those cases. SQL Server won’t be able to use any index efficiently, even if we have an index on that field. We’ll shortly discuss why this happens.

SELECT COUNT(1) FROM Matches WHERE YEAR(Date) = 1999

ColumnX = ColumnY + ColumnZ: A similar behaviour is observed when we use the result of arithmetic operations between columns to retrieve results of a particular column. It’s important to stress that a predicate containing an arithmetic operation between a column and a constant literal value would also be non-sargable.

SELECT * 
FROM Matches 
WHERE FullTimeGoals = HalfTimeHomeGoals + HalfTimeAwayGoals

@Parameter1 IS NULL OR ColumnX = @Parameter1: The heart of that single query where search dreams come true and someone got really proud of it. It’s possible to search for everything. Slowly. In runtime SQL Server won’t be able to figure this out, hence the non-sargability. There are a couple of ways to get around this, being one of them to add OPTION(RECOMPILE). This hint enables Parameter Embedding Optimization, which gives the Parser the ability to replace query parameters with literal constant values during query parsing and enables further complex simplifications, like removing contradictions. Nevertheless, as this hint forces the recompilation of a statement, it introduces some overhead that needs to be taken into account.

CREATE OR ALTER PROCEDURE SearchNonFriendlyMatch 
(
    @Hometeam INT = NULL,
    @Awayteam INT = NULL,
    @Date DATE    = NULL,
    @FTG TINYINT  = NULL
)AS
BEGIN
	SELECT 
		HomeTeam,AwayTeam,Date, FullTimeGoals
	FROM 
		Matches
	WHERE
		League IS NOT NULL AND
		(@HomeTeam IS NULL OR HomeTeam = @HomeTeam) OR 
		(@AwayTeam IS NULL OR HomeTeam = @HomeTeam) OR 
		(@Date IS NULL OR [Date] = @Date) OR 
		(@FTG IS NULL OR FullTimeGoals >= @FTG)
END

ColumnX LIKE ‘%Make This Query Go Slower Please%’

When we have this wildcard prefixing and suffixing a string, we’re telling SQL Server to retrieve all records that contain the specified text, regardless of what comes before or after it.

SELECT  TOP 5 * FROM Teams WHERE Name LIKE '%ar%'

The execution of the previous query returns these results:

IdCountryName
7PortugalSC Beira-Mar
8EnglandCardiff City FC
23EnglandArsenal FC
28SpainFC Barcelona
68ItalyParma

But why is this bad for performance?

Let’s put ourselves in SQL Server shoes and pretend that we’ve received a list containing names of football teams, ordered alphabetically. We need to count how many of these names contain the letters “ar”. We roll up our sleeves and start analyzing all characters from the first record, then the second, then the third, and so on, until we reach the last item in the list. We had no alternative but to scan through all values, simply because the order of this list wasn’t helpful at all. Recalling the table above it’s easy to understand why: the substring can be everywhere, there is no shortcut, no possibility to discard any subset of these names. SQL Server will have the same exact struggle scanning through all the index represented by our list.

Now, let’s execute a similar query:

SELECT  COUNT(1) FROM Teams WHERE t.Name LIKE '%Benf%'

And look into its execution plan:

This is a pretty small table but it’s enough to prove the point. This query ran in about 600 miliseconds and IO statistics show 1 Scan Count and 1226 logical reads. The cost is almost entirely in the Index Scan, that was chosen even thought SQL Server estimated just 9 rows to be retrieved. Well, that wasn’t really a choice, in this case an Index Scan was made because the non-sargable search prevented SQL Server from using an Index Seek, as we already know.

With that being said, let’s remove the leading wildcard and execute the query:

SELECT  COUNT(1) FROM Teams t WHERE t.Name LIKE 'Benf%'

And this time, it is possible to confirm by the execution plan that the query was much more efficient using the Index Seek.

It has executed in virtually no time and IO statistics show 1 Scan Count and 3 logical reads.

If we return to our ordered list, it’s very easy to understand why the query executed so much faster: Now we can benefit from that alphabetical order, going directly to the letter B, looking through a few lines and quickly return the row that matches the search.

What Should I Do, Then?

We saw that the sargable version of our query ran in virtually no time, while the non-sargable version took about 600ms and even though proportionally this gap is huge, users don’t complain about half a second queries. Nevertheless, the real problem with this type of queries is they won’t scale. When you try to run this query against a table with millions of rows instead of a couple thousand, it will be much more painful. You’ve been warned.

But what can we do to avoid it?

The first and obvious option would be to analyze if that leading wildcard is really needed.
If you can simply get rid of it, problem solved! If not, there’s still hope…

One way to improve these searches is by using Full Text Search (FTS). FTS is able to quickly find all matches for a term search in a table, because it doesn’t have to scan rows. Instead it takes advantage of text indexes that store positional information for all terms found in the columns where we create the text index on. Instead of using LIKE operator, FTS enables the use of CONTAINS/CONTAINSTABLE or FREETEXT/FREETEXTTABLE, as exemplified below:

SELECT COUNT(1) FROM Teams t WHERE CONTAINS((t.Name), @Name) 

FTS has a few limitations and requires some setup effort, but I often work with it and it can be pretty awesome.

Additionally, there are newer, hotter technologies designed to solve these issues. For example ElasticSearch, that allows fast text searches in a very scalable way. It also allows pretty complex searches but, to me, the nicest advantage comparing to Full-Text Search is the possibility to take the heavy processing completely outside SQL Server. CPU cores in SQL Server can be very expensive and should be treated with respect, my boss says.

Now you now why you should avoid queries LIKE ‘%these%’ and you have some ideias to improve them. Hope this was helpful to you 🖖

Performance Woes & Wows #1: Implicit Conversions

Implicit conversion… Sounds very clever, so it can’t be that bad, right? Well, in fact, implicit conversions can be huge performance killers and silent enough to stay off the radar, if we are not aware. The good news is that they tend to be very easy to solve, so let’s make an easy buck.

The story begins with mismatched data types on a JOIN operator or a WHERE clause that force SQL Server to implicitly convert values, on the fly. This can be as subtle as mistaking a VARCHAR for an NVARCHAR, although the consequences can be pretty awful. But, before diving into the problem itself, let’s talk about conversions and data type precedence.

Implicit vs Explicit Conversion

Data types can be either implicitly or explicitly converted.

Implicit conversions are nearly invisible to the user because they are automatically executed, i.e. they do not require the explicit use of any function.

Contrarily, in explicit conversions, CAST or CONVERT functions must be used. The recommendation is to use CAST function if ISO compliance is desirable and CONVERT function in order to take advantage of the style functionality.

Nonetheless, it’s not possible to convert every possible data type combination. The following matrix (provided by Microsoft here) shows all data type conversions for SQL Server system-supplied data types.

It’s important to notice that when an explicit conversion is done, the resulting data type is determined by the statement itself.

Implicit conversions like assignment statements, result in the data type that was defined by the variable declaration or column definition. For comparison operators or other expressions, the resulting data type depends on the rule of data type precedence.

Data Type Precedence

This rule simply states that, when an operator combines expressions of different data types, the data type with the lower precedence is converted to the data type with the higher precedence. The order is defined as follows:

1. user-defined data types (highest)11. real21. text
2. sql_variant12. decimal22. image
3. XML13. money23. timestamp
4. datetimeoffset14. smallmoney24. uniqueidentifier
5. datetime215. bigint25. nvarchar
6. datetime16. int26. nchar
7. smalldatetime17. smallint27. varchar
8. date18. tinyint28. char
9. time19. bit29. varbinary
10. float20. ntext30. binary (lowest)
Data type precedence (Transact-SQL)

How to Uncover Implicit Conversions

There are a couple of ways to uncover implicit conversions. The most usual is, probably, by analyzing the Execution Plan. If we’re inspecting the execution of a query where an implicit conversion that may affect cardinality estimate happened, a warning will be displayed. We’ll further look into this.

It’s also possible to capture these occurrences through the use of DMVs. I’ve seen some examples using sys.dm_exec_query_plan DMV in order to accomplish it, but we’re going to use the brand new and awesome sys.dm_exec_query_plan_stats DMV brought by SQL Server 2019. We’ll take a deeper look into it in a future post, but for now just keep in mind that this DMV will show us the last actual plan for a given query. Also, make sure to activate the scoped configuration as specified below:

ALTER DATABASE SCOPED CONFIGURATION SET LAST_QUERY_PLAN_STATS = ON;

--ImplicitConversions
DECLARE @DBName VARCHAR(20) = 'FStats';

SELECT TOP 30
	t.text AS 'Query',
	qs.total_worker_time/qs.execution_count AS 'Worker Time (Avg)', 
	qs.total_elapsed_time/qs.execution_count AS 'Elapsed Time (Avg)',
	qs.total_logical_reads/qs.execution_count AS 'Logical Reads (Avg)',
	qs.total_physical_reads/qs.execution_count AS 'Physical Reads (Avg)',
	qs.max_elapsed_time AS 'Max Elapsed Time',
	qs.max_logical_reads AS 'Max Logical Reads', 
	qs.max_physical_reads AS 'Max Physical Reads', 
	qs.execution_count AS 'Execution Count',
	qp.query_plan AS 'Query Plan'
FROM 
	sys.dm_exec_query_stats AS qs WITH (NOLOCK)
CROSS APPLY 
	sys.dm_exec_sql_text(plan_handle) AS t 
CROSS APPLY 
	sys.dm_exec_query_plan_stats(plan_handle) as qp
WHERE 
	t.text NOT LIKE '--ImplicitConversions%' AND
	CAST(query_plan AS NVARCHAR(MAX)) LIKE ('%PlanAffectingConvert%') AND
	t.[dbid] = DB_ID(@DBName)

Finally, another more comprehensive approach might be setting an Extedend Event to collect implicit conversions happening in the database. Creating a plan_affecting_convert event will allow us to uncover these CONVERT_IMPLICIT warnings. For the purposes of this demo it will be configured as it follows:

CREATE EVENT SESSION [xe_ImplicitConversions] ON SERVER 
	ADD EVENT sqlserver.plan_affecting_convert
	(
    	     ACTION(sqlserver.transaction_id)
    	     WHERE ([sqlserver].[database_name]=N'FStats')
        )
WITH (STARTUP_STATE=ON)

Implicit Conversions Woes

But why should we care about implicit conversions? Does it impact performance that much? Let’s find out! The following query retrieves the disciplinary sheet of football matches where 4 or more players were sent off.

DECLARE @NumRedCards tinyint = 4;
SELECT  
	MatchDetails, 
	HF HomeFouls, AF AwayFouls, 
	HY HomeYellowCards, AY AwayYellowCards, 
	HR HomeRedCards, AR AwayRedCards, TRC TotalRedCards
FROM 
	MatchesDetails 
WHERE 
	TRC >= @NumRedCards

Everything looks fine, right? Despite that, if you sense evil lurking, you’re absolutely correct. Using the previous xe_ImplicitConversions session, we’ll be able to see this:

Accordingly, the set of DMVs specified above, will reveal conversion problems that may affect cardinality estimate in the query plan choice.

If we click on the query plan and then hover on both operators, the following information will pop up:

Besides the obvious warnings, it’s clear that a nasty Clustered Index Scan was done and it’s quite strange that SQL Server didn’t cry for an index, isn’t it? Let’s create the bloody thing and ignore the implicit conversion, for now.

CREATE NONCLUSTERED INDEX IDX_MatchesDetails_TRC_Includes ON MatchesDetails(TRC)
INCLUDE(MatchDetails,HF,AF,HY,AY,AR,HR)
WITH(DATA_COMPRESSION=PAGE)

Noice! Now, with an unstoppable confidence, we will rerun the query and the plan will unequivocally show us that we’ve…

Failed. Miserably. An ugly Clustered Index Scan is still happening and the logical reads remain at 161 516.

Ok, so let’s change the approach here. First by dropping the index:

DROP INDEX IDX_MatchesDetails_TRC_Includes ON MatchesDetails

Then by analyzing the warning presented in the execution plan. According to it, there’s a data type mismatch between TRC and @NumRedCards. If we look into the definition of MatchesDetails table, we’ll confirm that TRC is, in fact, a varchar – genius.

Solving the Case

Elementary, my dear Watson.

Although the correct path here would be to alter the column data type, for simplicity’s sake we’ll just change the data type of @NumRedCards to varchar(2) and rerun the query.

DECLARE @NumRedCards varchar(2) = 4;
SELECT  
	MatchDetails, 
	HF HomeFouls, AF AwayFouls, 
	HY HomeYellowCards, AY AwayYellowCards, 
	HR HomeRedCards, AR AwayRedCards, TRC TotalRedCards
FROM 
	MatchesDetails 
WHERE 
	 TRC >= @NumRedCards

This time, although the Clustered Index Scan hasn’t vanished, we got a missing index hint that corresponds exactly to the one created earlier. After recreating it, and rerunning the last query, the result will be the following:

Success! Just like a rainy cloud, the warning has disappeared and the day is sunny again. We’re now efficiently using the index with an Index Seek and the logical reads dropped to a glorious 4.

Cry Me a River

There’s nothing much left to say, in this context, regarding how implicit conversions can mess up the cardinality estimates and consequently ruin SARGability, index usage, and burn CPU. Despite that, it’s relevant to mention that implicit conversions aren’t always a problem, even if SQL Server says so. One example of that is when the conversion occurs on the SELECT statement, after the rows are fetched. SQL Server will have an embarrassing outbreak, threatening you with the same old warnings, stating that cardinality estimate can be affected, but it’s all a lie, it certainly won’t. Nevertheless, stay alert for the real performance killers! 🖖

Rowstore Adaptive Joins

In this day and age, join operations in SQL Server can be made using one of the following algorithm types: Nested Loops, Merge or Hash. In theory, each one provides better performance under different conditions (e.g. amount of data).

  • Nested Loops: One of the tables is designated as the outer and other as the inner table. Each row of the outer table, is validated against all rows from the inner table and if a match is found, the row is included in the result. Nested Loop is tipically preferred when tables are small. There are three variants for a Nested Loop Join:
    • Naive: Scans an entire table or index;
    • Index: Uses an existing index to seek data;
    • Temporary: Index is built as part of the query plan (and destroyed upon completion of the query);
  • Hash Match: Hash Match consumes two inputs, corresponding to the joined tables. A hash table is created using the first input (also known as Build), then the same is done for the second input (Probe). The process of creating a hash table starts with hashing joined key values of the Build table and placing them on the appropriate bucket, depending on the hash value. Afterwards, Probe is processed, same hash values are applied, target bucket is determined and finally the value is compared to the ones already inside. If there is a match and after SQL Server verifies that no hash collision happened, the row is returned. Build should always be smaller because it will be stored in memory.

And that’s it. No, I did not forget about Merge Join, it happens that Merge was just ignored (like Internet Explorer or Terms of Agreement). Adaptive Join allows SQL Server to choose between Nested Loops and Hash joins on the same execution plan, deferring this choice, so it can understand which one will be better, based on the query input.

So, if until now (at least in rowstore) there wasn’t a way to adapt to different workloads inside the same execution plan and specific join algorithms thrive under different conditions, how can, in theory, Adaptive Joins save the day?

The Horrors of Parameter Sniffing

Parameter sniffing is easily one of the most common problems that almost anyone who works with databases have to deal with. Well, to be fair, parameter sniffing is not a problem by itself. The problem is giving the power of choice to users. Choice is the root of all evil.

A store procedure gets executed for the first time, the plan is created based on its input and then is cached. If no recompile is done, the plan will remain the same and future executions will reuse it, regardless of the input. In theory that’s good, because bad things happen when there isn’t enough memory available – and creating new plans from scratch is quite expensive. The problem begins when, in a subsequent execution, the user changes the input in a way that the query returns a million rows, instead of ten. As mentioned in the previous section, each join type thrives under distinct conditions. If in the first execution, a small amount of data was scanned, SQL Server has potentially used Nested Loops and done something like an Index Seek plus a couple Key Lookups. In the later execution, when SQL Server tries to process a million rows using the same approach, it suddenly has a heart attack. Telephones start to ring and clients start o yell as its soul quickly ascends to heaven.

We can try defibrillating by using RECOMPILE, but if that query runs too often, CPU will quickly escalate to 100% and we lose the patient, again. We can also try to optimize the query to a particular case by specifying a join hint or using OPTIMIZE FOR hint, but even if we get a plan that fits appropriately to the majority of executions, the deviations will potentially cause SQL Server to die in agony. Dividing the store procedure into multiple store procedures according to the input, can do the trick, but in some cases, queries can be too complex, not to mention that business logic and magic numbers hardcoded in queries are generally not a good idea. So, what’s left besides a bunch of bad ideas?

Perhaps Adaptive Joins is the hero we deserve. Basically, SQL Server defines a threshold (related to the number of rows) that represents the tipping point from which becomes more efficient to use Nested Loops instead of Hash Match. Adaptive Join will start the execution as a Hash Join, then, after processing the input from the Build phase, it will compare the resulting number of rows to this threshold. If the number of rows is greater or equal, it will proceed execution, otherwise the algorithm type is change to Nested Loops.

Therefore, SQL Server becomes able to adapt a single plan to different workloads. If a large number of rows are being processed, it can use Hash Join, otherwise, if few rows are involved, Nested Loop Join will be chosen. No recompile. No hints. No Columnstore wizardry needed.

Now, let’s check if Adaptive Joins will live to the expectations.

Improvise Adapt Overcome

For the purposes of this demo, I’m using a sample of FootballData public stats. Additionally, I’ve added some data in order to facilitate the demonstration of some concepts and behaviors. Bear in mind that in Standard Edition you won’t be able to use this feature.

We’ll firstly set database compatibility level to 140 and execute a store procedure with two different parameters, so we can understand the behavior prior to SQL Server 2019.

ALTER DATABASE [FStats] SET COMPATIBILITY_LEVEL = 140
GO
SET STATISTICS IO ON
GO
CREATE OR ALTER PROCEDURE GetHomeVictoriesByTeamId      
	@TeamId INT      
AS          
BEGIN          
SELECT  
	m.Id,  
	m.FTHG HomeGoals,  
	m.FTAG AwayGoals,    
	d.Att Attendance,  
	d.Stadium,  
	d.Att  
FROM           
	Matches m          
JOIN           
	MatchesDetails d ON m.id = d.MatchDetails       
WHERE      
	m.HomeTeam = @TeamId AND 
	FTR = 'H' AND 
	d.Att > 20000           
ORDER BY       
	d.att DESC        
END 
GO
EXEC GetHomeVictoriesByTeamId 2
EXEC GetHomeVictoriesByTeamId 1

Ouch! Both execution plans used Nested Loops and Index Seek to retrieve results. The plan was created and cached in the first execution that retrieved 18 rows. Then SQL Server reused the plan to process 500304 victories from Benfica (Id = 1) and things got nasty. We can confirm in Messages that the second plan was not efficient.

By executing the same statement again with recompile, we can notice how a more appropriate plan differs. MatchesDetails’ Scan count dropped from 500304 to 5 and logical reads from 1597472 to 1539. That’s more like it! But, as we discussed, using option recompile is far from being always appropriate, due to the fact that it is a heavy resource consumer.

EXEC GetHomeVictoriesByTeamId 1 WITH RECOMPILE;

Next, we’ll set database compatibility level up to 150 and run the store procedure again, so we can see Adaptive Joins in action.

ALTER DATABASE [FStats] SET COMPATIBILITY_LEVEL = 150
GO
EXEC GetHomeVictoriesByTeamId 1;
EXEC GetHomeVictoriesByTeamId 2;
EXEC GetHomeVictoriesByTeamId 1;

By looking into the first execution plan, we’ll notice the “new” Adaptive Join operator (this looks intuitive), so it looks like we’re ready to adapt and overcome. There are a few things going on in this plan, so let’s analyze it step by step.

The properties of this operator will show us some interesting information, e.g. that the threshold for the algorithm to change to Nested Loops is 9607.83. We also can see that 500304 rows were returned, so the chosen join type was “HashMatch”. This is the expected behavior as the number of rows is higher than the threshold, hence not changing to Nested Loops.

If we recall the execution plan, it’s easy to corroborate this information. Since the number of rows exceeded the threshold and a Hash join was done, the execution went to the Index Scan branch. This can also be confirmed by the displayed number of rows that flowed through this operator, in this case 500304. Correspondingly, the lower Index Seek branch, had no rows.

The second plan, corresponding to an input that retrieves far less rows, shows a different execution branch. In the Adaptive Join operator info, it’s possible to notice that the chosen join algorithm type (NestedLoops) is appropriate to the number of rows – 18. It’s also visible that these rows were retrieved by an Index Seek instead of an Index Scan. So far, so good.

The third and last plan, with the same input as the first will be exactly the same, right? Well… Sadly no. Although the operators are the same and the execution went through the same branch, it looks like we landed on spillage village. Both Sort and Adaptive Join have warning signs on them, stating that data was spilled to tempdb and we can see that the query took longer to run. Despite that, IO statistics maintain the same values of the first execution.  Interestingly, this behavior is caused by another adaptive query processing feature that was explored in a previous post– Adaptive Memory Grant Feedback.

If you are lucky enough, after a couple executions, SQL Server will be able to adapt the memory grant to its ideal value and just like clouds, when these warnings disappear, it will be a beautiful day. Unfortunately, sometimes SQL Server isn’t able to do that and gives up on adapting the grant (for more details, check the mentioned post). Let’s execute the store procedure a couple more times and check the outcome.

EXEC GetHomeVictoriesByTeamId 1
GO 30

Under these circumstances, SQL Server wasn’t able to adapt the memory grant and we didn’t get the stable ending we were looking for. Nevertheless, things could be worse: The initial memory grant is fairly adequate, so no spills this time. Furthermore, the chosen join algorithm is appropriate, IO statistics and execution time are pretty similar too.

In conclusion, looks like Adaptive Joins can really improve the performance of our servers, with virtually no effort needed. Nonetheless, it’s probably not a good idea to go all-in and change your production server compatibility level to 150, hoping that the robots will do what they were programmed to do. You must have seen enough movies to know that it’s not how it ends. But seriously, Adaptive Joins won’t fix all problems related to parameter sniffing and can actually cause problems that aren’t easy to troubleshoot. So, it’s reasonable to proceed cautiously and make use of tools like Query Store to fix regressions.