The Deceiving N

In this post, we’ll briefly look into a common misconception about nchar and nvarchar data types.

We begin by setting the breakfast table:

CREATE TABLE Breakfast
(
    Id SMALLINT IDENTITY(1,1),
    Product NVARCHAR(10)
)
INSERT INTO Breakfast (Product) VALUES ('Caviar'),('Blinis')

Later, someone spots our mistake creating the breakfast menu and tries to apply the only logical correction to it:

UPDATE Breakfast SET Product = 'Beluga Gold Vodka' WHERE Product = 'Blinis' 

But, despite the good thinking and good taste, the attempt fails.

Furthermore, looking into the Breakfast table definition can make thinks even more confusing, because the Lenght value is slighty greater than the inputed string.

It’s important to understand that in NVARCHAR(n) and NCHAR(n), the n doesn’t correspond to the number of characters that can be stored, but rather the string length in byte-pairs (from 1 to 4000). This misconception is probably explained by the fact that in characters defined within the Unicode range of 0 to 65 535, one character is stored per byte-pair. However, in higher Unicode ranges (65 536 to 1 114 111) one character may use two byte-pairs.

In our case, one character is being stored per byte-pair, but the discrepancy that we saw in the column information, more specifically in its lenght can lead to some confusion. Since we’ve initially set our n to be 10 in the Product column, those 10 byte-pairs will correspond and justify the value of 20 presented in the object definition for the column lenght. However, because our NVARCHAR column uses 2 bytes per character, the maximum number of characters that can be stored is 10.

If we run the following query:

SELECT
	column_name,
	character_maximum_length,
	character_octet_length
FROM 
	information_schema.columns
WHERE 
	table_name = 'Breakfast' AND
	Column_name ='Product'

The character_maximum_length and character_octet_length for the Product column will confirm exactly what we’ve just saw.

Now the modifications to allow a proper breakfast can be made.

Za zdarovye!