Understanding the Difference Between nVARCHAR and VARCHAR

7 min readMay 13, 2024

Introduction

Hey there, fellow SQL Server enthusiasts! Today, we’re diving into the world of character data types, specifically nVARCHAR and VARCHAR. As someone who’s worked with SQL Server for years, I’ve come to appreciate the importance of understanding these data types and how they can impact your database design and performance. In this article, we’ll explore the key differences between nVARCHAR and VARCHAR, and I’ll share some personal insights and examples along the way. By the end, you’ll have a solid grasp of when to use each data type and how to optimize your databases accordingly. Let’s get started!

“VARCHAR” is an abbreviation for “variable-length character,” referring to its ability to store character strings of varying lengths efficiently.

The term “VARCHAR” is pronounced as “var-char” or “var-care.”
Here’s a breakdown of the pronunciation:
“VAR” is pronounced as it sounds, rhyming with “car” or “far.”
“CHAR” is pronounced as “char” or “care,” rhyming with “bear” or “dare.”

Example characters

`NVARCHAR` (Unicode characters):

English letters (lowercase and uppercase): a, b, c, …, z, A, B, C, …, Z
Numbers: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Punctuation marks: ., ,, !, ?, “, ‘, (, ), [, ], {, }, …
Special characters: @, #, $, %, ^, &, *, _, +, -, =, |, \, /, ~, `
Accented characters: á, é, í, ó, ú, à, è, ì, ò, ù, â, ê, î, ô, û, ä, ë, ï, ö, ü, ñ, …
Emojis: 😀, 😃, 😄, 😁, 😆, 😅, 😂, 🤣, 😊, 😇, …

Non-Latin characters:

Greek: α, β, γ, δ, ε, ζ, η, θ, ι, κ, λ, μ, ν, ξ, ο, π, ρ, ς, σ, τ, υ, φ, χ, ψ, ω
Cyrillic: А, Б, В, Г, Д, Е, Ё, Ж, З, И, Й, К, Л, М, Н, О, П, Р, С, Т, У, Ю, Я, …
Chinese: 你好, 世界, 欢迎, 谢谢, 再见, …
Japanese: こんにちは, ありがとう, さようなら, …
Arabic: ا, ح, خ, د, س, ش, ص, ض, ط, ظ, ع, غ, ف, ق, ك, ل, م, ن, ه, …

`VARCHAR` (Non-Unicode characters):

English letters (lowercase and uppercase): a, b, c, …, z, A, B, C, …, Z
Numbers: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Punctuation marks: ., ,, !, ?, “, ‘, (, ), [, ], {, }, …
Special characters: @, #, $, %, ^, &, *, _, +, -, =, |, \, /, ~, `

Extended ASCII characters (depending on the collation):

Accented characters: á, é, í, ó, ú, à, è, ì, ò, ù, â, ê, î, ô, û, ä, ë, ï, ö, ü, ñ, …
Additional symbols: ¡, ¢, £, ¤, ¥, ¦, §, ¨, ©, ª, «, µ, ¶, », ¼, ½, ¾, ¿, ×, ÷, …

Note that the specific characters supported by VARCHAR depend on the collation settings of the database or column. The examples provided above assume a collation that supports extended ASCII characters.

Keep in mind that NVARCHAR can store a much wider range of characters, including those from various languages and scripts, as well as emojis and special symbols. VARCHAR, on the other hand, is limited to the characters supported by the specific collation being used.

The Defining Difference

The primary distinction between nVARCHAR and VARCHAR lies in their support for Unicode characters. nVARCHAR is designed to store Unicode data, which means it can handle a wide range of characters from different languages and scripts. On the other hand, VARCHAR is limited to storing non-Unicode data, typically based on the database’s collation setting.

Unicode Support:

NVARCHAR is a Unicode data type that supports a wide range of characters from different languages and scripts, including non-Latin characters, symbols, and emojis.
VARCHAR is a non-Unicode data type that supports a limited set of characters, typically based on the database's collation settings. It is primarily used for storing ASCII or extended ASCII characters.

2. Storage Size:

NVARCHAR uses 2 bytes per character to store data, regardless of the actual character being stored. This allows for consistent storage size and enables the storage of a wide range of Unicode characters.
VARCHAR uses 1 byte per character to store data for non-Unicode characters. If the database collation is set to a single-byte character set, VARCHAR can store ASCII or extended ASCII characters efficiently.

3. Maximum Length:

The maximum length for NVARCHAR is specified in the number of characters. The maximum length for NVARCHAR is 4,000 characters.

For example, NVARCHAR(100) can store up to 100 Unicode characters, which equates to a maximum storage size of 200 bytes (100 characters * 2 bytes per character).

The maximum length for VARCHAR is also specified in the number of characters. The maximum length for VARCHAR is 8,000 characters.

For example, VARCHAR(100) can store up to 100 non-Unicode characters, which equates to a maximum storage size of 100 bytes (100 characters * 1 byte per character).

4. Collation Sensitivity:

NVARCHAR is collation-sensitive, meaning that the collation settings of the database or column can affect the sorting order, comparison, and equality of NVARCHAR values.
VARCHAR is also collation-sensitive, but since it typically deals with a limited character set, the impact of collation may be less noticeable compared to NVARCHAR.

5. Storage Efficiency:

If the majority of the data being stored consists of ASCII or extended ASCII characters, using VARCHAR can be more storage-efficient compared to NVARCHAR, as it uses 1 byte per character instead of 2 bytes.
If the data being stored includes a significant number of Unicode characters or requires support for multiple languages, using NVARCHAR is recommended to ensure proper storage and handling of the data.

When deciding between `NVARCHAR` and `VARCHAR`, consider the following factors:

The type of characters you need to store (Unicode vs. non-Unicode)
The storage efficiency and memory consumption
The collation requirements of your application
The consistency and compatibility with other systems or applications

In general, if you need to store multilingual data or require support for a wide range of characters, it’s recommended to use NVARCHAR. If your data primarily consists of ASCII or extended ASCII characters and storage efficiency is a concern, VARCHAR can be a suitable choice.

Let’s consider an example with `NVARCHAR(150)` and `VARCHAR(150)` columns.

NVARCHAR(150):

NVARCHAR is a Unicode data type that uses 2 bytes per character.
The maximum storage size for NVARCHAR(150) is 150 characters, which equates to 300 bytes (150 * 2 bytes).
SQL Server includes a 2-byte overhead for each NVARCHAR value.
Therefore, the total storage size for one NVARCHAR(150) column per row is 302 bytes (300 bytes + 2 bytes overhead).

2. VARCHAR(150):

VARCHAR is a non-Unicode data type that uses 1 byte per character.
The maximum storage size for VARCHAR(150) is 150 characters, which equates to 150 bytes.
SQL Server includes a 2-byte overhead for each VARCHAR value.
Therefore, the total storage size for one VARCHAR(150) column per row is 152 bytes (150 bytes + 2 bytes overhead).

Now, let’s calculate the table sizes assuming we have 1 million rows in each table:

Table with one NVARCHAR(150) column:

Storage size per row: 302 bytes
Number of rows: 1,000,000
Total table size: 302 bytes * 1,000,000 rows = 302,000,000 bytes ≈ 288 MB

2. Table with one VARCHAR(150) column:

Storage size per row: 152 bytes
Number of rows: 1,000,000
Total table size: 152 bytes * 1,000,000 rows = 152,000,000 bytes ≈ 145 MB

Again, it’s important to note that the actual table sizes may vary based on factors such as data compression, row overhead, and the presence of other columns in the table. If the actual data stored in the columns is shorter than the maximum length, the storage size will be less than the calculated maximum.

Performance Impact

The choice between nVARCHAR and VARCHAR can also have an impact on query performance. In general, using nVARCHAR requires more storage space and can result in slightly slower query execution compared to VARCHAR, especially for large datasets.

However, it’s important to consider the trade-offs. If your application needs to support multilingual data or complex scripts, using nVARCHAR is essential for maintaining data integrity and avoiding encoding issues. The performance impact may be negligible compared to the benefits of proper Unicode support.

I’ve seen cases where developers prematurely optimized their databases by using VARCHAR everywhere, only to face challenges later when the application needed to support international users. It’s crucial to strike a balance and choose the appropriate data type based on your specific requirements.

Conclusion

In summary, understanding the difference between nVARCHAR and VARCHAR is crucial for designing effective and efficient SQL Server databases. nVARCHAR provides full Unicode support at the cost of fixed storage and slightly slower performance, while VARCHAR offers variable storage and faster execution for non-Unicode data.

As you embark on your SQL Server journey, keep these differences in mind and make informed decisions based on your application’s needs. Don’t be afraid to experiment and profile your queries to find the optimal balance between storage, performance, and functionality.

Remember, the key to success in database design is understanding your data and choosing the right tools for the job. With a solid grasp of nVARCHAR and VARCHAR, you’ll be well-equipped to tackle any character data challenges that come your way!

Understanding the Difference Between nVARCHAR and VARCHAR

Introduction

Example characters

`NVARCHAR` (Unicode characters):

`VARCHAR` (Non-Unicode characters):

The Defining Difference

When deciding between `NVARCHAR` and `VARCHAR`, consider the following factors:

Let’s consider an example with `NVARCHAR(150)` and `VARCHAR(150)` columns.

Now, let’s calculate the table sizes assuming we have 1 million rows in each table:

Performance Impact

Conclusion

Written by Joseph Chakola

No responses yet

Understanding the Difference Between nVARCHAR and VARCHAR

Introduction

Example characters

NVARCHAR (Unicode characters):

VARCHAR (Non-Unicode characters):

The Defining Difference

When deciding between NVARCHAR and VARCHAR, consider the following factors:

Let’s consider an example with NVARCHAR(150) and VARCHAR(150) columns.

Now, let’s calculate the table sizes assuming we have 1 million rows in each table:

Performance Impact

Conclusion

Written by Joseph Chakola

No responses yet

`NVARCHAR` (Unicode characters):

`VARCHAR` (Non-Unicode characters):

When deciding between `NVARCHAR` and `VARCHAR`, consider the following factors:

Let’s consider an example with `NVARCHAR(150)` and `VARCHAR(150)` columns.