Why do some SQL strings have an 'N' prefix?

refer: http://databases.aspfaq.com/general/why-do-some-sql-strings-have-an-n-prefix.html

 

You may have seen Transact-SQL code that passes strings around using an N prefix. This denotes that the subsequent string is in Unicode (the N actually stands for National language character set). Which means that you are passing an NCHAR, NVARCHAR or NTEXT value, as opposed to CHAR, VARCHAR or TEXT. See  Article #2354 for a comparison of these data types. 
 
Unicode is typically used in database applications which are designed to facilitate code pages which extend beyond the English and Western Europe code pages ( Erland Sommarskog, a native of Sweden, refers to this set as "Germanic and Romance languages"), for example Chinese. Unicode is designed so that extended character sets can still "fit" into database columns. What this means is that Unicode character data types are limited to half the space, because each byte actually takes two bytes (Unicode is sometimes referred to as "double-wide"). For more information on Unicode, see Unicode.org. Note that there are many encoding schemes in the Unicode standard, but SQL Server only supports one: UTF-16. 
 
While using Unicode is a design choice you can make in building your own applications, some facilities in SQL server require it. One example is sp_executeSQL. If you try the following: 
 
EXEC sp_ExecuteSQL 'SELECT 1'
 
You will get this error: 
 
Server: Msg 214, Level 16, State 2, Procedure sp_executesql, Line 1 
Procedure expects parameter '@statement' of type 'ntext/nchar/nvarchar'.
 
You can get around this in two ways: 
 
-- (a) using the N prefix 
 
EXEC sp_ExecuteSQL N'SELECT 1' 
 
-- (b) using a variable 
 
DECLARE @sql NVARCHAR(100) 
SET @sql = N'SELECT 1' 
EXEC sp_ExecuteSQL @sql
 
Note that implicit conversion makes the N prefix optional in case (b); however, for legibility and consistency, you should always use the prefix when defining Unicode strings. One reason is that leaving it off can actually change your data if it contains Unicode characters (losing the additional information), as in the following example: 
 
DECLARE @n NVARCHAR(10) 
SET @n = 'a' 
PRINT @n 
SET @n = N'a' 
PRINT @n
 
The first assignment, which didn't use the N prefix, gets printed as a regular a. Only the second maintains the character that was actually supposed to be represented. As you can imagine, if you are intending to support data entry in foreign languages and code pages, you will likely need to test for Unicode support (making sure that such columns support Unicode, and that data won't be lost when passed into stored procedures, functions, etc.). Note that your application will need to handle Unicode as well; for example, when you try to print this character from ASP... 
 
<% 
    Response.Write("a") 
%>
 
...it actually prints out the string aa. (This result might depend on your codepage and regional settings.) So you might consider translating your data into its ASCII equivalent, e.g. a = &#0257;. 
 
Another reason you want to avoid implicit conversion is that there are some potentially serious performance issues. Consider the following quite simple repro: 
 
USE tempdb  
GO 
 
CREATE TABLE a 
( 
    b VARCHAR(3), 
    c NVARCHAR(3) 
) 
CREATE INDEX b ON a(b) 
CREATE INDEX c ON a(c) 
GO 
 
SET NOCOUNT ON 
 
INSERT a SELECT 'foo', N'foo' 
INSERT a SELECT 'bar', N'bar' 
 
DECLARE 
    @b VARCHAR(3), 
    @c NVARCHAR(3) 
 
SELECT 
    @b = 'foo', 
    @c = N'foo' 
 
SELECT * FROM a WHERE b = @b 
SELECT * FROM a WHERE b = @c 
SELECT * FROM a WHERE c = @b 
SELECT * FROM a WHERE c = @c 
SELECT * FROM a WHERE b LIKE @b 
SELECT * FROM a WHERE b LIKE @c 
SELECT * FROM a WHERE c LIKE @b 
SELECT * FROM a WHERE c LIKE @c 
 
DROP TABLE a
 
Paste the code into Query Analyzer, turn execution plan on, and let her rip. You'll observe the following breakdown of percentage of work (roughly, depending on your hardware): 
 
VARCHAR = VARCHAR4.48%
VARCHAR = NVARCHAR13.31%
NVARCHAR = VARCHAR4.48%
NVARCHAR = NVARCHAR4.48%
VARCHAR LIKE VARCHAR4.48%
VARCHAR LIKE NVARCHAR13.31%
NVARCHAR LIKE VARCHAR4.48%
NVARCHAR LIKE NVARCHAR4.48%
 
Now, that's not the whole story; we all know that there are many other factors, such as I/O, that will impact the actual time each portion of the query takes. The key is that implicit conversion *can* cause a table scan instead of an index seek, and on larger tables this can really hurt. While it's important to understand why this happens and in which scenarios, my recommendation is to match your character-based datatypes as explicitly as possible. 
 
One other thing to watch out for: your database may be using Unicode without your knowledge. If you upsize from Access to SQL Server, for example, character-based text columns might be translated to Unicode (I believe this is a catch-all technique; in case Access was storing Unicode strings, or if you might be storing Unicode strings later, you won't lose data or require changes). I think the Access upsizing tools should be updated to force a conscious choice, so that you aren't wasting space for nothing, and so that you know that you made a decision at all. 
 
For a more thorough discussion of Unicode and the N prefix, please see  KB #239530, this  MSDN article, and this  Google thread
 
In other RDBMS platforms, or in the ANSI and/or ISO specifications, you might see prefixes other than N being used against values. (Current versions of SQL Server only support Unicode.) Here are the additional monikers I am aware of: 
 
BThis is used to denote a BINARY string expressed in bits (0 and 1 only)
XThis is used to denote a BINARY string expressed in hexadecimal (0 -> F)

Related Articles

转载于:https://www.cnblogs.com/Jenny90/p/3949607.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值