ORACLE中文长度和乱码判断

最新推荐文章于 2024-04-18 10:32:49 发布

MtiredM

最新推荐文章于 2024-04-18 10:32:49 发布

阅读量1.9k

点赞数

分类专栏： # Oracle 文章标签： oracle

本文链接：https://blog.csdn.net/MtiredM/article/details/123755048

版权

Oracle 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

参考：https://blog.csdn.net/iteye_4537/article/details/82166593
https://blog.csdn.net/tianlesoftware/article/details/6863797

oracle length相关函数说明

官网说明地址：http://download.oracle.com/docs/cd/E11882_01/server.112/e26088/functions088.htm#SQLRF00658

The LENGTH functionsreturn the length of char. LENGTH calculates length usingcharacters as defined by the input character set.

返回以字符为单位的长度

LENGTHB usesbytes instead of characters.

返回以字节为单位的长度

LENGTHC usesUnicode complete characters

返回以Unicode完全字符为单位的长度

LENGTH2 usesUCS2 code points

返回以UCS2代码点为单位的长度

LENGTH4 usesUCS4 code points

返回以UCS4代码点为单位的长度

char can beany of the data types char, varchar2, nchar, nvarchar2, clob,or nclob.
【字段可以是任何数据类型char、varchar2、nchar、nvarchar2、clob或nclob。】

The exceptionsare LENGTHC, LENGTH2, and LENGTH4, which do not allow char tobe a CLOB or NCLOB. The return value is of data type NUMBER.If char has data type CHAR, then the length includes all trailingblanks. If char is null, then this function returns null.
【例外情况是LENGTHC、LENGTH2和LENGTH4，它们不允许字段成为CLOB或NCLOB。返回值的数据类型为NUMBER。如果字段的数据类型为char，则长度包括所有尾随空格。如果字段为null，则此函数返回null。】

Restriction on LENGTHB (Lengthb函数的限制)
The LENGTHB functionis supported for single-byte LOBs only. It cannot be used with CLOB and NCLOB datain a multibyte character set.
【LENGTHB函数仅支持单字节LOB。它不能与多字节字符集中的CLOB和NCLOB数据一起使用。】

示例（GBK编码）

SELECT LENGTH('CANDIDE') "Length incharacters" FROM DUAL;

******************
Length in characters
--------
7

SELECT LENGTHB ('CANDIDE') "Length inbytes" FROM DUAL;
******************
Length in bytes
--------
14

在不同的数据库，因为字符集的不同，LENGTHB得到的值可能会不一样。如ZHS16GBK采用两个byte位来定义一个汉字。而在UTF8，采用3个byte。

示例（ZHS16GBK编码）

select length('安庆') from dual;
*********************
----------
2

select lengthb('安庆') from dual;
*********************
----------
4

select length('AnQing') from dual;
*********************
----------
6

select lengthb('AnQing') from dual;
*********************
----------
6

通过这个示例，我们可以看出来，Length 和 Lengthb 函数的一个重要用处，就是用来判断记录值里是否有中文内容。

如果有中文，那么Length() != Lengthb()
如果没有中文，那么Length() == Lengthb()

判断记录中是否有中文

这种方法使用与字符集为GBK的情况，其他情况不使用。在GBK 字符集下，中文的length 是1个字符，而lengthb 是2个byte。

select * from t where length(c1) != lengthb(c1);

判断记录中是否有乱码

这里需要使用asciistr 函数

ASCIISTRtakesas its argument a string, or an expression that resolves to a string, in anycharacter set and returns an ASCII version of the string in the databasecharacter set. Non-ASCII characters are converted to the form\xxxx, wherexxxxrepresentsa UTF-16 code unit.

ASCIISTR函数会把非ASCII 的字符转换成\xxxx 的格式，xxxx 是UTF-16的code unit。

示例

select asciistr('/\/Davecome from 安?庆?') as str from dual;
*****************
---------
/\005C/Dave come from \5B89?\5E86?

这里的’安’ 被转换成了\5B89，‘庆’被转换成了\5E86。

这里要注意一个特殊字符“\”，当它出现的时候转换后的码为“\005C”。

当然，我们也可以使用UNISTR函数，把asciistr 的结果反转回来

select UNISTR('\5E86') from dual;
******************
--------
庆

那么当我们的记录中存在中文乱码时

select UNISTR('\FFFD') from dual;
*********************
---------
�

注意:
执行以上SQL 不要在sqlplus 里执行，sqlplus 受本地环境影响。到第三方的工具(PL/SQL DEV 或者Toad)里测试。

那么当我们的中文记录变成乱码后，那么转成asciistr的值就会包含2种特殊符号: ?和\FFFD对应的符号。我们只需要匹配这2种符号，就可以判断记录里是否有乱码了。

 select * from USER_INFO where asciistr(USERNAME) like '%?%' or asciistr(USERNAME) like '%\FFFD%';

MtiredM

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
ORACLE中文长度和乱码判断

参考：https://blog.csdn.net/iteye_4537/article/details/82166593https://blog.csdn.net/tianlesoftware/article/details/6863797oracle length相关函数说明官网说明地址：http://download.oracle.com/docs/cd/E11882_01/server.112/e26088/functions088.htm#SQLRF00658The LENGTH func
复制链接

扫一扫