Oracle迁移到mysql字符集_oracle数据库字符集characterset迁移及变更系列一

背景

oracle数据库字符集和应用数据密切相关,数据库迁移也会涉及到数据库字符集的转换,大家常常听到的乱码之类的,导出导入表发生列长度不足的错误,也和数据库字符集不无关系,

本文我沿袭前文:http://blog.itpub.net/9240380/viewspace-1849340/ ,

itpub网友问题之AL32UTF8与ZHS16GBK 2种数据库字符集database characterset,继续学习数据库字符集的相关知识。

结论

1,表名长度最大为30个字节

2,convert函数确实是个利器,可以验证2种字符集是否兼容,达到选择目标数据库字符集的作用

3,nls_lang是OS层面的环境变量,其最后部分指定客户端采用的数据库字符集,控制EXP及IMP和EXPDP及IMPDP的字符集

4,如果NLS_LANG及数据库字符集不一致,会提示可能产生字符集变更的信息,这个可能会引发数据损失

5,千万别太轻信网上的,运行ALERT DATABASE CHARECTERSET INTERNAL_USE变更 数据库字符集,这样会引发隐性或未知的风险

6,nls_length_semantics默认值为BYTE,值也可为CHAR,ORACLE不建议采用后者,引发会引发性能问题以及运行时间错误,可能也会产生BUFFER OVERFLOW

测试

SQL> select * from v$version where rownum=1;

BANNER

----------------------------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

SQL> create table t_testlength(a int);

Table created.

SQL> select length(object_name) from user_objects where  lower(object_name)='t_testlength';

LENGTH(OBJECT_NAME)

-------------------

12

SQL> alter table t_testlength rename to t_testlength_testlength_testlengt;

alter table t_testlength rename to t_testlength_testlength_testlengt

*

ERROR at line 1:

ORA-00972: identifier is too long

SQL> select length('t_testlength_testlength_testlengt') from dual;

LENGTH('T_TESTLENGTH_TESTLENGTH_TESTLENGT')

-------------------------------------------

33

SQL> select length('t_testlength_testlength_testle') from dual;

LENGTH('T_TESTLENGTH_TESTLENGTH_TESTLE')

----------------------------------------

30

可见表名长度最长为30个字节的长度

SQL> alter table t_testlength rename to t_testlength_testlength_testle;

Table altered.

SQL> alter table t_testlength_testlength_testle rename to t_testlength_testlength_testle1;

alter table t_testlength_testlength_testle rename to t_testlength_testlength_testle1

*

ERROR at line 1:

ORA-00972: identifier is too long

--当前数据库字符集为AL32UTF8

SQL> select parameter,value from nls_database_parameters where parameter='NLS_CHARACTERSET';

PARAMETER                      VALUE

------------------------------ ------------------------------

NLS_CHARACTERSET               AL32UTF8

--长度刻度为字节,还有个值也可以为CHAR

SQL> show parameter nls_length_semantics

NAME                                 TYPE                   VALUE

------------------------------------ ---------------------- ------------------------------

nls_length_semantics                 string                 BYTE

--查看与中文相关的数据库字符集

SQL> select distinct parameter from v$nls_valid_values;

PARAMETER

------------------------------

CHARACTERSET

SORT

TERRITORY

LANGUAGE

SQL> select parameter,value from v$nls_valid_values where parameter='CHARACTERSET' and VALUE  like '%ZH%';

PARAMETER                      VALUE

------------------------------ ------------------------------

CHARACTERSET                   ZHS16CGB231280

CHARACTERSET                   ZHS16MACCGB231280

CHARACTERSET                   ZHS16GBK

CHARACTERSET                   ZHS16DBCS

CHARACTERSET                   ZHS32GB18030

CHARACTERSET                   ZHT32EUC

CHARACTERSET                   ZHT32SOPS

CHARACTERSET                   ZHT16DBT

CHARACTERSET                   ZHT32TRIS

CHARACTERSET                   ZHT16DBCS

CHARACTERSET                   ZHT16BIG5

PARAMETER                      VALUE

------------------------------ ------------------------------

CHARACTERSET                   ZHT16CCDC

CHARACTERSET                   ZHT16MSWIN950

CHARACTERSET                   ZHT16HKSCS

CHARACTERSET                   ZHT16HKSCS31

CHARACTERSET                   ZHS16CGB231280FIXED

CHARACTERSET                   ZHS16GBKFIXED

CHARACTERSET                   ZHS16DBCSFIXED

CHARACTERSET                   ZHT32EUCFIXED

CHARACTERSET                   ZHT32TRISFIXED

CHARACTERSET                   ZHT16DBCSFIXED

CHARACTERSET                   ZHT16BIG5FIXED

22 rows selected.

好像英文字符在2种字符集占用空间是一样的

SQL> select 'abc',dump('abc'),dump(convert('abc','ZHS16GBK','AL32UTF8')) from dual;

'ABC'  DUMP('ABC')                                  DUMP(CONVERT('ABC','ZHS16GBK','AL32UTF8'))

------ -------------------------------------------- ------------------------------------------

abc    Typ=96 Len=3: 97,98,99                       Typ=1 Len=3: 97,98,99

SQL> insert into t_charset values('我们');

1 row created.

SQL> commit;

Commit complete.

SQL> select a,dump(a) from t_charset;

A                              DUMP(A)

------------------------------ --------------------------------------------------

我们                           Typ=1 Len=9: 233,142,180,230,136,156,230,187,145

可见AL32UTF8不能正常转化ZHS16GBK,中文会出现问题,且前者占用字节长度为9个,而后者变成6,造成数据损失

SQL> select a,dump(a),convert(a,'ZHS16GBK','AL32UTF8') mig_char,dump(convert(a,'ZHS16GBK','AL32UTF8')) after_dump from t_charset;

A                    DUMP(A)                                            MIG_CHAR                       AFTER_DUMP

-------------------- -------------------------------------------------- ------------------------------ --------------------------------------------------

我们                 Typ=1 Len=9: 233,142,180,230,136,156,230,187,145                        Typ=1 Len=6: 230,136,145,228,187,172

看到没,如果从AL32UTF8转化为UTF8,不会造成数据损失,中文可以正常显示,可见CONVERT函数是个利器,对于字符集转化

SQL> select a,dump(a),convert(a,'UTF8','AL32UTF8') mig_char,dump(convert(a,'UTF8','AL32UTF8')) after_dump from t_charset;

A                    DUMP(A)                                            MIG_CHAR                       AFTER_DUMP

-------------------- -------------------------------------------------- ------------------------------ --------------------------------------------------

我们                 Typ=1 Len=9: 233,142,180,230,136,156,230,187,145   我们                           Typ=1 Len=9: 233,142,180,230,136,156,230,187,145

SQL> desc database_properties;

Name              Null?    Type

----------------- -------- ------------

PROPERTY_NAME     NOT NULL VARCHAR2(30)

PROPERTY_VALUE             VARCHAR2(400

0)

DESCRIPTION                VARCHAR2(400

0)

SQL> select count(*) from database_properties;

COUNT(*)

----------

36

SQL> select distinct property_name from database_properties;

PROPERTY_NAME

------------------------------

Flashback Timestamp TimeZone

DST_UPGRADE_STATE

NLS_CURRENCY

NLS_ISO_CURRENCY

NLS_DUAL_CURRENCY

DBTIMEZONE

DEFAULT_EDITION

NLS_CALENDAR

NLS_TIMESTAMP_FORMAT

EXPORT_VIEWS_VERSION

NO_USERID_VERIFIER_SALT

PROPERTY_NAME

------------------------------

DST_SECONDARY_TT_VERSION

NLS_TERRITORY

NLS_DATE_LANGUAGE

NLS_LENGTH_SEMANTICS

WORKLOAD_REPLAY_MODE

DEFAULT_PERMANENT_TABLESPACE

DEFAULT_TBS_TYPE

NLS_LANGUAGE

NLS_TIMESTAMP_TZ_FORMAT

NLS_NCHAR_CHARACTERSET

NLS_RDBMS_VERSION

PROPERTY_NAME

------------------------------

NLS_NUMERIC_CHARACTERS

NLS_TIME_FORMAT

NLS_NCHAR_CONV_EXCP

GLOBAL_DB_NAME

DEFAULT_TEMP_TABLESPACE

NLS_DATE_FORMAT

NLS_COMP

DICT.BASE

TDE_MASTER_KEY_ID

DST_PRIMARY_TT_VERSION

NLS_CHARACTERSET

PROPERTY_NAME

------------------------------

NLS_SORT

WORKLOAD_CAPTURE_MODE

NLS_TIME_TZ_FORMAT

36 rows selected.

SQL> select property_name,property_value from database_properties where property_name='NLS_LENGTH_SEMANTICS';

PROPERTY_NAME                  PROPERTY_VALUE

------------------------------ --------------------------------------------------

NLS_LENGTH_SEMANTICS           BYTE

--查官方手册,梳理上述参数nls_length_semantics的含义

Oracle? Database Globalization Support Guide 之

3 Setting Up a Globalization Support Environment

1,nls_length_semantics的可用值为:byte,char,默认值为byte

2,前值适用于单字节数据库字符集编码规则,后者适用于多字节数据库字符集编码规则

3,char,varchar2,long数据类型可以选取byte,也可以选择char

4,nchar,nvarchar2,clob,nclob,只能选用char

5,如果变更nls_length_semantics不会影响已存在的数据

6,nls_length_semantics可以在数据库级,实例级,会话级进行调整

7,sys用户下的数据只会采用byte

8,如果在客户端显示指定与服务器端不同的值,录入数据以客户端为准

9,为了数据兼容,ORACLE不建议配置值为CHAR,因为会导致产生很多问题:运行时错误,BUFFER OVERFLOW

---测试下zhs16gbk存储固定宽度的中文字符,然后转化为AL32UTF8

SQL> conn /as sysdba

Connected.

SQL> alter system enable restricted session;

System altered.

SQL> ALTER DATABASE character set INTERNAL_USE ZHS16GBK;

Database altered

SQL> alter system disable restricted session;

System altered.

SQL> conn scott/system

Connected.

SQL> create table t_charset(a char(10));

Table created.

SQL> insert into t_charset values('伙伴');

1 row created.

SQL> commit;

Commit complete.

--可见占用10个字节

SQL> select a,dump(a) from t_charset;

A                    DUMP(A)

-------------------- --------------------------------------------------

伙伴                 Typ=96 Len=10: 228,188,153,228,188,180,32,32,32,32

SQL> conn /as sysdba

Connected.

SQL> alter system enable restricted session;

System altered.

SQL> alter database character set INTERNAL_USE AL32UTF8;

Database altered.

SQL> alter system disable restricted session;

System altered.

--看到没,转化AL32UTF8,中文显示不出来

SQL> conn scott/system

Connected.

SQL> select a,dump(a) from t_charset;

A                    DUMP(A)

-------------------- --------------------------------------------------

Typ=96 Len=10: 228,188,153,228,188,180,32,32,32,32

SQL> insert into t_charset values('伙伴');

1 row created.

SQL> commit;

Commit complete.

--可见基于同样的中文,ZHS16GBK与AL32UTF8,编码规则发生了变化,虽然占用的字节数相同,当然显示不出来了

SQL> select a,dump(a) from t_charset;

A                    DUMP(A)

-------------------- ----------------------------------------------------------------------------------------------------

Typ=96 Len=10: 228,188,153,228,188,180,32,32,32,32

伙伴                 Typ=96 Len=10: 230,181,188,230,172,142,229,141,179,32

----在al32utf8导出测试表

可见导出失败了(这里失败是因为表中同时存储2种不同字符集的表数据)

[oracle@seconary ~]$ exp file=exp_t_charset.dmp userid=scott  tables=t_charset

Export: Release 11.2.0.1.0 - Production on Sun Nov 29 16:27:59 2015

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Password:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set

server uses AL32UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...

. . exporting table                      T_CHARSET

EXP-00008: ORACLE error 6552 encountered

ORA-06552: PL/SQL: Compilation unit analysis terminated

ORA-06553: PLS-553: character set name is not recognized

Export terminated successfully with warnings.

[oracle@seconary ~]$

---删除测试表,然后调整数据库字符,再导入测试表

SQL> drop table t_charset purge;

Table dropped.

SQL> conn /as sysdba

Connected.

SQL> alter system enable restricted session;

System altered.

SQL> alter database character set INTERNAL_USE  ZHS16GBK;

Database altered.

SQL> alter system disable restricted session;

System altered.

[oracle@seconary ~]$ imp file=exp_t_charset.dmp userid=scott  tables=t_charset

Import: Release 11.2.0.1.0 - Production on Sun Nov 29 16:31:43 2015

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Password:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Export file created by EXPORT:V11.02.00 via conventional path

import done in ZHS16GBK character set and AL16UTF16 NCHAR character set

. importing SCOTT's objects into SCOTT

Import terminated successfully without warnings.

---我们重新测试上述的例子,清空测试表

---al32utf8

SQL> conn scott/system

Connected.

SQL> create table t_charset(a char(10));

Table created.

SQL> insert into t_charset values('伙伴');

1 row created.

SQL> commit;

Commit complete.

SQL> select a,dump(a) from t_charset;

A                    DUMP(A)

-------------------- ----------------------------------------------------------------------

伙伴                 Typ=96 Len=10: 230,181,188,230,172,142,229,141,179,32

可见现在表中1种数据库字符集,EXP还是报错,说明用alter database character set internal_use这种方式改数据库字符集,其实具备很大的风险,很可能会把数据库搞坏

[oracle@seconary ~]$ exp file=exp_t_charset_pure.dmp userid=scott  tables=t_charset

Export: Release 11.2.0.1.0 - Production on Sun Nov 29 16:38:37 2015

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Password:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set  --可见导出是以ZHS16GBK,这个提示与OS的环境变量NLS_LANG有关

server uses AL32UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...

. . exporting table                      T_CHARSET

EXP-00008: ORACLE error 6552 encountered

ORA-06552: PL/SQL: Compilation unit analysis terminated

ORA-06553: PLS-553: character set name is not recognized

Export terminated successfully with warnings.

[oracle@seconary ~]$

---查看NLS_LANG

[oracle@seconary ~]$ env|grep LANG

NLS_LANG=AMERICAN_AMERICA.ZHS16GBK

LANG=en_US.utf8

--调整NLS_LANG

[oracle@seconary ~]$ export NLS_LANG=AMERICAN_AMERICA.AL32UTF8

[oracle@seconary ~]$ env|grep LANG

NLS_LANG=AMERICAN_AMERICA.AL32UTF8

---调整NLS_LANG中的数据库字符集与数据库中的字符集相同,导出还是报错

[oracle@seconary ~]$ exp file=exp_t_charset_pure.dmp userid=scott  tables=t_charset

Export: Release 11.2.0.1.0 - Production on Sun Nov 29 16:42:41 2015

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Password:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Export done in AL32UTF8 character set and AL16UTF16 NCHAR character set

About to export specified tables via Conventional Path ...

. . exporting table                      T_CHARSET

EXP-00008: ORACLE error 6552 encountered

ORA-06552: PL/SQL: Compilation unit analysis terminated

ORA-06553: PLS-553: character set name is not recognized

Export terminated successfully with warnings.

借此学习下nls_lang中的字符集是否配置不同,会影响EXP导出的字符集导出的结果

可见确实会影响EXP导出的效果,且会影响SQLPLUS的运行

[oracle@seconary ~]$ export NLS_LANG=AMERICAN_AMERICA.ZHT16DBCSFIXED

[oracle@seconary ~]$ exp file=exp_t_charset_pure_other.dmp userid=scott  tables=t_charset

BoBoBoBoBoBo: Release 11.2.0.1.0 - Production on

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

EXP-00092: BoBoBoBoBoBo@@BoBo@@BoBoBo@@BoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBoBo@@BoBo@@BoBoBoBoBoBoBoBo@@BoBoBoBoBoBoBoBoBo

EXP-00000: BoBoBoBoBoBo@@BoBoBoBoBoBoBoBoBoBo@@BoBoBoBoBoBoBoBoBoBoBoBoBoBoBo[oracle@seconary ~]$

[oracle@seconary ~]$

[oracle@seconary ~]$

[oracle@seconary ~]$

[oracle@seconary ~]$ sqlplus '/as sysdba'

Error 19 initializing SQL*Plus

Invalid NLS character set for this OS environment

---可见NLS_LANG确实会影响EXP数据库字符集的选择,所以一定要理解此参数的含义,否则EXP IMP以及EXPDP和IMPDP会产生数据损失以及产生乱码

[oracle@seconary ~]$ export NLS_LANG=AMERICAN_AMERICA.UTF8

[oracle@seconary ~]$ exp file=exp_t_charset_pure_other.dmp userid=scott  tables=t_charset

Export: Release 11.2.0.1.0 - Production on Sun Nov 29 17:24:25 2015

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Password:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Export done in UTF8 character set and AL16UTF16 NCHAR character set

server uses AL32UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...

. . exporting table                      T_CHARSET          1 rows exported

Export terminated successfully without warnings.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/9240380/viewspace-1849953/,如需转载,请注明出处,否则将追究法律责任。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值