mysql中怎么存储日文,在UTF-8数据库中存储日文字符

I am having some data reject on an insert of Japanese characters to a

Varchar column in my DB2 UTF-8 database. I am trying to understand how

big to make the Varchar column for the inesert to work successfully for

all of my data. I would also like ot understand if Vargraphic is a

better approach and what ramifications that might have.

This data is sourced from UTF-8 Oracle (using byte semantics) which

defines a columns as

Varchar2(255). I had defined my DB2 columns Varchar(255) but the

insert rejects some rows.

I notice that for Oracle : if I apply the length function to a specific

column , I get a value of 8 for a specified key. However, for this

same key that sucecssfully loaded on the DB2 side, I get a length of

33. Sometimes, the length on the DB2 side is more than 5 times the

Oracle side. I do not understand how this can be.

Also, Can someone clarify how much storage DB2 allocates in bytes, for

the following (assume non-nullable) ? My assumptions are below -

CHAR(1) - assume 1 byte

CHAR(100) - assume 100 bytes

VARCHAR(1) - assume length + 1 byte = 2 bytes

VARCHAR(100) - assume lenght + 100 bytes = 101 bytes at most

VARGRAPHIC(100) - ?

Thank you in advance for your assistance.

解决方案"mike_dba" wrote in message

news:11**********************@i39g2000cwa.googlegr oups.com...I am having some data reject on an insert of Japanese characters to a

Varchar column in my DB2 UTF-8 database. I am trying to understand how

big to make the Varchar column for the inesert to work successfully for

all of my data. I would also like ot understand if Vargraphic is a

better approach and what ramifications that might have.

This data is sourced from UTF-8 Oracle (using byte semantics) which

defines a columns as

Varchar2(255). I had defined my DB2 columns Varchar(255) but the

insert rejects some rows.

I notice that for Oracle : if I apply the length function to a specific

column , I get a value of 8 for a specified key. However, for this

same key that sucecssfully loaded on the DB2 side, I get a length of

33. Sometimes, the length on the DB2 side is more than 5 times the

Oracle side. I do not understand how this can be.

Also, Can someone clarify how much storage DB2 allocates in bytes, for

the following (assume non-nullable) ? My assumptions are below -

CHAR(1) - assume 1 byte

CHAR(100) - assume 100 bytes

VARCHAR(1) - assume length + 1 byte = 2 bytes

VARCHAR(100) - assume lenght + 100 bytes = 101 bytes at most

VARGRAPHIC(100) - ?

Thank you in advance for your assistance.

I would triple the size of the VARCHAR in DB2. You can go up to about 32K

(not quite) with VARCHAR in DB2.

DB2 uses 2 extra bytes for the length, but that does not reduce the amount

of space for the data. If it is nullable, it uses one additional byte.

Thank you for your response. The triple size will work in most cases.

However, we have noticed the following scenario -

I have a varchar(255) in Oracle. For a particular data item on Oracle,

I apply the Oracle function for length and get 86 character back. I

also apply the Oracle function for byte size and get back 238.

Inserting this same data into DB2 varchar(255) rejects. Inserting

into DB2 varchar(2000) works. DB2 length function returns 466 (this

should be bytes, correct ?). How could this be ?

I don''t know how Oracle measures the length of a UTF-8 varchar field or

how it physically stores it but UDB stores UTF-8 data as 1-4 bytes for

each character. Characters in pictographic languages, especially

Japanese, Chinese, and other languages used in that part of the world,

are more often the longer three and four byte ones.

A varchar definition specifies the number of bytes, not the number of

characters to be stored. The data capacity of the column can''t be

precisely determined because each character may occupy a different

number of bytes.

A single UTF-8 character occupies a maximum of four bytes. Specify a

varchar length four times the number of characters you expect to store

and you''ll always have enough space. Unfortunately, this will NOT

prevent storing more characters than you want if each character is

shorter than the 4/character worst case.

Philip Sherman

mike_dba wrote:Thank you for your response. The triple size will work in most cases.

However, we have noticed the following scenario -

I have a varchar(255) in Oracle. For a particular data item on Oracle,

I apply the Oracle function for length and get 86 character back. I

also apply the Oracle function for byte size and get back 238.

Inserting this same data into DB2 varchar(255) rejects. Inserting

into DB2 varchar(2000) works. DB2 length function returns 466 (this

should be bytes, correct ?). How could this be ?

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值