mysql 表 utf8mb4,在MySQL中将表从utf8更新到utf8mb4是否安全?

I am aware that similar questions have been asked before, but we need a more definitive answer.

Is it safe to update MySQL tables encoded in utf8 to utf8mb4 in all cases. More specifically, even for varchar fields with strings generated using for example (in Java):

new BigInteger(130, random).toString(32)

From our understanding utf8mb4 is a superset of utf8 so our assumption would be that everything should be fine, but we would love some input from more MySQL superusers.

解决方案

How the data was originally inserted in MySQL is irrelevant. Let's suppose you used the entire character set of utf8, e.g. the BMP characters.

utf8mb4 is a superset of utf8mb3 (alias utf8) as documented here

One advantage of converting from utf8mb3 to utf8mb4 is that this enables applications to use supplementary characters. One tradeoff is that this may increase data storage space requirements.

In terms of table content, conversion from utf8mb3 to utf8mb4 presents no problems:

For a BMP character, utf8mb4 and utf8mb3 have identical storage

characteristics: same code values, same encoding, same length.

For a supplementary character, utf8mb4 requires four bytes to store

it, whereas utf8mb3 cannot store the character at all. When

converting utf8mb3 columns to utf8mb4, you need not worry about

converting supplementary characters because there will be none.

In terms of table structure, these are the primary potential incompatibilities:

For the variable-length character data types (VARCHAR and the TEXT types), the maximum permitted length in characters is less for utf8mb4 columns than for utf8mb3 columns.

For all character data types (CHAR, VARCHAR, and the TEXT types), the maximum number of characters that can be indexed is less for utf8mb4 columns than for utf8mb3 columns.

Consequently, to convert tables from utf8mb3 to utf8mb4, it may be necessary to change some column or index definitions.

Personally I had some issues with indexes on relative long texts where the maximum size of the index was reached. It was a search index, not a unique index, so the workaround was to use less characters in the index. See also this answer

Of course I suppose that you will use the same collation. If you change the collation other issues apply.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值