mysql数据类型转换优先级,MySQL:转换数据类型和排序规则对存储的数据的影响...

I have a general question about this. There are many times we want to change data-types of fields or collations when lots of data is inserted before. Consider these situations :

converting varchar collation from utf8_general_ci to latin1_swedish_ci: as I know the first has multibyte chars and the second singly byte ones. Does this conversion manipulate stored records correctly? And does this conversion lead to reduction of volume of existing data (maybe 50%)?

Conversion of int(10) to smallint(5): Does the volume of data reduce to 50% correctly?

Or for example: int(10) to unsigned int(10) - text to varchar(1000) - varchar(20) to char(10) , ...

As it is obvious, these actions might be done to increase efficiency, reduce volume of data and ...

Consider I have a table with 1,000,000 records. I want to know if doing such actions have bad effects on stored data, or if it makes low performance for future inserts and selects involving this table.

UPDATE :

When I talk about changing utf8 encoding charset to Latin, of course the values of my field are English (it's obvious if there are Japanese, they will be lost). With this assumption, I'm asking about the resulting table size and performance.

解决方案Converting varchar collation from utf8_general_ci to latin1_swedish_ci: As I know the first has multibyte chars and the second singly byte ones. Does this conversion manipulate stored records correctly? And does this conversion lead to reduction of volume of existing data (maybe 50%)?

Collation is merely the ordering that is used for string comparisons—it has (almost) nothing to do with the character encoding that is used for data storage. I say almost because collations can only be used with certain character sets, so changing collation may force a change in the character encoding.

To the extent that the character encoding is modified, MySQL will correctly re-encode values to the new character set whether going from single to multi-byte or vice-versa. Beware that any values that become too large for the column will be truncated.

Provided that the new character type is of variable-length and that the values are encoded with fewer bytes in the new encoding than before, there will of course be a reduction in the table's size.

Conversion of int(10) to smallint(5): Does the volume of data reduce to 50% correctly?

INT and SMALLINT respectively occupy 4 and 2 bytes regardless of display width: so yes, the size of the table will reduce accordingly.

Or for example: int(10) to unsigned int(10) - text to varchar(1000) - varchar(20) to char(10), ...

INT occupies 4 bytes irrespective of whether it is signed, so there will be no change;

TEXT and VARCHAR(1000) both occupy L+2 bytes (where L is the value's length in bytes), so there will be no change;

VARCHAR(20) occupies L+1 bytes (where L is the value's length in bytes) whereas CHAR(10) occupies 10×w bytes (where w is the number of bytes required for the maximum-length character in the character set), so there may well be a change but it is dependent on the actual values stored and the character encoding used.

Note that, depending on storage engine, reductions in table size may not immediately be released to the filesystem.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值