mysql字符集与校对规则

mysql字符集与校对规则

mysql的字符集和校对规则有四个级别的默认设置:服务器级,数据库级,表级和字段级。他们分别在不同的地方设置,作用也不同。
查看字符集级对应的校对规则。
Charset:字符集,Default collation:默认使用的校对规则。Maxlen:字符集的最大长度

mysql> show character set;
+----------+---------------------------------+---------------------+--------+
| Charset  | Description                     | Default collation   | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian              | armscii8_general_ci |      1 |
| ascii    | US ASCII                        | ascii_general_ci    |      1 |
| big5     | Big5 Traditional Chinese        | big5_chinese_ci     |      2 |
| binary   | Binary pseudo charset           | binary              |      1 |
| cp1250   | Windows Central European        | cp1250_general_ci   |      1 |
| cp1251   | Windows Cyrillic                | cp1251_general_ci   |      1 |
| cp1256   | Windows Arabic                  | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic                  | cp1257_general_ci   |      1 |
| cp850    | DOS West European               | cp850_general_ci    |      1 |
| cp852    | DOS Central European            | cp852_general_ci    |      1 |
| cp866    | DOS Russian                     | cp866_general_ci    |      1 |
| cp932    | SJIS for Windows Japanese       | cp932_japanese_ci   |      2 |
| dec8     | DEC West European               | dec8_swedish_ci     |      1 |
| eucjpms  | UJIS for Windows Japanese       | eucjpms_japanese_ci |      3 |
| euckr    | EUC-KR Korean                   | euckr_korean_ci     |      2 |
| gb18030  | China National Standard GB18030 | gb18030_chinese_ci  |      4 |
| gb2312   | GB2312 Simplified Chinese       | gb2312_chinese_ci   |      2 |
| gbk      | GBK Simplified Chinese          | gbk_chinese_ci      |      2 |
| geostd8  | GEOSTD8 Georgian                | geostd8_general_ci  |      1 |
| greek    | ISO 8859-7 Greek                | greek_general_ci    |      1 |
| hebrew   | ISO 8859-8 Hebrew               | hebrew_general_ci   |      1 |
| hp8      | HP West European                | hp8_english_ci      |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak      | keybcs2_general_ci  |      1 |
| koi8r    | KOI8-R Relcom Russian           | koi8r_general_ci    |      1 |
| koi8u    | KOI8-U Ukrainian                | koi8u_general_ci    |      1 |
| latin1   | cp1252 West European            | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European     | latin2_general_ci   |      1 |
| latin5   | ISO 8859-9 Turkish              | latin5_turkish_ci   |      1 |
| latin7   | ISO 8859-13 Baltic              | latin7_general_ci   |      1 |
| macce    | Mac Central European            | macce_general_ci    |      1 |
| macroman | Mac West European               | macroman_general_ci |      1 |
| sjis     | Shift-JIS Japanese              | sjis_japanese_ci    |      2 |
| swe7     | 7bit Swedish                    | swe7_swedish_ci     |      1 |
| tis620   | TIS620 Thai                     | tis620_thai_ci      |      1 |
| ucs2     | UCS-2 Unicode                   | ucs2_general_ci     |      2 |
| ujis     | EUC-JP Japanese                 | ujis_japanese_ci    |      3 |
| utf16    | UTF-16 Unicode                  | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode                | utf16le_general_ci  |      4 |
| utf32    | UTF-32 Unicode                  | utf32_general_ci    |      4 |
| utf8     | UTF-8 Unicode                   | utf8_general_ci     |      3 |
| utf8mb4  | UTF-8 Unicode                   | utf8mb4_0900_ai_ci  |      4 |
+----------+---------------------------------+---------------------+--------+
41 rows in set (0.04 sec)

可通过命令查看字符集对应的可用的校对规则。

mysql> show collation like 'gbk%';
+----------------+---------+----+---------+----------+---------+---------------+
| Collation      | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+----------------+---------+----+---------+----------+---------+---------------+
| gbk_bin        | gbk     | 87 |         | Yes      |       1 | PAD SPACE     |
| gbk_chinese_ci | gbk     | 28 | Yes     | Yes      |       1 | PAD SPACE     |
+----------------+---------+----+---------+----------+---------+---------------+
2 rows in set (0.00 sec)

校对规则命名约定:相关的字符集名_语言名_ci(大小写不敏感)或_cs(大小写敏感),或字符集名_bin(二元,即比较是基于字符编码的值,所以和语言无关,命名中无语言标识)。
如一下utf8对应的校对规则

mysql> show collation like 'utf8\_%';
+--------------------------+---------+-----+---------+----------+---------+---------------+
| Collation                | Charset | Id  | Default | Compiled | Sortlen | Pad_attribute |
+--------------------------+---------+-----+---------+----------+---------+---------------+
| utf8_bin                 | utf8    |  83 |         | Yes      |       1 | PAD SPACE     |
| utf8_croatian_ci         | utf8    | 213 |         | Yes      |       8 | PAD SPACE     |
| utf8_czech_ci            | utf8    | 202 |         | Yes      |       8 | PAD SPACE     |
| utf8_danish_ci           | utf8    | 203 |         | Yes      |       8 | PAD SPACE     |
| utf8_esperanto_ci        | utf8    | 209 |         | Yes      |       8 | PAD SPACE     |
| utf8_estonian_ci         | utf8    | 198 |         | Yes      |       8 | PAD SPACE     |
| utf8_general_ci          | utf8    |  33 | Yes     | Yes      |       1 | PAD SPACE     |
| utf8_general_mysql500_ci | utf8    | 223 |         | Yes      |       1 | PAD SPACE     |
| utf8_german2_ci          | utf8    | 212 |         | Yes      |       8 | PAD SPACE     |
| utf8_hungarian_ci        | utf8    | 210 |         | Yes      |       8 | PAD SPACE     |
| utf8_icelandic_ci        | utf8    | 193 |         | Yes      |       8 | PAD SPACE     |
| utf8_latvian_ci          | utf8    | 194 |         | Yes      |       8 | PAD SPACE     |
| utf8_lithuanian_ci       | utf8    | 204 |         | Yes      |       8 | PAD SPACE     |
| utf8_persian_ci          | utf8    | 208 |         | Yes      |       8 | PAD SPACE     |
| utf8_polish_ci           | utf8    | 197 |         | Yes      |       8 | PAD SPACE     |
| utf8_romanian_ci         | utf8    | 195 |         | Yes      |       8 | PAD SPACE     |
| utf8_roman_ci            | utf8    | 207 |         | Yes      |       8 | PAD SPACE     |
| utf8_sinhala_ci          | utf8    | 211 |         | Yes      |       8 | PAD SPACE     |
| utf8_slovak_ci           | utf8    | 205 |         | Yes      |       8 | PAD SPACE     |
| utf8_slovenian_ci        | utf8    | 196 |         | Yes      |       8 | PAD SPACE     |
| utf8_spanish2_ci         | utf8    | 206 |         | Yes      |       8 | PAD SPACE     |
| utf8_spanish_ci          | utf8    | 199 |         | Yes      |       8 | PAD SPACE     |
| utf8_swedish_ci          | utf8    | 200 |         | Yes      |       8 | PAD SPACE     |
| utf8_tolower_ci          | utf8    |  76 |         | Yes      |       1 | PAD SPACE     |
| utf8_turkish_ci          | utf8    | 201 |         | Yes      |       8 | PAD SPACE     |
| utf8_unicode_520_ci      | utf8    | 214 |         | Yes      |       8 | PAD SPACE     |
| utf8_unicode_ci          | utf8    | 192 |         | Yes      |       8 | PAD SPACE     |
| utf8_vietnamese_ci       | utf8    | 215 |         | Yes      |       8 | PAD SPACE     |
+--------------------------+---------+-----+---------+----------+---------+---------------+
28 rows in set (0.00 sec)

一,服务器级字符集和校对规则
在mysql服务启动时确定,默认使用latin1作为服务器级字符集。
在这里插入图片描述
可使用如下命令查看当前server的字符集和校对规则。

mysql> show variables like 'character_set_server';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| character_set_server | utf8  |
+----------------------+-------+
1 row in set, 1 warning (0.00 sec)

mysql> show variables like 'collation_server';
+------------------+-----------------+
| Variable_name    | Value           |
+------------------+-----------------+
| collation_server | utf8_general_ci |
+------------------+-----------------+
1 row in set, 1 warning (0.00 sec)

二,数据库级字符集和校对规则
mysql数据库,默认的字符集为:armscii8,默认校对规则为:armscii8_general_ci。

mysql> show variables like
    -> 'character_set_database';
+------------------------+----------+
| Variable_name          | Value    |
+------------------------+----------+
| character_set_database | armscii8 |
+------------------------+----------+
1 row in set, 1 warning (0.00 sec)

mysql> show variables like 'collation_database';
+--------------------+---------------------+
| Variable_name      | Value               |
+--------------------+---------------------+
| collation_database | armscii8_general_ci |
+--------------------+---------------------+
1 row in set, 1 warning (0.00 sec)

三,表级字符集和校对规则
可用如下命令查看表级字符集和校对规则。表级字符集和校对规则可在建表时显式指定。

mysql> show create table tcc \G;
*************************** 1. row ***************************
       Table: tcc
Create Table: CREATE TABLE `tcc` (
  `charset_desc` char(8) COLLATE utf8_bin DEFAULT NULL,
  `collate_desc` char(8) COLLATE utf8_bin DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
1 row in set (0.00 sec)

四,列字符集和校对规则
使用频率较少。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值