mysql字符集文字乱码_MySQL的字符集和乱码问题

1.字符集知识

#概述

1.字符集是一套文字符号及其编码、比较规则的集合,第一个计算机字符串ASC22.mysql数据库字符集包括字符集(character)和 校对规则,其中字符集使用来定义mysql数据字符串的存储方式,校对规则是定义比较字符串的方式

#扩展

#字符编码:就是人类使用的英文字母、汉字、特殊符号等信息,通过转换规则,将其转换为计算机可以识别的二进制数字的一种编码方式

#mysql数据库常见字符集

常用字符集  长度 说明

GBK2不是国际标准

UTF-8     3中英文混合的环境,建议使用

Latin11mysql默认字符集

Utf8mb44 Utf-8unicode

#选择合适的字符集

1.处理各种各样的文字,发布到不同语言国家地区,应该选择Unicode字符集,在mysql里面就是utf-8(每个汉字三个字节)2.如果只需要支持中文,并且数据量大,性能要求高,可选择GBK(定长,每个汉字双字节)3.处理移动互联网业务,可使用utf8mb4字符集

#查看系统的字符集

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

[root@cots3 ~]# mysql -uroot -p -e "show character set;"Enter password:+----------+---------------------------------+---------------------+--------+

| Charset | Description | Default collation | Maxlen |

+----------+---------------------------------+---------------------+--------+

| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |

| dec8 | DEC West European | dec8_swedish_ci | 1 |

| cp850 | DOS West European | cp850_general_ci | 1 |

| hp8 | HP West European | hp8_english_ci | 1 |

| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |

| latin1 | cp1252 West European | latin1_swedish_ci | 1 |

| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |

| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |

| ascii | US ASCII | ascii_general_ci | 1 |

| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |

| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |

| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |

| tis620 | TIS620 Thai | tis620_thai_ci | 1 |

| euckr | EUC-KR Korean | euckr_korean_ci | 2 |

| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |

| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |

| greek | ISO 8859-7 Greek | greek_general_ci | 1 |

| cp1250 | Windows Central European | cp1250_general_ci | 1 |

| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |

| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |

| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |

| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |

| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |

| cp866 | DOS Russian | cp866_general_ci | 1 |

| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |

| macce | Mac Central European | macce_general_ci | 1 |

| macroman | Mac West European | macroman_general_ci | 1 |

| cp852 | DOS Central European | cp852_general_ci | 1 |

| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |

| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |

| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |

| utf16 | UTF-16 Unicode | utf16_general_ci | 4 |

| utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 |

| cp1256 | Windows Arabic | cp1256_general_ci | 1 |

| cp1257 | Windows Baltic | cp1257_general_ci | 1 |

| utf32 | UTF-32 Unicode | utf32_general_ci | 4 |

| binary | Binary pseudo charset | binary | 1 |

| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |

| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |

| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |

| gb18030 | China National Standard GB18030 | gb18030_chinese_ci | 4 |

+----------+---------------------------------+---------------------+--------+

View Code

#过滤出常用

[root@cots3 ~]# mysql -uroot -p -e "show character set;" |egrep "Charset|gbk|utf8|latin1" | awk '{print $0}'Enter password:

Charset Description Default collation Maxlen

latin1 cp1252 West European latin1_swedish_ci1gbk GBK Simplified Chinese gbk_chinese_ci2utf8 UTF-8 Unicode utf8_general_ci 3utf8mb4 UTF-8 Unicode utf8mb4_general_ci 4

#查看字符集对应校对规则

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

mysql> show collation like 'utf8%';+--------------------------+---------+-----+---------+----------+---------+

| Collation | Charset | Id | Default | Compiled | Sortlen |

+--------------------------+---------+-----+---------+----------+---------+

| utf8_general_ci | utf8 | 33 | Yes | Yes | 1 |

| utf8_bin | utf8 | 83 | | Yes | 1 |

| utf8_unicode_ci | utf8 | 192 | | Yes | 8 |

| utf8_icelandic_ci | utf8 | 193 | | Yes | 8 |

| utf8_latvian_ci | utf8 | 194 | | Yes | 8 |

| utf8_romanian_ci | utf8 | 195 | | Yes | 8 |

| utf8_slovenian_ci | utf8 | 196 | | Yes | 8 |

| utf8_polish_ci | utf8 | 197 | | Yes | 8 |

| utf8_estonian_ci | utf8 | 198 | | Yes | 8 |

| utf8_spanish_ci | utf8 | 199 | | Yes | 8 |

| utf8_swedish_ci | utf8 | 200 | | Yes | 8 |

| utf8_turkish_ci | utf8 | 201 | | Yes | 8 |

| utf8_czech_ci | utf8 | 202 | | Yes | 8 |

| utf8_danish_ci | utf8 | 203 | | Yes | 8 |

| utf8_lithuanian_ci | utf8 | 204 | | Yes | 8 |

| utf8_slovak_ci | utf8 | 205 | | Yes | 8 |

| utf8_spanish2_ci | utf8 | 206 | | Yes | 8 |

| utf8_roman_ci | utf8 | 207 | | Yes | 8 |

| utf8_persian_ci | utf8 | 208 | | Yes | 8 |

| utf8_esperanto_ci | utf8 | 209 | | Yes | 8 |

| utf8_hungarian_ci | utf8 | 210 | | Yes | 8 |

| utf8_sinhala_ci | utf8 | 211 | | Yes | 8 |

| utf8_german2_ci | utf8 | 212 | | Yes | 8 |

| utf8_croatian_ci | utf8 | 213 | | Yes | 8 |

| utf8_unicode_520_ci | utf8 | 214 | | Yes | 8 |

| utf8_vietnamese_ci | utf8 | 215 | | Yes | 8 |

| utf8_general_mysql500_ci | utf8 | 223 | | Yes | 1 |

| utf8mb4_general_ci | utf8mb4 | 45 | Yes | Yes | 1 |

| utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 |

| utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 8 |

| utf8mb4_icelandic_ci | utf8mb4 | 225 | | Yes | 8 |

| utf8mb4_latvian_ci | utf8mb4 | 226 | | Yes | 8 |

| utf8mb4_romanian_ci | utf8mb4 | 227 | | Yes | 8 |

| utf8mb4_slovenian_ci | utf8mb4 | 228 | | Yes | 8 |

| utf8mb4_polish_ci | utf8mb4 | 229 | | Yes | 8 |

| utf8mb4_estonian_ci | utf8mb4 | 230 | | Yes | 8 |

| utf8mb4_spanish_ci | utf8mb4 | 231 | | Yes | 8 |

| utf8mb4_swedish_ci | utf8mb4 | 232 | | Yes | 8 |

| utf8mb4_turkish_ci | utf8mb4 | 233 | | Yes | 8 |

| utf8mb4_czech_ci | utf8mb4 | 234 | | Yes | 8 |

| utf8mb4_danish_ci | utf8mb4 | 235 | | Yes | 8 |

| utf8mb4_lithuanian_ci | utf8mb4 | 236 | | Yes | 8 |

| utf8mb4_slovak_ci | utf8mb4 | 237 | | Yes | 8 |

| utf8mb4_spanish2_ci | utf8mb4 | 238 | | Yes | 8 |

| utf8mb4_roman_ci | utf8mb4 | 239 | | Yes | 8 |

| utf8mb4_persian_ci | utf8mb4 | 240 | | Yes | 8 |

| utf8mb4_esperanto_ci | utf8mb4 | 241 | | Yes | 8 |

| utf8mb4_hungarian_ci | utf8mb4 | 242 | | Yes | 8 |

| utf8mb4_sinhala_ci | utf8mb4 | 243 | | Yes | 8 |

| utf8mb4_german2_ci | utf8mb4 | 244 | | Yes | 8 |

| utf8mb4_croatian_ci | utf8mb4 | 245 | | Yes | 8 |

| utf8mb4_unicode_520_ci | utf8mb4 | 246 | | Yes | 8 |

| utf8mb4_vietnamese_ci | utf8mb4 | 247 | | Yes | 8 |

+--------------------------+---------+-----+---------+----------+---------+

53 rows in set (0.01 sec)

View Code

#查看系统默认设置的字符集

mysql> show variables like "character_set%";+--------------------------+----------------------------+

| Variable_name | Value |

+--------------------------+----------------------------+

| character_set_client | utf8 | #客户端字符集

| character_set_connection | utf8 | #连接字符集

| character_set_database | latin1 | #数据库字符集,配置文件指定或库表指定

| character_set_filesystem | binary | #文件系统字符集

| character_set_results | utf8 | #返回结果字符集

| character_set_server | latin1 | #服务器字符集,配置文件指定或库表指定

| character_set_system | utf8 | #系统字符集

| character_sets_dir | /usr/share/mysql/charsets/ |

+--------------------------+----------------------------+

8 rows in set (0.00 sec)

#提示:默认情况下character_set_client,character_set_connection,character_set_results三者的字符集和系统的字符集一致,如果mysql不指定

#查看系统字符集

[root@cots3 ~]# cat /etc/locale.conf

LANG="en_US.UTF-8"[root@cots3~]# echo $LANG

en_US.UTF-8

#指定字符集进行登录数据库

[root@ctos3 ~]# mysql -uroot -p --default-character-set=latin1;

Enter password:

mysql> show variables like 'character_set%';+--------------------------+----------------------------+

| Variable_name | Value |

+--------------------------+----------------------------+

| character_set_client | latin1 |

| character_set_connection | latin1 |

| character_set_database | latin1 |

| character_set_filesystem | binary |

| character_set_results | latin1 |

| character_set_server | latin1 |

| character_set_system | utf8 |

| character_sets_dir | /usr/share/mysql/charsets/ |

+--------------------------+----------------------------+

8 rows in set (0.00sec)

#提示:将原来set_client,set_connection,set_results的字符集更改了

2.乱码问题

#如果我们设置的字符集不相同的话,就会可能出现乱码

#保证不乱码思想:统一字符集,中英文的环境建议选择utf8

#保证不乱码的关键,保证以下几个统一

1.Linux操作系统2.操作系统客户端连接(xshell,ssh)3.mysql服务端4.mysql客户端5.mysql库表6.开发的程序字符集

#例子:统一设置成utf8

#1.Linux系统

[root@mysql-1 ~]# cat /etc/locale.conf #centos7系统

LANG="en_US.UTF-8"[root@mysql-1 ~]# cat /etc/sysconfig/il8n #centos6系统查看

LANG="en_US.UTF-8"[root@mysql-3 ~]# echo $LANG

zh_CN.UTF-8#提示:使用SecureCRT或xshell连接服务器的也要更改

#2.mysql服务端

#在/etc/my.cnf的mysqld模块里面更改

[mysqld]

character-set-server=utf8

#3.mysql客户端

#在/etc/my.cnf文件里面添加客户端模块,无需重启

[client]default-character-set=utf8

#4.mysql库表,一般上面几个改完,库表都会随着mysql的字符集

mysql> create database data_utf8 character set=utf8;

Query OK,1 row affected (0.01sec)

mysql>show create database data_utf8\G;*************************** 1. row ***************************Database: data_utf8

Create Database: CREATE DATABASE `data_utf8`/*!40100 DEFAULT CHARACTER SET utf8*/

1 row in set (0.00 sec)

#5.开发的程序字符集

#在下载的时候选择UTF8

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值