tigase保存聊天记录时出现如下错误:
java.sql.SQLException: Incorrect string value: '\xF0\x9F\x98\x8D\xE5\x93...' for column '_body' at row 1
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:964)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909)
...
mysql版本:5.7.17(查询方法:mysql> select version();)
mysql-connector-java.jar版本:5.1.42
原因分析:
mysql的utf8默认只支持3字节, 而emoji表情是4字节,从MYSQL5.5开始,可支持4个字节UTF编码utf8mb4,一个字符最多能有4字节,所以能支持更多的字符集。
utf8mb4 is a superset of utf8 #utf8mb4 是utf8的超集,兼容utf8。
问题解决方法:
S1:更改mysql的配置文件my.cnf文件
- S1.1检查数据库变量(进入所在的database中:use databaseName)
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+--------------------------+----------------------------+
此时你的mysql变量应该如上这样的。
- S1.2 修改my.cnf文件
我的在目录是/etc/my.cnf,在文件中添加以下内容:
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init_connect='SET NAMES utf8mb4'
- S1.3 重启数据库,检查变量
重启数据库: sudo /etc/init.d/mysqld restart
检查变量:
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+----------------------------+
保证以下变量是utf8mb4就可以:
character_set_client
character_set_connection
character_set_database
character_set_results
character_set_server
S2:将数据库和已建好的表转换成utf8mb4
--更改数据库编码
ALTER DATABASE dataBaseName CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
--更改表字符集编码
ALTER TABLE tableName CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
--更改表字段字符集编码(两种写法)
ALTER TABLE tableName CHANGE body body mediumtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tableName MODIFY `msg` mediumtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
可以用 show create table tableName查看创建表的语句,确认建表语句是否改变。
经过以上两个步骤,emoji的表情存储到mysql就不会报本文开头的错误,但用IDEA自带的数据库工具查看mysql表数据,发现全是问号??:
目前在IDEA的查看mysql表数据之前:
先执行
SET NAMES utf8mb4;
再进行query查询(比如select语句)
目前推断utf8mb4编码的字符已经存到mysql中,但是查询显示有问题,初步用SET NAMES utf8mb4;解决