先放结论,读写emoji表情都正常的操作方式是:
1、数据库的表定义,必须是 utf8mb4;
2、SpringBoot配置文件里的连接串必须加 characterEncoding=utf8
3、配置文件里同时要添加初始化SQL语句: initSQL: “SET NAMES utf8mb4”
保证以上3步,就可以正确写入,也能正确读取出emoji表情字符。
-------------------------------------------------我是分隔符-----------------------
今天在审查生产错误日志时,发现了这样的错误日志:
Incorrect string value: '\xF0\x9F\x98\x82\xE4\xB9...' for column 'Remarks' at row 1
放百度查了一下,都是答复说因为emoji表情字符的问题,导致入库失败。
然而,我看了n篇关于Java的文章,都要把连接串里的【characterEncoding=utf8】删除!!!
估计这些人只测试不报错了,却没有发现读取不出来了……
另外,还有很多人说数据库的编码也要改成utf8mb4,实际上我经过测试,数据库编码用utf8,建表用utf8mb4,完全可以正常读写emoji:
CREATE DATABASE aaa DEFAULT CHARACTER SET utf8;
CREATE TABLE `users` (
`userName` VARCHAR(50) NOT NULL
) ENGINE=INNODB DEFAULT CHARSET=utf8mb4 COMMENT='用户表'
我的测试过程,
1、首先要开启MySQL的日志记录,把所有SQL打印出来:
打开my.ini文件,在[mysqld]下增加3个配置(注:我的MySQL版本 5.7.27-log):
general-log=1
log-output=FILE
general-log-file=D:\Program Files\mysqldata\log.sql
2、删除连接串里的:【characterEncoding=utf8】
application.yml如下:
datasource:
url: jdbc:mysql://10.2.5.2:3306/db-utf8?&serverTimezone=Asia/Shanghai&useSSL=false
username: beinet
password: beinet123
driver-class-name: com.mysql.cj.jdbc.Driver
3、启动SpringBoot项目(2.1.4.RELEASE版本),先不做操作;
4、打开MySQL的日志文件,已经有内容了,SpringBoot已建立了连接:
MySQL, Version: 5.7.27-log (MySQL Community Server (GPL)). started with:
TCP Port: 0, Named Pipe: (null)
Time Id Command Argument
2020-07-30T11:45:51.461443Z 2 Connect beinet@10.2.0.43 on db-utf8 using TCP/IP
2020-07-30T11:45:51.473378Z 2 Query /* mysql-connector-java-8.0.15 (Revision: 79a4336f140499bd22dd07f02b708e163844e3d5) */SELECT @@session.auto_increment_increment AS auto_increment_increment, @@character_set_client AS character_set_client, @@character_set_connection AS character_set_connection, @@character_set_results AS character_set_results, @@character_set_server AS character_set_server, @@collation_server AS collation_server, @@collation_connection AS collation_connection, @@init_connect AS init_connect, @@interactive_timeout AS interactive_timeout, @@license AS license, @@lower_case_table_names AS lower_case_table_names, @@max_allowed_packet AS max_allowed_packet, @@net_write_timeout AS net_write_timeout, @@query_cache_size AS query_cache_size, @@query_cache_type AS query_cache_type, @@sql_mode AS sql_mode, @@system_time_zone AS system_time_zone, @@time_zone AS time_zone, @@transaction_isolation AS transaction_isolation, @@wait_timeout AS wait_timeout
2020-07-30T11:45:51.482978Z 2 Query SHOW WARNINGS
2020-07-30T11:45:51.486337Z 2 Query SET NAMES latin1
2020-07-30T11:45:51.487290Z 2 Query SET character_set_results = NULL
2020-07-30T11:45:51.489110Z 2 Query SET autocommit=1
2020-07-30T11:45:51.496463Z 2 Query SELECT @@session.transaction_read_only
2020-07-30T11:45:51.498080Z 2 Query SELECT @@session.transaction_isolation
2020-07-30T11:45:52.302419Z 2 Query SHOW FULL TABLES FROM `db-utf8` LIKE '%'
2020-07-30T11:45:52.307233Z 2 Query SHOW FULL TABLES FROM `db-utf8`
2020-07-30T11:45:52.309415Z 2 Query SHOW FULL COLUMNS FROM `users` FROM `db-utf8` LIKE '%'
2020-07-30T11:45:52.327227Z 2 Query SHOW FULL COLUMNS FROM `users1` FROM `db-utf8` LIKE '%'
可以看到,没有了characterEncoding,MySQL连接默认用 SET NAMES latin1;
4、在SpringBoot项目里,执行一条SQL,插入一个emoji表情,可以看到,插入是正常的,
但是,接着在项目里读取这条新插入的记录,会发现,读取出来的是一个 问号,而不是插入的表情了。
备注1:
SET NAMES utf8mb4,从MySQL官方看,做了3件事:https://dev.mysql.com/doc/refman/5.6/en/charset-connection.html
- character_set_client 客户端发送的语句编码;
- character_set_connection 数据库收到语句后,要转换到的编码;
- character_set_results 数据库执行语句后,返回给客户端的数据编码。
备注2:
在SpringBoot中,有多种开启初始化SQL的写法,
1、用tomcat的写法:
datasource:
url: jdbc:mysql://10.2.5.2:3306/db-utf8?characterEncoding=utf8&serverTimezone=Asia/Shanghai&useSSL=false
username: beinet
password: beinet123
driver-class-name: com.mysql.cj.jdbc.Driver
tomcat:
initSQL: "SET NAMES utf8mb4"
2、用阿里的durid连接池,可以用:
datasource:
url: jdbc:mysql://10.2.5.2:3306/db-utf8?characterEncoding=utf8&serverTimezone=Asia/Shanghai&useSSL=false
username: beinet
password: beinet123
driver-class-name: com.mysql.cj.jdbc.Driver
# 使用Druid数据库连接池
type: com.alibaba.druid.pool.DruidDataSource
druid:
connection-init-sqls: ["SET NAMES utf8mb4"]