更改表的字符集(utf8 to utf8mb4)

目录

环境要求:

测试数据:

查看当前数据库的字符集:

将表t2的字符集由utf8更改utf8mb4:

方法一:alter table t2 default character set utf8mb4;

方法二:alter table t2 convert to character set utf8mb4;

mysqldump备份的情况:

mydumper的备份情况:

总结:


环境要求:

  • MySQL: 5.6.23
  • 字符集:utf8
  • 操作系统:centos6

测试数据:

# 测试表

CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL DEFAULT '' COMMENT '姓名',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

# 测试数据

insert into t2 values(1,'张三'),(2,'李四');

查看当前数据库的字符集:

(root@g1-db-test-v01:5623)[test]>show global variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name            | Value                                 |
+--------------------------+---------------------------------------+
| character_set_client     | utf8                                  |
| character_set_connection | utf8                                  |
| character_set_database   | utf8                                  |
| character_set_filesystem | binary                                |
| character_set_results    | utf8                                  |
| character_set_server     | utf8                                  |
| character_set_system     | utf8                                  |
| character_sets_dir       | /data/mysql/mha_mysql/share/charsets/ |
+--------------------------+---------------------------------------+
8 rows in set (0.00 sec)

查看当前数据库是否支持utf8mb4:

(root@g1-db-test-v01:5623)[test]>show character set like '%utf8%';
+---------+---------------+--------------------+--------+
| Charset | Description   | Default collation  | Maxlen |
+---------+---------------+--------------------+--------+
| utf8    | UTF-8 Unicode | utf8_general_ci    |      3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci |      4 |
+---------+---------------+--------------------+--------+
2 rows in set (0.00 sec)

将表t2的字符集由utf8更改utf8mb4:

方法一:alter table t2 default character set utf8mb4;

(root@g1-db-test-v01:5623)[test]>alter table t2 default character set utf8mb4;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

(root@g1-db-test-v01:5623)[test]>show create table t2 \G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `name` varchar(20) CHARACTER SET utf8 NOT NULL DEFAULT '' COMMENT '姓名',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

注意:

使用此方法后虽然表的字符集被更改为utf8mb4,但是字符串类型列·name·的字符集居然还是utf8。

方法二:alter table t2 convert to character set utf8mb4;

(root@g1-db-test-v01:5623)[test]>alter table t2 convert to character set utf8mb4;
Query OK, 2 rows affected (0.06 sec)
Records: 2  Duplicates: 0  Warnings: 0

(root@g1-db-test-v01:5623)[test]>show create table t2 \G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL DEFAULT '' COMMENT '姓名',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

因此,在将表的字符集由utf8更改为utf8mb4时应该使用方法二。

mysqldump备份的情况:

插入测试数据:

# 插入测试数据

(root@g1-db-test-v01:5623)[test]>select * from t2;
+----+--------+
| id | name   |
+----+--------+
|  1 | 张三   |
|  2 | 李四   |
+----+--------+
2 rows in set (0.00 sec)

(root@g1-db-test-v01:5623)[test]>set names utf8mb4;
Query OK, 0 rows affected (0.00 sec)

(root@g1-db-test-v01:5623)[test]>insert into t2 values(3,'\U+1F337');
Query OK, 1 row affected (0.00 sec)

(root@g1-db-test-v01:5623)[test]>select * from t2;
+----+--------+
| id | name   |
+----+--------+
|  1 | 张三   |
|  2 | 李四   |
|  3 | 🌷       |
+----+--------+
3 rows in set (0.00 sec)

mysqldump备份脚本:

--default-character-set 参数默认值为utf8

mysqldump \
--host=127.0.0.1 --user=root -p --port=5623 \
--master-data=2 --single-transaction test t2  >/data/mysql/tmp/t2.sql

查看备份文件内容:vim /data/mysql/tmp/t2.sql

-- MySQL dump 10.13  Distrib 5.6.42-84.2, for Linux (x86_64)
--
-- Host: 127.0.0.1    Database: test
-- ------------------------------------------------------
-- Server version       5.6.23-72.1-log

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Position to start replication or point-in-time recovery from
--

-- CHANGE MASTER TO MASTER_LOG_FILE='mha-mysql-bin.000039', MASTER_LOG_POS=13625726;

--
-- Table structure for table `t2`
--

DROP TABLE IF EXISTS `t2`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL DEFAULT '' COMMENT '姓名',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `t2`
--

LOCK TABLES `t2` WRITE;
/*!40000 ALTER TABLE `t2` DISABLE KEYS */;
INSERT INTO `t2` VALUES (1,'张三'),(2,'李四'),(3,'?');
/*!40000 ALTER TABLE `t2` ENABLE KEYS */;
UNLOCK TABLES;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2020-07-02 14:28:26

注意看id=3的记录name值为乱码。

 

接着使用--default-character-set=utf8mb4进行备份

mysqldump \
--host=127.0.0.1 --user=root -p --port=5623 \
--default-character-set=utf8mb4 \
--master-data=2 --single-transaction test t2  >/data/mysql/tmp/t2.sql

查看备份文件:

 

此时备份文件没有显示乱码。

-- MySQL dump 10.13  Distrib 5.6.42-84.2, for Linux (x86_64)
--
-- Host: 127.0.0.1    Database: test
-- ------------------------------------------------------
-- Server version       5.6.23-72.1-log

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8mb4 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Position to start replication or point-in-time recovery from
--

-- CHANGE MASTER TO MASTER_LOG_FILE='mha-mysql-bin.000039', MASTER_LOG_POS=13625726;

--
-- Table structure for table `t2`
--

DROP TABLE IF EXISTS `t2`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL DEFAULT '' COMMENT '姓名',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `t2`
--

LOCK TABLES `t2` WRITE;
/*!40000 ALTER TABLE `t2` DISABLE KEYS */;
INSERT INTO `t2` VALUES (1,'张三'),(2,'李四'),(3,'🌷');
/*!40000 ALTER TABLE `t2` ENABLE KEYS */;
UNLOCK TABLES;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2020-07-02 14:31:13

mydumper的备份情况:

备份脚本如下:

mydumper \
--host=10.16.81.101 --user=dba --password=doumi1.q --port=5623 \
--database=test -T t2 -o /data/mysql/tmp/test

查看备份文件内容:

cat test.t2.sql 
/*!40101 SET NAMES binary*/;
/*!40014 SET FOREIGN_KEY_CHECKS=0*/;
/*!40103 SET TIME_ZONE='+00:00' */;
INSERT INTO `t2` VALUES
(1,"张三"),
(2,"李四"),
(3,"🌷");

总结:

  1. 更改表的字符集需要使用convert to character set utf8mb4命令;
  2. 备份字符集utf8mb4的表如果使用mysqldump 需要指定参数--default-character-set=utf8mb4;
  3. 如果字符集为utf8mb4,那么客户端连接需要使用set names utf8mb4;

 

参考文章:https://dev.mysql.com/doc/refman/5.6/en/charset-unicode-conversion.html

 

### 配置 Tomcat 以支持 MySQL 使用 UTF8MB4 字符集 为了使 Tomcat 连接池能够正确处理并传递 UTF8MB4 编码的数据给 MySQL 数据库,需确保三个层面都进行了适当设置: #### 修改 MySQL 的配置文件 MySQL服务器端需要调整其默认字符编码为UTF8MB4。这可以通过修改my.cnf (Linux) 或 my.ini (Windows) 文件来实现。 ```ini [client] default-character-set = utf8mb4 [mysql] default-character-set = utf8mb4 [mysqld] character-set-client-handshake = FALSE character-set-server = utf8mb4 collation-server = utf8mb4_unicode_ci ``` 上述配置使得客户端连接时不依赖于客户端发送过来的字符集握手信息而强制使用指定的全局字符集[^1]。 #### 创建数据库时指明字符集 当创建新的数据库实例时也应显式声明使用的字符集和校对规则: ```sql CREATE SCHEMA `test` DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ; ``` 此命令确保新建立的schema采用utf8mb4作为默认字符集以及对应的排序方式[^2]。 #### 设置 JDBC URL 参数 对于Tomcat应用来说,在定义数据源的时候要通过JDBC URL中的参数告知驱动程序期望接收什么样的字符编码格式。具体做法是在context.xml或其他合适位置中加入如下形式的URL字符串: ```xml <Resource name="jdbc/TestDB" auth="Container" type="javax.sql.DataSource" ... url="jdbc:mysql://localhost:3306/test?autoReconnect=true&useUnicode=yes&characterEncoding=UTF-8"/> ``` 这里特别注意的是`useUnicode=yes` 和 `characterEncoding=UTF-8`这两个参数用来指示JDBC驱动器应该怎样对待来自Java应用程序的数据流[^3]。 需要注意一点关于版本兼容性的说明:早期某些版本可能存在不完全支持的情况,但从官方文档来看,现代版MySQL Connector/J 已经全面支持UTF8MB4字符集[^4]。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值