问题现象,表注释,数据库注释中文都为乱码:
desc t_dm_test;
address string ????
rowtime string ??????????
inserttime string ????
remark string ??
查看元数据库mysql的配置,全部正常为utf-8
mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
检查hive元数据库,发现数据库字符集和表字符集为CHARSET=latin1,应该是mysql没有修改utf8字符集或者字符集未生效时创建的hive元数据库。
mysql> show create database hive;
+----------+---------------------------------------------------------------+
| Database | Create Database |
+----------+---------------------------------------------------------------+
| hive | CREATE DATABASE `hive` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+----------+---------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> show create table COLUMNS_V2;
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| COLUMNS_V2 | CREATE TABLE `COLUMNS_V2` (
`CD_ID` bigint(20) NOT NULL,
`COMMENT` varchar(256) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`COLUMN_NAME` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`TYPE_NAME` mediumtext,
`INTEGER_IDX` int(11) NOT NULL,
PRIMARY KEY (`CD_ID`,`COLUMN_NAME`),
KEY `COLUMNS_V2_N49` (`CD_ID`),
CONSTRAINT `COLUMNS_V2_FK1` FOREIGN KEY (`CD_ID`) REFERENCES `CDS` (`CD_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
知道原因就好处理了,修改元数据库和表的字符类型:
mysql -u -p连接数据库
mysql> alter database hive character set utf8;
Query OK, 1 row affected (0.00 sec)
#修改数据库字符集
alter database hive character set utf8;
#修改字段注释字符集
ALTER TABLE COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
#修改表注释字符集
ALTER TABLE TABLE_PARAMS modify column PARAM_VALUE varchar(40000) character set utf8;
#修改分区参数,支持分区建用中文表示
ALTER TABLE PARTITION_PARAMS modify column PARAM_VALUE varchar(40000) character set utf8;
ALTER TABLE PARTITION_KEYS modify column PKEY_COMMENT varchar(40000) character set utf8;
#修改表名注释,支持中文表示
ALTER TABLE INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
#修改视图,支持视图中文
ALTER TABLE TBLS modify COLUMN VIEW_EXPANDED_TEXT mediumtext CHARACTER SET utf8;
ALTER TABLE TBLS modify COLUMN VIEW_ORIGINAL_TEXT mediumtext CHARACTER SET utf8;
#修改数据库名称注释
ALTER TABLE `DBS` CHANGE COLUMN `DESC` `DESC` VARCHAR(4000) CHARACTER SET 'utf8' NULL DEFAULT NULL ;
hive中查看表信息,中文显示正常
hive> desc t_dm_test;
OK
rowtime string 记录新增或更新时的时
inserttime string 落库时间
remark string 备注