一、问题描述
环境 cdh6.3.1 + hive 2.1.0。 hive 元数据库用的是mysql
创建hive表后,用show create table 查看表信息,中文注释出现乱码,如下
1 CREATE TABLE `stage_mysql.userdevice_default_group_day`(
2 `id` int COMMENT '??',
3 `user_id` string COMMENT '????',
4 `sub_serial` string COMMENT '?????',
5 `at_home` int COMMENT '0:?? 1?? 2??',
6 `out_door` int COMMENT '0:?? 1?? 2??',
7 `at_sleep` int COMMENT '0:?? 1?? 2??')
8 COMMENT '????????'
9 PARTITIONED BY (
10 `dt` string)
11 ROW FORMAT SERDE
12 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
13 STORED AS INPUTFORMAT
14 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
15 OUTPUTFORMAT
16 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
17 LOCATION
18 'hdfs://nameservice1/user/hive/warehouse/stage_mysql.db/userdevice_default_group_day'
19 TBLPROPERTIES (
20 'transient_lastDdlTime'='1589878417')
二、解决方法
1、进入mysql执行:show create database hive; 发现默认是utf8类型
+----------+---------------------------------------------------------------+
| Database | Create Database |
+----------+---------------------------------------------------------------+
| hive | CREATE DATABASE `hive` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+----------+---------------------------------------------------------------+
2、更改默认编码为latin1
mysql> alter database hive default character set latin1;
3、在mysql中修改hive元数据属性
mysql> use hive;
mysql> alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
mysql> alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
mysql> alter table PARTITION_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
mysql> alter table PARTITION_KEYS modify column PKEY_COMMENT varchar(4000) character set utf8;
mysql> alter table INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
4、重新建表,再show create table查看,此时中文注释显示正常.