hive数据导入MySQL乱码_sqoop学习3(数据导入乱码问题)

sqoop将mysql数据库中数据导入hdfs或hive中后中文乱码问题解决办法

[root@spark1 ~]# vi /etc/my.cnf 修改配置文件

在文件内的[mysqld]和client下增加如下1行

[mysqld]

default-character-set=utf8

[client]

default-character-set=utf8

然后在创建数据库和表时都指定字符集为utf8

mysql> create database wujiadong1 character set utf8;

mysql> create table stud_info(

-> stud_code varchar(50) not null,

-> stud_name varchar(50) not null,

-> stud_gend varchar(50) not null default 'M',

-> birthday date null,

-> log_date date null,

-> orig_addr varchar(50) null,

-> lev_date date null,

-> college_code varchar(50) null,

-> college_name varchar(50) null,

-> state varchar(50) null,

-> primary key(stud_code)

-> )character set utf8;

mysql> load data local infile '/root/hive_test/stud_info.csv' into table stud_info

-> fields terminated by ','

-> lines terminated by '\n'

-> ignore 1 lines;

mysql> select * from stud_info; #看中文字符能否正常显示

再向hdfs中导入数据

[root@spark1 ~]# sqoop import --connect jdbc:mysql://192.168.220.144:3306/wujiadong1 --username root --table stud_info --target-dir 'hdfs://spark1:9000/user/sqoop_test1' -m 1

[root@spark1 ~]# hadoop fs -lsr /user/sqoop_test1

[root@spark1 ~]# hadoop fs -cat /user/sqoop_test1/part-m-00000

6165720.html

13659a32071e9843cb6348a8941b4574.png

mysql数据导入hdfs中中文乱码问题总结

修改mysql里面的my.conf文件

创建数据库,指定字符集是utf8

再新的数据库里面创建表,在create table语句里面指定字符集是 utf8

插入中文汉字记录

select看到中文是正常的

依次完成这些操作以后,再用sqoop导入

导入hdfs解决中文乱码问题后,再去导入hive中就没出现乱码问题了,所以应该是一样的解决方法

mysql中的编码查看和修改方法

查看编码方式

mysql> show variables like 'collation_%';

+----------------------+-------------------+

| Variable_name | Value |

+----------------------+-------------------+

| collation_connection | latin1_swedish_ci |

| collation_database | latin1_swedish_ci |

| collation_server | latin1_swedish_ci |

+----------------------+-------------------+

mysql> show variables like 'character_set_%'; 查看mysql数据库默认编码

+--------------------------+----------------------------+

| Variable_name | Value |

+--------------------------+----------------------------+

| character_set_client | latin1 |

| character_set_connection | latin1 |

| character_set_database | latin1 |

| character_set_filesystem | binary |

| character_set_results | latin1 |

| character_set_server | latin1 |

| character_set_system | utf8 |

| character_sets_dir | /usr/share/mysql/charsets/ |

+--------------------------+----------------------------+

修改编码方式在/etc/my.cnf这个文件中修改

[root@spark1 ~]# vi /etc/my.cnf

root@spark1 ~]# service mysqld restart 重启mysql

查看是否变成utf8

mysql> \s

--------------

mysql Ver 14.14 Distrib 5.1.73, for redhat-linux-gnu (x86_64) using readline 5.1

Connection id:6

Current database:

Current user:root@localhost

SSL:Not in use

Current pager:stdout

Using outfile:''

Using delimiter:;

Server version:5.1.73 Source distribution

Protocol version:10

Connection:Localhost via UNIX socket

Server characterset:utf8

Db characterset:utf8

Client characterset:utf8

Conn. characterset:utf8

UNIX socket:/var/lib/mysql/mysql.sock

Uptime:22 min 3 sec

Threads: 1 Questions: 59 Slow queries: 0 Opens: 20 Flush tables: 1 Open tables: 9 Queries per second avg: 0.44

--------------

mysql> show variables like "char%";

+--------------------------+----------------------------+

| Variable_name | Value |

+--------------------------+----------------------------+

| character_set_client | utf8 |

| character_set_connection | utf8 |

| character_set_database | utf8 |

| character_set_filesystem | binary |

| character_set_results | utf8 |

| character_set_server | utf8 |

| character_set_system | utf8 |

| character_sets_dir | /usr/share/mysql/charsets/ |

+--------------------------+----------------------------+

8 rows in set (0.00 sec)

mysql> show variables like "colla%";

+----------------------+-----------------+

| Variable_name | Value |

+----------------------+-----------------+

| collation_connection | utf8_general_ci |

| collation_database | utf8_general_ci |

| collation_server | utf8_general_ci |

+----------------------+-----------------+

3 rows in set (0.00 sec)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值