sqoop 基本使用

最新推荐文章于 2024-11-10 22:39:42 发布

weixin_34161032

最新推荐文章于 2024-11-10 22:39:42 发布

阅读量97

点赞数

文章标签：大数据数据库 python

原文链接：https://my.oschina.net/u/3862440/blog/2354369

版权

2019独角兽企业重金招聘Python工程师标准>>>

1、把MySQL数据导入到hdfs:

sqoop import \
--connect jdbc:mysql://192.168.83.11:3306/sqoop \
--username root \
--password Oracle123 \
--table sqoop1 \
--delete-target-dir \
-m 1

--query 指明查询的sql语句，注意主里加了一个 and \$conditions ，这是必需的，如果有带where条件的话

--hive-table 指明目标表名

--target-dir 指明目标表的hdfs路径

--delete-target-dir 删除目标hfds路径数据

--split-by 指明shuffle的字段，一般是取主键

--hive-overwrite 先删除旧数据，再重新插入

--null-string --对null字符串和处理，映射成hive里的null

--null-non-string --对null非字符串和处理，映射成hive里的null

2、把hdfs数据导入到MySQL：

先创建表结构,（感觉sqoop在这里做的相当不好，表结构不能自己创建吗？）

create table emp_1
          (empno int,
          ename varchar(20),
          job varchar(20),
          mgr int,
          hirdate varchar(20),
          sal double,
          comm double,
          deptno int);

在导入数据

sqoop export --connect jdbc:mysql://hd1:3306/hive --username root --password Oracle123 --table emp_1 --export-dir /user/hive/warehouse/part_emp3/mgr=10/emp.txt --fields-terminated-by '\t'

3、将MySQL的表结构复制到hive中

sqoop create-hive-table --connect jdbc:mysql://hd1:3306/hive --table TBS --username root --password Oracle123 --hive-table test

转载于:https://my.oschina.net/u/3862440/blog/2354369

weixin_34161032

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫