【sqoop】mysql 按日期增量导入

1 .创建如下test表

+----+------+---------------------+
| id | val  | created_at          |
+----+------+---------------------+
|  0 |    0 | 2012-05-30 15:27:48 |
|  1 |    4 | 2012-05-30 15:28:33 |
|  2 |   14 | 2012-05-30 15:28:33 |
|  3 |  100 | 2012-05-30 15:41:17 |
+----+------+---------------------+

2 .  源表导入到本地HDFS

sqoop  job --create import --connect jdbc:mysql://<host name>/<database> --table test  --username hive -P -m 1 --target-dir old

3  更新表

update test set val=210 where id=3;
+----+------+---------------------+
| id | val  | created_at          |
+----+------+---------------------+
|  0 |    0 | 2012-05-30 15:27:48 |
|  1 |    4 | 2012-05-30 15:28:33 |
|  2 |   14 | 2012-05-30 15:28:33 |
|  3 |  210 | 2012-05-31 10:09:28 |
+----+------+---------------------+

4

select max(created_at) from test into max_dt  
sqoop-job --create mergetest -- import --connect jdbc:mysql://<host name>/<database> --table test --incremental lastmodified --check-column created_at --last-value max_dt --username hive -P -m 1 --target-dir new
5 执行job 

sqoop job --exec mergetest

6 合并操作

cp /tmp/sqoop-${USER}/compile/028e88f8a52c3505710a26375853e7ff/test.jar  .
sqoop merge --class-name test --new-data new --onto old --target-dir merge --jar-file ./test.jar --merge-key id
7 导入hive

create table test(id int,val int,created_at string) row format delimited fields terminated by ',';
load data local INPATH "merge/part-r-00000" into table test;

hive> select * from test;
OK
0	0	2012-05-30 15:27:48.0
1	4	2012-05-30 15:28:33.0
2	14	2012-05-30 15:28:33.0
3	210	2012-05-31 10:09:28.0



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值