sqoop增量导入hive_Sqoop 增量导入导出及Job操作示例

增量导入

递增列的增量导入append

# 第一次导入

[root@node222 ~]# /usr/local/sqoop-1.4.7/bin/sqoop import --connect jdbc:mysql://192.168.0.200:3306/sakila?useSSL=false --table actor --where "actor_id < 50" --username sakila -P --num-mappers 1 --target-dir /tmp/hive/sqoop/actor_all

...

18/10/15 14:32:14 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7

Enter password:

...

18/10/15 14:32:34 INFO mapreduce.Job: Running job: job_1539583112983_0002

18/10/15 14:32:49 INFO mapreduce.Job: Job job_1539583112983_0002 running in uber mode : false

18/10/15 14:32:49 INFO mapreduce.Job: map 0% reduce 0%

18/10/15 14:33:06 INFO mapreduce.Job: map 100% reduce 0%

18/10/15 14:33:07 INFO mapreduce.Job: Job job_1539583112983_0002 completed successfully

18/10/15 14:33:08 INFO mapreduce.Job: Counters: 30

...

18/10/15 14:33:08 INFO mapreduce.ImportJobBase: Transferred 1.8262 KB in 40.9516 seconds (45.6636 bytes/sec)

18/10/15 14:33:08 INFO mapreduce.ImportJobBase: Retrieved 49 records.

[hadoop@node224 ~]$ hdfs dfs -cat /tmp/hive/sqoop/actor_all/part-m-00000

1,PENELOPE,GUINESS,2006-02-15 04:34:33.0

...

48,FRANCES,DAY-LEWIS,2006-02-15 04:34:33.0

49,ANNE,CRONYN,2006-02-15 04:34:33.0

[hadoop@node224 ~]$

# apppend增量导入actor_id < 50

# 指定增量模式

--incremental (mode append|lastmodified)

# 指定增量校对字段

--check-column (col)

# 指定增量起始值

--last-value (value)

# append模式增量

[root@node222 ~]# /usr/local/sqoop-1.4.7/bin/sqoop import --connect jdbc:mysql://192.168.0.200:3306/sakila?useSSL=false --table actor --username sakila -P --incremental append --check-column actor_id --last-value 49 --num-mappers 1 --target-dir /tmp/hive/sqoop/actor_all

...

Enter password:

...

18/10/15 14:43:03 INFO mapreduce.Job: Running job: job_1539583112983_0003

18/10/15 14:43:19 INFO mapreduce.Job: Job job_1539583112983_0003 running in uber mode : false

18/10/15 14:43:19 INFO mapreduce.Job: map 0% reduce 0%

18/10/15 14:43:34 INFO mapreduce.Job: map 100% reduce 0%

18/10/15 14:43:35 INFO mapreduce.Job: Job job_1539583112983_0003 completed successfully

18/10/15 14:43:35 INFO mapreduce.Job: Counters: 30

...

18/10/15 14:43:35 INFO mapreduce.ImportJobBase: Transferred 5.79 KB in 38.6992 seconds (153.2074 bytes/sec)

18/10/15 14:43:35 INFO mapreduce.ImportJobBase: Retrieved 151 records.

18/10/15 14:43:35 INFO util.AppendUtils: Appending to directory actor_all

18/10/15 14:43:35 INFO util.AppendUtils: Using found partition 1

18/10/15 14:43:35 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import, supply the following arguments:

18/10/15 14:43:35 INFO tool.ImportTool: --incremental append

18/10/15 14:43:35 INFO tool.ImportTool: --check-column actor_id

18/10/15 14:43:35 INFO tool.ImportTool: --last-value 200

18/10/15 14:43:35 INFO tool.ImportTool: (Consider saving this with 'sqoop job --create')

[hadoop@node224 ~]$ hdfs dfs -cat /tmp/hive/sqoop/actor_all/part-m-00001

50,NATALIE,HOPKINS,2006-02-15 04:34:33.0

...

200,JULIA,FAWCETT,2006-02-15 04:34:33.0

时间列的增量导入lastmodified

# 将actor_new表中的时间修改非相同

UPDATE actor_new SET last_update = DATE_ADD(last_update,INTERVAL (FLOOR(RAND()*199+1)) DAY)

# 第一次导入

[root@node222 ~]# /usr/local/sqoop-1.4.7/bin/sqoop import --connect jdbc:mysql://192.168.0.200:3306/sakila?useSSL=false --table actor_new --where "last_update < '2006-04-25 04:34:33'" --username sakila -P --num-mappers 1 --target-dir /tmp/hive/sqoop/actor_lastmodified --delete-target-dir

...

18/10/15 14:57:23 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7

Enter password:

...

18/10/15 14:57:42 INFO mapreduce.Job: Running job: job_1539583112983_0004

18/10/15 14:58:01 INFO mapreduce.Job: Job job_1539583112983_0004 running in uber mode : false

18/10/15 14:58:01 INFO mapreduce.Job: map 0% reduce 0%

18/10/15 14:58:22 INFO mapreduce.Job: map 100% reduce 0%

18/10/15 14:58:23 INFO mapreduce.Job: Job job_1539583112983_0004 completed successfully

...

18/10/15 14:58:23 INFO mapreduce.ImportJobBase: Transferred 2.6592 KB in 32.9053 seconds (82.7527 bytes/sec)

18/10/15 14:58:23 INFO mapreduce.ImportJobBase: Retrieved 69 records.

# 通过 lastmodified 增量模式导入时

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值