sqoop job入门 与sqoop增量导入

sqoop job命令选项

sqoop job可以封装一些简单的代码,方便调用

可以创建,删除,执行,列出job,以及查看job.

Job management arguments:
   --create <job-id>            Create a new saved job
   --delete <job-id>            Delete a saved job
   --exec <job-id>              Run a saved job
   --help                       Print usage instructions
   --list                       List saved jobs
   --meta-connect <jdbc-uri>    Specify JDBC connect string for the
                                metastore
   --show <job-id>              Show the parameters for a saved job
   --verbose                    Print more information while working

比如创建一个mysql导入hdfs的job. 利用sqoop --create job1 就可以创建一个job.

sqoop job --create job1 \
-- import \
--connect jdbc:mysql://mypc:3306/sz \
--username root \
--password 123456 \
--table emp \
--target-dir "/0909/job" \
--delete-target-dir \
-m 1

执行job,只需要将上面建立job的名字写在后面即可.

sqoop job --exec job1

查看job执行结果


[conf]# hdfs dfs -cat /0909/job/*
7369,SMITH,CLERK,7902,1980-12-17,800,null,20
7499,ALLEN,SALESMAN,7698,1981-02-20,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02,2975,null,20
7654,MARTIN,SALESMAN,7698,1981-09-28,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01,2850,null,30
7782,CLARK,MANAGER,7839,1981-06-09,2450,null,10
7788,SCOTT,ANALYST,7566,1987-04-19,3000,null,20
7839,KING,PRESIDENT,null,1981-11-17,5000,null,10
7844,TURNER,SALESMAN,7698,1981-09-08,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23,1100,null,20
7900,JAMES,CLERK,7698,1981-12-03,950,null,30
7902,FORD,ANALYST,7566,1981-12-03,3000,null,20
7934,MILLER,null,7782,1982-01-23,1300,null,10

查看job信息.这样就可以看到job的各种信息了.

sqoop job --show job1
verbose = false
hcatalog.drop.and.create.table = false
db.connect.string = jdbc:mysql://mypc:3306/sz
codegen.output.delimiters.escape = 0
codegen.output.delimiters.enclose.required = false
codegen.input.delimiters.field = 0
split.limit = null
hbase.create.table = false
mainframe.input.dataset.type = p
db.require.password = true
skip.dist.cache = false
hdfs.append.dir = false
db.table = emp
codegen.input.delimiters.escape = 0
accumulo.create.table = false
import.fetch.size = null
codegen.input.delimiters.enclose.required = false
db.username = root
reset.onemapper = false
codegen.output.delimiters.record = 10
import.max.inline.lob.size = 16777216
sqoop.throwOnError = false
hbase.bulk.load.enabled = false
hcatalog.create.table = false
db.clear.staging.table = false
codegen.input.delimiters.record = 0
enable.compression = false
hive.overwrite.table = false
hive.import = false
codegen.input.delimiters.enclose = 0
accumulo.batch.size = 10240000
hive.drop.delims = false
customtool.options.jsonmap = {}
codegen.output.delimiters.enclose = 0
hdfs.delete-target.dir = true
codegen.output.dir = .
codegen.auto.compile.dir = true
relaxed.isolation = false
mapreduce.num.mappers = 1
accumulo.max.latency = 5000
import.direct.split.size = 0
sqlconnection.metadata.transaction.isolation.level = 2
codegen.output.delimiters.field = 44
export.new.update = UpdateOnly
incremental.mode = None
hdfs.file.format = TextFile
sqoop.oracle.escaping.disabled = true
codegen.compile.dir = /tmp/sqoop-root/compile/649e51434cdd01f9036e1a77885d14af
direct.import = false
temporary.dirRoot = _sqoop
hdfs.target.dir = /0909/job
hive.fail.table.exists = false
db.batch = false

删除job

sqoop job --delete job1

再次查看job1就会报错
在这里插入图片描述

sqoop增量导入

增量导入job案例
因为是增量导入,所以不需要–delete-target-dir \ 这个参数

sqoop job --create job2 \
-- import \
--connect jdbc:mysql://mypc:3306/sz \
--username root \
--password 123456 \
--table emp \
--incremental append \
--check-column empno \
--target-dir "/0910/job" \
--fields-terminated-by ',' \
-m 1

查看数据

# hdfs dfs -cat /0910/job/*

在mysql中插入一些数据,再次执行增量导入. 就只会导入新增的数据.判断是否新增依据–check-column empno 所在列的值. job会记住该列的最大值.下一次再次运行job会从这个最大值所在行开始,该值之前的数据不会被重复导入.

总结

  • sqoop job命令可以创建,删除,保存job
  • –incremental append 参数结合–check-column 可以实现mysql向hdfs中增量导入数据,这样更贴近实际应用.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值