Sqoop

Sqoop:一个hdfs(或者说hive,hbase)和关系型数据库(本文以mysql为例)交互的工具,可以将hdfs上面的数据导出到关系型数据库(如mysql),也可以将关系型数据库(mysql)导入到hdfs上,导入导出都是针对hdfs来说的。底层跑的mr,只有map,没有reduce,如果是通过mysql向hive表导数据的话,是会先将数据跑到/user/用户名目录下,然后再加载到hive表中。

Sqoop搭建:
1.下载与cdh对应版本的sqoop版本,然后解压,改名或者创建软连接
[hadoop@hadoop001 software]$ tar -zxvf sqoop-1.4.6-cdh5.7.0.tar.gz -C ~/app
[hadoop@hadoop001 app]$ mv sqoop-1.4.6-cdh5.7.0/ ./sqoop-1.4.6
2.配置sqoop的环境变量(.bash_profile中配置SQOOP_HOME和PATH),并刷新环境变量
3.进入conf目录下,cp一份sqoop-env.sh,然后修改配置文件
[hadoop@hadoop001 conf]$ cp sqoop-env-template.sh sqoop-env.sh
[hadoop@hadoop001 conf]$ vi sqoop-env.sh
#Set path to where bin/hadoop is available 配置hadoop的目录
export HADOOP_COMMON_HOME=/home/hadoop/app/hadoop
#Set path to where hadoop-*-core.jar is available 配置hadoop的目录
export HADOOP_MAPRED_HOME=/home/hadoop/app/hadoop
#set the path to where bin/hbase is available 暂时先不配
#export HBASE_HOME=
#Set the path to where bin/hive is available 配置hive的目录
export HIVE_HOME=/home/hadoop/app/hive
#Set the path for where zookeper config dir is 暂时先不配
#export ZOOCFGDIR=
4.因为要和mysql交互,那么肯定需要mysql的驱动jar包,所以需要拷贝一份到sqoop的lib下
5.完成,进行测试。
[hadoop@hadoop001 conf]$ sqoop list-databases
–connect jdbc:mysql://localhost:3306
–username root
–password 123456
Warning: /home/hadoop/app/sqoop-1.4.6/…/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/hadoop/app/sqoop-1.4.6/…/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/app/sqoop-1.4.6/…/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/hadoop/app/sqoop-1.4.6/…/zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
19/07/21 12:56:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.0
19/07/21 12:56:01 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/07/21 12:56:02 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
hivemetadatadb
hwzdb
mysql
performance_schema
test

Sqoop常用参数讲解:

Sqoop使用常见错误:
一:在从mysql中往hive表中导入数据的时候报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/json/JSONObject
        at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:42)
        at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:742)
        at org.apache.sqoop.mapreduce.JobBase.putSqoopOptionsToConfiguration(JobBase.java:369)
        at org.apache.sqoop.mapreduce.JobBase.createJob(JobBase.java:355)
        at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:249)
        at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
        at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.lang.ClassNotFoundException: org.json.JSONObject
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 15 more

解决方案:找一个 java-json.jar 包拷到sqoop/lib下即可

二:找不到HiveConf
19/07/25 13:59:33 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
19/07/25 13:59:33 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
        at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
        at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
        at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
        at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
        at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
        ... 12 more
解决方案:将hive/lib下的hive-common-1.1.0-cdh5.7.0.jar 以及hive-shims*.jar 都cp到sqoop/lib下即可。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值