SQOOP 把MySQL中的数据导入到Hive中:
语句如下:
sqoop import \
--connect jdbc:mysql://localhost:3306/ruoze_g6 \
--username root --password 960210 \
--delete-target-dir \
--table city_info \
--hive-import \
--hive-table city_info \
--hive-overwrite \
--fields-terminated-by '\t' \
--lines-terminated-by '\n' \
--split-by city_id \
-m 2
运行环境:
- 单节点hdfs上部署的Hive、Sqoop、MySQL
第一次失败:
缺少JSONobject文件:
解决办法:
- 在$SQOOP_HOME/lib下上传这个文件上去
[hadoop@hadoop004 lib]$ ll java-json.jar
-rw-r--r-- 1 hadoop hadoop 84697 Oct 16 2013 java-json.jar
第二次尝试启动(还是报错):
- java.net.ConnectException: to 0.0.0.0:10020 failed on connection
百度尝试解决:在$HADOOP_HOME/etc/hadoop下的mapred-site.xml中,添加如下信息:
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
</property>
第三次尝试启动(还是报错失败):
提取报错信息:
ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
进行修改:
1、编辑文件 vi ~/.bash_profile
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
2、使其生效
source ~/.bash_profile
- 按照搜到的还把$HIVE_HOME/lib下的hive-common/hive-common-1.1.0-cdh5.7.0.jar包拷贝到SQOOP_HOME/lib目录下
第四次尝试启动(仍然失败):
19/09/12 21:07:09 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
19/09/12 21:07:20 INFO mapreduce.ImportJobBase: Transferred 129 bytes in 177.6484 seconds (0.7262 bytes/sec)
19/09/12 21:07:20 INFO mapreduce.ImportJobBase: Retrieved 10 records.
19/09/12 21:07:22 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `city_info` AS t LIMIT 1
19/09/12 21:07:23 INFO hive.HiveImport: Loading uploaded data into Hive
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:358)
at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.ShimLoader
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
- 再次测试:
[hadoop@hadoop004 lib]$ ll hive-exec-1.1.0-cdh5.7.0.jar
-rw-r--r-- 1 hadoop hadoop 19272159 Sep 12 21:13 hive-exec-1.1.0-cdh5.7.0.jar
第五次尝试启动:
- 启动过程日志信息如下:
19/09/12 21:18:39 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
19/09/12 21:19:01 INFO mapreduce.ImportJobBase: Transferred 129 bytes in 266.9711 seconds (0.4832 bytes/sec)
19/09/12 21:19:01 INFO mapreduce.ImportJobBase: Retrieved 10 records.
19/09/12 21:19:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `city_info` AS t LIMIT 1
19/09/12 21:19:03 INFO hive.HiveImport: Loading uploaded data into Hive
19/09/12 21:19:14 INFO hive.HiveImport: which: no hbase in (/home/hadoop/app/protobuf-2.5.0/bin:/home/hadoop/app/sqoop-1.4.6-cdh5.7.0/bin:/home/hadoop/app/apache-maven-3.5.4/bin:/home/hadoop/app/scala-2.11.12/bin:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/sbin:/home/hadoop/app/zookeeper-3.4.6/bin:/usr/java/jdk1.8.0_45/bin:/home/hadoop/app/hive/bin:/home/hadoop/app/zookeeper-3.4.6/bin:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/sbin:/home/hadoop/app/hive/bin:/usr/java/jdk1.7.0_45/bin:/usr/java/jdk1.7.0_45/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/curl/bin)
19/09/12 21:19:17 INFO hive.HiveImport:
19/09/12 21:19:17 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/hadoop/app/sqoop-1.4.6-cdh5.7.0/lib/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties
19/09/12 21:19:44 INFO hive.HiveImport: OK
19/09/12 21:19:44 INFO hive.HiveImport: Time taken: 6.317 seconds
19/09/12 21:19:45 INFO hive.HiveImport: Loading data to table default.city_info
19/09/12 21:19:55 INFO hive.HiveImport: Table default.city_info stats: [numFiles=2, numRows=0, totalSize=129, rawDataSize=0]
19/09/12 21:19:55 INFO hive.HiveImport: OK
19/09/12 21:19:55 INFO hive.HiveImport: Time taken: 11.884 seconds
19/09/12 21:19:56 INFO hive.HiveImport: Hive import complete.
19/09/12 21:19:57 INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.
-
19/09/12 21:19:45 INFO hive.HiveImport: Loading data to table default.city_info
-
它把数据加载到了default库中的city_info表
我们去到hive下的default库下的city_info表查询:
mysql> select * from city_info;
+---------+-----------+------+
| city_id | city_name | area |
+---------+-----------+------+
| 1 | beijing1 | NC |
| 2 | SHANGHAI | EC |
| 3 | NANJING | EC |
| 4 | GUANGZHOU | SC |
| 5 | SANYA | SC |
| 6 | WUHAN | CC |
| 7 | CHANGSHA | CC |
| 8 | XIAN | NW |
| 9 | CHENGDU | SW |
| 10 | HAERBIN | NE |
+---------+-----------+------+
10 rows in set (0.04 sec)
参考博客:
- https://blog.csdn.net/qq_36168479/article/details/84970352(此篇博客问题与我最为相近)
- https://www.cnblogs.com/zll20153246/p/9345921.html (此篇博客也还不错)