一.sqoop的下载安装
1.下载地址:http://ftp.wayne.edu/apache/sqoop/(推荐下载sqoop1版本)
2.将下载的sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz上传到服务器对应目录下
3.对下载的包进行操作,编辑对应设置
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C app/
cd sqoop-1.4.7.bin__hadoop-2.6.0/
mv sqoop-1.4.7.bin__hadoop-2.6.0/ sqoop-1.4.7
cd sqoop-1.4.7/conf/
mv sqoop-env-template.sh sqoop-env.sh
vi sqoop-env.sh
4.编辑sqoop-env.sh内容,将部分参数指向对应文件目录,再将hive包中的对应mysql的jar包导入进sqoop目录
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/***/***/***/hadoop-2.7.2
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/***/***/***/hadoop-2.7.2
#set the path to where bin/hbase is available
#export HBASE_HOME=/user
#Set the path to where bin/hive is available
#export HIVE_HOME=
export HIVE_HOME=/***/***/apache-hive-1.2.1-bin
#Set the path for where zookeper config dir is
#export ZOOCFGDIR=/user
cp /***/***/apache-hive-1.2.1-bin/lib/mysql-connector-java-5.1.34-bin.jar /***/***/***/sqoop-1.4.7/lib
5.进行测试
sqoop version
如果出现waring,hbase zook…没有对应目录可以先忽略
6.实例,将mysql数据导入到hdfs
sqoop import -m 1 --connect jdbc:mysql://ip地址/mysql_databases --username 用户名--password '用户密码' --table mysql_table--target-dir /***/***/***/***
7.将mysql表导入到hive表中(hive没有表则创建,有则报错)
sqoop import --connect jdbc:mysql://ip地址/mysql_databses\
--username 用户名--password '用户密码@!' --table mysql_table\
--hive-import \
--hive-database 导入hive的databases\
--create-hive-table \
--fields-terminated-by '\t' \
--null-string '\\N' \
--null-non-string '\\N' \
--split-by id \
-m 1
8.将mysql的表进行覆盖对应的hive表
sqoop import --connect jdbc:mysql://ip地址/mysql_data_bases --username 用户名--password '用户名密码' --table mysql_table --hive-import --target-dir temp -m 1 --hive-overwrite
9.将对应mysql增量追加到hdfs
sqoop import \
--connect jdbc:mysql://47.94.**.**:***/wesurvey \
--username *** \
--password '***' \
--table vehicle \
--check-column CreatedAt \
--incremental lastmodified \
--last-value "'$date'" \
--as-textfile \
--target-dir /user/hive/warehouse/dms/dms_vehicle \
--fields-terminated-by "\001" \
-m 1 \
--append