一、安装sqoop:
1、sqoop的解压并配置环境变量:
tar -zxvf /bigdata/sqoop-1.4.6-cdh5.13.2.tar.gz -C /apps
配置环境变量:
vi /etc/profile
export SQOOP_HOME=/apps/sqoop-1.4.6-cdh5.13.2
export PATH=
P
A
T
H
:
PATH:
PATH:SQOOP_HOME/bin:
重新加载环境变量:
source /etc/profile
which sqoop
2、配置sqoop的环境配置文件:
mv /apps/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.template.sh /apps/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh
vi /apps/sqoop-1.4.6-cdh5.13.2/conf/sqoop-env.sh
export HADOOP_COMMON_HOME=/apps/hadoop-2.6.0-cdh5.13.2
export HADOOP_MAPRED_HOME=/apps/hadoop-2.6.0-cdh5.13.2
export HBASE_HOME=/apps/hbase-1.2.0-cdh5.13.2
export HIVE_HOME=/apps/hive-1.1.0-cdh5.13.2
#export ZOOCFGDIR=
3、将mysql的驱动包copy到sqoop目录中的lib目录下:
cp /bigdata/mysql-connector-java-5.1.32.jar /apps/sqoop-1.4.6-cdh5.13.2/lib
4、测试
sqoop version
二、下载驱动包 mysql-connector-java-5.1.31.jar (MySQL驱动)、ojdbc6.jar(Oracle驱动)、sqljdbc_4.2.8112.200_enu.tar.gz,其中sqljdbc_4.2.8112.200_enu.tar.gz解压获取sqljdbc42.jar(SqlServer驱动)
将驱动包拷贝到
H
A
D
O
O
P
H
O
M
E
/
l
i
b
、
HADOOP_HOME/lib、
HADOOPHOME/lib、HADOOP_HOME/share/hadoop/common/lib、$SQOOP_HOME/lib目录下。
3.数据对接实例:
数据从MySQL导入HDFS:
$SQOOP_HOME/bin sqoop import
–connect jdbc:mysql://mini01:3306/sales_source
–username root
–password-file /sqoop/pwd/ test_sqoopPWD.pwd
–table customer
–delete-target-dir
–target-dir /user/hive/warehouse/sales_rds.db/customer
–fields-terminated-by ‘\001’
;
数据从SQLServer导入HDFS:
单表导入Hive
bin/sqoop import \
--connect 'jdbc:sqlserver://192.**.*.5;username=zhangsan;password=123456;database=db1' \
--table tb1 \
--create-hive-table \
--hive-database "hdb1" \
--hive-import \
--fields-terminated-by '\001' \
--m 1 \
;
单表导入
bin/sqoop import \
--connect 'jdbc:sqlserver://192.**.*.5;username=zhangsan;password=123456;database=db1' \
--table tb1 \
--target-dir /data/Sqlserverdata \
--fields-terminated-by '\001' \
--autoreset-to-one-mapper \
--null-string '\\N' \
--null-non-string '\\N' \
--m 2;
整库全表导入
bin/sqoop import-all-tables \
--connect 'jdbc:sqlserver://192.**.*.5;username=zhangsan;password=123456;database=db1' \
--create-hive-table \
--hive-database "hdb1" \
--hive-import \
--fields-terminated-by '\001' \
--autoreset-to-one-mapper \
--m 2 \
;
数据从Oracle导入HDFS:
导入到文件夹
bin/sqoop import
–connect 'jdbc:sqlserver://192.**.*.5:1521:orcl
–username zhangsan
–password 123456
–delete-target-dir
–table tb1
–target-dir /data/hdb1
–null-string ‘\N’
–null-non-string ‘\N’
–m 1
–fields-terminated-by ‘\001’;
导入到Hive
bin/sqoop import \
--connect 'jdbc:sqlserver://192.**.*.5:1521:orcl \
--username zhangsan \
--password 123456 \
--table tb1 \
--create-hive-table \
--hive-database "hdb1" \
--hive-import \
--m 1 \
--fields-terminated-by '\001'
导入整个库:
bin/sqoop import-all-tables \
--connect 'jdbc:sqlserver://192.**.*.5:1521:orcl \
--username zhangsan \
--password 123456 \
--create-hive-table \
--hive-database "hdb1" \
--hive-import \
--m 1 \
--null-string '\\N' \
--null-non-string '\\N' \
--fields-terminated-by '\001';