下载
wget http://mirror.bit.edu.cn/apache/sqoop/1.99.7/sqoop-1.99.7-bin-hadoop200.tar.gz
tar -zxvf sqoop-1.99.7-bin-hadoop200.tar.gz
mv sqoop-1.99.7-bin-hadoop200 sqoop
配置环境
export HADOOP_HOME=/home/xiaobin/soft/hadoop-2.6.0
export HADOOP_COMMON_HOME=$HADOOP_HOME/share/hadoop/common
export HADOOP_HDFS_HOME=$HADOOP_HOME/share/hadoop/hdfs
export HADOOP_MAPRED_HOME=$HADOOP_HOME/share/hadoop/mapreduce
export HADOOP_YARN_HOME=$HADOOP_HOME/share/hadoop/yarn
export SQOOP_SERVER_EXTRA_LIB=/home/xiaobin/soft/sqoop/sqoop_lib
修改core-site.xml
特别要注意的是sqoop2是提交任务的用户名,这里要改成你提交任务的用户,否则会报
org.apache.hadoop.security.authorize.AuthorizationException: User: xbuser is not allowed to impersonate xbuser
<property>
<name>hadoop.proxyuser.sqoop2.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.sqoop2.groups</name>
<value>*</value>
</property>
配置hadoopp配置文件路径
vi conf/sqoop.properties
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/home/xiaobin/soft/hadoop-2.6.0/etc/hadoop
创建第三方包路径
mkdir -p /home/xiaobin/soft/sqoop/sqoop_lib
cp mysql-connector-java-5.1.35.jar /home/xiaobin/soft/sqoop/sqoop_lib/
export SQOOP_SERVER_EXTRA_LIB=/home/xiaobin/soft/sqoop/sqoop_lib
初始化仓库
sqoop2-tool upgrade
Setting conf dir: /home/xiaobin/soft/sqoop/bin/../conf
Sqoop home directory: /home/xiaobin/soft/sqoop
Sqoop tool executor:
Version: 1.99.7
Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine
Running tool: class org.apache.sqoop.tools.tool.UpgradeTool
0 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config file poller thread
Tool class org.apache.sqoop.tools.tool.UpgradeTool has finished correctly.
启动server
sqoop2-server start
Starting the Sqoop2 server...
0 [main] INFO org.apache.sqoop.core.SqoopServer - Initializing Sqoop server.
4 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config file poller thread
Sqoop2 server started.
查看进程SqoopJettyServer,默认监听12000端口
jps
11987 NailgunRunner
19013 Launcher
2991 Main
20093 NameNode
20653 ResourceManager
20238 DataNode
3194 RemoteMavenServer
23093 SqoopJettyServer
20796 NodeManager
23186 Jps
20474 SecondaryNameNode
启动client
sqoop2-shell
set server --host localhost --port 12000 --webapp sqoop
创建Link(mysql link ,hdfs link)
#MySql Link
sqoop:000> create link -connector generic-jdbc-connector
Creating link for connector with name generic-jdbc-connector
Please fill following values to create new link object
Name: First Link
Database connection
Driver class: com.mysql.jdbc.Driver
Connection String: jdbc:mysql://localhost:3306/test
Username: root
Password: ******
Fetch Size: 100
Connection Properties:
There are currently 0 values in the map:
entry#
SQL Dialect
Identifier enclose: 注意 这里不能直接回车!要打一个空格符号!因为如果不打,查询mysql表的时候会在表上加上“”,导致查询出错!
New link was successfully created with validation status OK and name First Link
#HDFS Link
sqoop:000> create link -connector hdfs-connector
Creating link for connector with name hdfs-connector
Please fill following values to create new link object
Name: Second Link
HDFS cluster
URI: hdfs://xiaobin:9000/
Conf directory:
Additional configs::
There are currently 0 values in the map:
entry#
New link was successfully created with validation status OK and name Second Link
创建job
sqoop:000> create job -f "First Link" -t "Second Link"
Creating job for links with from name First Link and to name Second Link
Please fill following values to create new job object
Name: Sqoopy
Database source
Schema name: test
Table name: usert
SQL statement:
Column names:
There are currently 0 values in the list:
element#
Partition column:
Partition column nullable:
Boundary query:
Incremental read
Check column:
Last value:
Target configuration
Override null value:
Null value:
File format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
2 : PARQUET_FILE
Choose: 0
Compression codec:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom codec:
Output directory: /test/sqoop
Append mode:
Throttling resources
Extractors: 2
Loaders: 2
Classpath configuration
Extra mapper jars:
There are currently 0 values in the list:
element#
New job was successfully created with validation status OK and name Sqoopy
启动job
sqoop:000> start job -name Sqoopy
Submission details
Job Name: Sqoopy
Server URL: http://localhost:12000/sqoop/
Created by: xiaobin
Creation date: 2018-06-12 15:37:48 CST
Lastly updated by: xiaobin
External ID: job_1528789035231_0001
http://localhost:8088/proxy/application_1528789035231_0001/
2018-06-12 15:37:48 CST: BOOTING - Progress is not available
查看生成的文件
xiaobin@xiaobin:~/soft/hadoop-2.6.0$ hadoop dfs -ls /test/sqoop
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Found 2 items
-rw-r--r-- 3 xiaobin supergroup 61398 2018-06-12 15:38 /test/sqoop/906b57d6-6117-4494-b0e0-e7e25eb09981.txt
-rw-r--r-- 3 xiaobin supergroup 60931 2018-06-12 15:38 /test/sqoop/d0cdfc13-3c0d-4e74-a1fe-c973bfc97e13.txt