Sqoop分client和server,server安装在hadoop集群中的某个节点上,这个节点充当要连接sqoop的入口节点,,client端不需要安装hadoop.
一:下载、解压、配置环境变量就不多说了。
二:进入sqoop安装目录下的server/conf
修改sqoop.properties文件中的倒数第七行改成hadoop的安装目录。
三:继续,修改catalina.properties
common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/common/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/common/lib/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/hdfs/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/hdfs/lib/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/mapreduce/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/mapreduce/lib/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/tools/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/tools/lib/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/yarn/*.jar,/opt/hadoop/hadoop-2.4.1/share/hadoop/yarn/lib/*.jar
以上为
hadoop的jar目录
四:复制mysql-connector-java-5.1.10-bin.jar到server/lib/下
ok,下面可以进行测试了
./sqoop.sh server start 启动sqoop服务
./bin/sqoop.sh client 进入sqoop控制台
sqoop:000> set server --host msg-01 --port 12000 --webapp sqoop 连接服务器
sqoop:000> show version --all 显示服务器、客户端的版本信息,如果server显示错误,重启一下server./sqoop.sh server stop
sqoop:000> show connector --all 查看连接器
sqoop:000> show connection --all 查看连接
sqoop:000> show connection --xid 1 查看id为1的连接
sqoop:000> create connection --cid 1 创建id为1的连接
<pre name="code" class="java">Creating connection for connector with id 1
Please fill following values to create new connection object
Name: mysql --输入名称
Connection configuration
JDBC Driver Class: com.mysql.jdbc.Driver --输入
JDBC Connection String: jdbc:mysql://10.1.65.121:3306/sqoop --输入
Username: root --输入
Password: ****** --输入
JDBC Connection Properties:
There are currently 0 values in the map:
entry#
Security related configuration options
Max connections: 20 --输入
New connection was successfully created with validation status FINE and persistent id 1
sqoop:000> create job --xid 1 --type import
Creating job for connection with id 1
Please fill following values to create new job object
Name: mysql_job
Database configuration
Schema name:
Table name: userinfo 要全量导出一张表,请填写表名,table name 和 table sql statement不能同时配置
Table SQL statement: 如果填写格式必须为 select * from userinfo where ${CONDITIONS}
Table column names:
Partition column name: id 使用哪个字段来填充过滤条件 userid
Nulls in partition column:
Boundary query: 如果选择sql方式,这里要写一个查询语句,返回值需为整形,sqoop运行job时,会自动填充${CONDITIONS} 这个占位符,如:select 0,3 from userinfo
Output configuration
Storage type:
0 : HDFS
Choose: 0
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 1
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
Choose: 0
Output directory: /home/jifeng/out
Throttling resources
Extractors:
Loaders:
New job was successfully created with validation status FINE and persistent id 1
sqoop:000> start job --jid 1 启动
Submission details
Job ID: 1
Server URL: http://localhost:12000/sqoop/
......
<span style="font-family: Consolas, 'Courier New', Courier, mono, serif; line-height: 18px;">hadoop fs -ls /mysql/out </span>