1.下载包并配置
2. 配置环境变量
export SQOOP_HOME=/home/admin/sqoop-1.2.0-CDH3B4
export PATH=$PATH:$SQOOP_HOME/bin
3. 测试安装
[admin@server1 ~]$ sqoop help
usage: sqoop COMMAND [ARGS]
Available commands:
See 'sqoop help COMMAND' for information on a specific command.
4. 从MySQL导入HDFS
使用原先做Hive实验的数据库实例hive,显示表SEQUENCE_TABLE的数据
mysql> select * from SEQUENCE_TABLE;
+-----------------------------------------------------------+----------+
| SEQUENCE_NAME
+-----------------------------------------------------------+----------+
| org.apache.hadoop.hive.metastore.model.MColumnDescriptor
| org.apache.hadoop.hive.metastore.model.MDatabase
| org.apache.hadoop.hive.metastore.model.MSerDeInfo
| org.apache.hadoop.hive.metastore.model.MStorageDescriptor |
| org.apache.hadoop.hive.metastore.model.MTable
+-----------------------------------------------------------+----------+
5 rows in set (0.00 sec)
将mysql-connector-java-5.1.18-bin.jar 复制到Sqoop-1.2.0-CDH3B4的lib目录下,使用sqoop将表SEQUENCE_TABLE的数据导入HDFS
[admin@server1 bin]$ sqoop import --connect jdbc:mysql://server1/hive --username hive --password hive
12/12/16 01:27:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
12/12/16 01:27:16 INFO tool.CodeGenTool: Beginning code generation
12/12/16 01:27:16 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `SEQUENCE_TABLE` AS t LIMIT 1
12/12/16 01:27:16 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `SEQUENCE_TABLE` AS t LIMIT 1
12/12/16 01:27:16 INFO orm.CompilationManager: HADOOP_HOME is /home/admin/hadoop-0.20.2/bin/..
12/12/16 01:27:16 INFO orm.CompilationManager: Found hadoop core jar at: /home/admin/hadoop-0.20.2/bin/../hadoop-0.20.2-core.jar
12/12/16 01:27:18 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-admin/compile/415f2a5412b5c2aadd764748
12/12/16 01:27:18 WARN manager.MySQLManager: It looks like you are importing from mysql.
12/12/16 01:27:18 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
12/12/16 01:27:18 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
12/12/16 01:27:18 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
12/12/16 01:27:18 INFO mapreduce.ImportJobBase: Beginning import of SEQUENCE_TABLE
12/12/16 01:27:19 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `SEQUENCE_TABLE` AS t LIMIT 1
12/12/16 01:27:20 WARN db.TextSplitter: Generating splits for a textual index column.
12/12/16 01:27:20 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.
12/12/16 01:27:20 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column.
12/12/16 01:27:20 INFO mapred.JobClient: Running job: job_201212152320_0004
12/12/16 01:27:21 INFO mapred.JobClient:
12/12/16 01:27:41 INFO mapred.JobClient:
12/12/16 01:27:51 INFO mapred.JobClient:
12/12/16 01:27:57 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient: Job complete: job_201212152320_0004
12/12/16 01:27:59 INFO mapred.JobClient: Counters: 5
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapred.JobClient:
12/12/16 01:27:59 INFO mapreduce.ImportJobBase: Transferred 274 bytes in 40.4297 seconds (6.7772 bytes/sec)
12/12/16 01:27:59 INFO mapreduce.ImportJobBase: Retrieved 5 records.
查看HDFS中的数据
[admin@server1 bin]$ hadoop dfs -cat SEQUENCE_TABLE/part*
org.apache.hadoop.hive.metastore.model.MColumnDescriptor,16
org.apache.hadoop.hive.metastore.model.MDatabase,6
org.apache.hadoop.hive.metastore.model.MSerDeInfo,16
org.apache.hadoop.hive.metastore.model.MStorageDescriptor,16
org.apache.hadoop.hive.metastore.model.MTable,16
5.从HDFS导出到MySQL
HDFS中有文件test.txt, 内容如下:
[admin@server1 ~]$ hadoop dfs -cat /test.txt
aaaa,111
bbbb,222
cccc,333
dddd,444
MySQL中建立表test
mysql> CREATE TABLE test(str VARCHAR(10), num INT);
Query OK, 0 rows affected (0.01 sec)
执行导出
[admin@server1 ~]$ sqoop export --connect jdbc:mysql://server1/hive --username hive --password hive
12/12/16 01:51:50 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
12/12/16 01:51:50 INFO tool.CodeGenTool: Beginning code generation
12/12/16 01:51:50 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
12/12/16 01:51:50 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
12/12/16 01:51:50 INFO orm.CompilationManager: HADOOP_HOME is /home/admin/hadoop-0.20.2/bin/..
12/12/16 01:51:50 INFO orm.CompilationManager: Found hadoop core jar at: /home/admin/hadoop-0.20.2/bin/../hadoop-0.20.2-core.jar
12/12/16 01:51:52 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-admin/compile/bebfa051f0e18c14ad2c4665
12/12/16 01:51:52 INFO mapreduce.ExportJobBase: Beginning export of test
12/12/16 01:51:52 INFO manager.MySQLManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
12/12/16 01:51:53 INFO input.FileInputFormat: Total input paths to process : 1
12/12/16 01:51:53 INFO input.FileInputFormat: Total input paths to process : 1
12/12/16 01:51:53 INFO mapred.JobClient: Running job: job_201212152320_0005
12/12/16 01:51:54 INFO mapred.JobClient:
12/12/16 01:52:01 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient: Job complete: job_201212152320_0005
12/12/16 01:52:03 INFO mapred.JobClient: Counters: 6
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapred.JobClient:
12/12/16 01:52:03 INFO mapreduce.ExportJobBase: Transferred 42 bytes in 10.7834 seconds (3.8949 bytes/sec)
12/12/16 01:52:03 INFO mapreduce.ExportJobBase: Exported 4 records.
查看MySql中的表test记录
mysql> select * from test;
+------+------+
| str
+------+------+
| aaaa |
| bbbb |
| cccc |
| dddd |
+------+------+
4 rows in set (0.00 sec)