把数据从mysql导入到hdfs中_sqoop第2关:mysql导入数据至hdfs上-CSDN博客

本文链接：https://blog.csdn.net/fsz521job/article/details/50278885

1.设置MySQL 数据库驱动

复制mysql-connector-java-5.1.6.jar 到sqoop的lib目录

2.Sqoop操作

sqoop help import 查看帮助信息
查看数据库列表

[root@single bin]# **./sqoop list-databases --connect jdbc:mysql://192.168.2.1:3306/ --username root --P**
15/12/12 23:05:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
Enter password: 
15/12/12 23:06:04 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
alibabanew
echshop
mysql
nextapp
numysql
performance_schema
sakila
sys
weixinmgr
wordpress
world

执行导入命令将ts_user表导入到hdfs：

[root@single bin]#./sqoop import --connect jdbc:mysql://192.168.2.1:3306/weixinmgr --username root --P --table ts_user -m 1

用sqoop导入mysql出现错误

上搜索是mysql的bug，下载新的jar就可以了

[root@single bin]#wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.32.tar.gz
99% [===============================================================================================================> ] 3,774,587   16.4K/s eta(英国中部时99% [===============================================================================================================> ] 3,780,347   17.2K/s eta(英国中部时99% [===============================================================================================================> ] 3,786,107   16.9K/s eta(英国中部时99% [===============================================================================================================> ] 3,791,867   17.6K/s eta(英国中部时100%[================================================================================================================>] 3,795,067   17.0K/s eta(英国中部时100%[================================================================================================================>] 3,795,067   17.0K/s   in 3m 32s  

2015-12-12 23:29:13 (17.5 KB/s) - 已保存 “mysql-connector-java-5.1.32.tar.gz” [3795067/3795067])

重新执行

[hadoop@single bin]$ ./sqoop import --connect jdbc:mysql://192.168.2.1:3306/weixinmgr --username root --P --table ts_user -m 1
15/12/12 23:48:18 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
Enter password: 
15/12/12 23:48:22 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/12/12 23:48:22 INFO tool.CodeGenTool: Beginning code generation
15/12/12 23:48:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `ts_user` AS t LIMIT 1
15/12/12 23:48:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `ts_user` AS t LIMIT 1
15/12/12 23:48:23 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/java/hadoop-2.6.2
注: /tmp/sqoop-hadoop/compile/ac79c6428b049576508e90af2cb05afe/ts_user.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
15/12/12 23:48:26 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/ac79c6428b049576508e90af2cb05afe/ts_user.jar
15/12/12 23:48:26 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/12/12 23:48:26 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/12/12 23:48:26 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/12/12 23:48:26 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/12/12 23:48:26 INFO mapreduce.ImportJobBase: Beginning import of ts_user
15/12/12 23:48:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/12 23:48:27 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/12/12 23:48:28 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/12/12 23:48:28 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/12/12 23:48:38 INFO db.DBInputFormat: Using read commited transaction isolation
15/12/12 23:48:38 INFO mapreduce.JobSubmitter: number of splits:1
15/12/12 23:48:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449928075440_0002
15/12/12 23:48:43 INFO impl.YarnClientImpl: Submitted application application_1449928075440_0002
15/12/12 23:48:43 INFO mapreduce.Job: The url to track the job: http://single.hadoop.com:8088/proxy/application_1449928075440_0002/
15/12/12 23:48:43 INFO mapreduce.Job: Running job: job_1449928075440_0002
15/12/12 23:49:31 INFO mapreduce.Job: Job job_1449928075440_0002 running in uber mode : false
15/12/12 23:49:31 INFO mapreduce.Job:  map 0% reduce 0%
15/12/12 23:49:51 INFO mapreduce.Job:  map 100% reduce 0%
15/12/12 23:49:53 INFO mapreduce.Job: Job job_1449928075440_0002 completed successfully
15/12/12 23:49:54 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=124049
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=87
                HDFS: Number of bytes written=40
                HDFS: Number of read operations=4
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters 
                Launched map tasks=1
                Other local map tasks=1
                Total time spent by all maps in occupied slots (ms)=15767
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=15767
                Total vcore-seconds taken by all map tasks=15767
                Total megabyte-seconds taken by all map tasks=16145408
        Map-Reduce Framework
                Map input records=1
                Map output records=1
                Input split bytes=87
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=178
                CPU time spent (ms)=2150
                Physical memory (bytes) snapshot=101388288
                Virtual memory (bytes) snapshot=2060509184
                Total committed heap usage (bytes)=28442624
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=40
15/12/12 23:49:54 INFO mapreduce.ImportJobBase: Transferred 40 bytes in 85.9276 seconds (0.4655 bytes/sec)
15/12/12 23:49:54 INFO mapreduce.ImportJobBase: Retrieved 1 records.

查看结果

[hadoop@single hadoop-2.6.2]$ bin/hadoop fs -ls
drwxr-xr-x   - hadoop supergroup          0 2015-12-12 17:04 output
drwxr-xr-x   - hadoop supergroup          0 2015-12-12 23:49 ts_user

[hadoop@single hadoop-2.6.2]$ bin/hadoop fs -text ts_user/part-m-00000
结果：  1,admin,admin,0,1,2015-12-03,2015-12-23