Sqoop实践——原始数据库到大数据时代必备神器

2 篇文章 0 订阅
1 篇文章 0 订阅

必须要更改mysql,由原来5.1.17改为5.1.32,如下:

[hdfs@bd4 ~]$ ll /usr/share/java/mysql-connector-java*
-rw-r--r-- 1 root root 819803 Jun 22  2012 /usr/share/java/mysql-connector-java-5.1.17.jar
lrwxrwxrwx 1 root root     74 Dec  2 19:05 /usr/share/java/mysql-connector-java.jar -> /usr/share/mysql-connector-java-5.1.32/mysql-connector-java-5.1.32-bin.jar

否则报错:

14/09/08 01:10:12 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@1d5a7f6 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.  
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@1d5a7f6 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.  
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:930)  


然后切换到Hdfs用户,执行导入命令:

[hdfs@bd4 ~]$ sqoop import --connect jdbc:mysql://ip:3306/sqoop --username root -password 123456 --table test --fields-terminated-by '\t' -m 1
Warning: /usr/hdp/2.3.2.0-2950/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/hdp/2.3.2.0-2950/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/12/02 19:05:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.3.2.0-2950
15/12/02 19:05:46 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/12/02 19:05:47 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/12/02 19:05:47 INFO tool.CodeGenTool: Beginning code generation
15/12/02 19:05:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
15/12/02 19:05:47 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
15/12/02 19:05:47 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.3.2.0-2950/hadoop-mapreduce
Note: /tmp/sqoop-hdfs/compile/12ab2787b9e0e246c43048c26814edb8/test.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/12/02 19:05:49 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/12ab2787b9e0e246c43048c26814edb8/test.jar
15/12/02 19:05:49 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/12/02 19:05:49 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/12/02 19:05:49 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/12/02 19:05:49 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/12/02 19:05:49 INFO mapreduce.ImportJobBase: Beginning import of test
15/12/02 19:05:51 INFO impl.TimelineClientImpl: Timeline service address: http://bd2.com:8188/ws/v1/timeline/
15/12/02 19:05:51 INFO client.RMProxy: Connecting to ResourceManager at bd2.com/10.252.169.250:8050
15/12/02 19:05:56 INFO db.DBInputFormat: Using read commited transaction isolation
15/12/02 19:05:56 INFO mapreduce.JobSubmitter: number of splits:1
15/12/02 19:05:56 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1447409744056_0024
15/12/02 19:05:57 INFO impl.YarnClientImpl: Submitted application application_1447409744056_0024
15/12/02 19:05:57 INFO mapreduce.Job: The url to track the job: http://bd2.com:8088/proxy/application_1447409744056_0024/
15/12/02 19:05:57 INFO mapreduce.Job: Running job: job_1447409744056_0024
15/12/02 19:06:12 INFO mapreduce.Job: Job job_1447409744056_0024 running in uber mode : false
15/12/02 19:06:12 INFO mapreduce.Job:  map 0% reduce 0%
15/12/02 19:06:22 INFO mapreduce.Job:  map 100% reduce 0%
15/12/02 19:06:23 INFO mapreduce.Job: Job job_1447409744056_0024 completed successfully
15/12/02 19:06:23 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=142123
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=87
                HDFS: Number of bytes written=25
                HDFS: Number of read operations=4
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters 
                Launched map tasks=1
                Other local map tasks=1
                Total time spent by all maps in occupied slots (ms)=6993
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=6993
                Total vcore-seconds taken by all map tasks=6993
                Total megabyte-seconds taken by all map tasks=3580416
        Map-Reduce Framework
                Map input records=3
                Map output records=3
                Input split bytes=87
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=68
                CPU time spent (ms)=840
                Physical memory (bytes) snapshot=163049472
                Virtual memory (bytes) snapshot=1102458880
                Total committed heap usage (bytes)=97517568
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=25
15/12/02 19:06:23 INFO mapreduce.ImportJobBase: Transferred 25 bytes in 32.3223 seconds (0.7735 bytes/sec)
15/12/02 19:06:23 INFO mapreduce.ImportJobBase: Retrieved 3 records.
[hdfs@bd4 ~]$ 
[hdfs@bd4 ~]$ 
[hdfs@bd4 ~]$ 


可见,导入成功,在hdfs用户目录下,如下:

[hdfs@bd4 ~]$ hadoop fs -ls -R /
drwxr-xr-x   - hdfs      hdfs            0 2015-12-02 19:06 /user/hdfs/test
-rw-r--r--   3 hdfs      hdfs            0 2015-12-02 19:06 /user/hdfs/test/_SUCCESS
-rw-r--r--   3 hdfs      hdfs           25 2015-12-02 19:06 /user/hdfs/test/part-m-00000


查看文件内容:

[hdfs@bd4 ~]$ hadoop fs -cat /user/hdfs/test/part-m-00000
2       mumu
1       xiangge
3       maoge


导出也是OK的,但抛出一个密码问题,我确定是这个密码,暂时还不知道原因:

[hdfs@bd4 ~]$ sqoop export --connect jdbc:mysql://ip:3306/sqoop --username root -password 123456 --table test2 --export-dir '/user/hdfs/test/part-m-00000'  --fields-terminated-by '\t'
Warning: /usr/hdp/2.3.2.0-2950/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/hdp/2.3.2.0-2950/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/12/04 14:18:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.3.2.0-2950
15/12/04 14:18:42 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/12/04 14:18:42 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/12/04 14:18:42 INFO tool.CodeGenTool: Beginning code generation
15/12/04 14:18:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test2` AS t LIMIT 1
15/12/04 14:18:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test2` AS t LIMIT 1
15/12/04 14:18:42 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.3.2.0-2950/hadoop-mapreduce
Note: /tmp/sqoop-hdfs/compile/70145de68e097becfb9fa8ee50764798/test2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/12/04 14:18:45 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/70145de68e097becfb9fa8ee50764798/test2.jar
15/12/04 14:18:45 INFO mapreduce.ExportJobBase: Beginning export of test2
15/12/04 14:18:47 INFO impl.TimelineClientImpl: Timeline service address: http://bd2.com:8188/ws/v1/timeline/
15/12/04 14:18:47 INFO client.RMProxy: Connecting to ResourceManager at bd2.com/10.252.169.250:8050
15/12/04 14:18:51 INFO input.FileInputFormat: Total input paths to process : 1
15/12/04 14:18:51 INFO input.FileInputFormat: Total input paths to process : 1
15/12/04 14:18:51 INFO mapreduce.JobSubmitter: number of splits:4
15/12/04 14:18:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1447409744056_0033
15/12/04 14:18:52 INFO impl.YarnClientImpl: Submitted application application_1447409744056_0033
15/12/04 14:18:52 INFO mapreduce.Job: The url to track the job: http://bd2.com:8088/proxy/application_1447409744056_0033/
15/12/04 14:18:52 INFO mapreduce.Job: Running job: job_1447409744056_0033
15/12/04 14:19:00 INFO mapreduce.Job: Job job_1447409744056_0033 running in uber mode : false
15/12/04 14:19:00 INFO mapreduce.Job:  map 0% reduce 0%
15/12/04 14:19:24 INFO mapreduce.Job:  map 100% reduce 0%
15/12/04 14:19:24 INFO mapreduce.Job: Job job_1447409744056_0033 completed successfully
15/12/04 14:19:24 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=567916
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=659
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=19
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters 
                Launched map tasks=4
                Data-local map tasks=4
                Total time spent by all maps in occupied slots (ms)=82285
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=82285
                Total vcore-seconds taken by all map tasks=82285
                Total megabyte-seconds taken by all map tasks=42129920
        Map-Reduce Framework
                Map input records=3
                Map output records=3
                Input split bytes=576
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=503
                CPU time spent (ms)=4200
                Physical memory (bytes) snapshot=637882368
                Virtual memory (bytes) snapshot=4463616000
                Total committed heap usage (bytes)=391643136
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=0
15/12/04 14:19:24 INFO mapreduce.ExportJobBase: Transferred 659 bytes in 37.0326 seconds (17.7951 bytes/sec)
15/12/04 14:19:24 INFO mapreduce.ExportJobBase: Exported 3 records.
[hdfs@bd4 ~]$ 



结果如下:

mysql> select * from test2;
+----+---------+
| Id | namestr |
+----+---------+
|  1 | xiangge |
|  2 | mumu    |
|  3 | maoge   |
+----+---------+
3 rows in set (0.00 sec)


评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值