Using Sqoop 1.4.6 With Hadoop 2.7.4

本文主要描述Sqoop 1.4.6的安装配置以及使用。
一、安装配置
1、Sqoop安装

[hadoop@hdp01 ~]$ wget http://mirror.bit.edu.cn/apache/sqoop/1.4.6/sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
[hadoop@hdp01 ~]$ tar -xzf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
[hadoop@hdp01 ~]$ mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha /u01/sqoop
--编辑Sqoop环境变量
[hadoop@hdp01 ~]$ cd /u01/sqoop/conf
[hadoop@hdp01 conf]$ cp sqoop-env-template.sh sqoop-env.sh
[hadoop@hdp01 conf]$ vi sqoop-env.sh
export HADOOP_COMMON_HOME=/u01/hadoop
export HADOOP_MAPRED_HOME=/u01/hadoop
export HBASE_HOME=/u01/hbase
export HIVE_HOME=/u01/hive
export ZOOCFGDIR=/u01/zookeeper/conf
--注释掉configure-sqoop中的以下内容
#if [ -z "${HCAT_HOME}" ]; then
#  if [ -d "/usr/lib/hive-hcatalog" ]; then
#    HCAT_HOME=/usr/lib/hive-hcatalog
#  elif [ -d "/usr/lib/hcatalog" ]; then
#    HCAT_HOME=/usr/lib/hcatalog
#  else
#    HCAT_HOME=${SQOOP_HOME}/../hive-hcatalog
#    if [ ! -d ${HCAT_HOME} ]; then
#       HCAT_HOME=${SQOOP_HOME}/../hcatalog
#    fi
#  fi
#fi
#if [ -z "${ACCUMULO_HOME}" ]; then
#  if [ -d "/usr/lib/accumulo" ]; then
#    ACCUMULO_HOME=/usr/lib/accumulo
#  else
#    ACCUMULO_HOME=${SQOOP_HOME}/../accumulo
#  fi
#fi
## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
#  echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
#  echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
#fi
#
#if [ ! -d "${ACCUMULO_HOME}" ]; then
#  echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
#  echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
#fi
--编辑用户环境环境变量
[hadoop@hdp01 ~]$ vi .bash_profile
export SQOOP_HOME=/u01/sqoop
export SQOOP_CONF_DIR=$SQOOP_HOME/conf
export SQOOP_CLASSPATH=$SQOOP_CONF_DIR
export PATH=$PATH:$SQOOP_HOME/bin
[hadoop@hdp01 ~]$ source .bash_profile
--验证sqoop安装
[hadoop@hdp01 ~]$ sqoop version
2017-12-28 09:30:01,801 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Sqoop 1.4.6
git commit id c0c5a81723759fa575844a0a1eae8f510fa32c25
Compiled by root on Mon Apr 27 14:38:36 CST 2015
或者运行sqoop-version
--拷贝jdbc驱动
将MySQL、PostgreSQL以及Oracle的jdbc驱动拷贝到$SQOOP_HOME/lib

二、Sqoop使用
1、Sqoop测试各个jdbc驱动连接
1.1 Sqoop与MySQL的连接

[hadoop@hdp01 bin]$ sqoop list-tables --username root -P --connect jdbc:mysql://192.168.120.92:3306/smsqw?useSSL=false
2017-12-28 09:38:19,587 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Enter password: 
2017-12-28 09:38:23,067 [myid:] - INFO  [main:MySQLManager@69] - Preparing to use a MySQL streaming resultset.
Phone
TestPhone
history_store
tbAreaprefix
tbAreaprefix_bak
tbBill
tbBilltmp
tbCat
tbContact
tbDataPath
tbDeliverMsg
tbDeliverMsg2
tbDest
tbLocPrefix
tbMessage
tbPrice
tbReceiver
tbSSLog
tbSendState
tbSendState2
tbSmsSendState
tbTest
tbUser

1.2 Sqoop与PostgreSQL的连接

[hadoop@hdp01 ~]$ sqoop list-tables --username rhnuser -P --connect jdbc:postgresql://192.168.120.93:5432/rhndb
2017-12-28 09:40:24,842 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Enter password: 
2017-12-28 09:40:29,775 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 1000
rhnservergroupmembers
rhntemplatestring
rhnservergrouptypefeature
rhnserverhistory
qrtz_fired_triggers

1.3 Sqoop与Oracle的连接

[hadoop@hdp01 ~]$ sqoop list-tables --username spwuser -P --connect jdbc:oracle:thin:@192.168.120.121:1521/rhndb --driver oracle.jdbc.driver.OracleDriver
2017-12-28 10:01:43,337 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Enter password: 
2017-12-28 10:01:43,425 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 1000
rhnservergroupmembers
rhntemplatestring
rhnservergrouptypefeature
rhnserverhistory
qrtz_fired_triggers

1.4 Sqoop与Hive的连接
基于PostgreSQL在hive上创建一个名为rhnpackagefile的表,但不导入数据,后面介绍数据导入。

[hadoop@hdp01 ~]$ sqoop create-hive-table --connect jdbc:postgresql://192.168.120.93:5432/rhndb --table rhnpackagefile --username rhnuser -P --hive-database hivedb
2017-12-28 10:32:01,376 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Enter password: 
2017-12-28 10:32:04,699 [myid:] - INFO  [main:BaseSqoopTool@1353] - Using Hive-specific delimiters for output. You can override
2017-12-28 10:32:04,699 [myid:] - INFO  [main:BaseSqoopTool@1354] - delimiters with --fields-terminated-by, etc.
2017-12-28 10:32:04,819 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 1000
2017-12-28 10:32:05,015 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM "rhnpackagefile" AS t LIMIT 1
2017-12-28 10:32:05,674 [myid:] - INFO  [main:HiveImport@194] - Loading uploaded data into Hive
2017-12-28 10:32:09,089 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Class path contains multiple SLF4J bindings.
2017-12-28 10:32:09,090 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2017-12-28 10:32:09,090 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2017-12-28 10:32:09,090 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2017-12-28 10:32:09,091 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2017-12-28 10:32:09,091 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2017-12-28 10:32:09,091 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2017-12-28 10:32:09,095 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2017-12-28 10:32:11,996 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - 
2017-12-28 10:32:11,996 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - Logging initialized using configuration in jar:file:/u01/hive/lib/hive-common-2.3.2.jar!/hive-log4j2.properties Async: true
2017-12-28 10:32:16,650 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - OK
2017-12-28 10:32:16,783 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - Time taken: 3.433 seconds
2017-12-28 10:32:17,248 [myid:] - INFO  [main:HiveImport@242] - Hive import complete.

2、数据迁移
2.1 PostgreSQL☞Hive

[hadoop@hdp01 ~]$ sqoop import --connect jdbc:postgresql://192.168.120.93:5432/rhndb --table rhnpackagefile --username rhnuser -P --fields-terminated-by ',' --hive-import --hive-database hivedb --columns package_id,capability_id,device,inode,file_mode,username,groupname,rdev,file_size,mtime,checksum_id,linkto,flags,verifyflags,lang,created,modified --split-by modified -m 4
2017-12-28 11:24:46,666 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Enter password: 
2017-12-28 11:24:48,891 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 1000
2017-12-28 11:24:48,894 [myid:] - INFO  [main:CodeGenTool@92] - Beginning code generation
2017-12-28 11:24:49,091 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM "rhnpackagefile" AS t LIMIT 1
2017-12-28 11:24:49,127 [myid:] - INFO  [main:CompilationManager@94] - HADOOP_MAPRED_HOME is /u01/hadoop
Note: /tmp/sqoop-hadoop/compile/ca09f6bb133fa32808220902aedc0437/rhnpackagefile.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2017-12-28 11:24:50,481 [myid:] - INFO  [main:CompilationManager@330] - Writing jar file: /tmp/sqoop-hadoop/compile/ca09f6bb133fa32808220902aedc0437/rhnpackagefile.jar
2017-12-28 11:24:50,493 [myid:] - WARN  [main:PostgresqlManager@119] - It looks like you are importing from postgresql.
2017-12-28 11:24:50,493 [myid:] - WARN  [main:PostgresqlManager@120] - This transfer can be faster! Use the --direct
2017-12-28 11:24:50,494 [myid:] - WARN  [main:PostgresqlManager@121] - option to exercise a postgresql-specific fast path.
2017-12-28 11:24:50,495 [myid:] - INFO  [main:ImportJobBase@235] - Beginning import of rhnpackagefile
2017-12-28 11:24:50,496 [myid:] - INFO  [main:Configuration@1019] - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2017-12-28 11:24:50,634 [myid:] - INFO  [main:Configuration@1019] - mapred.jar is deprecated. Instead, use mapreduce.job.jar
2017-12-28 11:24:51,160 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2017-12-28 11:24:51,506 [myid:] - INFO  [main:TimelineClientImpl@123] - Timeline service address: http://hdp01:8188/ws/v1/timeline/
2017-12-28 11:24:51,696 [myid:] - INFO  [main:AHSProxy@42] - Connecting to Application History server at hdp01.thinkjoy.tt/192.168.120.96:10201
2017-12-28 11:24:53,801 [myid:] - INFO  [main:DBInputFormat@192] - Using read commited transaction isolation
2017-12-28 11:24:53,805 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2017-12-28 11:24:53,805 [myid:] - INFO  [main:DataDrivenDBInputFormat@147] - BoundingValsQuery: SELECT MIN("modified"), MAX("modified") FROM "rhnpackagefile"
2017-12-28 11:25:14,854 [myid:] - WARN  [main:TextSplitter@64] - Generating splits for a textual index column.
2017-12-28 11:25:14,854 [myid:] - WARN  [main:TextSplitter@65] - If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.
2017-12-28 11:25:14,854 [myid:] - WARN  [main:TextSplitter@67] - You are strongly encouraged to choose an integral split column.
2017-12-28 11:25:14,903 [myid:] - INFO  [main:JobSubmitter@396] - number of splits:6
2017-12-28 11:25:14,997 [myid:] - INFO  [main:JobSubmitter@479] - Submitting tokens for job: job_1514358672274_0009
2017-12-28 11:25:15,453 [myid:] - INFO  [main:YarnClientImpl@236] - Submitted application application_1514358672274_0009
2017-12-28 11:25:15,485 [myid:] - INFO  [main:Job@1289] - The url to track the job: http://hdp01:8088/proxy/application_1514358672274_0009/
2017-12-28 11:25:15,486 [myid:] - INFO  [main:Job@1334] - Running job: job_1514358672274_0009
2017-12-28 11:25:24,763 [myid:] - INFO  [main:Job@1355] - Job job_1514358672274_0009 running in uber mode : false
2017-12-28 11:25:24,764 [myid:] - INFO  [main:Job@1362] -  map 0% reduce 0%
2017-12-28 11:26:00,465 [myid:] - INFO  [main:Job@1362] -  map 17% reduce 0%
2017-12-28 11:26:01,625 [myid:] - INFO  [main:Job@1362] -  map 50% reduce 0%
2017-12-28 11:26:03,643 [myid:] - INFO  [main:Job@1362] -  map 83% reduce 0%
2017-12-28 11:34:22,028 [myid:] - INFO  [main:Job@1362] -  map 100% reduce 0%
2017-12-28 11:34:22,035 [myid:] - INFO  [main:Job@1373] - Job job_1514358672274_0009 completed successfully
2017-12-28 11:34:22,162 [myid:] - INFO  [main:Job@1380] - Counters: 31
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=860052
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=913
                HDFS: Number of bytes written=3985558014
                HDFS: Number of read operations=24
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=12
        Job Counters 
                Killed map tasks=1
                Launched map tasks=7
                Other local map tasks=7
                Total time spent by all maps in occupied slots (ms)=1208611
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=1208611
                Total vcore-seconds taken by all map tasks=1208611
                Total megabyte-seconds taken by all map tasks=4331661824
        Map-Reduce Framework
                Map input records=18680041
                Map output records=18680041
                Input split bytes=913
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=4453
                CPU time spent (ms)=180780
                Physical memory (bytes) snapshot=1957969920
                Virtual memory (bytes) snapshot=30116270080
                Total committed heap usage (bytes)=1611661312
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=3985558014
2017-12-28 11:34:22,170 [myid:] - INFO  [main:ImportJobBase@184] - Transferred 3.7118 GB in 571.0001 seconds (6.6566 MB/sec)
2017-12-28 11:34:22,174 [myid:] - INFO  [main:ImportJobBase@186] - Retrieved 18680041 records.
2017-12-28 11:34:22,215 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM "rhnpackagefile" AS t LIMIT 1
2017-12-28 11:34:22,245 [myid:] - INFO  [main:HiveImport@194] - Loading uploaded data into Hive
2017-12-28 11:34:28,609 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - 
2017-12-28 11:34:28,609 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Logging initialized using configuration in jar:file:/u01/hive/lib/hive-common-2.3.2.jar!/hive-log4j2.properties Async: true
2017-12-28 11:34:31,619 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - OK
2017-12-28 11:34:31,622 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Time taken: 1.666 seconds
2017-12-28 11:34:32,026 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Loading data to table hivedb.rhnpackagefile
2017-12-28 11:36:14,783 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - OK
2017-12-28 11:36:14,908 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Time taken: 103.285 seconds
2017-12-28 11:36:15,363 [myid:] - INFO  [main:HiveImport@242] - Hive import complete.
2017-12-28 11:36:15,372 [myid:] - INFO  [main:HiveImport@278] - Export directory is contains the _SUCCESS file only, removing the directory.

2.2 MySQL☞HDFS

[hadoop@hdp01 ~]$ sqoop import --connect jdbc:mysql://192.168.120.92:3306/smsqw --username smsqw -P --table tbDest --columns iMsgID,cDest,tTime,cSMID,iReSend,tLastProcess,cEnCode,tCreateDT,iNum,iResult,iPriority,iPayment,cState,tGpTime --split-by tGpTime --target-dir /user/DataSource/MySQL/tbDest
2017-12-28 14:36:52,550 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Enter password: 
2017-12-28 14:36:55,496 [myid:] - INFO  [main:MySQLManager@69] - Preparing to use a MySQL streaming resultset.
2017-12-28 14:36:55,497 [myid:] - INFO  [main:CodeGenTool@92] - Beginning code generation
Thu Dec 28 14:36:55 CST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
2017-12-28 14:36:56,233 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest` AS t LIMIT 1
2017-12-28 14:36:56,253 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest` AS t LIMIT 1
2017-12-28 14:36:56,260 [myid:] - INFO  [main:CompilationManager@94] - HADOOP_MAPRED_HOME is /u01/hadoop
Note: /tmp/sqoop-hadoop/compile/4a4024e6b2baa336939a9310f627636a/tbDest.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2017-12-28 14:36:57,637 [myid:] - INFO  [main:CompilationManager@330] - Writing jar file: /tmp/sqoop-hadoop/compile/4a4024e6b2baa336939a9310f627636a/tbDest.jar
2017-12-28 14:36:57,650 [myid:] - WARN  [main:MySQLManager@107] - It looks like you are importing from mysql.
2017-12-28 14:36:57,650 [myid:] - WARN  [main:MySQLManager@108] - This transfer can be faster! Use the --direct
2017-12-28 14:36:57,650 [myid:] - WARN  [main:MySQLManager@109] - option to exercise a MySQL-specific fast path.
2017-12-28 14:36:57,650 [myid:] - INFO  [main:MySQLManager@189] - Setting zero DATETIME behavior to convertToNull (mysql)
2017-12-28 14:36:57,652 [myid:] - INFO  [main:ImportJobBase@235] - Beginning import of tbDest
2017-12-28 14:36:57,653 [myid:] - INFO  [main:Configuration@1019] - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2017-12-28 14:36:57,820 [myid:] - INFO  [main:Configuration@1019] - mapred.jar is deprecated. Instead, use mapreduce.job.jar
2017-12-28 14:36:58,229 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2017-12-28 14:36:58,581 [myid:] - INFO  [main:TimelineClientImpl@123] - Timeline service address: http://hdp01:8188/ws/v1/timeline/
2017-12-28 14:36:58,770 [myid:] - INFO  [main:AHSProxy@42] - Connecting to Application History server at hdp01.thinkjoy.tt/192.168.120.96:10201
Thu Dec 28 14:37:01 CST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
2017-12-28 14:37:01,123 [myid:] - INFO  [main:DBInputFormat@192] - Using read commited transaction isolation
2017-12-28 14:37:01,124 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2017-12-28 14:37:01,124 [myid:] - INFO  [main:DataDrivenDBInputFormat@147] - BoundingValsQuery: SELECT MIN(`tGpTime`), MAX(`tGpTime`) FROM `tbDest`
2017-12-28 14:37:17,446 [myid:] - INFO  [main:JobSubmitter@396] - number of splits:4
2017-12-28 14:37:17,541 [myid:] - INFO  [main:JobSubmitter@479] - Submitting tokens for job: job_1514358672274_0012
2017-12-28 14:37:17,966 [myid:] - INFO  [main:YarnClientImpl@236] - Submitted application application_1514358672274_0012
2017-12-28 14:37:17,996 [myid:] - INFO  [main:Job@1289] - The url to track the job: http://hdp01:8088/proxy/application_1514358672274_0012/
2017-12-28 14:37:17,996 [myid:] - INFO  [main:Job@1334] - Running job: job_1514358672274_0012
2017-12-28 14:37:26,149 [myid:] - INFO  [main:Job@1355] - Job job_1514358672274_0012 running in uber mode : false
2017-12-28 14:37:26,150 [myid:] - INFO  [main:Job@1362] -  map 0% reduce 0%
2017-12-28 14:39:52,733 [myid:] - INFO  [main:Job@1362] -  map 25% reduce 0%
2017-12-28 14:40:14,978 [myid:] - INFO  [main:Job@1362] -  map 75% reduce 0%
2017-12-28 14:40:43,183 [myid:] - INFO  [main:Job@1362] -  map 100% reduce 0%
2017-12-28 14:40:43,191 [myid:] - INFO  [main:Job@1373] - Job job_1514358672274_0012 completed successfully
2017-12-28 14:40:43,321 [myid:] - INFO  [main:Job@1380] - Counters: 31
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=573248
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=609
                HDFS: Number of bytes written=5399155888
                HDFS: Number of read operations=16
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=8
        Job Counters 
                Killed map tasks=2
                Launched map tasks=6
                Other local map tasks=6
                Total time spent by all maps in occupied slots (ms)=724670
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=724670
                Total vcore-seconds taken by all map tasks=724670
                Total megabyte-seconds taken by all map tasks=2597217280
        Map-Reduce Framework
                Map input records=31037531
                Map output records=31037531
                Input split bytes=609
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=3675
                CPU time spent (ms)=588590
                Physical memory (bytes) snapshot=4045189120
                Virtual memory (bytes) snapshot=20141694976
                Total committed heap usage (bytes)=1943535616
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=5399155888
2017-12-28 14:40:43,329 [myid:] - INFO  [main:ImportJobBase@184] - Transferred 5.0284 GB in 225.0893 seconds (22.8755 MB/sec)
2017-12-28 14:40:43,335 [myid:] - INFO  [main:ImportJobBase@186] - Retrieved 31037531 records.

2.3 HDFS☞MySQL

[hadoop@hdp01 ~]$ sqoop export --connect jdbc:mysql://192.168.120.92:3306/smsqw?useSSL=false --username smsqw -P --table tbDest2 --export-dir /user/DataSource/MySQL/tbDest
2017-12-28 16:03:18,922 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6
Enter password: 
2017-12-28 16:03:21,934 [myid:] - INFO  [main:MySQLManager@69] - Preparing to use a MySQL streaming resultset.
2017-12-28 16:03:21,934 [myid:] - INFO  [main:CodeGenTool@92] - Beginning code generation
2017-12-28 16:03:22,343 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest2` AS t LIMIT 1
2017-12-28 16:03:22,365 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest2` AS t LIMIT 1
2017-12-28 16:03:22,373 [myid:] - INFO  [main:CompilationManager@94] - HADOOP_MAPRED_HOME is /u01/hadoop
Note: /tmp/sqoop-hadoop/compile/332a6c4b30e942c56cf7f507cdff5761/tbDest2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2017-12-28 16:03:23,752 [myid:] - INFO  [main:CompilationManager@330] - Writing jar file: /tmp/sqoop-hadoop/compile/332a6c4b30e942c56cf7f507cdff5761/tbDest2.jar
2017-12-28 16:03:23,762 [myid:] - INFO  [main:ExportJobBase@378] - Beginning export of tbDest2
2017-12-28 16:03:23,762 [myid:] - INFO  [main:Configuration@1019] - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2017-12-28 16:03:24,011 [myid:] - INFO  [main:Configuration@1019] - mapred.jar is deprecated. Instead, use mapreduce.job.jar
2017-12-28 16:03:24,738 [myid:] - INFO  [main:Configuration@1019] - mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
2017-12-28 16:03:24,742 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2017-12-28 16:03:24,743 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2017-12-28 16:03:25,087 [myid:] - INFO  [main:TimelineClientImpl@123] - Timeline service address: http://hdp01:8188/ws/v1/timeline/
2017-12-28 16:03:25,269 [myid:] - INFO  [main:AHSProxy@42] - Connecting to Application History server at hdp01.thinkjoy.tt/192.168.120.96:10201
2017-12-28 16:03:27,400 [myid:] - INFO  [main:FileInputFormat@281] - Total input paths to process : 4
2017-12-28 16:03:27,406 [myid:] - INFO  [main:FileInputFormat@281] - Total input paths to process : 4
2017-12-28 16:03:27,484 [myid:] - INFO  [main:JobSubmitter@396] - number of splits:4
2017-12-28 16:03:27,493 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2017-12-28 16:03:27,493 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2017-12-28 16:03:27,577 [myid:] - INFO  [main:JobSubmitter@479] - Submitting tokens for job: job_1514358672274_0020
2017-12-28 16:03:28,062 [myid:] - INFO  [main:YarnClientImpl@236] - Submitted application application_1514358672274_0020
2017-12-28 16:03:28,091 [myid:] - INFO  [main:Job@1289] - The url to track the job: http://hdp01:8088/proxy/application_1514358672274_0020/
2017-12-28 16:03:28,092 [myid:] - INFO  [main:Job@1334] - Running job: job_1514358672274_0020
2017-12-28 16:17:18,663 [myid:] - INFO  [main:Job@1355] - Job job_1514358672274_0020 running in uber mode : false
2017-12-28 16:17:18,665 [myid:] - INFO  [main:Job@1362] -  map 0% reduce 0%
2017-12-28 16:17:34,148 [myid:] - INFO  [main:Job@1362] -  map 1% reduce 0%
2017-12-28 16:17:43,200 [myid:] - INFO  [main:Job@1362] -  map 2% reduce 0%
2017-12-28 16:17:55,269 [myid:] - INFO  [main:Job@1362] -  map 3% reduce 0%
......
2017-12-28 16:40:15,427 [myid:] - INFO  [main:Job@1362] -  map 100% reduce 0%
2017-12-28 16:40:32,491 [myid:] - INFO  [main:Job@1373] - Job job_1514358672274_0020 completed successfully
2017-12-28 16:40:32,659 [myid:] - INFO  [main:Job@1380] - Counters: 31
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=571960
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=5401517442
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=70
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters 
                Launched map tasks=4
                Other local map tasks=1
                Rack-local map tasks=3
                Total time spent by all maps in occupied slots (ms)=4931826
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=4931826
                Total vcore-seconds taken by all map tasks=4931826
                Total megabyte-seconds taken by all map tasks=17675664384
        Map-Reduce Framework
                Map input records=31037531
                Map output records=31037531
                Input split bytes=2192
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=21815
                CPU time spent (ms)=1522470
                Physical memory (bytes) snapshot=3453595648
                Virtual memory (bytes) snapshot=20112125952
                Total committed heap usage (bytes)=477102080
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=0
2017-12-28 16:40:32,667 [myid:] - INFO  [main:ExportJobBase@301] - Transferred 5.0306 GB in 2,227.9141 seconds (2.3122 MB/sec)
2017-12-28 16:40:32,671 [myid:] - INFO  [main:ExportJobBase@303] - Exported 31037531 records.

另附import和export常用参数说明表:
Using Sqoop 1.4.6 With Hadoop 2.7.4
Using Sqoop 1.4.6 With Hadoop 2.7.4

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值