# 准备: [root@leader sqoop-1.4.7.bin__hadoop-2.6.0]# hdfs dfs -mkdir -p /sqoop/mysql/[root@leader sqoop-1.4.7.bin__hadoop-2.6.0]# hdfs dfs -ls -R /sqoop
drwxr-xr-x - root supergroup 02022-03-26 02:40 /sqoop/mysql
# --num-mappers 3 表示分块为三个的意思# --fields-terminated-by ",\t" 表示分隔符为,,选择第一个!# 执行命令,用-P来输入密码[root@leader sqoop-1.4.7.bin__hadoop-2.6.0]# /opt/sqoop-1.4.7.bin__hadoop-2.6.0/bin/sqoop import \> --connect jdbc:mysql://slave1:3306/javaAndBigdata \> --username root \> -P \> --table student \> --target-dir /sqoop/mysql/javaAndBigdata/student \> --num-mappers 3\> --fields-terminated-by ",\t"
Warning: /opt/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set$HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set$ACCUMULO_HOME to the root of your Accumulo installation.
2022-03-26 03:01:17,514 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
# 输入mysql的密码
Enter password:
# --fields-terminated-by ",\t" 表示分隔符为,,选择第一个!2022-03-26 03:01:22,314 WARN sqoop.SqoopOptions: Character argument ,\t has multiple characters; only the first will be used.
2022-03-26 03:01:22,405 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
2022-03-26 03:01:22,405 INFO tool.CodeGenTool: Beginning code generation
2022-03-26 03:01:22,771 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 12022-03-26 03:01:22,797 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student` AS t LIMIT 12022-03-26 03:01:22,807 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/hadoop-3.2.2
注: /tmp/sqoop-root/compile/830756523a1c4105ded8d1a83aec3c65/student.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
2022-03-26 03:01:24,041 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/830756523a1c4105ded8d1a83aec3c65/student.jar
2022-03-26 03:01:24,052 WARN manager.MySQLManager: It looks like you are importing from mysql.
# 使用--direct效率会更快!推荐2022-03-26 03:01:24,052 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
2022-03-26 03:01:24,052 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
2022-03-26 03:01:24,052 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)2022-03-26 03:01:24,058 INFO mapreduce.ImportJobBase: Beginning import of student
2022-03-26 03:01:24,058 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2022-03-26 03:01:24,178 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
2022-03-26 03:01:24,751 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2022-03-26 03:01:25,212 INFO client.AHSProxy: Connecting to Application History server at slave2/192.168.52.11:10200
2022-03-26 03:01:25,368 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
2022-03-26 03:01:25,784 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1648229768990_0002
2022-03-26 03:01:27,552 INFO db.DBInputFormat: Using read commited transaction isolation
2022-03-26 03:01:27,553 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`stu_id`), MAX(`stu_id`) FROM `student`2022-03-26 03:01:27,556 INFO db.IntegerSplitter: Split size: 3; Num splits: 3 from: 1 to: 10# 分块为三块2022-03-26 03:01:27,630 INFO mapreduce.JobSubmitter: number of splits:3
2022-03-26 03:01:27,768 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648229768990_0002
2022-03-26 03:01:27,770 INFO mapreduce.JobSubmitter: Executing with tokens: []2022-03-26 03:01:28,002 INFO conf.Configuration: resource-types.xml not found
2022-03-26 03:01:28,002 INFO resource.ResourceUtils: Unable to find'resource-types.xml'.2022-03-26 03:01:28,291 INFO impl.YarnClientImpl: Submitted application application_1648229768990_0002
# mapreduce提交到yarn2022-03-26 03:01:28,345 INFO mapreduce.Job: The url to track the job: http://slave1:8088/proxy/application_1648229768990_0002/
2022-03-26 03:01:28,346 INFO mapreduce.Job: Running job: job_1648229768990_0002
2022-03-26 03:01:35,489 INFO mapreduce.Job: Job job_1648229768990_0002 running in uber mode :false2022-03-26 03:01:35,491 INFO mapreduce.Job: map 0% reduce 0%
2022-03-26 03:01:41,641 INFO mapreduce.Job: map 67% reduce 0%
2022-03-26 03:01:45,672 INFO mapreduce.Job: map 100% reduce 0%
2022-03-26 03:01:45,686 INFO mapreduce.Job: Job job_1648229768990_0002 completed successfully
# mapreduce作业详情输出 2022-03-26 03:01:45,816 INFO mapreduce.Job: Counters: 34
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=747984
FILE: Number of readoperations=0
FILE: Number of large readoperations=0
FILE: Number of writeoperations=0
HDFS: Number of bytes read=320
HDFS: Number of bytes written=101
HDFS: Number of readoperations=18
HDFS: Number of large readoperations=0
HDFS: Number of writeoperations=6
HDFS: Number of bytes read erasure-coded=0
Job Counters
Killed map tasks=1
Launched map tasks=3
Other local map tasks=3
Total time spent by all maps in occupied slots (ms)=40740
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=10185
Total vcore-milliseconds taken by all map tasks=10185
Total megabyte-milliseconds taken by all map tasks=20858880
Map-Reduce Framework
Map input records=9
Map output records=9
Input splitbytes=320
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=310
CPU time spent (ms)=2890
Physical memory (bytes)snapshot=680730624
Virtual memory (bytes)snapshot=11030802432
Total committed heap usage (bytes)=546832384
Peak Map Physical memory (bytes)=237473792
Peak Map Virtual memory (bytes)=3677790208
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=1012022-03-26 03:01:45,824 INFO mapreduce.ImportJobBase: Transferred 101 bytes in21.0565 seconds (4.7966 bytes/sec)2022-03-26 03:01:45,829 INFO mapreduce.ImportJobBase: Retrieved 9 records.
2022-03-26 02:58:17,680 ERROR tool.ImportTool: Import failed: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://mycluster/sqoop/mysql/javaAndBigdata/student already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:277)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1565)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1562)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1562)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1583)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)