sqoop 导入mysql_Sqoop 将数据从HDFS导入到MySQL

本文介绍了如何使用sqoop将HDFS中的数据导入到MySQL数据库,包括查看HDFS文件数据、创建数据库表和执行数据导入的步骤,并展示了导入后的数据验证。
摘要由CSDN通过智能技术生成

在前面的2篇博客中我们看了使用sqoop 将数据从MySQL 导入的HDFS文件,我们这里看下如何将数据从HDFS文件导入到MySQL 数据库。https://www.cndba.cn/dave/article/3307https://www.cndba.cn/dave/article/3307

https://www.cndba.cn/dave/article/3307

Sqoop MySQL 导入数据到 HDFS

https://www.cndba.cn/dave/article/3305

Sqoop 导出库中所有表到 HDFS

https://www.cndba.cn/dave/article/3306https://www.cndba.cn/dave/article/3307

1 查看HDFS 文件的数据

为方便操作,我们这里直接使用之前测试的数据:https://www.cndba.cn/dave/article/3307

[dave@www.cndba.cn data]$ hdfs dfs -cat /user/hadoop/employees/part-m-00000|head

10001,1953-09-02,Georgi,Facello,M,1986-06-26

10002,1964-06-02,Bezalel,Simmel,F,1985-11-21

10003,1959-12-03,Parto,Bamford,M,1986-08-28

10004,1954-05-01,Chirstian,Koblick,M,1986-12-01

10005,1955-01-21,Kyoichi,Maliniak,M,1989-09-12

10006,1953-04-20,Anneke,Preusig,F,1989-06-02

10007,1957-05-23,Tzvetan,Zielinski,F,1989-02-10

10008,1958-02-19,Saniya,Kalloufi,M,1994-09-15

10009,1952-04-19,Sumant,Peac,F,1985-02-18

10010,1963-06-01,Duangkaew,Piveteau,F,1989-08-24

cat: Unable to write to output stream.

[dave@www.cndba.cn data]$ hdfs dfs -cat /user/hadoop/employees/part-m-00000|wc -l

300024

[dave@www.cndba.cn data]$

我们将/user/hadoop/employees/part-m-00000的数据导入的MySQL数据库中。

https://www.cndba.cn/dave/article/3307

2 导入MySQL

2.1 创建表

正式操作表的创建需要根据字段类型来进行操作,我们这里直接把表结构复制一份:https://www.cndba.cn/dave/article/3307https://www.cndba.cn/dave/article/3307

https://www.cndba.cn/dave/article/3307

CREATE TABLE `dave` (

`emp_no` int(11) NOT NULL,

`birth_date` date NOT NULL,

`first_name` varchar(14) NOT NULL,

`last_name` varchar(16) NOT NULL,

`gender` enum('M','F') NOT NULL,

`hire_date` date NOT NULL,

PRIMARY KEY (`emp_no`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

2.2 导入数据

[dave@www.cndba.cn data]$ sqoop export --connect jdbc:mysql://192.168.56.2:3306/employees --username root --table dave --export-dir /user/hadoop/employees -P

Warning: /home/hadoop/sqoop/../hbase does not exist! HBase imports will fail.

Please set $HBASE_HOME to the root of your HBase installation.

Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will fail.

Please set $ACCUMULO_HOME to the root of your Accumulo installation.

2019-03-03 02:23:14,354 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7

Enter password:

2019-03-03 02:23:16,203 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

2019-03-03 02:23:16,222 INFO tool.CodeGenTool: Beginning code generation

Sun Mar 03 02:23:16 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.

2019-03-03 02:23:16,721 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dave` AS t LIMIT 1

2019-03-03 02:23:16,768 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dave` AS t LIMIT 1

2019-03-03 02:23:16,786 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop

Note: /tmp/sqoop-hadoop/compile/fd09937f2198b625a8a66c3104f75e94/dave.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

2019-03-03 02:23:19,013 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/fd09937f2198b625a8a66c3104f75e94/dave.jar

2019-03-03 02:23:19,034 INFO mapreduce.ExportJobBase: Beginning export of dave

2019-03-03 02:23:19,035 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address

2019-03-03 02:23:19,292 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar

2019-03-03 02:23:20,614 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative

2019-03-03 02:23:20,618 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

2019-03-03 02:23:20,618 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps

2019-03-03 02:23:20,782 INFO client.RMProxy: Connecting to ResourceManager at Master/192.168.56.100:8032

2019-03-03 02:23:21,273 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1551513927152_0010

2019-03-03 02:23:25,675 INFO input.FileInputFormat: Total input files to process : 1

2019-03-03 02:23:25,681 INFO input.FileInputFormat: Total input files to process : 1

2019-03-03 02:23:25,807 INFO mapreduce.JobSubmitter: number of splits:4

2019-03-03 02:23:25,856 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

2019-03-03 02:23:25,856 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled

2019-03-03 02:23:26,110 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1551513927152_0010

2019-03-03 02:23:26,112 INFO mapreduce.JobSubmitter: Executing with tokens: []

2019-03-03 02:23:26,451 INFO conf.Configuration: resource-types.xml not found

2019-03-03 02:23:26,451 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.

2019-03-03 02:23:26,536 INFO impl.YarnClientImpl: Submitted application application_1551513927152_0010

2019-03-03 02:23:26,585 INFO mapreduce.Job: The url to track the job: http://Master:8088/proxy/application_1551513927152_0010/

2019-03-03 02:23:26,585 INFO mapreduce.Job: Running job: job_1551513927152_0010

2019-03-03 02:23:36,783 INFO mapreduce.Job: Job job_1551513927152_0010 running in uber mode : false

2019-03-03 02:23:36,784 INFO mapreduce.Job: map 0% reduce 0%

2019-03-03 02:24:07,055 INFO mapreduce.Job: map 29% reduce 0%

2019-03-03 02:24:08,065 INFO mapreduce.Job: map 37% reduce 0%

2019-03-03 02:24:13,104 INFO mapreduce.Job: map 61% reduce 0%

2019-03-03 02:24:14,110 INFO mapreduce.Job: map 82% reduce 0%

2019-03-03 02:24:15,152 INFO mapreduce.Job: map 85% reduce 0%

2019-03-03 02:24:16,174 INFO mapreduce.Job: map 93% reduce 0%

2019-03-03 02:24:17,184 INFO mapreduce.Job: map 100% reduce 0%

2019-03-03 02:24:17,202 INFO mapreduce.Job: Job job_1551513927152_0010 completed successfully

2019-03-03 02:24:17,302 INFO mapreduce.Job: Counters: 33

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=920724

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=13838998

HDFS: Number of bytes written=0

HDFS: Number of read operations=19

HDFS: Number of large read operations=0

HDFS: Number of write operations=0

HDFS: Number of bytes read erasure-coded=0

Job Counters

Launched map tasks=4

Data-local map tasks=4

Total time spent by all maps in occupied slots (ms)=148315

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=148315

Total vcore-milliseconds taken by all map tasks=148315

Total megabyte-milliseconds taken by all map tasks=151874560

Map-Reduce Framework

Map input records=300024

Map output records=300024

Input split bytes=606

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=1115

CPU time spent (ms)=20710

Physical memory (bytes) snapshot=645414912

Virtual memory (bytes) snapshot=11055927296

Total committed heap usage (bytes)=486801408

Peak Map Physical memory (bytes)=165249024

Peak Map Virtual memory (bytes)=2770202624

File Input Format Counters

Bytes Read=0

File Output Format Counters

Bytes Written=0

2019-03-03 02:24:17,313 INFO mapreduce.ExportJobBase: Transferred 13.1979 MB in 56.6817 seconds (238.4305 KB/sec)

2019-03-03 02:24:17,318 INFO mapreduce.ExportJobBase: Exported 300024 records.

[dave@www.cndba.cn data]$

验证数据:

https://www.cndba.cn/dave/article/3307

mysql> use employees

Database changed

mysql> select count(*) from dave;

+----------+

| count(*) |

+----------+

| 300024 |

+----------+

1 row in set (0.05 sec)

版权声明:本文为博主原创文章,未经博主允许不得转载。

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值