sqoop2 导入mysql_Sqoop2从Mysql导入Hdfs (hadoop-2.7.1,Sqoop 1.99.6)

本文档详细介绍了如何配置和使用Sqoop2将数据从MySQL数据库导入到Hadoop HDFS中,包括环境搭建、权限授权、启动Sqoop2服务器和Shell、创建链接、创建和启动导入作业等步骤。
摘要由CSDN通过智能技术生成

一、环境搭建

1.Hadoop

2.Sqoop2.x

3. mysql

二、从mysql导入hdfs

1.创建mysql数据库、表、以及测试数据

xxxxxxxx$  mysql -uroot -p

Enter password:

mysql> show databases;

+--------------------+

| Database           |

+--------------------+

| information_schema |

| mysql              |

| performance_schema |

| test               |

+--------------------+

4 rows in set (0.00 sec)

test  => 是新建的数据库

mysql> use test;

mysql> show tables;

+----------------------+

| Tables_in_test       |

+----------------------+               |

| test                 |

+----------------------+

1 rows in set (0.00 sec)

test => 是新增的表

mysql> desc test;

+-------+-------------+------+-----+---------+----------------+

| Field | Type        | Null | Key | Default | Extra          |

+-------+-------------+------+-----+---------+----------------+

| id    | int(11)     | NO   | PRI | NULL    | auto_increment |

| name  | varchar(45) | YES  |     | NULL    |                |

| age   | int(11)     | YES  |     | NULL    |                |

+-------+-------------+------+-----+---------+----------------+

3 rows in set (0.00 sec)

mysql> select * from test;

+----+------+------+

| id | name | age  |

+----+------+------+

|  7 | a    |    1 |

|  8 | b    |    2 |

|  9 | c    |    3 |

+----+------+------+

3 rows in set (0.00 sec)

2. 为各个用户授权

注意:sqoop提交job后,各个节点在map阶段会访问数据库,所以需事先授权

mysql> grant [all | select | ...] on {db}.{table} to {user}@{host} identified by {passwd};

mysql> flush privileges;

#我给特定的hostname授权 username:root passwd:root 访问db:test 中任意table,权限是all

mysql> grant all on test.* to 'root'@{host} identified by 'root';

3.启动sqoop2-server

[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ pwd

/home/hadoop/sqoop-1.99.6-bin-hadoop200

[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ ./bin/sqoop2-server start

...

...

webui可以访问校验,也可以查看log

4.启动sqoop2-shell

[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ pwd

/home/hadoop/sqoop-1.99.6-bin-hadoop200

[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ ./bin/sqoop2-shell

...

...

sqoop:000> show version

...

...

sqoop:000> show connector

+----+------------------------+---------+------------------------------------------------------+----------------------+

| Id |          Name          | Version |                        Class                         | Supported Directions |

+----+------------------------+---------+------------------------------------------------------+----------------------+

| 1  | generic-jdbc-connector | 1.99.6  | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO              |

| 2  | kite-connector         | 1.99.6  | org.apache.sqoop.connector.kite.KiteConnector        | FROM/TO              |

| 3  | hdfs-connector         | 1.99.6  | org.apache.sqoop.connector.hdfs.HdfsConnector        | FROM/TO              |

| 4  | kafka-connector        | 1.99.6  | org.apache.sqoop.connector.kafka.KafkaConnector      | TO                   |

+----+------------------------+---------+------------------------------------------------------+----------------------+

根据你的connector创建connector

sqoop:000> create link -c 1      => 先创建jdbc

会填写name、jdbc-driver、url、username、passwd等等

sqoop:000> create link -c 3      => 创建hdfs

会填写name、hdfs url、等等

sqoop:000> show link

+----+-------------+--------------+------------------------+---------+

| Id |    Name     | Connector Id |     Connector Name     | Enabled |

+----+-------------+--------------+------------------------+---------+

| 3  | 10-21_jdbc1 | 1            | generic-jdbc-connector | true    |

| 4  | 10-21_hdfs1 | 3            | hdfs-connector         | true    |

+----+-------------+--------------+------------------------+---------+

创建job -f=> from  -t to 即从哪些导入到哪里

sqoop:000> create job -f 3 -t 4

会填写,相应的table信息。还有hdfs信息

sqoop:000> show job

+----+---------------+----------------+--------------+---------+

| Id |     Name      | From Connector | To Connector | Enabled |

+----+---------------+----------------+--------------+---------+

| 1  | 10-20_sqoopy2 | 1              | 3            | true    |

+----+---------------+----------------+--------------+---------+

#启动job

sqoop:000> start job -j 2

...

...

...

可以再webui上访问到,查看进度,也可以使用

sqoop:000> status job -j 2

sqoop的guide

5.troubleshooting

多看日志,慢慢的排查

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值