Sqoop安装以及指令学习 ----------语句学习一条龙

Sqoop介绍

  • Sqoop是一个用于在Hadoop和关系型数据库之间传输数据的工具
  1. 将数据从RDBMS导入到HDFS
    RDBMS:HDFS、Hive、Hbase
  2. 从HDFS导出数据到RDBMS
  3. 使用MapReduce导入和导出数据,提供并行操作和容错
  • 目标用户
  1. 系统管理员、数据库管理员
  2. 大数据分析师、大数据开发工程师等

Sqoop安装

1.解压安装:

[root@zjw opt]# tar -zxf sqoop-1.4.6-cdh5.14.2.tar.gz 
[root@zjw opt]# mv sqoop-1.4.6-cdh5.14.2 soft/sqoop146

在这里插入图片描述

2.添加所需驱动
连接Hadoop得三个驱动分别在这三个位置:

在这里插入图片描述
连接mysql的驱动可在自己idea maven本地仓库找。
添加驱动:
在这里插入图片描述
在这里插入图片描述
3. 配置文件
首先准备好hadoop和hive的环境变量:


[root@zjw conf]# echo $HADOOP_HOME
/opt/soft/hadoop/hadoop-2.6.0-cdh5.14.2
[root@zjw conf]# echo $HIVE_HOME\
> 
/opt/soft/hive/hive-1.1.0-cdh5.14.2

export HADOOP_COMMON_HOME=/opt/soft/hadoop/hadoop-2.6.0-cdh5.14.2

复制临时配置文件:

[root@zjw conf]# cp sqoop-env-template.sh sqoop-env.sh

在里面添加环境变量:

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/opt/soft/hadoop/hadoop-2.6.0-cdh5.14.2

#set the path to where bin/hbase is available
#export HBASE_HOME=

#Set the path to where bin/hive is available
export HIVE_HOME=/opt/soft/hive/hive-1.1.0-cdh5.14.2

#Set the path for where zookeper config dir is
export ZOOCFGDIR=/opt/soft/zookeeper/zookeeper-3.4.5-cdh5.14.2/conf

4.环境变量配置
在这里插入图片描述


# SQOOP_HOME
export SQOOP_HOME=/opt/soft/sqoop146
export PATH=$PATH:$SQOOP_HOME/bin

然后激活环境变量:

[root@zjw sqoop146]# vi /etc/profile
[root@zjw sqoop146]# source /etc/profile

Sqoop操作

连接mysql

  1. 连接mysql 拉取databases
语句:
[root@zjw conf]# sqoop list-databases --username root --password 1234 --connect jdbc:mysql://192.168.56.100:3306/
执行结果:
[root@zjw conf]# sqoop list-databases --username root --password 1234 --connect jdbc:mysql://192.168.56.100:3306/
Warning: /opt/soft/sqoop146/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/soft/sqoop146/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
20/06/26 10:13:10 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.14.2
20/06/26 10:13:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
20/06/26 10:13:10 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
Fri Jun 26 10:13:11 CST 2020 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
information_schema
hive
mydemo
mysql
performance_schema
sys

  1. 查看表
语句:
[root@zjw conf]# sqoop list-tables --username root --password 1234 --connect jdbc:mysql://192.168.56.100:3306/mydemo
执行结果:
[root@zjw conf]# sqoop list-tables --username root --password 1234 --connect jdbc:mysql://192.168.56.100:3306/mydemo
Warning: /opt/soft/sqoop146/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/soft/sqoop146/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
20/06/26 10:44:57 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.14.2
20/06/26 10:44:57 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
20/06/26 10:44:58 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
Fri Jun 26 10:44:58 CST 2020 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
user


Hadoop拉取

启动hadoop 然后准备利用sqoop导出:

全量导入:

[root@zjw conf]# sqoop import --connect jdbc:mysql://192.168.56.100:3306/mydemo --driver com.mysql.jdbc.Driver --username root --password 1234 --table user --target-dir /tmp/user -m 3

此时会可能会出现少jar包的错误:

在这里插入图片描述
下载地址:java-json.jar

在这里插入图片描述
再次执行:
在这里插入图片描述
查看其中的一个分片内容:

[root@zjw conf]# hdfs dfs -text /tmp/user/part-m-00000
20/06/26 11:01:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1,zs,male

按sql语句导入:

[root@zjw conf]# sqoop import --connect jdbc:mysql://192.168.56.100:3306/mydemo --driver com.mysql.jdbc.Driver --username root --password 1234 --query "select * from user where gender='male' and \$CONDITIONS" --target-dir /tmp/user1 --split-by Id --fields-terminated-by ','  -m 1

在这里插入图片描述
查看查询的内容:

[root@zjw conf]# hdfs dfs -text /tmp/user1/part-m-00000
20/06/26 11:33:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1,zs,male

Append 追加:

[root@zjw myshl]# sqoop import --username root --password 1234 --connect jdbc:mysql://192.168.56.100:3306/mydemo --driver com.mysql.jdbc.Driver --query "select * from myorder where orderdate='2020-06-25' and \$CONDITIONS" --incremental append --check-column orderdate --target-dir /mydata/mytest -m 1

然后追加6-26日的:

[root@zjw myshl]# sqoop import --username root --password 1234 --connect jdbc:mysql://192.168.56.100:3306/mydemo --driver com.mysql.jdbc.Driver --query "select * from myorder where orderdate='2020-06-26' and \$CONDITIONS" --incremental append --check-column orderdate --target-dir /mydata/mytest -m 1

在这里插入图片描述
创建一个外部表接数据:

hive> create database myddd;
OK
Time taken: 0.8 seconds
hive> use myddd;
hive> create external table mytab(
    > custid string,
    > custname string,
    > birthday string
    > )
    > row format delimited fields terminated by ','
    > location '/mydata/mytest';
OK
Time taken: 0.277 seconds
hive> select * from mytab;
OK
1	dd0001	2020-06-25
2	dd0002	2020-06-25
3	dd0003	2020-06-26
4	dd0004	2020-06-26
3	dd0003	2020-06-26
4	dd0004	2020-06-26
Time taken: 0.388 seconds, Fetched: 6 row(s)

每次运行 这边外部表就会接取数据

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值