Sqoop-1.4.6-cdh5.7.0.tar.gz安装和使用

一、Sqoop简介

二、Sqoop下载、解压

三、目录解读、配置环境变量

四、添加MySQL驱动、配置sqoop-env.sh

五、查看sqoop命令帮助

一、Sqoop简介

  • Apache Sqoop™ is a tool designed for efficiently transferring bulk data(一种为有效传输大量数据设计出来的工具) between Apache Hadoop and structured datastores such as relational databases)(比如说我们的关系型数据库和Hadoop).

  • Sqoop successfully graduated from the Incubator(孵化版) in March of 2012 and is now a Top-Level Apache Project(已经成为了Apache的顶级项目):More Information

  • Latest stable release is 1.4.7 (download, documentation). Latest cut of Sqoop2 is 1.99.7.Note that 1.99.7 is not compatible with 1.4.7 and not feature complete, it is not intended for production deployment.

小结:

  • Sqoop就是Hadoop和关系型数据库之间的数据交换的一个桥梁,本质是运行一个Map Reduce作业把我们关系型数据库中的数据读出来写到目的地中去(Hive、Hbase)

问题:Sqoop能否把关系数据库的数据导入Hive、Hbase?

  • 答:只要数据存储在hdfs上之间都是可以的。Hive就是存储在hdfs上的。

  • Sqoop:1.4.7 version:1 生产上一般使用1

  • Sqoop: 1.99.7 version:2

二、Sqoop下载、解压

1、下载地址

  • 在如下网址中找到本次使用的版本sqoop-1.4.6-cdh5.7.0:http://archive.cloudera.com/cdh5/cdh/5/

2、 下载并解压sqoop-1.4.6-cdh5.7.0.tar.gz

1、使用wget下载:
wget http://archive.cloudera.com/cdh5/cdh/5/sqoop-1.4.6-cdh5.7.0.tar.gz

2、解压:

  • tar -xzvf /sqoop-1.4.6-cdh5.7.0.tar.gz

注意:用户和用户组

三、Sqoop目录解读、配置环境变量

1、Sqoop文件目录结构:

[hadoop@hadoop004 sqoop-1.4.6-cdh5.7.0]$ ll
total 1880
drwxr-xr-x 2 hadoop hadoop    4096 Sep 12 14:35 bin					//存放脚本
drwxr-xr-x 2 hadoop hadoop    4096 Sep 12 14:35 conf					//配置文件
drwxr-xr-x 5 hadoop hadoop    4096 Sep 12 14:35 docs				//文档
drwxr-xr-x 2 hadoop hadoop    4096 Sep 12 14:35 lib					//lib依赖包
-rw-rw-r-- 1 hadoop hadoop 1035867 Mar 24  2016 sqoop-1.4.6-cdh5.7.0.jar				//完整jar包
-rw-rw-r-- 1 hadoop hadoop  656201 Mar 24  2016 sqoop-test-1.4.6-cdh5.7.0.jar				//测试jar包
drwxr-xr-x 7 hadoop hadoop    4096 Mar 24  2016 src					//源码

2、 配置环境变量并生效

1、vi ~/.bash_profile
export SQOOP_HOME=/home/hadoop/app/sqoop-1.4.6-cdh5.7.0
export PATH=$SQOOP_HOME/bin:$PATH

2、生效环境变量:
source ~/.bash_profile

四、添加MySQL驱动、配置sqoop-env.sh

1、注意$SQOOP_HOME/lib目录下要有mysql驱动

[hadoop@hadoop004 lib]$ ll mysql-connector-java-5.1.46-bin.jar 
-rw-r--r-- 1 hadoop hadoop 1004840 Feb 26  2018 mysql-connector-java-5.1.46-bin.jar

2、 配置sqoop-env.sh

  • [hadoop@hadoop000 conf]$ cp sqoop-env-template.sh sqoop-env.sh
#Set path to where bin/hadoop is available
#export HADOOP_COMMON_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0


#Set path to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0

#Set the path to where bin/hive is available
#export HIVE_HOME=/home/hadoop/app/hive

在这里插入图片描述

五、查看Sqoop命令帮助

  • sqoop help
Available commands:
  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import datasets from a mainframe server to HDFS
  job                Work with saved jobs
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  merge              Merge results of incremental imports
  metastore          Run a standalone Sqoop metastore
  version            Display version information

注意:

命令帮助中的导入、导出的出发点:
导入(import):MySQL --> Hadoop
导出(export):Hadoop --> MySQL

1、更详细的import命令帮助:

  • sqoop import --help

  • 用法:usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS]

如果数据是从MySQL中导入到HDFS中:
1、目的地目录我们得知道,从MySQL中出来,url、password这些都是需要的

2、Hive arguments:

命令描述
–create-hive-tableFail if the target hive table exists
–hive-database Sets the database name to use when importing to hive
–hive-delims-replacement Replace Hive record \0x01 and row delimiters (\n\r) from imported string fields with user-defined string
–hive-drop-import-delimsDrop Hive record \0x01 and row delimiters (\n\r) from imported string fields
–hive-home Override $HIVE_HOME
–hive-importImport tables into Hive
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值