大数据技术sqoop插件使用简介

一、sqoop是什么

Sqoop(发音:skup)是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql...)间进行数据的传递,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。

二、sqoop的安装

1.下载sqoop安装包地址:[sqoop下载地址](http://mirrors.hust.edu.cn/apache/sqoop/1.4.6/)
2.将下载的安装包sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz上传至Linux操作系统
3.解压文件到指定目录
    例如: $ tar -zxvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz -C /opt/module
4.修改配置文件
    ·修改sqoop目录下conf文件下文件sqoop-env-template.sh将文件重命名为sqoop-env.sh
        $ mv sqoop-env-template.sh sqoop-env.sh
        这个配置是指定hadoop的安装目录
        export HADOOP_COMMON_HOME=/home/admin/modules/hadoop-2.7.2
        这个配置的是mapred的安装目录也就是hadoop的安装目录
        export HADOOP_MAPRED_HOME=/home/admin/modules/hadoop-2.7.2
        这里指定的是hive的安装目录
        export HIVE_HOME=/home/admin/modules/apache-hive-1.2.2-bin
        这里指定的是zookeeper的安装目录
        export ZOOKEEPER_HOME=/home/admin/modules/zookeeper-3.4.5
        这里指定的是zookeeper的conf文件夹目录
        export ZOOCFGDIR=/home/admin/modules/zookeeper-3.4.5 

三、拷贝数据库连接驱动包

将连接mysql的驱动包mysql-connector-java-5.1.27-bin.jar拷贝到sqoop的lib文件夹下
    例如:$ cp mysql-connector-java-5.1.27-bin.jar /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/lib

四、验证sqoop配置是否成功

·Linux转到sqoop的安装目录指定命令
    $ bin/sqoop help
 如果出现以下信息则表示sqoop配置成功
Warning: /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/12/27 20:34:31 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
usage: sqoop COMMAND [ARGS]

Available commands:
  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import datasets from a mainframe server to HDFS
  job                Work with saved jobs
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  merge              Merge results of incremental imports
  metastore          Run a standalone Sqoop metastore
  version            Display version information

See 'sqoop help COMMAND' for information on a specific command.
    ·测试sqoop是否能连接到mysql数据库
        $ bin/sqoop list-databases \
            --connect jdbc:mysql://hadoop102:3306 \
            --username root \
            --password 123456
    ·能列出查询出对应数据库列表则表示配置成功
        Warning: /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/bin/../../hbase does not exist! HBase imports will fail.
        Please set $HBASE_HOME to the root of your HBase installation.
        Warning: /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/bin/../../hcatalog does not exist! HCatalog jobs will fail.
        Please set $HCAT_HOME to the root of your HCatalog installation.
        Warning: /opt/module/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/bin/../../accumulo does not exist! Accumulo imports will fail.
        Please set $ACCUMULO_HOME to the root of your Accumulo installation.
        17/12/27 20:42:10 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
        17/12/27 20:42:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
        17/12/27 20:42:10 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
        information_schema
        company
        metastore
        mysql
        performance_schema
        test

五、sqoop使用案例

1.将mysql关系型数据库中的数据导入到hdfs文件系统中
    $ bin/sqoop import \
    --connect jdbc:mysql://hadoop102:3306/company \
    --username root \
    --password 123456 \
    --table staff \
    --target-dir /user/company \
    --delete-target-dir \
    --num-mappers 1 \
    --fields-terminated-by "\t"
2.mysql关系型数据库中的数据导入到hive表中
    $ bin/sqoop import \
    --connect jdbc:mysql://hadoop102:3306/company \
    --username root \
    --password 123456 \
    --table staff \
    --num-mappers 1 \
    --hive-import \
    --fields-terminated-by "\t" \
    --hive-overwrite \
    --hive-table staff_hive \ 
3.将hive表中的数据导入到mysql中
    $bin/sqoop export \
    -connect jdbc:mysql://hadoop102:3306/company \
    --username root \
    --password 123456 \
    --table staff \
    --num-mappers 1 \
    --export-dir /user/hive/warehouse/staff_hive \
    --input-fields-terminated-by "\t"
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值