Sqoop1.4.4实现关系型数据库多表同时导入HDFS或Hive中

问题导读:

         1、使用Sqoop哪个工具实现多表导入?

         2、满足多表导入的三个条件是?

         3、如何指定导入HDFS某个目录?如何指定导入Hive某个数据库?

一、介绍

        有时候我们需要将关系型数据库中多个表一起导入到HDFS或者Hive中,这个时候可以使用Sqoop的另一个工具sqoop-import-all-tables。每个表数据被分别存储在以表名命名的HDFS上的不同目录中。

       在使用多表导入之前,以下三个条件必须同时满足:
           1、每个表必须都只有一个列作为主键;
           2、必须将每个表中所有的数据导入,而不是部分;
           3、你必须使用默认分隔列,且WHERE子句无任何强加的条件

        --table, --split-by, --columns, 和 --where 参数在sqoop-import-all-tables命令中是不合法的。--exclude-tables:可以用来排除导入某个表。具体的导入用法和单表导入差不多。

二、关系数据表

        我的spice数据库中,有以下四张表:

mysql> show tables;
+-----------------+
| Tables_in_spice |
+-----------------+
| servers         |
| users           |
| vmLog           |
| vms             |
+-----------------+
4 rows in set (0.00 sec)
三、多表同时导入HDFS中

[hadoopUser@secondmgt ~]$ sqoop-import-all-tables --connect jdbc:mysql://secondmgt:3306/spice  --username hive --password hive --as-textfile --warehouse-dir /output/
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
15/01/19 20:21:15 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/01/19 20:21:15 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/01/19 20:21:15 INFO tool.CodeGenTool: Beginning code generation
15/01/19 20:21:15 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `servers` AS t LIMIT 1
15/01/19 20:21:15 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `servers` AS t LIMIT 1
15/01/19 20:21:15 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0
Note: /tmp/sqoop-hadoopUser/compile/0bdbced5e58f170e1670516db3339f91/servers.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/01/19 20:21:16 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoopUser/compile/0bdbced5e58f170e1670516db3339f91/servers.jar
15/01/19 20:21:16 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/01/19 20:21:16 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/01/19 20:21:16 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/01/19 20:21:16 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/01/19 20:21:16 INFO mapreduce.ImportJobBase: Beginning import of servers
15/01/19 20:21:16 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoopUser/cloud/hbase/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/01/19 20:21:17 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/01/19 20:21:17 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/01/19 20:21:17 INFO client.RMProxy: Connecting to ResourceManager at secondmgt/192.168.2.133:8032
15/01/19 20:21:18 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`src_id`), MAX(`src_id`) FROM `servers`
15/01/19 20
  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值