上一篇文章中,我介绍了一种将Hive分析结果,通过存放到另外一个Hive表中,使用Sqoop将该表内容直接导入到MySQL中的方法。本人认为这种方式在数据量特别大的时候,可能很有效果,但是一般情况下,Hive的分析、查询、统计结果数据量不会太大,所以在这种情况下,我尝试使用Hive JDBC驱动连接Hive将查询结果集,通过MySQL JDBC驱动,直接导入到数据库中,并取得成功,速度也比Sqoop方式快了很多。
一、启动Hive元数据服务
[hadoopUser@secondmgt ~]$ hive --service metastore
Starting Hive Metastore Server
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/04/22 14:53:12 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
二、启动HiveServer2服务
[hadoopUser@secondmgt ~]$ hive --service hiveserver2
Starting HiveServer2
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/04/22 14:58:22 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
三、Hive关联HBase数据库表
部分数据查询,结果如下,HBase中目前存放有6649条数据
hive> select * from transjtxx_hbase;
32108800000000004620140317000817 02 03 苏K22F91 0.00 3 1 0 0
32108800000000004620140317000820 02 03 苏HP062H 0.00 6 1 0 0
32108800000000004620140317000823 02 03