将Hive统计分析结果导入到MySQL数据库表中（一）——Sqoop导入方式

最新推荐文章于 2021-03-13 03:11:34 发布

NIITYZU

最新推荐文章于 2021-03-13 03:11:34 发布

阅读量1.3w

点赞数 3

分类专栏： Hive学习

本文链接：https://blog.csdn.net/NIITYZU/article/details/45190787

版权

本文介绍了如何将Hive中对HBase数据进行统计分析后的结果，通过Sqoop导入到MySQL数据库的过程。首先创建Hive外部表关联HBase，然后将Hive查询结果插入到临时表，接着配置MySQL表结构与Hive表匹配，并解决因字段分隔符和关键字冲突导致的Sqoop导出问题。当MySQL表已有数据时，需要清理相关进程以完成导入。

摘要由CSDN通过智能技术生成

最近在做一个交通流的数据分析，需求是对于海量的城市交通数据，需要使用MapReduce清洗后导入到HBase中存储，然后使用Hive外部表关联HBase，对HBase中数据进行查询、统计分析，将分析结果保存在一张Hive表中，最后使用Sqoop将该表中数据导入到MySQL中。整个流程大概如下：

下面我主要介绍Hive关联HBase表——Sqoop导出Hive表到MySQL这些流程，原始数据集收集、MapReduce清洗及WEB界面展示此处不介绍。

一、HBase数据库表

hbase(main):003:0> list
TABLE
transtable
1 row(s) in 0.0250 seconds

=> ["transtable"]
hbase(main):004:0> describe 'transtable'
DESCRIPTION                                                                                                        ENABLED
 'transtable', {NAME => 'jtxx', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VER true
 SIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLO
 CKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.0480 seconds

创建一个名叫“transtable”的HBase表，列族是：“jtxx”。HBase中的部分数据如下：

hbase(main):008:0> get 'transtable','32108417000000013220140317000701'
COLUMN                                        CELL
 jtxx:cdbh                                    timestamp=1429597736296, value=03
 jtxx:clbj                                    timestamp=1429597736296, value=0
 jtxx:cllb                                    timestamp=1429597736296, value=0
 jtxx:cllx                                    timestamp=1429597736296, value=3
 jtxx:clsd                                    timestamp=1429597736296, value=127.00
 jtxx:hphm                                    timestamp=1429597736296, value=\xE8\x8B\x8FKYV152
 jtxx:wflx                                    timestamp=1429597736296, value=0
 jtxx:xsfx                                    timestamp=1429597736296, value=03
8 row(s) in 0.1550 seconds

二、创建Hive外部表关联HBase表

create external table transJtxx_Hbase
 (
    clxxbh string,
    xsfx string,
    cdbh string,
    hphm string,
    clsd string,
    cllx string,
    clbj string,
    cllb string,
    wflx string
)
stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
with serdeproperties ("hbase.columns.mapping" =":key,jtxx:xsfx,jtxx:cdbh,jtxx:hphm,jtxx:clsd,jtxx:cllx,jtxx:clbj,jtxx:cllb,jtxx:wflx") TBLPROPERTIES ("hbase.table.name" = "transtable");

hbase.columns.mapping要对应hbase数据库transtable表中列族下的列限定符。此处一定要是外部表

查看是否关联成功，如何执行一条语句能够查询出HBase表中数据，则关联成功。

hive> select *  from transjtxx_hbase where clxxbh like '321084170000000132%';
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1428394594787_0007, Tracking URL = http://secondmgt:8088/proxy/application_1428394594787_0007/
Kill Command = /home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/bin/hadoop job  -kill job_1428394594787_0007
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-04-21 17:27:18,136 Stage-1 map = 0%,  reduce = 0%
2015-04-21 17:27:35,029 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 12.31 sec
MapRed

最低0.47元/天解锁文章

NIITYZU

关注

3
点赞
踩
26

收藏

觉得还不错? 一键收藏
0
评论
将Hive统计分析结果导入到MySQL数据库表中（一）——Sqoop导入方式

最近在做一个交通流的数据分析，需求是对于海量的城市交通数据，需要使用MapReduce清洗后导入到HBase中存储，然后使用Hive外部表关联HBase，对HBase中数据进行查询、统计分析，将分析结果保存在一张Hive表中，最后使用Sqoop将该表中数据导入到MySQL中。整个流程大概如下：下面我主要介绍Hive关联HBase表——Sqoop导出Hive表到MySQL这
复制链接

扫一扫

专栏目录