HBase是一种非关系型分布式数据库,无法进行多表关联等关系型数据库能执行的操作,但是基于HBase高容量,高吞吐的特征,我们可以将HBase的表映射到Hive中进行存储,下面展示存储案例.
一:准备HBase数据表
1.创建命名空间
create_namespace 'testConnectHive'
2.命名空间内创建一个student表,列族分别为stuinfo,schoolinfo
create 'testConnectHive:student','stuinfo','schoolinfo'
3.表内插入数据
hbase(main):004:0> put 'testConnectHive:student','1001','stuinfo:stuname','zhangsan'
Took 0.6039 seconds
hbase(main):005:0> put 'testConnectHive:student','1001','stuinfo:stuno','1001'
Took 0.0107 seconds
hbase(main):006:0> put 'testConnectHive:student','1001','stuinfo:stugender','male'
Took 0.1695 seconds
hbase(main):007:0> put 'testConnectHive:student','1001','schoolinfo:schoolname','beijingdaxue'
Took 0.0342 seconds
hbase(main):008:0> put 'testConnectHive:student','1001','schoolinfo:schoollocation','beijing'
4.查看数据表
desc 'testConnectHive:student'
二.HBase数据表映射到Hive中
1.开启远程服务
nohup hive --service hiveserver2 &
2.Hive创建表,映射HBase数据表
create database testConnectHive;
use testConnectHive;
create external table student(
id string,
stuname string,
stuno string,
schoolname string,
schoollocation string
)stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with
serdeproperties ('hbase.columns.mapping'=':key,stuinfo:stuname,stuinfo:stuno,schoolinfo:schoolname,schoolinfo:schoollocation')
tblproperties ('hbase.table.name'='testConnectHive:student');
select * from student;