HBase与hive集成

集成环境:hadoop-2.6.0(Master,Slave1,Slave2),hbase-0.98.6-hadoop2,hive-1.2.1

1. hive和hbase集成需要的jar包有guava,hbase-common,hbase-server,hbase-client,hbase-protocol,hbase-it,htrace-core这七个jar包。

进入$HIVE_HOME/lib下以及$HBASE_HOME/lib下,看hive和hbase下的guava的jar包版本是否相同,如果不相同,在hive/lib下执行命令

rm -rf guava-XX.jar
删除hive里的guava的jar包,然后在hbase/lib执行命令,将guava-12.0.1.jar包拷贝到hive/lib目录下,并将其余的六个jar包也拷贝到hive/lib目录下

[root@Master lib]# cp guava-12.0.1.jar /usr/soft/hive-1.2.1/lib/

[root@Master lib]# cp hbase-common-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-server-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-client-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-protocol-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-it-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp htrace-core-2.04.jar /usr/soft/hive-1.2.1/lib/

2. 修改hive/conf下的hive-site.xml配置文件,在最后添加如下属性

<property>
        <name>hbase.zookeeper.quorum</name>
        <value>Master</value>
  </property>


3. 启动hive,HBase与hive集成有两种方式,第一种是创建表管理表hbase_table_1,指定数据存储在hbase表中


hive (default)> CREATE TABLE hbase_table_1(key int, value string)   
              > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'  
              > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")  
              > TBLPROPERTIES ("hbase.table.name" = "xyz"); 
OK
Time taken: 4.938 seconds

 在hbase中查看是否创建xyz表

hbase(main):004:0> list
TABLE                                                                           
basic                                                                           
sub_user                                                                        
test                                                                            
xyz                                                                             
4 row(s) in 0.0310 seconds

=> ["basic", "sub_user", "test", "xyz"]


 往hive表hbase_table_1表中插入数据


hive (default)> insert overwrite table hbase_table_1 select empno, ename from emp;
Query ID = root_20170825100051_a6ec3c4e-f78c-4a63-9db3-291d9e73c0f9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1502944428192_0002, Tracking URL = http://10.226.118.24:8888/proxy/application_1502944428192_0002/
Kill Command = /usr/soft/hadoop-2.6.0/bin/hadoop job  -kill job_1502944428192_0002
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2017-08-25 10:01:09,798 Stage-0 map = 0%,  reduce = 0%
2017-08-25 10:01:12,978 Stage-0 map = 100%,  reduce = 0%, Cumulative CPU 1.72 sec
MapReduce Total cumulative CPU time: 1 seconds 720 msec
Ended Job = job_1502944428192_0002
MapReduce Jobs Launched: 
Stage-Stage-0: Map: 1   Cumulative CPU: 1.72 sec   HDFS Read: 9729 HDFS Write: 263148 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 720 msec
OK
empno	ename
Time taken: 22.516 seconds


 查看hbase表xyz中是否有数据

hbase(main):006:0> scan 'xyz'
ROW                   COLUMN+CELL                                               
 7369                 column=cf1:val, timestamp=1503626471989, value=SMITH      
 7499                 column=cf1:val, timestamp=1503626471989, value=ALLEN      
 7521                 column=cf1:val, timestamp=1503626471989, value=WARD       
 7566                 column=cf1:val, timestamp=1503626471989, value=JONES      
4 row(s) in 0.0800 seconds

4. 第二中方式是创建外部表hbase_test,hbase中已经有test表

hbase(main):004:0> list
TABLE                                                                           
basic                                                                           
sub_user                                                                        
test                                                                            
xyz                                                                             
4 row(s) in 0.0240 seconds

=> ["basic", "sub_user", "test", "xyz"]
hbase(main):003:0> scan 'test'
ROW                   COLUMN+CELL                                               
 10002                column=cf:age, timestamp=1502847463784, value=56          
 10002                column=cf:name, timestamp=1502847451295, value=zhangsan   
 10003                column=cf:age, timestamp=1503279594383, value=35          
 10003                column=cf:name, timestamp=1502847534361, value=zhaoliu    
2 row(s) in 0.2230 seconds

hive (default)> create external table hbase_test(id int, name string, age int)
              > stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
              > with serdeproperties ("hbase.columns.mapping" = ":key,cf:name,cf:age")
              > tblproperties ("hbase.table.name" = "test");
OK
Time taken: 2.619 seconds

hive (default)> select * from hbase_test ;
OK
hbase_test.id	hbase_test.name	hbase_test.age
10002	zhangsan	56
10003	zhaoliu	35
Time taken: 0.595 seconds, Fetched: 2 row(s)








评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值