一、hive与hbase的结合
Hive会经常和Hbase结合使用,把Hbase作为Hive的存储路径,所以Hive整合Hbase尤其重要。使用Hive读取Hbase中的数据,可以使用HQL语句在HBase表上进行查询、插入操作;甚至是进行Join和Union等复杂查询。此功能是从Hive 0.6.0开始引入的。Hive与HBase整合的实现是利用两者本身对外的API接口互相进行通信,相互通信主要是依靠hive-hbase-handler-*.jar工具里面的类实现的。使用Hive操作HBase中的表,只是提供了便捷性,hiveQL引擎使用的是MapReduce,对于性能上,表现不尽人意。
步骤:
1、将hbase相关jar包复制到hive/lib下,操作如下:
[hadoop@bus-stable hive]$ cp /opt/hbase/lib/hbase-protocol-1.4.5.jar /opt/hive/lib/
[hadoop@bus-stable hive]$ cp /opt/hbase/lib/hbase-server-1.4.5.jar /opt/hive/lib/
[hadoop@bus-stable hive]$ cp /opt/hbase/lib/hbase-client-1.4.5.jar /opt/hive/lib/
[hadoop@bus-stable hive]$ cp /opt/hbase/lib/hbase-common-1.4.5.jar /opt/hive/lib/
[hadoop@bus-stable hive]$ cp /opt/hbase/lib/hbase-common-1.4.5-tests.jar /opt/hive/lib/
[hadoop@bus-stable hive]$
2、在hive-site.xml文件中引用hbase,添加如下内容:
[hadoop@bus-stable hive]$ vim /opt/hive/conf/hive-site.xml
hive.aux.jars.path
file:///opt/hive/lib/hive-hbase-handler-2.3.3.jar,
file:///opt/hive/lib/hbase-protocol-1.4.5.jar,
file:///opt/hive/lib/hbase-server-1.4.5.jar,
file:///opt/hive/lib/hbase-client-1.4.5.jar,
file:///opt/hive/lib/hbase-common-1.4.5.jar,
file:///opt/hive/lib/hbase-common-1.4.5-tests.jar,
file:///opt/hive/lib/zookeeper-3.4.6.jar,
file:///opt/hive/lib/guava-14.0.1.jar
The location of the plugin jars that contain implementations of user defined functions and serdes.
hbase.zookeeper.quorum
open-stable,permission-stable,sp-stable
dfs.permissions.enabled
false
3、启动hive:
[hadoop@bus-stable hive]$ hive -hiveconf hbase.master=oversea-stable:60000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/opt/apache-hive-2.3.3-bin/lib/hive-common-2.3.3.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> create table htest(key int,value string) stored