2.4其它整合
2.4.1Hive整合Spark
Spark整合hive,需要将hive_home下的conf下的hive_site.xml放到spark_home下的conf目录下。(3台服务器都做相同的配置)
[root@bigdata2 spark-2.3.0-bin-hadoop2.7]# cd $HIVE_HOME/conf
[root@bigdata2 conf]# cp hive-site.xml $SPARK_HOME/conf
如果想使用./spark-sql 以yarn的方式运行,需要将mysql-connector-java-5.1.38.jar 放到$SPARK_HOME/jars下面
2.4.2 Hive整合HBASE
(1)修改hive-site.xml文件,添加配置属性(zookeeper的地址)
<property>
<name>hbase.zookeeper.quorum</name>
<value>bigdata2:2181,bigdata3:2181,bigdata4:2181,bigdata5:2181,bigdata6:2181</value>
</property>
(2)引入hbase的依赖包
将hbase安装包目录下的lib文件夹下的包导入到hive的环境变量中
在hive-env.sh文件中添加:
export HIVE_CLASSPATH=$HIVE_CLASSPATH:$HBASE_HOME/lib
将上面的配置同步到另外2台机器中
2.5连接方式
2.5.1Cli连接
2.5.2HiveServer2/beeline
关于beeline的更多使用,可以参考:https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients (里面介绍了更加详细的hiveserver2/beeline的配置)
现在使用的最新的hive版本是hive-2.3.5,都需要对hadoop集群做如下改变,否则无法使用。
2.5.2.1修改hadoop集群的core-site.xml中的如下配置
<property>
<name>hadoop.proxyuser.root.hosts</name>
<!-- <value>master</value> -->
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<!-- <value>hadoop</value> -->
<value>*</value>
</property>
即:
配置解析:
1、hadoop.proxyuser.root.hosts配置成*的意义,表示任意节点使用hadoop集群的代理用户root都能访问到hdfs集群。
2、hadoop.proxyuser.hadoop.groups表示代理用户的组所属.
3、hadoop.proxyuser.root.hosts 中的root为hadoop用户,即hadoop的安装目录。
以上重启一下hadoop集群后生效
2.5.2.2修改hive-site.xml的如下内容:
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.client.user</name>
<value>root</value>
<description>Username to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>123456</value>
<description>Password to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hadoop1</value>
<description>Bind host on which to run the HiveServer2 Thrift service.</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
<description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'http'.</description>
</property>
要注意的是这里的用户名root,密码是操作系统的登录名和密码。
然后执行:
nohup hive --service hiveserver2 &
或使用类似以下方式运行:
nohup hiveserver2 1>/home/hadoop/hiveserver.log 2>/home/hadoop/hiveserver.err &
或者:nohup hiveserver2 1>/dev/null 2>/dev/null &
或者:nohup hiveserver2 >/dev/null 2>&1 &
登录beenline的方式:
beeline -u jdbc:hive2//hadoop1:10000 -n root
-u :指定元数据的连接信息
-n :指定用户名和密码
另外还有一种方式可以去连接
先执行beeline,然后再输入:!connect jdbc:hive2://hadoop1:10000
[root@hadoop1 apache-hive-2.3.4-bin]# bin/beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/installed/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/installed/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 2.3.4 by Apache Hive
beeline> !connect jdbc:hive2://hadoop1:10000
Connecting to jdbc:hive2://hadoop1:10000
Enter username for jdbc:hive2://hadoop1:10000:
Enter password for jdbc:hive2://hadoop1:10000:
Connected to: Apache Hive (version 2.3.4)
Driver: Hive JDBC (version 2.3.4)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoop1:10000>
再如:
[root@hadoop1 apache-hive-2.3.4-bin]# beeline -u jdbc:hive2://hadoop1:10000 -n root
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/installed/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/installed/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://hadoop1:10000
Connected to: Apache Hive (version 2.3.4)
Driver: Hive JDBC (version 2.3.4)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 2.3.4 by Apache Hive
0: jdbc:hive2://hadoop1:10000>
2.5.3Hive wui
暂略
2.5.4和Squirrel SQL Client集成
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients 此中最下方含有介绍。
2.5.5和Oracle的SQL Developer集成
可以和Oracle的SQLDevelopers集成
https://community.hortonworks.com/articles/1887/connect-oracle-sql-developer-to-hive.html