一、Hive安装部署
(1)把apache-hive-3.1.2-bin.tar.gz上传到Linux的/opt/software目录下
(2)解压apache-hive-3.1.2-bin.tar.gz到/opt/module/目录下面
[muzili@hadoop102 software]$ tar -zxvf /opt/software/apache-hive-3.1.2-bin.tar.gz -C /opt/module/
(3)修改apache-hive-3.1.2-bin.tar.gz的名称为hive
[muzili@hadoop102 software]$ mv /opt/module/apache-hive-3.1.2-bin/ /opt/module/hive
(4)修改/etc/profile.d/my_env.sh,添加环境变量
[muzili@hadoop102 software]$ sudo vim /etc/profile.d/my_env.sh
添加内容
#HIVE_HOME
export HIVE_HOME=/opt/module/hive
export PATH=$PATH:$HIVE_HOME/bin
重启Xshell对话框或者source一下 /etc/profile.d/my_env.sh文件,使环境变量生效
[muzili@hadoop102 software]$ source /etc/profile.d/my_env.sh
(5)解决日志Jar包冲突,进入/opt/module/hive/lib目录
[muzili@hadoop102 lib]$ mv log4j-slf4j-impl-2.10.0.jar log4j-slf4j-impl-2.10.0.jar.bak
(6)在$HIVE_HOME/conf目录下复制hive-env.sh.template为hive-env.sh
[muzili@hadoop102 conf]$ cp hive-env.sh.template hive-env.sh
[muzili@hadoop102 conf]$ vim hive-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_212
export HADOOP_HOME=/opt/module/hadoop-3.1.3
export HADOOP_CONF_DIR=/opt/module/hadoop-3.1.3/etc/hadoop
export HIVE_HOME=/opt/module/hive
export HIVE_CONF_DIR=/opt/module/hive/conf
export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib
二、Hive元数据配置到MySQL
2.1 拷贝驱动
将MySQL的JDBC驱动拷贝到Hive的lib目录下
[muzili@hadoop102 lib]$ cp /opt/software/mysql-connector-java-5.1.27.jar /opt/module/hive/lib/
2.2 配置Metastore到MySQL
(1)在$HIVE_HOME/conf目录下新建hive-site.xml文件
[muzili@hadoop102 conf]$ vim hive-site.xml
(2)添加如下内容
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop102:3306/metastore?useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>000000</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hadoop102</value>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
</configuration>
三、启动Hive
3.1 初始化元数据库
(1)登陆MySQL
[muzili@hadoop102 conf]$ mysql -uroot -p000000
(2)新建Hive元数据库
mysql> create database metastore;
mysql> quit;
(3)初始化Hive元数据库
[muzili@hadoop102 conf]$ schematool -initSchema -dbType mysql -verbose
3.2 启动Hive客户端
(1)启动Hive客户端
[muzili@hadoop102 hive]$ bin/hive
(2)查看一下数据库
hive (default)> show databases;
OK
database_name
default
备注:
如果是postgresql数据库则见下面:
编辑 ${HIVE_HOME}/conf/hive-site.xml文件,如果文件不存在,创建之。
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://<ip>:5432/<db></value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value><username></value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value><password></value>
</property>
</configuration>
初始化PostgreSQL
$ bin/schematool -dbType postgres -initSchema -verbose
遇到问题:
Hive在spark2.4.4启动时无法访问spark-assembly-*.jar的解决办法
无法访问/share/apps/spark-2.4.4/lib/spark-assembly-*.jar:没有那个文件或目录
但是spark升级到spark2以后,原有lib目录下的大JAR包被分散成多个小JAR包,原来的spark-assembly-*.jar已经不存在,所以hive没有办法找到这个JAR包。
解决方法:
修改/<PathToHive>/bin/hive文件,将加载原来的lib/spark-assembly-*.jar`替换成jars/*.jar,就不会出现这样的问题。
如果出现如下问题:
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
则对数据库初始化。(hive的bin目录下schematool)