我安装的hive版本是3.1.3,同时hadoop的版本是3.2.0,供读者参考。
下载
官方下载链接
https://dlcdn.apache.org/hive/hive-3.1.3/
下载apache-hive-3.1.3-bin.tar.gz
并将压缩包上传至linux虚拟机中。
安装
进入压缩包所在目录,执行命令tar -xzvf ./apache-hive-3.1.3-bin.tar.gz -C /export/servers/
将压缩包解压至/export/servers
目录处。
Hive相关配置
- 配置环境变量(
/etc/profile
)
添加以下配置信息:
export HIVE_HOME=/export/servers/apache-hive-3.1.3-bin
export HIVE_CONF_DIR=/export/servers/apache-hive-3.1.3-bin/conf
export PATH=$PATH:$HIVE_HOME/bin
执行命令source /etc/profile
使配置生效
2) 进入hive的根目录下的conf文件夹内,创建一个hive-site.xml
文件,并添加如下内容:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<configuration>
<!-- 数据库 start -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive_meta?useSSL=false</value>
<description>mysql连接</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>mysql驱动</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>数据库使用用户名</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>数据库密码</description>
</property>
<!-- 数据库 end -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehouse</value>
<description>hive使用的HDFS目录</description>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.support.concurrency</name>
<value>true</value>
<description>开启Hive的并发模式</description>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
<description>用于并发控制的锁管理器类</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>my2308-host</value>
<description>hive开启的thriftServer地址</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>hive开启的thriftServer端口</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<!-- 其它 end -->
</configuration>
- 修改hadoop配置文件
core-site.xml
,文件路径:$HADOOP_HOME/etc/hadoop/core-sit.xml
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
<description>配置超级用户允许通过代理用户所属组</description>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
<description>配置超级用户允许通过代理访问的主机节点</description>
</property>
<property>
<name>hadoop.proxyuser.root.users</name>
<value>*</value>
</property>
- 复制
hive-env.sh.template
模板配置文件为hive-env.sh
,文件所在路径:/export/servers/apache-hive-3.1.3-bin/conf
cp hive-env.sh.template hive-env.sh
在hive-env.sh文件中添加Hadoop目录位置:
export HADOOP_HOME=/export/servers/hadoop-3.2.0
,这里的目录是你的hadoop的路径。
5) 对日志文件改名
mv hive-log4j2.properties.template hive-log4j2.properties
- 在MYSQL中创建hive用的元数据库
hive_meta
create database hive_meta default charset utf8 collate utf8_general_ci;
- 拷贝mysql驱动jar 到
/export/servers/apache-hive-3.1.3-bin/lib
,需要注意自己的MySQL版本,选择对应的jdbc。
cp mysql-connector-java-5.1.40-bin.jar /export/servers/apache-hive-3.1.3-bin/lib
- 删除冲突的log4j(log4j-slf4j-impl-2.4.1.jar)
rm -f /export/servers/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.4.1.jar
- hive初始化mysql
schematool -dbType mysql -initSchema
schematool 是 hive 自带的管理 schema 的相关工具,在hive根目录中的bin目录里。
随后,可以看到hive_meta数据库中有很多表了。