前言
Hive介绍
Hive是一个基于Hadoop的开源数据仓库工具,用于存储和处理海量结构化数据。它是Facebook 2008年8月开源的一个数据仓库框架,提供了类似于SQL语法的HQL(hiveSQL)语句作为数据访问接口。
1.先安装MySQL
(1) 安装wget命令
yum -y install wget
(2) 下载mysql的repo源
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
(3)安装mysql-community-release-el7-5.noarch.rpm包
rpm -ivh mysql-community-release-el7-5.noarch.rpm
(4)查看下载的文件
ls -1 /etc/yum.repos.d/mysql-community*
(5)安装MySQL
yum install mysql-server
(6)启动MySQL服务
systemctl start mysql.service
(7)使用MySQL
mysql -uroot -p
密码直接回车就进入了
设置密码
set password for 用户名@localhost = password('新密码');
退出mysql 下次登录就是新密码了
exit;
2.安装Hive
(1)解压apache-hive-3.1.3-bin.tar.gz到指定目录
tar -xzvf apache-hive-3.1.3-bin.tar.gz -C /opt
(2) 配置环境变量(在全局配置文件/etc/profile)
export HIVE_HOME=/export/servers/apache-hive-3.1.3-bin
export HIVE_CONF_DIR=/export/servers/apache-hive-3.1.3-bin/conf
export PATH=$PATH:$HIVE_HOME/bin
source /etc/profile
使文件生效
(3)在hive根目录下的conf目录下,创建一个hive-site.xml文件,并添加如下内容:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<configuration>
<!-- 数据库 start -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive_meta?useSSL=false</value>
<description>mysql连接</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>mysql驱动</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>数据库使用用户名</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>数据库密码</description>
</property>
<!-- 数据库 end -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehouse</value>
<description>hive使用的HDFS目录</description>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.support.concurrency</name>
<value>true</value>
<description>开启Hive的并发模式</description>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
<description>用于并发控制的锁管理器类</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>my2308-host</value>
<description>hive开启的thriftServer地址</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>hive开启的thriftServer端口</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<!-- 其它 end -->
</configuration>
你需要更改里面的my2308-host 为你自己的虚拟机的ip地址(如果没有做主机名映射的话)
(4)修改$HADOOP_HOME/etc/hadoop/core-site.xml 开启hadoop代理功能
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
<description>配置超级用户允许通过代理用户所属组</description>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
<description>配置超级用户允许通过代理访问的主机节点</description>
</property>
<property>
<name>hadoop.proxyuser.root.users</name>
<value>*</value>
</property>
(5) 拷贝hive-env.sh.template模版配置文件为hive-env.sh
cp hive-env.sh.template hive-env.sh
在hive-env.sh文件中添加Hadoop目录位置
HADOOP_HOME=/opt/hadoop-3.1.3
(6) 对日志文件改名
mv hive-log4j2.properties.template hive-log4j2.properties
(7) 在MySQL中创建hive用的元数据库hive_meta(先进入MySQL)
create database hive_meta default charset utf8 collate utf8_general_ci;
(8)拷贝mysql驱动jar 到/export/servers/apache-hive-3.1.3-bin/lib
cp mysql-connector-java-5.1.40-bin.jar /export/servers/apache-hive-3.1.3-bin/lib
(9) 删除冲突的log4j(log4j-slf4j-impl-2.4.1.jar)
rm -f /export/servers/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.4.1.jar
(10) hive初始化mysql
schematool -dbType mysql -initSchema
3.启动Hive
在$HIVE_HOME/bin目录下
在$HIVE_HOME/bin目录下
以JDBC连接启动(beeline方式连接)
先启动hiveserver2服务
前台启动启动hiveserver2服务:
hiveserver2
最后
beeline -u jdbc:hive2://localhost:10000 -n root