Hive 的安装及配置

最新推荐文章于 2023-05-16 20:00:00 发布

荟柠檬

最新推荐文章于 2023-05-16 20:00:00 发布

阅读量303

点赞数

分类专栏：安装文章标签： hive mysql hadoop 安装配置

本文链接：https://blog.csdn.net/HuiningM/article/details/100590026

版权

安装专栏收录该内容

5 篇文章 0 订阅

订阅专栏

为了探索Hive 的神秘与伟大，我们踏上了Hive的学习之路，这个工具的好与不好且先不谈，先来安装Hive吧。。。

我们用MySQL存储Hive 的元数据Metastore，所以先安装MySQL。具体安装及配置步骤如下：

我的整个操作过程分7个部分：

1.安装 MySQL

2.安装 Hive

3.将 Hive 元数据 Metastore配置到 MySQL

4.Hadoop集群配置

5.Hive 数据仓库位置配置

6. 查询后信息显示配置（根据自己喜好配或不配）

7.Hive日志文件配置

----------------------------------OK，开始我的长篇大论-------------------------------------------------

》》》》》》>1. 安装MySQL

step 1: 查看mysql 是否安装，如果已经安装，卸载原有MySQL

# yum list installed | grep mysql ----查看

#yum -y remove mysql-libs.x86-64 -----卸载

step 1: 下载压缩包并安装

# rpm -Uvh http://repo.mysql.com/mysql-community-release-el6-5.noarch.rpm（下载网址）

#yum install mysql-community-server -y

或者

# yum -y install mysql mysql-server mysql-devel

# wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm

# rpm -ivh mysql-community-release-el7-5.noarch.rpm

# yum -y install mysql-community-server

step3: 开启mysql

#service mysqld start

step4: 设置root用户登录密码

# mysqladmin -uroot password 'rootroot'

step5: 登录mysql

# mysql -uroot -prootroot

step6: 登录后设置访问权限，以便使集群内部机器访问

mysql> grant all privileges on *.* to root@'%' identified by 'rootrooot';

mysql>

可以 > exit; 退出，

此时，MySQL已经完成安装。

》》》》》》> 2. 安装 Hive

step1: 将本机下载好多hive安装包 apache-hive-1.2.1-bin.tar.gz 上传之虚拟机（hadoop011）

Alt +p 进入sftp界面

sftp> put G:/Hive/apache-hive-1.2.1-bin.tar.gz

然后进入home目录 #cd ~ 找到文件，将文件包移到指定路径下

# mv apache-hive-1.2.1-bin.tar.gz /opt/soft

step2: 解压安装包到/opt/app/

# tar -zxvf apache-hive-1.2.1-bin.tar.gz -C /opt/app/

进入/opt/app/修改文件名称

# cd /opt/app

#mv apache-hive-1.2.1-bin apache-hive-1.2.1

step 3: 配置文件 hive-env.sh

进入/opt/app/apache-hive-1.2.1/conf 找到文件 hive-env.sh.template，修改文件名称

[root@hadoop011 conf]# mv hive-env.sh.template hive-env.sh

[root@hadoop011 conf]# vim hive-env.sh

进入文件hive-env.sh，添加HADOOP_HOME, HIVE_CONF_DIR路径

export HADOOP_HOME=/opt/app/hadoop-2.7.2

export HIVE_CONF_DIR=/opt/app/apache-hive-1.2.1/conf

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Set Hive and Hadoop environment variables here. These variables can be used
# to control the execution of Hive. It should be used by admins to configure
# the Hive installation (so that users do not have to set environment variables
# or set command line parameters to get correct behavior).
#
# The hive service being invoked (CLI/HWI etc.) is available via the environment
# variable SERVICE
# Hive Client memory usage can be an issue if a large number of clients
# are running at the same time. The flags below have been useful in
# reducing memory usage:
#
# if [ "$SERVICE" = "cli" ]; then
#   if [ -z "$DEBUG" ]; then
#     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"
#   else
#     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
#   fi
# fi
# The heap size of the jvm stared by hive shell script can be controlled via:
# export HADOOP_HEAPSIZE=1024
# Larger heap size may be required when running queries over large number of files or partitions.
# By default hive shell scripts use a heap size of 256 (MB).  Larger heap size would also be
# appropriate for hive server (hwi etc).
# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/opt/app/hadoop-2.7.2
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/opt/app/apache-hive-1.2.1/conf
# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
# export HIVE_AUX_JARS_PATH=
"hive-env.sh" 54L, 2407C 已写入

此时，可以在目录/opt/app/apache-hive-1.2.1/conf下启动Hive了，但是我们的整体工作还未配置完成，下面继续，，，

》》》》》》> 3. Hive 元数据（Metastore）配置到MySQL

step 1: 将下载好的驱动文件上传至本机

Alt+p:

sftp> put G:/Hive/mysql-connector-java-5.1.37-bin.jar

Uploading mysql-connector-java-5.1.37-bin.jar to /root/mysql-connector-java-5.1.37-bin.jar

  100% 962KB    962KB/s 00:00:00    

G:/Hive/mysql-connector-java-5.1.37-bin.jar: 985603 bytes transferred in 0 seconds (962 KB/s)

sftp>

将mysql-connector-java-5.1.37-bin.jar移动（或拷贝）到 /opt/app/apache-hive-1.2.1/lib/

[root@hadoop011 ~]# mv mysql-connector-java-5.1.37-bin.jar /opt/app/apache-hive-1.2.1/lib/

step 2：配置Metastore 到mySQL

在/opt/app/apache-hive-1.2.1/conf/下创建hive-site.xml

[root@hadoop011 ~]# cd /opt/app/apache-hive-1.2.1/conf

[root@hadoop011 conf]# touch hive-site.xml

将配置内容（https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin）放入 hive-site.xml文件中

[root@hadoop011 conf]# vim hive-site.xml

配置信息：
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<property>
	  <name>javax.jdo.option.ConnectionURL</name>
	  <value>jdbc:mysql://hadoop102:3306/metastore?createDatabaseIfNotExist=true</value>
	  <description>JDBC connect string for a JDBC metastore</description>
	</property>

	<property>
	  <name>javax.jdo.option.ConnectionDriverName</name>
	  <value>com.mysql.jdbc.Driver</value>
	  <description>Driver class name for a JDBC metastore</description>
	</property>

	<property>
	  <name>javax.jdo.option.ConnectionUserName</name>
	  <value>root</value>
	  <description>username to use against metastore database</description>
	</property>

	<property>
	  <name>javax.jdo.option.ConnectionPassword</name>
	  <value>000000</value>
	  <description>password to use against metastore database</description>
	</property>
</configuration>

其中对内容中两处作如下修改：

a. 我的MySQL在hadoop015上面

<value>jdbc:mysql://hadoop102:3306/metastore?createDatabaseIfNotExist=true</value>

——> <value>jdbc:mysql://hadoop015:3306/metastore?createDatabaseIfNotExist=true</value>

b. MySQL密码已设置为‘rootroot’,所以此处需要修改

<name>javax.jdo.option.ConnectionPassword</name>

<value>rootrooot</value>

<description>password to use against metastore database</description>

</property>

step 3：配置完成，可以重启查看

#reboot

#service mysqld start

#mysql -uroot -prootroot

mysql> show databases:

此时，可以发现数据库 metastore ，说明配置成功

当然，也可以启动集群和Hive

你认为结束了吗，NONONO，可以继续其它配置,,,,,,,

》》》》》》> 4. Hadoop集群配置

step 1：启动集群

[root@hadoop011 ~]# start-dfs.sh

[root@hadoop012 ~]# start-yarn.sh

step 2：在HDFS上创建目录/user/hive/warehouse（可以自己设置），作为HIVE 的数据仓库

[root@hadoop011 conf]# hadoop fs -mkdir -p /user/hive/warehouse

step 2：修改权限

[root@hadoop011 conf]# hadoop fs -chmod g+w /user/hive/warehouse

》》》》》》>5. 数据仓库位置配置

修改default数据仓库原始位置（将目录/opt/app/apache-hive-1.2.1/conf下的hive-default.xml.template如下配置信息拷贝到hive-site.xml文件中）。

<name>hive.metastore.warehouse.dir</name>

<value>/user/hive/warehouse</value>

<description>location of default database for the warehouse</description>

</property>

》》》》》》>6. 查询后信息显示配置

为了实现查询后显示当前数据库，以及查询表的头信息，可在hive-site.xml文件中添加下面配置信息：

<property>
	<name>hive.cli.print.header</name>
	<value>true</value>
</property>

<property>
	<name>hive.cli.print.current.db</name>
	<value>true</value>
</property>

》》》》》》> 7. Hive日志文件配置

查看日志文件：

#cd /opt/app/apache-hive-1.2.1/conf

发现日志文件hive-log4j.properties.template，修改名称为 hive-log4j.properties

[root@hadoop011 conf]# mv hive-log4j.properties.template hive-log4j.properties

[root@hadoop011 conf]# pwd

/opt/app/apache-hive-1.2.1/conf

进入日志文件，修改LOG存放路径

[root@hadoop011 conf]# vim hive-log4j.properties

日志文件信息：

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Define some default values that can be overridden by system properties
hive.log.threshold=ALL
hive.root.logger=INFO,DRFA
hive.log.dir=/opt/app/apache-hive-1.2.1/logs
hive.log.file=hive.log

原始数据： hive.log.dir=${java.io.tmpdir}/${user.name}

修改后路径： hive.log.dir=/opt/app/apache-hive-1.2.1/logs

---------------------------------

OK，到此为止，基本所需的东西已配置完毕。当然，根据需要也可以配置其它信息，如参数配置等等

我想，这些对于我目前的学习，已足矣

那么，最后的最后，启动HIVE

# reboot

[root@hadoop011 ~]# start-dfs.sh

[root@hadoop012 ~]# start-yarn.sh

进入目录 /opt/app/apache-hive-1.2.1/conf

[root@hadoop011 bin]# ./hive

------------ 一同开启学习之路 --------------------------------