hive mysql集群_Hive集群搭建

最新推荐文章于 2022-04-14 23:56:57 发布

weixin_39840729

最新推荐文章于 2022-04-14 23:56:57 发布

阅读量261

点赞数

文章标签： hive mysql集群

本文链接：https://blog.csdn.net/weixin_39840729/article/details/113215558

版权

Hive本身是建立在Hadoop之上的用于处理结构化数据的数据仓库基础工具。它提供一系列的工具用于数据提取、转化、分析、加载。其提供类SQL语言HQL用于处理存储在Hadoop上的海量数据。所以，数据是在HDFS上，计算是MR/Spark，Hive自身并没有承担过多的压力。Hive不需要做集群。

1、软件环境：

centos6.8：sparknode1、sparknode2、sparknode3、sparknode4

hadoop版本：2.7.5

zookeeper版本：3.4.11

hbase版本：1.4.0

2、搭建了4台Hadoop+hdfs+hbase，名称分别是Sparknode1(master)，Sparknode2，Sparknode3，Sparknode4。搭建了三台zookeeper集群，名称分别是zookeeper1，zookeeper2，zookeeper3。这里我没有使用hbase自带的zookeeper集群，而是自己搭建了另外一套zookeeper集群。

3、下载Hive安装包：

http://www.trieuvan.com/apache/hive/hive-2.3.2/

4、用RZ命令上传至centos后将其解压：

tar -zxvf apache-hive-2.3.2-bin.tar.gz

5、配置环境变量：

vim /etc/profile

export HIVE_HOME=/usr/soft/apache-hive-2.3.2-bin

export HIVE_CONF_DIR=$HIVE_HOME/conf

export CLASSPATH=$CLASSPATH:$HIVE_HOME/lib

export PATH=$PATH:$HIVE_HOMW/bin

source /etc/profile

6、配置Mysql：

(1)、查看已安装的mysql服务：

rpm -qa | grep mysql

(2)、卸载Centos自带的mysql：

rpm -e mysql-5.1.73-8.el6_8.x86_64 --nodeps

(3)、下载mysql：

yum -y install mysql-server

(4)、初始化mysql

a.修改mysql的密码(root权限执行)

cd /usr/bin

./mysql_secure_installation

b.输入当前MySQL数据库的密码为root, 初始时root是没有密码的,所以直接回车

Enter current password for root (enter for none):

c.设置MySQL中root用户的密码(应与下面Hive配置一致,下面设置为123456)

Set root password? [Y/n] Y

New password:

Re-enter new password:

Password updated successfully!

Reloading privilege tables..

... Success!

d.删除匿名用户

Remove anonymous users? [Y/n] Y

... Success!

e.是否不允许用户远程连接,选择N

Disallow root login remotely? [Y/n] N

... Success!

f.删除test数据库

Remove test database and access to it? [Y/n] Y

Dropping test database...

... Success!

Removing privileges on test database...

... Success!

g.重装

Reload privilege tables now? [Y/n] Y

... Success!

h.完成

All done! If you've completed all of the above steps, your MySQL

installation should now be secure.

Thanks for using MySQL!

i.登陆mysql

mysql -uroot -p

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

FLUSH PRIVILEGES;

exit;

7、配置Hive

(1)、将hive-env.sh.template文件复制为hive-env.sh, 编辑hive-env.sh文件，配置如下：

cp hive-env.sh.template hive-env.sh

(2)、将hive-default.xml.template文件拷贝为hive-site.xml, 并编辑hive-site.xml文件(删除所有内容，只留一个)

cp hive-default.xml.template hive-site.xml

配置如下：

javax.jdo.option.ConnectionURL

jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true

JDBC connect string for a JDBC metastore

javax.jdo.option.ConnectionDriverName

com.mysql.jdbc.Driver

Driver class name for a JDBC metastore

javax.jdo.option.ConnectionUserName

root

username to use against metastore database

javax.jdo.option.ConnectionPassword

mysql

password to use against metastore database

datanucleus.autoCreateSchema

true

datanucleus.autoCreateTables

true

datanucleus.autoCreateColumns

true

hive.metastore.warehouse.dir

/hive

location of default database for the warehouse

hive.downloaded.resources.dir

/usr/soft/apache-hive-2.3.2-bin/tmp_resources

Temporary local directory for added resources in the remote file system.

hive.exec.dynamic.partition

true

hive.exec.dynamic.partition.mode

nonstrict

hive.exec.local.scratchdir

/usr/soft/apache-hive-2.3.2-bin/log/HiveJobsLog

Local scratch space for Hive jobs

hive.downloaded.resources.dir

/usr/soft/apache-hive-2.3.2-bin/log/ResourcesLog

Temporary local directory for added resources in the remote file system.

hive.querylog.location

/usr/soft/apache-hive-2.3.2-bin/log/HiveRunLog

Location of Hive run time structured log file

hive.server2.logging.operation.log.location

...skipping...

/usr/soft/apache-hive-2.3.2-bin/log/OpertitionLog

Top level directory where operation tmp are stored if logging functionality is enabled

hive.hwi.war.file

/usr/soft/apache-hive-2.3.2-bin/lib/hive-hwi-2.1.1.jar

This sets the path to the HWI war file, relative to ${HIVE_HOME}.

hive.hwi.listen.host

master

This is the host address the Hive Web Interface will listen on

hive.hwi.listen.port

9999

This is the port the Hive Web Interface will listen on

hive.server2.thrift.bind.host

master

hive.server2.thrift.port

10000

hive.server2.thrift.http.port

10001

hive.server2.thrift.http.path

cliservice

hive.server2.webui.host

master

hive.server2.webui.port

10002

hive.scratch.dir.permission

755

hive.aux.jars.path

file:///opt/spark-2.1.2-bin-hadoop2.7/jars

hive.server2.enable.doAs

false

hive.auto.convert.join

false

spark.dynamicAllocation.enabled

true

动态分配资源

spark.driver.extraJavaOptions

-XX:PermSize=128M -XX:MaxPermSize=512M

8、配置日志地址，将hive-log4j2.properties.template文件复制为hive-log4j2.properties, 编辑hive-log4j2.properties文件，配置如下：

cp hive-log4j2.properties.template hive-log4j2.properties

vim hive-log4j2.properties

9、配置$HIVE_HOME/conf/hive-config.sh文件:

## 增加以下三行

export JAVA_HOME=/home/centos/soft/java

export HIVE_HOME=/home/centos/soft/hive

export HADOOP_HOME=/home/centos/soft/hadoop

## 修改下列该行

HIVE_CONF_DIR=$HIVE_HOME/conf

10、将JDBC的jar包放入$HIVE_HOME/lib目录下：

11、将$HIVE_HOME/lib目录下的jline-2.12.jar包拷贝到$HADOOP_HOME/share/hadoop/yarn/lib目录下,并删除$HADOOP_HOME/share/hadoop/yarn/lib目录下旧版本的jline包

12、复制$JAVA_HOME/lib目录下的tools.jar到$HIVE_HOME/lib下

cp $JAVA_HOME/lib/tools.jar ${HIVE_HOME}/lib

13、执行初始化Hive操作

选用MySQLysql和Derby二者之一为元数据库

注意:先查看MySQL中是否有残留的Hive元数据,若有,需先删除

schematool -dbType mysql -initSchema ## MySQL作为元数据库

其中mysql表示用mysql做为存储hive元数据的数据库, 若不用mysql做为元数据库, 则执行

schematool -dbType derby -initSchema ## Derby作为元数据库

脚本hive-schema-1.2.1.mysql.sql会在配置的Hive元数据库中初始化创建表

14、启动Metastore服务：

hive

15、测试：

show databases;

show tables;

weixin_39840729

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
hive mysql集群_Hive集群搭建

Hive本身是建立在Hadoop之上的用于处理结构化数据的数据仓库基础工具。它提供一系列的工具用于数据提取、转化、分析、加载。其提供类SQL语言HQL用于处理存储在Hadoop上的海量数据。所以，数据是在HDFS上，计算是MR/Spark，Hive自身并没有承担过多的压力。Hive不需要做集群。1、软件环境：centos6.8：sparknode1、sparknode2、sparknode3、sp...
复制链接

扫一扫