hive mysql集群_Hive集群搭建

Hive本身是建立在Hadoop之上的用于处理结构化数据的数据仓库基础工具。它提供一系列的工具用于数据提取、转化、分析、加载。其提供类SQL语言HQL用于处理存储在Hadoop上的海量数据。所以,数据是在HDFS上,计算是MR/Spark,Hive自身并没有承担过多的压力。Hive不需要做集群。

1、软件环境:

centos6.8:sparknode1、sparknode2、sparknode3、sparknode4

hadoop版本:2.7.5

zookeeper版本:3.4.11

hbase版本:1.4.0

2、搭建了4台Hadoop+hdfs+hbase,名称分别是Sparknode1(master),Sparknode2,Sparknode3,Sparknode4。搭建了三台zookeeper集群,名称分别是zookeeper1,zookeeper2,zookeeper3。这里我没有使用hbase自带的zookeeper集群,而是自己搭建了另外一套zookeeper集群。

3、下载Hive安装包:

http://www.trieuvan.com/apache/hive/hive-2.3.2/

4、用RZ命令上传至centos后将其解压:

tar -zxvf apache-hive-2.3.2-bin.tar.gz

9d980fe9c94fe37432c34987e3cdf0e9.png

5、配置环境变量:

vim /etc/profile

export HIVE_HOME=/usr/soft/apache-hive-2.3.2-bin

export HIVE_CONF_DIR=$HIVE_HOME/conf

export CLASSPATH=$CLASSPATH:$HIVE_HOME/lib

export PATH=$PATH:$HIVE_HOMW/bin

fe37aa0673bca7446c038f14b909cd73.png

source /etc/profile

6、配置Mysql:

(1)、查看已安装的mysql服务:

rpm -qa | grep mysql

af290b2df28f79781efe639f08aff7a0.png

(2)、卸载Centos自带的mysql:

rpm -e mysql-5.1.73-8.el6_8.x86_64 --nodeps

(3)、下载mysql:

yum -y install mysql-server

(4)、初始化mysql

a.修改mysql的密码(root权限执行)

cd /usr/bin

./mysql_secure_installation

b.输入当前MySQL数据库的密码为root, 初始时root是没有密码的,所以直接回车

Enter current password for root (enter for none):

c.设置MySQL中root用户的密码(应与下面Hive配置一致,下面设置为123456)

Set root password? [Y/n] Y

New password:

Re-enter new password:

Password updated successfully!

Reloading privilege tables..

... Success!

d.删除匿名用户

Remove anonymous users? [Y/n] Y

... Success!

e.是否不允许用户远程连接,选择N

Disallow root login remotely? [Y/n] N

... Success!

f.删除test数据库

Remove test database and access to it? [Y/n] Y

Dropping test database...

... Success!

Removing privileges on test database...

... Success!

g.重装

Reload privilege tables now? [Y/n] Y

... Success!

h.完成

All done! If you've completed all of the above steps, your MySQL

installation should now be secure.

Thanks for using MySQL!

i.登陆mysql

mysql -uroot -p

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

FLUSH PRIVILEGES;

exit;

7、配置Hive

(1)、将hive-env.sh.template文件复制为hive-env.sh, 编辑hive-env.sh文件,配置如下:

cp hive-env.sh.template hive-env.sh

d17427c7c8179105306424a52fc21882.png

(2)、将hive-default.xml.template文件拷贝为hive-site.xml, 并编辑hive-site.xml文件(删除所有内容,只留一个)

cp hive-default.xml.template hive-site.xml

配置如下:

javax.jdo.option.ConnectionURL

jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true

JDBC connect string for a JDBC metastore

javax.jdo.option.ConnectionDriverName

com.mysql.jdbc.Driver

Driver class name for a JDBC metastore

javax.jdo.option.ConnectionUserName

root

username to use against metastore database

javax.jdo.option.ConnectionPassword

mysql

password to use against metastore database

datanucleus.autoCreateSchema

true

datanucleus.autoCreateTables

true

datanucleus.autoCreateColumns

true

hive.metastore.warehouse.dir

/hive

location of default database for the warehouse

hive.downloaded.resources.dir

/usr/soft/apache-hive-2.3.2-bin/tmp_resources

Temporary local directory for added resources in the remote file system.

hive.exec.dynamic.partition

true

hive.exec.dynamic.partition.mode

nonstrict

hive.exec.local.scratchdir

/usr/soft/apache-hive-2.3.2-bin/log/HiveJobsLog

Local scratch space for Hive jobs

hive.downloaded.resources.dir

/usr/soft/apache-hive-2.3.2-bin/log/ResourcesLog

Temporary local directory for added resources in the remote file system.

hive.querylog.location

/usr/soft/apache-hive-2.3.2-bin/log/HiveRunLog

Location of Hive run time structured log file

hive.server2.logging.operation.log.location

...skipping...

/usr/soft/apache-hive-2.3.2-bin/log/OpertitionLog

Top level directory where operation tmp are stored if logging functionality is enabled

hive.hwi.war.file

/usr/soft/apache-hive-2.3.2-bin/lib/hive-hwi-2.1.1.jar

This sets the path to the HWI war file, relative to ${HIVE_HOME}.

hive.hwi.listen.host

master

This is the host address the Hive Web Interface will listen on

hive.hwi.listen.port

9999

This is the port the Hive Web Interface will listen on

hive.server2.thrift.bind.host

master

hive.server2.thrift.port

10000

hive.server2.thrift.http.port

10001

hive.server2.thrift.http.path

cliservice

hive.server2.webui.host

master

hive.server2.webui.port

10002

hive.scratch.dir.permission

755

hive.aux.jars.path

file:///opt/spark-2.1.2-bin-hadoop2.7/jars

hive.server2.enable.doAs

false

hive.auto.convert.join

false

spark.dynamicAllocation.enabled

true

动态分配资源

spark.driver.extraJavaOptions

-XX:PermSize=128M -XX:MaxPermSize=512M

8、配置日志地址,将hive-log4j2.properties.template文件复制为hive-log4j2.properties, 编辑hive-log4j2.properties文件,配置如下:

cp hive-log4j2.properties.template hive-log4j2.properties

vim hive-log4j2.properties

e9b6d47d9f20904ab50fb9830059a4a0.png

9、配置$HIVE_HOME/conf/hive-config.sh文件:

## 增加以下三行

export JAVA_HOME=/home/centos/soft/java

export HIVE_HOME=/home/centos/soft/hive

export HADOOP_HOME=/home/centos/soft/hadoop

## 修改下列该行

HIVE_CONF_DIR=$HIVE_HOME/conf

10、将JDBC的jar包放入$HIVE_HOME/lib目录下:

367dd155deb0e33a5ce6db87f2ca57a4.png

11、将$HIVE_HOME/lib目录下的jline-2.12.jar包拷贝到$HADOOP_HOME/share/hadoop/yarn/lib目录下,并删除$HADOOP_HOME/share/hadoop/yarn/lib目录下旧版本的jline包

12、复制$JAVA_HOME/lib目录下的tools.jar到$HIVE_HOME/lib下

cp $JAVA_HOME/lib/tools.jar ${HIVE_HOME}/lib

13、执行初始化Hive操作

选用MySQLysql和Derby二者之一为元数据库

注意:先查看MySQL中是否有残留的Hive元数据,若有,需先删除

schematool -dbType mysql -initSchema ## MySQL作为元数据库

其中mysql表示用mysql做为存储hive元数据的数据库, 若不用mysql做为元数据库, 则执行

schematool -dbType derby -initSchema ## Derby作为元数据库

脚本hive-schema-1.2.1.mysql.sql会在配置的Hive元数据库中初始化创建表

14、启动Metastore服务:

hive

ad1694ed4efba2aac6df33ccdf8c16ad.png

15、测试:

show databases;

1b14303d2810b4b44356a7fb4d589ba9.png

show tables;

9175a60ca84806cc805d1ec3894b22ec.png

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值