hadoop 元数据存储到mysql_hadoop集群上hive安装，配置mysql数据库存储metadata

最新推荐文章于 2023-01-30 13:58:25 发布

zLiM5

最新推荐文章于 2023-01-30 13:58:25 发布

阅读量493

点赞数

文章标签： hadoop 元数据存储到mysql

本文链接：https://blog.csdn.net/weixin_30851261/article/details/113265342

版权

本文介绍了如何在Hadoop集群上安装Hive 2.0.0，并配置使用MySQL数据库来存储metadata。内容包括MySQL的安装、用户创建、数据库创建，以及Hive的安装、配置文件修改、JDBC驱动添加和启动步骤。

摘要由CSDN通过智能技术生成

hadoop集群上hive安装，配置mysql数据库存储metadata

2016-06-07 16:15:33 作者：MangoCool 来源：MangoCool

Hive是基于Hadoop的一个数据仓库工具，可以将结构化的数据文件映射为一张数据库表，并提供简单的sql查询功能，可以将sql语句转换为MapReduce任务进行运行。其优点是学习成本低，可以通过类SQL语句快速实现简单的MapReduce统计，不必开发专门的MapReduce应用，十分适合数据仓库的统计分析。

我们这里采用的Hadoop-2.7.2、Hive-2.0.0版本，Hadoop实现已经安装完成，服务器三台，分别是：

192.168.21.6 slave

192.168.21.181 master

192.168.21.9 slave

MySQL安装

这一步其实不是必须的，因为Hive默认的metadata(元数据)是存储在Derby里面的，但是有一个弊端就是同一时间只能有一个Hive实例访问，这适合做开发程序时做本地测试。

Hive提供了增强配置，可将数据库替换成mysql等关系型数据库，将存储数据独立出来在多个服务示例之间共享。

1、安装MySQL：

yum install -y mysql-server

2、启动MySQL：

service mysqld start

3、root登录新建新用户：

mysql -u root -p

root初始密码为空，输入命令后直接回车即可。

mysql> use mysql;

mysql> update user set password = Password('root') where User = 'root';

mysql> create user 'hive'@'%' identified by 'hive';

mysql> grant all privileges on *.* to 'hive'@'%' with grant option;

mysql> flush privileges;

mysql> exit;

4、创建数据库：

mysql> create database hive;

Hive安装配置

Hive只需要安装master就可以了

2、解压：

tar -zxvf apache-hive-2.0.0-bin.tar.gz

3、配置/etc/profile：

export HIVE_HOME=/home/hadoop/SW/hive-2.0.0

export PATH=$HIVE_HOME/bin:$HIVE_HOME/conf:$PATH

4、创建Hive数据文件目录：

在HDFS中建立用于存储Hive数据的文件目录(/tmp 目录可能已经存在)，进入hadoop的bin目录执行：

./hadoop fs -mkdir /tmp

./hadoop fs -mkdir /user/hive/warehouse

./hadoop fs -chmod 777 /tmp

./hadoop fs -chmod 777 /user/hive/warehouse

其中/tmp用于存放一些执行过程中的临时文件，/user/hive/warehouse用于存放Hive进行管理的数据文件。

5、Hive配置文件：

Hive配置文件位于$Hive_Home/conf目录下面，名为hive-site.xml，这个文件默认情况下是不存在的，需要进行手动创建，在此目录下有个hive-default.xml.template的模板文件，cp它创建hive-site.xml文件。

javax.jdo.option.ConnectionURL

jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true

JDBC connect string for a JDBC metastore

javax.jdo.option.ConnectionDriverName

com.mysql.jdbc.Driver

Driver class name for a JDBC metastore

javax.jdo.option.ConnectionUserName

hive

username to use against metastore database

javax.jdo.option.ConnectionPassword

hive

password to use against metastore database

hive.metastore.warehouse.dir

/user/hive/warehouse

location of default database for the warehouse

hive.metastore.local

true

这里我们将hive的服务端(Hive Service)和客户端(metadata Service)部署在同一台服务器上。

也可以部署在不同服务器，这样更好，如下：

javax.jdo.option.ConnectionURL

jdbc:mysql://192.168.21.8:3306/hive?createDatabaseIfNotExist=true

JDBC connect string for a JDBC metastore

javax.jdo.option.ConnectionDriverName

com.mysql.jdbc.Driver

Driver class name for a JDBC metastore

javax.jdo.option.ConnectionUserName

hive

username to use against metastore database

javax.jdo.option.ConnectionPassword

hive

password to use against metastore database

hive.metastore.local

false

hive.server2.thrift.port

10001

hive.server2.authentication

NONE

hive.metastore.uris

thrift://192.168.21.8:9083

hive.metastore.warehouse.dir

/user/hive/warehouse

location of default database for the warehouse

hive.exec.scratchdir

/tmp/hive

Local scratch space for Hive jobs

hive.default.fileformat

Parquet

Expects one of [textfile, sequencefile, rcfile, orc].

Default file format for CREATE TABLE statement.

Users can explicitly override it by CREATE TABLE ... STORED AS [FORMAT]

hive.query.result.fileformat

Parquet

Expects one of [textfile, sequencefile, rcfile].

Default file format for storing result of the query.

hive.execution.engine

spark

Expects one of [mr, tez, spark].

Chooses execution engine. Options are: mr (Map reduce, default), tez, spark.

While MR remains the default engine for historical reasons,

it is itself a historical engine and is deprecated in Hive 2 line.

It may be removed without further warning.

JDBC驱动包下载：

wget http://search.maven.org/remotecontent?filepath=mysql/mysql-connector-java/5.1.38/mysql-connector-java-5.1.38.jar

复制到hive-2.0.0/lib下：

cp mysql-connector-java-5.1.38.jar hive-2.0.0/lib

启动前初始化：

bin/schematool -dbType mysql -initSchema

启动Hive：

hive --service hiveserver2 &

hive --service metastore &

如果要让Hive运行于后台，可执行：

nohup hive --service hiveserver2 &

nohup hive --service metastore &

启动CLI方式：

hive shell

hive

debug模式的CLI：

hive --hiveconf hive.root.logger=DEBUG,console

zLiM5

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫