集群下安装Hive

最新推荐文章于 2024-06-01 08:34:47 发布

·清尘·

最新推荐文章于 2024-06-01 08:34:47 发布

阅读量556

点赞数

本文链接：https://blog.csdn.net/u012969412/article/details/78008510

版权

部署参考网址：http://blog.csdn.net/w12345_ww/article/details/51910030

hive概念参考网站：http://blog.csdn.net/liyongke89/article/details/51392910

强烈建议参考：http://blog.csdn.net/t1dmzks/article/details/72026876

Hadoop的MapReduce是Hive架构的根基。Hive架构包括如下组件：CLI（Command Line Interface）、JDBC/ODBC、Thrift Server、WEB GUI、Metastore和Driver(Complier、Optimizer和Executor)，这些组件分为两大类：服务端组件和客户端组件。

jdbc (Java DataBase Connectivity,java数据库连接);

odbc (Open Database Connectivity,开放数据库连接);

服务端组件：

1、lDriver组件：该组件包括Complier、Optimizer和Executor，它的作用是将HiveQL（类SQL）语句进行解析、编译优化，生成执行计划，然后调用底层的MapReduce计算框架；

2、lMetastore组件：元数据服务组件，这个组件存储Hive的元数据，Hive的元数据存储在关系数据库里，Hive支持的关系数据库有Derby和Mysql。元数据对于Hive十分重要，因此Hive支持把Metastore服务独立出来，安装到远程的服务器集群里，从而解耦Hive服务和Metastore服务，保证Hive运行的健壮性；
3、lThrift服务：Thrift是Facebook开发的一个软件框架，它用来进行可扩展且跨语言的服务的开发，Hive集成了该服务，能让不同的编程语言调用Hive的接口。
客户端组件：
lCLI：Command Line Interface，命令行接口。
1、lThrift客户端：上面的架构图里没有写上Thrift客户端，但是Hive架构的许多客户端接口是建立在Thrift客户端之上，包括JDBC和ODBC接口。
2、lWEBGUI：Hive客户端提供了一种通过网页的方式访问Hive所提供的服务。这个接口对应Hive的HWI组件（Hive Web Interface），使用前要启动HWI服务。

第一步：下载hive源文件

注意版本号：下载网址是这里：http://mirrors.hust.edu.cn/apache/

$ wget -r -O apache-hive-2.3.0-bin.tar.gz "http://mirrors.hust.edu.cn/apache/hive/hive-2.3.0/apache-hive-2.3.0-bin.tar.gz"

第二步: 配置环境变量

##################### hive env ################################
export HIVE_HOME=/home/hadoop/apache-hive-2.3.0-bin  #注意hive_home配置成本机的hive安装目录
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:$HIVE_HOME/lib/*:.

第三步：安装mysql

因为hive的元数据经常需要修改，所以将元数据存放在mysql数据库中，将hive的物理数据存放在hbase或者hdfs中。

$ sudo apt-get install mysql-server

开启mysql

$ service mysql start

进入mysql控制台

$ mysal -u root -p <回车>
$ <无密码直接回车>

第四步：在mysql中创建hive用户和数据库

创建hive用户,数据库等

insert into mysql.user(Host,User,Password) values("localhost","hive",password("hive")); 创建hive用户的方法还有很多
#或使用该句创建hive用户: grant all on hive.* to hive@'%'  identified by 'hive666';
#或使用该句创建你hive用户: grant all on hive.* to hive@'localhost'  identified by 'hive666';   
create database hive; 
flush privileges;

验证mysql是否配置成功

$ mysql -u hive -p <回车>
$ hive <密码是"hive"回车>

出现如下信息，信息内包含hive表示安装成功。

+--------------------------------+

| Database |

+--------------------------------+

| information_schema |

| hive |

| test |

+--------------------------------+

第五步：copy mysql目录下的jdbc驱动包到hive

下载mysql connector 驱动地址：https://dev.mysql.com/downloads/file/?id=472650

下载到指定目录。

$ cp mysql-connector-java-5.1.44-bin.jar /home/hadoop/apache-hive-2.3.0-bin/lib

第六步：新建hdfs和本地文件目录

hdfs创建工作目录

$ hdfs dfs -mkdir -p /hive/warehouse
$ hdfs dfs -mkdir -p /hive/logs
$ hdfs dfs -mkdir -p /hive/tmp

本地创建对应目录

$ mkdir -p ./tmp/downloadedsource #在目录 $HIVE_HOME下创建
$ mkdir -p ./tmp/exec #在目录 $HIVE_HOME下创建

下面设置hive-site.xml都能用到

第七步: 修改hive配置文件

1、制作hive-env.sh等文件

$ cd /home/hadoop/apache-hive-2.3.0-bin/conf
$ cp hive-env.sh.template hive-env.sh
$ cp hive-default.xml.template hive-site.xml
$ cp hive-exec-log4j.properties.template hive-exec-log4j.properties
$ cp hive-log4j.properties.template hive-log4j.properties

使用如下命令边界配置文件

sudo vim hive-env.sh

分别设置HADOOP_HOME和HIVE_CONF_DIR两个值：
# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/home/hadoop/hadoop-2.2.0 #如果/etc/profile 文件中设置了HADOOP_HOME这里可以省略
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/home/hadoop/apache-hive-2.3.0-bin/conf

2、构建hive-default.xml和hive-site.xml文件

$ cd /home/hadoop/apache-hive-2.3.0-bin/conf
$ cp hive-default.xml.template hive-default.xml

备注: “hive-default.xml”用于保留默认配置,“hive-site.xml”用于个性化配置,可覆盖默认配置。

创建新的hive-site.xml文件，并编辑hive-site.xml。

$ vim hive-site.xml

内容更新如下。

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->
<configuration>

    


    <!-- ################################### 基本设置 ################################### -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name> 
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
        <description>JDBC connect string for a JDBC metastore</description>
        <!-- 连接jdbc,一定要注意//后的内容为localhost -->
    </property>

    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name for a JDBC metastore</description>
    </property>

    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
        <description>username to use against metastore database</description> 
        <!-- 连接hive数据库的mysql用户名,第四步设置的值 -->
    </property>

    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hive</value>
        <description>password to use against metastore database</description>  
        <!-- 连接hive数据库的mysql密码，第四步设置的值 -->
    </property>



    <property>
        <name>hive.metastore.uris</name>
        <value/>  <!-- //这里我是默认的，没变 -->
        <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
    </property>

    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/hive/warehouse</value>   <!-- //到时候需要在hdfs上建立想要的目录 -->
        <description>location of default database for the warehouse，Hive在HDFS上的根目录</description>
    </property>

    <property>
        <name>hive.exec.local.scratchdir</name>
        <value>/home/hadoop/apache-hive-2.3.0-bin/tmp/exec</value> <!-- 本地目录 -->
        <description>Local scratch space for Hive jobs</description>
    </property>

    <property>
        <name>hive.downloaded.resources.dir</name>
        <value>/home/hadoop/apache-hive-2.3.0-bin/tmp/downloadedsource</value> <!-- 本地目录 -->
        <description>Temporary local directory for added resources in the remote file system.</description>
    </property>

    <property>
        <name>datanucleus.autoCreateTables</name>
        <value>true</value>
        <description> 表示在操作JDO API的时候对应的数据库表还没有创建的话会根据实体的元数据自动创建表 </description>
    </property> 

    <property>
        <name>datanucleus.autoCreateSchema</name>
        <value>true</value>
        <description> 表示在操作JDO API的时候对应的数据库表还没有创建的话会根据实体的元数据自动创建表 </description>
    </property>    

    
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
        <description> 解决异常“Caused by: MetaException(message:Version information not found in metastore. )” </description>
        <!-- 注意这里的: autoCreateSchema,autoCreateTables,schema.verification设置-->
    </property>


</configuration>

第八步：开启hive服务

hive 只需在集群中的一个节点安装即可。

在使用hive之前需要启动metastore和hiveserver服务，通过如下命令启用：

实际使用时，一般通过后台启动metastore和hiveserver实现服务，命令如下：

$ nohup hive --service metastore &
$ nohup hive --service hiveserver2 & #注意这里执行hive --service hiveserver & 会报找不到hiveserver的错误

之后在另一个终端中输入hive就可以操作hive了

查看hive运行状态操作参考这里：http://blog.csdn.net/gamer_gyt/article/details/52062460

Web UI：http://192.168.199.123:10002/

·清尘·

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫