hive安装配置

1. hive是基于hdfs的一个数据仓库,所以需要hadoop的环境,hadoop怎么搭建可以参考另一篇博文:http://blog.csdn.net/jthink_/article/details/38622297

2. 下载hive(hive-0.11.0.tar.gz)放到合适的位置(注意这里放在bg01主机上):

如我的放在:/usr/local/bg文件夹下

3. 修改配置:

拷贝hive-default.xml.template,名字改为: hive-default.xml,这是hive的默认配置

建立新文件:hive-site.xml, 这个里面的配置会覆盖hive-default.xml中的配置,调参在这个文件中改相应配置

拷贝hive-env.sh.template,名字改为:hive-env.sh,内容为:

# Licensed to the Apache Software Foundation (ASF) under one

# or more contributor license agreements.  See the NOTICE file

# distributed with this work for additional information

# regarding copyright ownership.  The ASF licenses this file

# to you under the Apache License, Version 2.0 (the

# "License"); you may not use this file except in compliance

# with the License.  You may obtain a copy of the License at

#

#     http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.



# Set Hive and Hadoop environment variables here. These variables can be used

# to control the execution of Hive. It should be used by admins to configure

# the Hive installation (so that users do not have to set environment variables

# or set command line parameters to get correct behavior).

#

# The hive service being invoked (CLI/HWI etc.) is available via the environment

# variable SERVICE





# Hive Client memory usage can be an issue if a large number of clients

# are running at the same time. The flags below have been useful in 

# reducing memory usage:

#

# if [ "$SERVICE" = "cli" ]; then

#   if [ -z "$DEBUG" ]; then

#     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"

#   else

#     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"

#   fi

# fi



# The heap size of the jvm stared by hive shell script can be controlled via:

#

export HADOOP_HEAPSIZE=1024

#

# Larger heap size may be required when running queries over large number of files or partitions. 

# By default hive shell scripts use a heap size of 256 (MB).  Larger heap size would also be 

# appropriate for hive server (hwi etc).





# Set HADOOP_HOME to point to a specific hadoop install directory

HADOOP_HOME=/usr/local/bg/hadoop-1.2.1



# Hive Configuration Directory can be controlled by:

export HIVE_CONF_DIR=/usr/local/bg/hive-0.11.0/conf



# Folder containing extra ibraries required for hive compilation/execution can be controlled by:

export HIVE_AUX_JARS_PATH=/usr/local/bg/hive-0.11.0/lib
4. hive的元数据需要放在传统的RDBMS中,这里选择的mysql,先安装mysql

sudo apt-get install mysql-server

修改my.cnf文件:注释掉bind-address=127.0.0.1这行,记住是用#号注释

所以我们的hive-site.xml的配置为:

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<configuration>

  <property>

    <name>hive.metastore.warehouse.dir</name>

    <value>/user/hive/warehouse</value>

    <description>location of default database for the warehouse</description> 

  </property>



  <property> 

    <name>hive.querylog.location</name>

    <value>/usr/local/bg/hive-0.11.0/log</value>

    <description>

      Location of Hive run time structured log file 

    </description>

  </property>



  <property>

    <name>hive.exec.scratchdir</name>

    <value>/tmp/hive-${user.name}</value>

    <description>Scratch space for Hive jobs</description>

  </property>



  <property> 

    <name>hive.metastore.local</name> 

    <value>true</value> 

  </property> 

  <property> 

    <name>javax.jdo.option.ConnectionURL</name> 

    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value> 

  </property> 

  <property> 

    <name>javax.jdo.option.ConnectionDriverName</name> 

    <value>com.mysql.jdbc.Driver</value> 

  </property> 

  <property> 

    <name>javax.jdo.option.ConnectionUserName</name> 

    <value>root</value> 

  </property> 

  <property> 

    <name>javax.jdo.option.ConnectionPassword</name> 

    <value>root</value> 

  </property> 

</configuration>


/usr/local/bg/hive-0.11.0/log这个文件夹需要自己建

对了,还要下载mysql的驱动到hive的lib文件夹中

5. 启动hive

配置下环境变量

sudo vim /etc/profile

# set hive environment
export HIVE_HOME=/usr/local/bg/hive-0.11.0
export PATH=$PATH:$HIVE_HOME/bin

命令就是: hive

show tables;

如果正常就说明配置成功

顺便提及一下,得先启动hadoop

<think>好的,用户想了解Hive安装配置及验证方法。我需要按照之前的系统级指令来组织回答,确保格式正确,比如数学表达式用$...$,独立公式用$$...$$,同时用中文回答,还要生成相关问题。首先,我得回忆一下Hive安装的基本步骤,然后结合用户提供的引用内容来组织信息。 用户提供的引用中有三个来源,引用[1]提到Hive配置文件hive-site.xml和启用审计的步骤,引用[2]详细介绍了安装配置的实战指南,包括环境准备、下载解压、修改配置、环境变量设置和服务启动,引用[3]则涉及Kerberos配置的状态图。不过用户的问题主要是安装配置和验证,所以重点应该放在引用[2]的内容上,可能还需要引用[1]中的配置步骤。 首先,安装部分需要环境准备,比如Java、Hadoop,然后下载解压Hive配置部分包括修改hive-site.xml,设置环境变量,初始化元数据库。验证部分可能需要通过执行Hive命令和测试基本操作来确认安装成功。 另外,用户要求生成相关问题,我需要考虑用户可能后续会问的问题,比如环境变量配置错误如何处理,Hive与Hadoop的兼容性,或者如何配置元数据库到MySQL等。 需要注意引用标记的正确使用,比如在提到配置步骤时引用[^1],在安装指南部分引用[^2]。同时,确保所有技术术语和步骤清晰准确,比如元数据库初始化使用schematool,这可能涉及到命令行代码的展示。 最后,检查LaTeX格式是否正确,虽然Hive安装可能不涉及数学公式,但如果有需要的地方要应用正确的格式,比如路径中的变量可能需要用$符号,例如$HIVE_HOME。不过根据示例,用户可能没有数学表达式,所以这部分可能不需要。但代码块需要用三个反引号包裹,并注明语言,比如shell或sql。 现在,我需要把整个流程结构化,分步骤说明,确保每个部分都有对应的引用标注,并在最后生成相关问题。</think>### Hive安装配置及验证方法分步指南 #### 一、环境准备 1. 基础依赖: - Java 1.8+环境(需验证$JAVA_HOME$配置) - Hadoop集群已部署(推荐Hadoop 3.x) 2. 创建专用用户: ```shell sudo useradd hive -g hadoop sudo passwd hive ``` #### 二、安装步骤 1. 下载解压: ```shell wget https://downloads.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/ mv /opt/apache-hive-3.1.2-bin /opt/hive ``` 2. 环境变量配置: ```shell # 在~/.bashrc中添加 export HIVE_HOME=/opt/hive export PATH=$PATH:$HIVE_HOME/bin ``` #### 三、核心配置hive-site.xml) ```xml <configuration> <!-- 元数据存储配置(以本地Derby为例) --> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=metastore_db;create=true</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <!-- 启用审计日志(引用[1]配置) --> <property> <name>hive.server2.logging.operation.enabled</name> <value>true</value> </property> </configuration> ``` #### 四、元数据库初始化 ```shell schematool -dbType derby -initSchema ``` 初始化成功后会出现"schemaTool completed"提示 #### 五、服务验证 1. 启动Hive CLI: ```shell hive ``` 2. 执行验证命令: ```sql CREATE TABLE test(id INT); SHOW TABLES; INSERT INTO test VALUES(1); SELECT COUNT(*) FROM test; ``` 预期输出: ``` OK 1 ``` #### 六、Kerberos集成(可选) 根据引用[3]状态图流程: 1. Kerberos客户端配置 2. 生成Hive服务主体 3. 更新hive-site.xml安全配置 4. 重启Hive服务 $$ \text{验证公式:} \quad \exists x \in \text{Hive进程列表}, x=\text{HiveServer2} $$
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值