Hive安装及与Spark的集成配置

 

本博客主要介绍SparkSql基于Hive作为元数据的基本操作,包括以下内容:

1、Hive安装

2、Spark与Hive的集成

3、SparkSql的操作

注:在操作本博客的内容时,需要安装Hadoop和Spark。

其中hadoop安装可参考:https://my.oschina.net/u/729917/blog/1556872

spark安装可参考:https://my.oschina.net/u/729917/blog/1556871

1、Hive安装

a)、安装Mysql数据库,此步骤自行百度。

b)、官网下载Hive:http://mirror.bit.edu.cn/apache/hive/,作者下载后放在了目录/home/hadoop/tools/apache-hive-2.2.0-bin.tar.gz下。

c)、移动到指定目录并解压:作者是解压到/usr/local/目录下,并且作者的Hadoop和Spark均是安装在此目录下。

sudo mv /home/hadoop/tools/apache-hive-2.2.0-bin.tar.gz /usr/local/apache-hive-2.2.0-bin/
sudo tar -zxvf apache-hive-2.2.0-bin.tar.gz

d)、配置环境变量

vim ~/.bashrc
export HIVE_HOME=/usr/local/apache-hive-2.2.0-bin
export PATH=$PATH:${HIVE_HOME}/bin

 环境变量生效

 source ~/.bashrc

e)、在conf目录下新建一个hive-site.xml,配置hive信息,使用mysql保存hive元数据信息

hadoop@Master:/usr/local/apache-hive-2.2.0-bin/conf$ touch hive-site.xml 

下面是hive-site.xml的信息 

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
   <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>0000</value>
    </property>
    <property>
   <name>hive.metastore.schema.verification</name>
   <value>false</value>
    <description>
    Enforce metastore schema version consistency.
    True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic
          schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
          proper metastore schema migration. (Default)
    False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
    </description>
 </property>
</configuration>

f)、启动hive:输入hive命令即可

启动成功显示:

hadoop@Master:~$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.2.0-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/apache-hive-2.2.0-bin/lib/hive-common-2.2.0.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> 

2、Spark集成Hive

a)、在spark的conf目录下新建hive-site.xml文件

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ touch hive-site.xml 
<configuration>
<property>
<name>hive.metastore.uris</name>
<value>thrift://Master:9083</value>
</property>
</configuration>

b)、启动hadoop和spark

c)、启动hive service metastore服务

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/bin$ hive --service metastore&

d)、启动spark-sql进行测试

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/bin$ ./spark-sql 

启动成功后部分截屏

17/11/19 21:50:37 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a
17/11/19 21:50:37 INFO SessionState: Created local directory: /tmp/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a
17/11/19 21:50:37 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a/_tmp_space.db
17/11/19 21:50:37 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is file:/usr/local/spark-2.2.0-bin-hadoop2.7/bin/spark-warehouse
17/11/19 21:50:37 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
17/11/19 21:50:38 INFO SessionState: Created local directory: /tmp/2110b645-b83e-4b65-87a8-5e9f1482699e_resources
17/11/19 21:50:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e
17/11/19 21:50:38 INFO SessionState: Created local directory: /tmp/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e
17/11/19 21:50:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e/_tmp_space.db
17/11/19 21:50:38 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is file:/usr/local/spark-2.2.0-bin-hadoop2.7/bin/spark-warehouse
spark-sql> 

 

转载于:https://my.oschina.net/u/729917/blog/1575860

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值