Hive on Spark 详解(附安装包)

Hive on Spark 详解

安装准备

  1. 修改好的hive源码包 apache-hive-3.1.3-src.zip
  2. 纯净版spark安装包 spark-3.2.2-bin-without-hadoop.tgz
  3. maven程序 apache-maven-3.6.1-bin.tar.gz
    下载地址
    https://pan.baidu.com/s/1e4JGWbzGv-e0xpPB_G3nqQ?pwd=g7gz
    提取码:g7gz
    复制这段内容打开「百度网盘APP 即可获取」

安装Maven

  1. 解压
	tar -zxvf apache-maven-3.6.1-bin.tar.gz -C /opt/module/
	mv apache-maven-3.6.1-bin maven-3.6.1
  1. 创建本地maven仓库文件夹
	mkdir -p /opt/module/maven_repo
  1. 配置maven setting
 <!-- 配置本地仓库路径 -->
 <localRepository>/opt/module/maven_repo</localRepository>
 <!-- 配置远程仓库 -->
 <mirrors>
    <!-- 阿里云仓库 -->
    <mirror>
      <id>alimaven</id>
      <name>aliyun maven</name>
      <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
      <mirrorOf>central</mirrorOf>
    </mirror>

	<mirrors>
    <mirror>
        <id>aliyunmaven</id>
        <mirrorOf>*</mirrorOf>
        <name>spring-plugin</name>
        <url>https://maven.aliyun.com/repository/spring-plugin</url>
    </mirror>

    <mirror>
        <id>aliyunmaven</id>
        <mirrorOf>*</mirrorOf>
        <name>阿里云公共仓库</name>
        <url>https://maven.aliyun.com/repository/public</url>
    </mirror>

    <mirror>
        <id>nexus-aliyun</id>
        <mirrorOf>*,!cloudera</mirrorOf>
        <name>Nexus aliyun</name>
        <url>http://maven.aliyun.com/nexus/content/groups/public</url>
    </mirror>
    </mirrors>

编译HIVE并安装

  1. 解压
unzip apache-hive-3.1.3-src.zip
  1. 编译打包
	mvn clean package -Pdist -DskipTests -Dmaven.javadoc.skip=true
	mvn clean package -Pdist -DskipTests
  1. 单独对 Webhcat模块打包
   cd hcatalog
   mvn package -Pdist -DskipTests
   cd ..
   mvn clean package -Pdist -DskipTests
  1. 打包完成 将target下面文件拿出来
	mv /opt/software/apache-hive-3.1.3-src/packaging/target/apache-hive-3.1.3-bin /opt/module
	cd /opt/module
	mv apache-hive-3.1.3-bin hive-3.1.3
  1. 修改hive-site.xml
   <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" ?>
<configuration>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>
            jdbc:mysql://hadoop102:3306/hive?createDatabaseIfNotExist=true
        </value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>000000</value>
    </property>
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>hadoop102</value>
    </property>
    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.cli.print.header</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.cli.print.current.db</name>
        <value>true</value>
    </property>
<property>
	 <name>hive.execution.engine</name>
 	 <value>spark</value>
</property>
<property> 
    <name>spark.yarn.jars</name> 
    <value>hdfs://hadoop102:8020/spark-jars/*</value> </property>
<property> 
    <name>hive.spark.client.connect.timeout</name> 
    <value>10000ms</value> 
</property>
<property>
  <name>hive.metastore.uris</name>
      <value>thrift://hadoop102:9083</value>
 </property>
</configuration>

  1. hdfs 创建目录
   hdfs dfs -mkdir spark-history
  1. 在conf目录下创建spark配置文件spark-default.xml
spark.master                               yarn
spark.eventLog.enabled                   true
spark.eventLog.dir                        hdfs://hadoop102:8020/spark-history
spark.executor.memory                    1g
spark.driver.memory					   1g

安装spark并配置

  1. 解压
 tar -zxvf spark-3.2.2-bin-without-hadoop.tgz -C /opt/module
 mv spark-3.2.2-bin-without-hadoop spark-3.2.2
  1. spark-env.sh
HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
  1. spark-defaults.conf
	spark.yarn.historyServer.address hadoop102:18080
	spark.yarn.jars       hdfs://hadoop102:8020/spark-jars/*
  1. jars下面Jar包修改
rm -f log4j-1.2.17.jar
rm -f parquet-hadoop-1.12.2.jar
cd /opt/module/hive-3.1.3/lib/
cp /opt/module/hive-3.1.3/lib/log4j-*.jar  parquet-hadoop-bundle-1.10.0.jar hive-beeline-3.1.3.jar   hive-cli-3.1.3.jar  hive-exec-3.1.3.jar  hive-jdbc-3.1.3.jar hive-metastore-3.1.3.jar log4j-*.jar   /opt/module/spark-3.2.2/jars/
hdfs dfs -mkdir /spark-jars
hdfs dfs -put /opt/module/spark-3.2.2/jars/* /opt/module/spark-3.2.2/jars/

启动测试

  1. hive-3.1.3/lib下添加jar
cp kryo-shaded-4.0.2.jar  minlog-1.3.0.jar scala-xml_2.12-1.2.0.jar spark-core_2.12-3.2.2.jar chill-java-0.10.0.jar chill_2.12-0.10.0.jar jackson-module* jersey-container-servlet-core*  jersey-server-2.34.jar json4s-ast_2.12-3.7.0-M11.jar spark-launcher_2.12-3.2.2.jar spark-network-shuffle_2.12-3.2.2.jar spark-unsafe_2.12-3.2.2.jar xbean-asm9-shaded-4.20.jar  /opt/module/hive/lib
  1. bin/hive
select count(1) from student;
  1. 运行结果
    hive运行结果
    yarn运行结果
  • 4
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值