Hive2.3.6安装TEZ0.9.0
前提环境
- hadoop 我的是2.7.1
- hive 我的是2.3.6
Tez环境准备
下载Tez的安装包解压 下载路径 - 解压tez的jar包(我的是0.9.0)
进去Tez安装目录下的conf目录
[root@hadoop001 conf]# vi tez-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<!-- hdfs上tez的压缩包 基础使用配这一个属性就行 -->
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/tez-0.9.0/tez.tar.gz</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
<property>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
</configuration>
上传tez压缩包到HDFS上(注意是tez解压后的share文件下的压缩包,不是未解压之前的包)
hdfs dfs -mkdir /tez-0.9.0
hdfs dfs -put /usr/local/tez-0.9.0/share/tez.tar.gz /tez-0.9.0
tez下的lib目录中的hadoop包的版本和真实安装的hadoop版本不一致,需要将其jar包换成一致。
#删除不符合版本的jar:
[root@hadoop01 tez-0.9.0]# rm -rf ./lib/hadoop-mapreduce-client-core-2.7.0.jar ./lib/hadoop-mapreduce-client-common-2.7.0.jar
#重新再hadoop目录中拷贝:
[root@hadoop01 tez-0.9.0]# cp /usr/local/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.1.jar /usr/local/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.1.jar /usr/local/tez-0.9.0/lib/
安装Tez在Hive上
配置hive-env-sh 新增以下内容
#TEZ_HOME根据你实际情况来
export TEZ_HOME=/usr/local/tez-0.9.0
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done
#这个lzo的包去自己hadoop下找对应的
export HIVE_AUX_JARS_PATH=/opt/moudle/hadoop-2.7.2/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar$TEZ_JARS
也可以直接将tez下的jar和tez/lib下的jar拷贝到$hive_home/lib下(不用配置)
启动hive测试
hive (default)> set hive.execution.engine=tez;
hive (default)> select
> id,
> count(id)
> from sq11
> group by id;
hive (default)> set hive.execution.engine=tez;
03:38:10.961 [59fd501b-ea06-4ce3-83e3-24b71a218eb8 main] ERROR org.apache.hadoop.hdfs.KeyProviderCache - Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
Query ID = root_20190919033809_a5d0a892-78d7-4c21-ae26-234fc6429664
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1568834471830_0001)
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 6.99 s
----------------------------------------------------------------------------------------------
出现进度条就代表成功了。
安装Tez在Hadoop上
这种配置对原hadoop集群有一定影响。会使得所有在yarn上运行的mapreduce都只能走tez引擎,所有hive运行的时候自然也是tez。
修改hadoop-env.sh
新增以下内容
export TEZ_HOME=/usr/local/tez-0.9.0 #是你的tez的解压安装目录
for jar in `ls $TEZ_HOME |grep jar`; do
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/lib/$jar
done
修改mapred-site.xml的这个配置
修改mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn-tez</value>
</property>
同步集群
测试Hive是否跑的直接就是Tez任务
直接执行一个语句
set hive.execution.engine=tez;
hive (default)> select
> id,
> count(id)·
> from zzy.l1
> group by id;
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2019-09-19 04:10:56,905 Stage-1 map = 0%, reduce = 0%
2019-09-19 04:11:01,094 Stage-1 map = 100%, reduce = 100%
Ended Job = job_1568837085850_0002
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
A 8
B 7
Time taken: 17.341 seconds, Fetched: 2 row(s)
原文链接:https://blog.csdn.net/weixin_43326165/article/details/100997261
出现错误:可以查看hive的日志文件(默认在/tmp/用户名/hive.log)
[外链图片转存失败(img-K6a4wzNa-1568960422239)(1568960249206.png)]
[外链图片转存失败(img-uIEjiOSc-1568960422244)(1568960203701.png)]
conds, Fetched: 2 row(s)
出现错误:可以查看hive的日志文件(默认在/tmp/用户名/hive.log)
以上错误是因为内存不够进程被杀死(tez对内存的要求很高),可以尝试分配内存,或者调试yarn