- 下载tez源码编译,0.8支持hadoop2.6+,0.9支持hadoop2.7+,cdh5.x使用hadoop2.6这里下载tez-0.8.5
tar -zxvf tez-0.8.5.tar.gz
- 安装编译环境
2.1安装JDK1.8
2.2安装Maven3
下载安装包:apache-maven-3.5.4-bin.tar.gz
tar -zxvf apache-maven-3.5.4-bin.tar.gz -C /usr/local/software/maven
[root@cm ~]# vim /etc/profile
export MAVEN_HOME=/usr/local/software/maven/apache-maven-3.5.4
export PATH=${MAVEN_HOME}/bin:$PATH
source /etc/profile
2.3 安装os依赖
yum -y install gcc gcc-c++ libstdc++-devel make build
2.4.安装Protobuf2.5.0,需要通过源码的方式编译安装
https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz
[root@cdh05 tez-0.8.5]# tar -zxvf protobuf-2.5.0.tar.gz
[root@cdh05 tez-0.8.5]# cd protobuf-2.5.0/
[root@cdh05 protobuf-2.5.0]# ./configure
[root@cdh05 protobuf-2.5.0]# make & make install
[root@cdh05 tez-0.8.5]# protoc --version
- 修改tez项目
3.1 修改pom.xml
3.1.1修改hadoop依赖版本

3.1.2 添加cloudera仓库
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
<name>Cloudera Repositories</name>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository></repositories>
<pluginRepositories>
<pluginRepository>
<id>cloudera</id>
<name>Cloudera Repositories</name>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</pluginRepository></pluginRepositories>

3.1.3 屏蔽tez-ext-service-tests、tez-ui、tez-ui2三个模块暂不做编译

3.1.4 添加依赖
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
<version>1.9.13</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-core-asl</artifactId>
<version>1.9.13</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-jaxrs</artifactId>
<version>1.9.13</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-xc</artifactId>
<version>1.9.13</version>
</dependency>
3.1.5修改/JobContexImpl.java文件
vi /root/wf/apache-tez-0.8.5-src/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContexImpl.java
在最后加上方法:
/**
* Get the boolean value for the property that specifies which classpath
* takes precedence when tasks are launched. True - user's classes takes
* precedence. False - system's classes takes precedence.
* @return true if user's classes should take precedence
*/
@Override
public boolean userClassesTakesPrecedence() {
return getJobConf().getBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, false);
}
- 编译tez项目
mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true
编译完成后 tez包在 apache-tez-0.8.5-src/tez-dist/target下

- 上传tez包到hdfs上
hdfs dfs -mkdir /user/tez
hdfs dfs -chmod -R 775 /user/tez/
hdfs dfs -put tez-0.8.5.tar.gz /user/tez/
hdfs dfs -ls /user/tez
- linux创建tez目录 /opt/cloudera/parcels/tez
cd /opt/cloudera/parcels
mkdir tez
cd tez
mkdir conf
- 拷贝tez-0.8.5-minimal下的jar包到tez目录
cp tez-0.8.5-minimal/*.jar /opt/cloudera/parcels/tez/
cp -r tez-0.8.5-minimal/lib /opt/cloudera/parcels/tez/

- 新建tez配置文件
cd tez/conf
vi tez-site.xml

<configuration>
<property>
<name>tez.lib.uris</name>
<!-- 这里指向hdfs上的tez.tar.gz包 -->
<value>/user/tez/tez-0.8.5.tar.gz</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>false</value>
<description>使用hadoop自身的lib包,设置为true的话可以使用minimal的tez包,false的话需要使用tez-0.8.5.tar.gz的包</description>
</property>
<property>
<name>hive.tez.container.size</name>
<value>4096</value>
<description>Set hive.tez.container.size to be the same as or a small multiple(1 or 2 times that) of YARN container size yarn.scheduler.minimum-allocation-mb but NEVER more than yarn.scheduler.maximum-allocation-mb</description>
</property>
<property>
<name>tez.task.launch.env</name>
<value>LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native</value>
</property>
<property>
<name>tez.am.launch.env</name>
<value>LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native</value>
</property>
</configuration>
- CM配置tez ,进入hive配置
HADOOP_CLASSPATH=/opt/cloudera/parcels/tez/conf:/opt/cloudera/parcels/tez/*:/opt/cloudera/parcels/tez/lib/*


- 重启hive
执行任务tez任务kryo异常

解决:
cp /opt/cloudera/parcels/CDH/jars/kryo-2.22.jar /opt/cloudera/parcels/tez/lib
上传hdfs的tez-0.8.5.tar.gz也需要修改加上kryo-2.22.jar包
- 将/opt/cloudera/parcels/tez目录复制到所有节点
scp -r tez/ bdpnode2:/opt/cloudera/parcels/
- 重启hive,验证


Tez 设置队列
set tez.queue.name=root.test;
Hive设置tez引擎
set hive.execution.engine=tez;

738

被折叠的 条评论
为什么被折叠?



