最近为了提升复杂SQL的处理性能,准备在现有集群上搭建hive on tez,以提高ETL速度,其中遇到一些坑,在此纪录下。

    生产环境是hadoop2.3.0-cdh5.0.2,首先去tez官网下载tez0.53源码,然后根据官方文档进行安装,http://tez.apache.org/install.html tez的部署很简单,按部就班的按照文档做就是了。有一个注意点,因为最新版的tez有个Tez UI依赖haodop2.4.0,所有源码里的pom.xml中的hadoop.version改成hadoop2.4.0之前的版本会编译失败,不过没关系,就用hadoop2.4.0编译,也能在hadoop2.3.0上正常运行。还有tez-site.xml不要放在chd管理的目录中,比如/etc/hadoop/conf,集群重启的话会删除tez-site.xml文件。对于tez更详细的配置,可以参考http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.7/bk_installing_manually_book/content/rpm-chap-tez_configure_tez.html


    Tez部署完毕后,下载hive0.14.0的二进制文件,解压即可。然后在hive的conf目录下新建hive-site.xml进行常规的配置,如果要是此hive运行在tez上,可以在配置文件中加上:

<property>

    <name>mapreduce.framework.name</name>

    <value>yarn-tez</value>

  </property>

当然这个配置也能加在mapsite.xml里,建议加在需要hive-site.xml以不印象集群其他hive。

     然后打开命令行,首先执行set hive.execution.engine=tez; 这里说下这个配置的含义,


Setting execution engine to mr and framework name to yarn = Hive compiles to MR and runs on MR.

Setting execution engine to mr and framework name to yarn-tez = Hive compiles to MR and runs on Tez.

Setting execution engine to tez = Hive compiles to Tez and runs on Tez.


     如果运行报错,例如:

Vertex failed, vertexName=Map 1, vertexId=vertex_1421076019070_0156_1_01, diagnostics=[Task failed, taskId=task_1421076019070_0156_1_01_000000, diagnostics=[AttemptID:attempt_1421076019070_0156_1_01_000000_0 Info:Container container_1421076019070_0156_01_000002 COMPLETED with diagnostics set to [Exception from container-launch.
Container id: container_1421076019070_0156_01_000002
Exit code: 255
Stack trace: ExitCodeException exitCode=255: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:196)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

     请使用 yarn logs -applicationId 查看日志,当时我的日志中报错如下:


java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z

    at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)

    at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)

    at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:132)

    at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148)

    at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163)

    at org.apache.tez.runtime.library.common.sort.impl.IFile$Writer.<init>(IFile.java:128)

    at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.spill(DefaultSorter.java:749)

    at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.sortAndSpill(DefaultSorter.java:723)

    at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.flush(DefaultSorter.java:610)

    at org.apache.tez.runtime.library.output.OnFileSortedOutput.close(OnFileSortedOutput.java:134)

    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:331)

    at org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:567)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.Subject.doAs(Subject.java:415)

    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)

    at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)

2015-01-22 01:23:25,821 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status -1


解决办法是配置Snappy或在hive-site.xml中增加

 <property>

    <name>mapreduce.output.fileoutputformat.compress.codec</name>

    <value>org.apache.hadoop.io.compress.DefaultCodec</value>

  </property>

<property>

    <name>mapreduce.map.output.compress</name>

    <value>true</value>

  </property>


如果运行时发行job 卡住,请参考https://issues.apache.org/jira/browse/TEZ-704  把mapreduce.reduce.cpu.vcores设为1 即可。