问题
描述
我在搭建大数据技术栈的时候,在hive环节进行测试,测试及报错如下:
hive> insert overwrite table dw.people3 select * from ods.people1;
Query ID = hdfs_20190103084929_d1a58db2-8ce6-40b1-a55c-2fe881e67ad5
Total jobs = 1
Launching Job 1 out of 1
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
考虑到是否是Tez造成这种问题,于是将hive.execution.engine
由Tez切换成MapReduce后,类似报错如下:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
开始没有什么头绪,查看hive日志目录,却没有发现任何问题。网上搜索之后,发现有遇到相同问题的,浏览之后,大部分解决方案是调整yarn的参数,如下:
- 调整
yarn.scheduler.minimum-allocation-mb
,从缺省1024调整为2048; - 调整
yarn.nodemanager.vmem-pmem-ratio
,从缺省2.1调整到3.0。
调整之后,并没有解决问题。于是将参数还原,继续研究。
解决
使用百度和Google继续搜索这个问题,在stackoverflow发现有类似问题的讨论,虽然没有找到解决方案,但是发现了hive另一个产生日志的位置:/tmp/<user>/hive.log
。我使用的是hdfs用户,所以查看/tmp/hdfs/hive.log
文件,果然发现有错误,报错如下:
2019-01-03T09:09:40,935 INFO [5171b2ea-7c81-400b-a266-013c2b191617 main] hooks.ATSHook: Created ATS Hook
2019-01-03T09:09:40,936 INFO [ATS Logger 0] hooks.ATSHook: Received pre-hook notification for :hdfs_20190103090940_da2b453b-5a1f-4aba-8223-06ee0d4160d5
2019-01-03T09:09:40,938 ERROR [5171b2ea-7c81-400b-a266-013c2b191617 main] ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
2019-01-03T09:09:40,938 INFO [5171b2ea-7c81-400b-a266-013c2b191617 main] ql.Driver: Completed executing command(queryId=hdfs_20190103090940_da2b453b-5a1f-4aba-8223-06ee0d4160d5); Time take
n: 0.005 seconds2019-01-03T09:09:40,941 INFO [ATS Logger 0] hooks.ATSHook: Received post-hook notification for :hdfs_20190103090940_da2b453b-5a1f-4aba-8223-06ee0d4160d5
2019-01-03T09:09:40,941 WARN [ATS Logger 1] hooks.ATSHook: Failed to send event to ATS
java.lang.NullPointerException
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingObject(TimelineClientImpl.java:481) ~[phoenix-4.13.1-HBase-1.2-client.jar:4.13.1-HBase-1.2]
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:332) ~[phoenix-4.13.1-HBase-1.2-client.jar:4.13.1-HBase-1.2]
...
2019-01-03T09:09:40,941 WARN [ATS Logger 1] hooks.ATSHook: Failed to send event to ATS
java.lang.NullPointerException
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingObject(TimelineClientImpl.java:481) ~[phoenix-4.13.1-HBase-1.2-client.jar:4.13.1-HBase-1.2]
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:332) ~[phoenix-4.13.1-HBase-1.2-client.jar:4.13.1-HBase-1.2]
...
2019-01-03T09:09:40,947 INFO [5171b2ea-7c81-400b-a266-013c2b191617 main] conf.HiveConf: Using the default value passed in for log id: 5171b2ea-7c81-400b-a266-013c2b191617
......
2019-01-03T09:09:42,766 ERROR [5171b2ea-7c81-400b-a266-013c2b191617 main] exec.Task: Failed to execute tez graph.
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/codehaus/jackson/jaxrs/JacksonJaxbJso
nProvider at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.isOpen(TezSessionState.java:163) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:144) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) ~[hive-exec-2.3.3.jar:2.3.3]
...
Caused by: java.util.concurrent.ExecutionException: java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/codehaus/jackson/jaxrs/JacksonJaxbJsonProvider
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_162]
at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_162]
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.isOpen(TezSessionState.java:158) ~[hive-exec-2.3.3.jar:2.3.3]
... 20 more
Caused by: java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/codehaus/jackson/jaxrs/JacksonJaxbJsonProvider
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState$1.call(TezSessionState.java:333) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState$1.call(TezSessionState.java:326) ~[hive-exec-2.3.3.jar:2.3.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_162]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
Caused by: java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/codehaus/jackson/jaxrs/JacksonJaxbJsonProvider
at java.lang.ClassLoader.defineClass1(Native Method) ~[?:1.8.0_162]
at java.lang.ClassLoader.defineClass(ClassLoader.java:763) ~[?:1.8.0_162]
...
Caused by: java.lang.ClassNotFoundException: org.apache.phoenix.shaded.org.codehaus.jackson.jaxrs.JacksonJaxbJsonProvider
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[?:1.8.0_162]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_162]
...
根据上面这些信息,得出的结论就是缺少phoenix依赖,我一看就知道这个缺少的包是phoenix-hive.jar
。
将这个包放到hive的lib目录下之后,再次操作hive,问题不再出现,任务成功提交并完成。
至于我为什么一看就知道这个缺少的包是phoenix-hive.jar
,是因为hive-metastore服务同样出现过这个问题,我成功解决了。
这次出这个问题,是因为我偷懒,只在安装hive-metastore服务的那台机器的hive的lib目录下添加了phoenix-hive.jar
,而操作hive是在另一个节点上进行的,于是发生了这种惨剧。不过也算因祸得福,收获了一些经验。