Hive学习4_Hive on Spark: Getting Started_Common Issues

最新推荐文章于 2023-11-16 17:08:43 发布

Wang_Zhenwei

最新推荐文章于 2023-11-16 17:08:43 发布

阅读量1.9k

点赞数

分类专栏： Hive 转载文章标签： spark hive

转载同时被 2 个专栏收录

454 篇文章 0 订阅

订阅专栏

Hive

6 篇文章 0 订阅

订阅专栏

Issue1:

[ERROR] Terminal initialization failed; falling back to unsupported
java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected

Cause:

Hive has upgraded to Jline2 but jline 0.94 exists in the Hadoop lib.

Resolution:

Delete jline from the Hadoop lib directory (it's only pulled in transitively from ZooKeeper).
export HADOOP_USER_CLASSPATH_FIRST=true
If this error occurs during mvn test, perform a mvn clean install on the root project and itests directory.

Issue2:

Error: Could not find or load main class org.apache.spark.deploy.SparkSubmit

Cause:

Spark dependency not correctly set.

Resolution:

Add Spark dependency to Hive, see Step 1 above .

Issue3:

Exception in thread "Driver" scala.MatchError: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/TaskAttemptContext (of class java.lang.NoClassDefFoundError)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:432)

Cause:

MR is not on the YARN classpath.

Resolution:

If on HDP change from

/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework

/hdp/apps/2.2.0.0-2041/mapreduce/mapreduce.tar.gz#mr-framework

Issue4:

org.apache.spark.SparkException: Job aborted due to stage failure:

Task 5.0:0 had a not serializable result: java.io.NotSerializableException: org.apache.hadoop.io.BytesWritable

Cause:

Spark serializer not set to Kryo.

Resolution:

Set spark.serializer to be org.apache.spark.serializer.KryoSerializer, see Step 3 above .

Issue5:

Run query and get an error like:

FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

In Hive logs, it shows:

java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79)

Cause:

Happens on Mac (not officially supported).

This is a general Snappy issue with Mac and is not unique to Hive on Spark, but workaround is noted here because it is needed for startup of Spark client.

Resolution:

Run this command before starting Hive or HiveServer2:

export HADOOP_OPTS="-Dorg.xerial.snappy.tempdir=/tmp -Dorg.xerial.snappy.lib.name=libsnappyjava.jnilib $HADOOP_OPTS"

Issue6:

Spark executor gets killed all the time and Spark keeps retrying the failed stage; you may find similar information in the YARN nodemanager log.

WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Container [pid=217989,containerID=container_1421717252700_0716_01_50767235] is running beyond physical memory limits. Current usage: 43.1 GB of 43 GB physical memory used; 43.9 GB of 90.3 GB virtual memory used. Killing container.

Cause:

For Spark on YARN, nodemanager would kill Spark executor if it used more memory than the configured size of "spark.executor.memory" + "spark.yarn.executor.memoryOverhead".

Resolution:

Increase "spark.yarn.executor.memoryOverhead" to make sure it covers the executor off-heap memory usage.

Issue7:

Spark executor gets killed all the time and Spark keeps retrying the failed stage; you may find similar information in the YARN nodemanager log.

Cause:

For Spark on YARN, nodemanager would kill Spark executor if it used more memory than the configured size of "spark.executor.memory" + "spark.yarn.executor.memoryOverhead".

Resolution:

Increase "spark.yarn.executor.memoryOverhead" to make sure it covers the executor off-heap memory usage.

Issue8:

Stack trace: ExitCodeException exitCode=1: .../launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR.../usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:$PWD/__app__.jar:$PWD/*: bad substitution

Cause:

The keymapreduce.application.classpath in/etc/hadoop/conf/mapred-site.xml contains a variable which is invalid in bash.

Resolution:

From mapreduce.application.classpath remove

 
           :/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo- 
           0.6 
           . 
           0 
           .${hdp.version}.jar

from

/etc/hadoop/conf/mapred-site.xml

参考自：

https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

和

用户文档：https://cwiki.apache.org/confluence/display/Hive/Home#Home-UserDocumentation

Wang_Zhenwei

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hive学习4_Hive on Spark: Getting Started_Common Issues

IssueCauseResolutionIssueCauseResolution[ERROR] Terminal initialization failed; falling back to unsupportedjava.lang.IncompatibleClassChangeErro
复制链接

扫一扫