升级过程中遇到一些问题,逐个解决。
其中有个问题:spark thrift server启动后,执行sql报错
> select * from xxx.xxx limit 10;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 5.0 failed 10 times, most recent failure: Lost task 0.9 in stage 5.0 (TID 59, node41, executor 2): java.io.StreamCorruptedException: invalid stream header: 0000005B
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:866)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:63)
at org.apache.spark.serializer.JavaDeserializationStream.<init>(JavaSerializer.scala:63)
at org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:126)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:113)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:313)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)Driver stacktrace: (state=,code=0)
问题分析:问题报错不是很明显,但肯定是升级导致的,比较可能的是环境变量和启动参数
1. 环境变量SPARK_HOME设置。检查了,SPARK_HOME是正常的。
2. 检查spark启动参数,发现spark.yarn.archive参数指定的还是2.3.3的环境。
该参数设置在$SPARK_HOME/conf/spark-defaults.conf文件中
spark.yarn.archive hdfs://myname/user/hive/spark2.3.3/jars
修改成
spark.yarn.archive hdfs://myname/user/hive/spark2.4.6/jars
同时,将spark2.4.6依赖的jar包上传到修改后的路径:hdfs://myname/user/hive/spark2.4.6/jars
重新启动spark thrift server,或者etl_sql,问题解决