Apache Spark启动spark-sql报错

一、问题

出现版本:
Apache Spark 2.4.0
Apache Spark 3.0.0

安装好spark后,执行spark-sql报错Exception in thread “main” java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT

命令

./bin/spark-sql

报错日志:

2021-08-02 15:00:04,213 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
	at org.apache.spark.sql.hive.HiveUtils$.formatTimeVarsForHiveClient(HiveUtils.scala:204)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:90)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-08-02 15:00:04,409 INFO util.ShutdownHookManager: Shutdown hook called
2021-08-02 15:00:04,410 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-16fcc4aa-301d-428f-b840-cd29602d253b

二、解决

原因:

1 修改源码

hadoop@company:/opt/os_ws/spark$ pwd
/opt/os_ws/spark
hadoop@company:/opt/os_ws/spark$ vim sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala

搜索HIVE_STATS_JDBC_TIMEOUT
注视掉如下内容
在这里插入图片描述

2 重新编译、部署

编译参考
https://blog.csdn.net/qq_39945938/article/details/119236982

部署参考
https://blog.csdn.net/weixin_44449270/article/details/86102461

再次启动spark-sql发现报错变了

root@company:/opt/soft/spark-2.4.0-bin-hadoop3.1.1/bin# spark-sql 
21/08/02 15:43:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.spark.sql.hive.HiveUtils$.formatTimeVarsForHiveClient(HiveUtils.scala:192)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:90)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.1
	at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
	at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
	at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
	... 15 more

3 解决Unrecognized Hadoop major version number

$HIVE_HOME/lib/下的jar包拷贝到$SPARK_HOME/jars/,之后备份之前的jar包
在这里插入图片描述
替换后的目录如下所示
在这里插入图片描述
再次启动spark-sql;又报错提示没有写权限

4 解决The dir: /tmp/hive on HDFS should be writable问题

在这里插入图片描述
赋予全部权限

hdfs dfs -chmod -R 777 /tmp/hive

权限修改后,这个报错消失;看其他人改完后就可以启动spark-sql,又报了另外的错

hadoop@company:/opt/soft/spark-2.4.0-bin-hadoop3.1.1/bin$ ./spark-sql 
2021-08-02 16:43:36,877 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-08-02 16:43:37,066 INFO conf.HiveConf: Found configuration file file:/opt/soft/apache-hive-3.1.2-bin/conf/hive-site.xml
Hive Session ID = c9b1fce5-35e7-4716-a948-e71a69881c54
2021-08-02 16:43:37,161 INFO SessionState: Hive Session ID = c9b1fce5-35e7-4716-a948-e71a69881c54
2021-08-02 16:43:37,657 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/c9b1fce5-35e7-4716-a948-e71a69881c54
2021-08-02 16:43:37,665 INFO session.SessionState: Created local directory: /tmp/hive/tmpdir/hadoop/c9b1fce5-35e7-4716-a948-e71a69881c54
2021-08-02 16:43:37,692 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/c9b1fce5-35e7-4716-a948-e71a69881c54/_tmp_space.db
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.session.SessionState$LogHelper.<init>(Lorg/apache/commons/logging/Log;)V
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:303)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-08-02 16:43:37,703 INFO util.ShutdownHookManager: Shutdown hook called
2021-08-02 16:43:37,704 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-9520a36a-9613-4081-b57e-65435736a034

看起来是hive和spark版本冲突问题;

参考资料

http://blog.51yip.com/hadoop/2329.html

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
Mastering Apache Spark 2.x - Second Edition by Romeo Kienzler English | 26 July 2017 | ISBN: 1786462745 | ASIN: B01MR4YF5G | 354 Pages | AZW3 | 13.74 MB Advanced analytics on your Big Data with latest Apache Spark 2.x About This Book An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities. Extend your data processing capabilities to process huge chunk of data in minimum time using advanced concepts in Spark. Master the art of real-time processing with the help of Apache Spark 2.x Who This Book Is For If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected. What You Will Learn Examine Advanced Machine Learning and DeepLearning with MLlib, SparkML, SystemML, H2O and DeepLearning4J Study highly optimised unified batch and real-time data processing using SparkSQL and Structured Streaming Evaluate large-scale Graph Processing and Analysis using GraphX and GraphFrames Apply Apache Spark in Elastic deployments using Jupyter and Zeppelin Notebooks, Docker, Kubernetes and the IBM Cloud Understand internal details of cost based optimizers used in Catalyst, SystemML and GraphFrames Learn how specific parameter settings affect overall performance of an Apache Spark cluster Leverage Scala, R and python for your data science projects In Detail Apache Spark is an in-memory cluster-based parallel processing system that provides a wide range of functionalities such as graph processing, machine learning, stream processing, and SQL. This book aims to take your knowledge of Spark to the next level by teaching you how to expand Spark's functionality and implement your data flows and machine/deep learning programs on top of the platform. The book commences with an overview of the Spark ecosystem. It

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值