我们通过源码编译的spark-2.3.0来启动spark-sql进行sql操作,结果出现如下错误:
Spark assembly has been built with Hive, including Datanucleus jars on classpath
java.lang.ClassNotFoundException: org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:342)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Failed to load main class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.
You need to build Spark with -Phive and -Phive-thriftserver.
根据提示很明显,编译时需要加上两个选项-Phive -Phive-thriftserver。最终会多生成至少两个jar包:spark-hive_2.11-2.3.0.jar,spark-hive-thriftserver_2.11-2.3.0.jar,而该类SparkSQLCLIDriver就存在spark-hive-thriftserver_2.11-2.3.0.jar中。