网上有好多出现相同问题的,但是都没说解决办法,下面是自己遇到的一些问题。
后续遇到会继续补充
问题是在本机运行的 IDEA 里面遇到的。
错误1
Exception in thread "main" java.lang.NoClassDefFoundError: scala/Product$class
at org.apache.spark.SparkConf$DeprecatedConfig.<init>(SparkConf.scala:723)
at org.apache.spark.SparkConf$.<init>(SparkConf.scala:571)
at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
at org.apache.spark.SparkConf.set(SparkConf.scala:92)
at org.apache.spark.SparkConf.set(SparkConf.scala:81)
at org.apache.spark.SparkConf.setMaster(SparkConf.scala:113)
at cn.spark.WordCount$.main(WordCount.scala:23)
at cn.spark.WordCount.main(WordCount.scala)
Caused by: java.lang.ClassNotFoundException: scala.Product$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 8 more
Process finished with exit code 1
这种错误是因为 Scala 版本不对,要使用2.11.x 的版本
因为:官网上说了
错误2
Disconnected from the target VM, address: '127.0.0.1:63201', transport: 'socket'
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.spark.util.Utils$.getCallSite(Utils.scala:1434)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:76)
at cn.spark.WordCount$.main(WordCount.scala:24)
at cn.spark.WordCount.main(WordCount.scala)
Process finished with exit code 1
是因为
这里面配置的版本和本机的不一样造成的
错误3
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$scope()Lscala/xml/TopScope$;
at org.apache.spark.ui.jobs.AllJobsPage.<init>(AllJobsPage.scala:39)
at org.apache.spark.ui.jobs.JobsTab.<init>(JobsTab.scala:38)
at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:67)
at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:84)
at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:221)
at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:163)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:452)
at cn.spark.WordCount$.main(WordCount.scala:24)
at cn.spark.WordCount.main(WordCount.scala)
因为依赖的问题造成的:
在 pom.xml 中配置版本spark-core_2.11
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
错误4
Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: file
之前上传文件失败造成的
Hadoop客户端将本地文件cheap_all上传到hdfs上时,hadoop会通过fs.FSInputChecker判断需要上传的文件是否存在.crc校验文件。如果存在.crc校验文件,则会进行校验。如果校验失败,自然不会上传该文件。
将隐藏的文件.crc删除就可以
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file
类似的输出目录已经存在,删除文件夹