报错如下:
java.lang.ClassNotFoundException: scalaBase.day7.sparkWC2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:725)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:218)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:132)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我的提交代码:
/usr/local/spark-2.2.0-bin-hadoop2.6/bin/spark-submit \
--class scalaBase.day7.sparkWC2 \
--master spark://mini1:7077 \
--executor-memory 512m \
--total-executor-cores 2 \
/ajar/swc4.jar \
hdfs://mini1:9000/data/wcount \
hdfs://mini1:9000/out/wcount_20200211_2
我的idea代码
package scalaBase.day7
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}
//集群上运行
object sparkWC2 {
def main(args: Array[String]): Unit = {
val conf:SparkConf=new SparkConf()
conf.setAppName("myWC")
//conf.setJars(Array("/ajar/swc2.jar"))
// conf.setMaster("local")
val sc = new SparkContext(conf)
val r1: RDD[String] = sc.textFile(args(0))
val r2 = r1.flatMap(_.split(" ")).map(x=>(x,1)).reduceByKey(_+_).sortBy(_._2,false)
println(r2.collect().toBuffer)
r2.saveAsTextFile(args(1))
sc.stop()
}
}
解决:
如果要打包到集群,在pom.xml中添加这个!!
<build>
<sourceDirectory>src/main/scalaJob</sourceDirectory>
<!--<testSourceDirectory>src/test/scala</testSourceDirectory>-->
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<!--<arg>-make:transitive</arg>-->
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass></mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
注意:
1.把这里改成你的源文件夹的那个名字,比如我原来的是src/main/scala
<sourceDirectory>src/main/scalaJob</sourceDirectory>
2.<build> </build>中的这个要和下面这个 <properties> </properties>里上面两行的那个的值一致(此处值为1.8)
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.11.8</scala.version>
<scala.compat.version>2.11</scala.compat.version>
<spark.version>2.2.0</spark.version>
<hadoop.version>2.7.4</hadoop.version>
<scala.compat.version>2.11</scala.compat.version>
</properties>