今天整了一上午,终于在spark上跑出来了这个程序。
在eclipse上编了个简单Scala程序,code如下
package spark.wordcount
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object WordCount {
def main(args: Array[String]) {
val infile = "/input" // Should be some file on your system
val conf = new SparkConf().setAppName("word count")
val sc = new SparkContext(conf)
val indata = sc.textFile(infile, 2).cache()
val words = indata.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey((a,b) => (a+b))
words.saveAsTextFile("/output")
println("All words are counted!")
}
}
用spark-submit,走起:
[root@sparkmaster bin]# ./spark-submit --class WordCount /opt/spark-wordcount-in-scala.jar
java.lang.ClassNotFoundException: WordCount
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.spark.util.Utils$.classForName(Utils.scala:174)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
琢磨了好久,bing了好多,都没有答案,最后Google了好一通,说可能跟package name有关,于是尝试下面的提交方式:
[root@sparkmaster bin]# ./spark-submit --class spark.wordcount.WordCount /opt/spark-wordcount-in-scala.jar
终于走起了!
所以,--class后接的格式应该是packageName.objectName。