本人spark小白一枚,最开始使用spark时遇到过许多问题,现在把它们都记录下来,希望给后人一个参考。大神不要嘲笑哦~~~
1.清除mac缓存:rm -rf /etc/udev/rules.d/70-persistent-net.rules
2.spark-submit \
--class main.scala.SparkWordCount \
--master spark://192.168.109.130:7077 \
/home/yingying/SparkTest.jar \
file:///usr/spark/spark-1.5.1-bin-hadoop2.6/README.md
3.出错Error: Could not find or load main class org.apache.spark.launcher.Main时,把spark重新装一遍即可= =
4. 出现"*** is already defined as object ***"错误
编写好SogouResult后进行编译,出现"Sogou is already as object SogouResult"的错误,
出现这个错误很可能不是程序代码的问题,很可能是使用Scala JDK版本问题,作者在使用scala-2.11.4遇到该问题,换成scala-2.10.4后重新编译该问题得到解决,需要检查两个地方配置:Libraries和Global Libraries分别修改为scala-2.10.4
def main(args:Array[String]){
val conf = new SparkConf().setAppName("Spark Pi")
val spark = new SparkContext(conf)
val data = spark.textFile("data")
val mappeddata = data.map(num => {(num % 10 , num)})
val count = mappeddata.reduceByKey((a,b) => {a+b}).collect()
val sum_count = count.map(data => {data._2}).sum
var temp = 0
var index = 0
val mid = sum_count/2
for(i <- 0 to 10) {
temp = temp + count(i)
if(temp >= mid) {
index = i