先配置jdk,scala,spark,并启动scala和spark
1、配置spark-env.sh
cd conf
cp spark-env.sh.template spark-env.sh
vim spark-env.sh
在文件最后一行添加:
export LD_LIBRARY_PATH=$JAVA_LIBRARY_PATH
2、修改log4j.properties文件
cp log4j.properties.template log4j.properties
3、配置
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native/:$LD_LIBRARY_PATH
4、启动spark
spark-shell
5、统计词频
val textFile=sc.textFile("file:/home/data/words")
textFile.flatMap(line =>line.split(" ")).map(word =>(word,1)).foreach(println)