一、pom文件见 Spark Streaming处理Socket数据 统计WordCount
二、代码
package com.kinglone.streaming
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}
/**
* Spark Streaming处理文件系统
*
*/
object FileWordCount {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[2]").setAppName("FileWordCount")
val ssc = new StreamingContext(sparkConf,Seconds(5))
val lines = ssc.textFileStream("file://D:/test/streaming/")
val result = lines.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_)
result.print()
ssc.start()
ssc.awaitTermination()
}
}
三、启动,在D盘,向D:/test/streaming/目录下复制数据
控制台输出统计结果: