离线计算/实时计算
离线计算与实时计算的区别
离线计算
实时计算
mysql数据库是实时查询,不是实时计算
SparkStreaming
SparkStreaming与flink
SparkStreaming会出现的问题
编写简单的SparkStreaming代码
在虚拟机端:nc -lk 8888用于测试
代码在IDEA中运行
import org.apache.spark.streaming.dstream.{DStream, ReceiverInputDStream}
import org.apache.spark.streaming.{Duration, Durations, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}
object Demo1 {
def main(args: Array[String]): Unit = {
val conf: SparkConf = new SparkConf()
conf.setMaster("local[2]")
conf.setAppName("Demo1")
val sc: SparkContext = new SparkContext(conf)
/**
*创建SparkStreaming环境
* 指定多久运行一次
*/
val ssc: StreamingContext = new StreamingContext(sc, Durations.seconds(5))
/**
* 读取数据
* 读取socket数据
* nc -lk 8888用于测试
*/
val lines: ReceiverInputDStream[String] = ssc.socketTextStream("master", 8888)
/**
* 处理数据WordCount
*
*/
val wordDS: DStream[(String, Int)] = lines.flatMap(_.split(","))
.map(w => (w, 1))
.reduceByKey(_ + _)
wordDS.print()
/**
* 启动sparkStreaming
*
*/
ssc.start()
ssc.awaitTermination()
ssc.stop()
}
}