背景:
在进行DStream和RDD的混合操作的时候,需要对把函数执行于DStream中的每个RDD,返回一个新的DStream
Demo
val blackList = new ListBuffer[(String, Boolean)]
blackList.append(("James", true))
blackList.append(("Wade", true))
//初始化RDD
val blackRdd = ssc.sparkContext.parallelize(blackList)
//业务处理
val lines = ssc.socketTextStream("hadoop000", 9999)
//transform:DStream join RDD
lines.map(x => (x.split(",")(0), x)).transform(rdd => {
rdd.leftOuterJoin(blackRdd)
.filter(x => {
x._2._2.getOrElse(false) != true
}).map(x => (x._1, x._2._1))
}).print()
``