countByValueAndWindow 与countByWindow区别
看源码countByWindow 首先把内容转成1的数字的形式 ,然后调用 reduceByWiindow 函数
def countByWindow(
windowDuration: Duration,
slideDuration: Duration): DStream[Long] = ssc.withScope {
this.map(_ => 1L).reduceByWindow(_ + _, _ - _, windowDuration, slideDuration)
//窗口下的DStream进行map操作,把每个元素变为1之后进行reduceByWindow操作
}
countByValueAndWindow 首先把内容转为Tuple2(a,1)元组形式,然后调用reduceByKeyAndWindow操作
def countByValueAndWindow(
windowDuration: Duration,
slideDuration: Duration,
numPartitions: Int = ssc.sc.defaultParallelism)
(implicit ord: Ordering[T] = null)
: DStream[(T, Long)] = ssc.withScope {
this.map((_, 1L)).reduceByKeyAndWindow(
(x: Long, y: Long) => x + y,
(x: Long, y: Long) => x - y,
windowDuration,
slideDuration,
numPartitions,
(x: (T, Long)) => x._2 != 0L
)
}