Spark ~ aggregate算子解析
搞不清楚参数怎么写的话,就写第一种方式,自定义函数比较方便。
理论方面等有空再写。
val rdd = sc.parallelize(List(1,2,3,4,5,6,7,8,9,10),3)
def fun1(x:Int,y:Int): Int = {x+y}
def fun2(x:Int,y:Int): Int = {x+y}
var cai = rdd.aggregate(0)(fun1,fun2)
println(cai)
println("-----------------------------------------------")
val rdd1 = sc.parallelize(List(List(1,2,3),List(6,7,8),List(8,9,22)))
var luo = rdd1.aggregate(0)(_+_.sum,_+_)
println(luo)
println("-----------------------------------------------")
val rdd3 = sc.parallelize(List(1,2,3,4,5,6,7,8,9,10),3)
var ying = rdd3.aggregate(0)(_+_,_+_)
println(ying)
结果如下:
55
-----------------------------------------------
66
-----------------------------------------------
55