aggregateMessages
类graph提供了聚合方法aggregateMessages,关于使用方法官方给出了具体的案例参考:
// Import random graph generation library
import org.apache.spark.graphx.util.GraphGenerators
// Create a graph with "age" as the vertex property. Here we use a random graph for simplicity.
val graph: Graph[Double, Int] =
GraphGenerators.logNormalGraph(sc, numVertices = 100).mapVertices( (id, _) => id.toDouble )
// Compute the number of older followers and their total age
val olderFollowers: VertexRDD[(Int, Double)] = graph.aggregateMessages[(Int, Double)](
triplet => { // Map Function
if (triplet.srcAttr > triplet.dstAttr) {
// Send message to destination vertex containing counter and age
triplet.sendToDst(1, triplet.srcAttr)
}
},
// Add counter and age
(a, b) => (a._1 + b._1, a._2 + b._2) // Reduce Function
)
不过为了更清晰的理解该方法的使用方式,故采用更简单的代码实践以供理解:
1,随机产生图的数据集;
val graph=
GraphGenerators.logNormalGraph(sc, numVertices = 100).mapVertices( (id, _) => id.toDouble )
为了定位函数效果,选取srcId=80的点,数据如下图所示:
2,利用aggregateMessages方法计算以srcId=80为源点的邻居点的个数和总和,主要掌握sendToDst和sendToSrc方法的特点,前者是以dst为聚集点,后者则是以src为聚集点,方法定义如下;
val olderFollower=graph.aggregateMessages[(Int, Double)](
e => e.sendToSrc(1,e.dstAttr),
(a, b) => (a._1 + b._1, a._2 + b._2) )
之后查看srcId=80的聚合效果如下图所示:
3.若是在定义aggregateMessages使用sendToDst方法时,如下:
val olderFollowers=graph.aggregateMessages[(Int, Double)](
e => e.sendToDst(1,e.srcAttr),
(a, b) => (a._1 + b._1, a._2 + b._2) )
原始数据如下所示:
则效果如下图所示: