1.Connected Components
含义:连通分量算法用图的最低编号顶点的ID标记图的每个连通分量。例如,在社交网络中,连接的组件可以近似于群集。
案例:
package sparkGraphX
import org.apache.spark.graphx.{GraphLoader, VertexId, VertexRDD}
import org.apache.spark.{SparkConf, SparkContext}
object connectionTest {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("SimpleGraphX").setMaster("local")
val sc = new SparkContext(conf)
sc.setLogLevel("WARN")
// Load the graph as in the PageRank example
val graph = GraphLoader.edgeListFile(sc, "D:/testData/vertices.txt")
// Find the connected components
val cc: VertexRDD[VertexId] = graph.connectedComponents().vertices
cc.collect.foreach(println(_))
//下面的代码就是将顶点的名称与编号建立对应关系,最后将顶点名称与值打印出来。
val users = sc.textFile("D:/testData/user.txt").map { line =>
val fields = line.split(",")
(fields