Spark Graphx连通图(仅做自己学习用)

Spark Graphx连通图(仅做自己学习用)

//创建测试数据
val userRDD: RDD[(String, Map[String, String])] = sc
  .parallelize {
    Seq(
      ("M001", Map(("imei", "IM001"), ("mac", "MA001"), ("openudid", "OU001"), ("idfa", "ID001"))),
      ("M002", Map(("mac", "MA002"), ("openudid", "OU001"))),
      ("M003", Map(("imei", "IM003"), ("androidid", "AN003"), ("idfa", "ID001"))),
      ("M004", Map(("imei", "IM004"), ("mac", "MA004"), ("openudid", "OU004"))),
      ("M005", Map(("androidid", "AN005"), ("idfa", "ID005")))
    )
  }

//    userRDD.foreach(println)

val vertex: RDD[(VertexId, String)] = userRDD
  .flatMap {
    case (mainID, map) => {
      var list = new ListBuffer[(VertexId, String)]()
      //方便最后去掉mainID的顶点
      list += mainID.hashCode.toLong -> s"mainID#${mainID}"
      map.foreach {
        case (idValue, id) =>
          list += id.hashCode.toLong -> s"${idValue}#${id}"
      }
      //这里必须把可变的变成不可变的List
      list.toList
    }
  }

//    vertex.foreach(println)

val edges: RDD[Edge[String]] = userRDD
  .flatMap {
    case (mainID, map) => {
      val list: ListBuffer[Edge[String]] = new ListBuffer[Edge[String]]()
      map.foreach {
        case (idName, id) => list += Edge(mainID.hashCode.toLong, id.hashCode.toLong, mainID)
      }
      //这里必须把可变的变成不可变的List
      list.toList
    }
  }

//    edges.foreach(println)

//构建图
val graph: Graph[String, String] = Graph(vertex, edges)

//获得连通图
val ccGraph: Graph[VertexId, String] = graph.connectedComponents()

//    connectGraph.edges.foreach(println)
//    connectGraph.vertices.foreach(println)

ccGraph
  .vertices
  .join(vertex)
  .map { case (vertexId, (ccVertexId, attr)) =>
    (ccVertexId, (vertexId, attr))
  }
  .groupByKey()
  .values
  .map { iter =>
    iter
      .filter {
        //将中间你的mainID去掉
        case (_, attr) => attr.startsWith("mainID")
      }
      .map(tuple => tuple._2)
      .reduce((mainIdAttr1, mainIdAttr2) => s"$mainIdAttr1, $mainIdAttr2")
  }
  .foreach(println)

Thread.sleep(1000000)

if (sc.isStopped) sc.stop()

在这里插入图片描述

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值