Spark Graphx图形数据分析(二)

一、属性算子

属性算子包含mapVertices,mapEdges,mapTriplets,作用类似于RDD的map操作

//操作顶点属性
  def mapVertices[VD2](map: (VertexId, VD) => VD2): Graph[VD2, ED]
//操作边属性
  def mapEdges[ED2](map: Edge[ED] => ED2): Graph[VD, ED2]
//操作整个三元组
  def mapTriplets[ED2](map: EdgeTriplet[VD, ED] => ED2): Graph[VD, ED2]
val p = sc.parallelize(Array((1L,("Alice",28)),(2L,("Bob",27)),(3L,("Charlie",65)),(4L,("David",42)),(5L,("Ed",55)),(6L,("Fran",50))))

val re = sc.parallelize(Array(Edge(2L,1L,7),Edge(2L,4L,2),Edge(3L,2L,4),Edge(3L,6L,3),Edge(4L,1L,1),Edge(5L,2L,2),Edge(5L,3L,8),Edge(5L,6L,3)))

val graph=Graph(p,re)

//修改vertices结构
val gr =
  graph.mapVertices{ case(id,(name,age)) => (id,age) }
//gr 与 gr2 效果想同
val gr2 = 
  graph.mapVertices{case(id,attr)=>(id,attr._1)}  

gr.vertices.collect.foreach(println)
输出:
(4,(4,David))
(1,(1,Alice))
(6,(6,Fran))
(3,(3,Charlie))
(5,(5,Ed))
(2,(2,Bob))


//修改edges
val gr3 = 
    graph.mapEdges(x=>Edge(x.srcId,x.dstId,x.attr*2))

gr3.edges.collect.foreach(println)
输出:
Edge(2,1,Edge(2,1,14))
Edge(2,4,Edge(2,4,4))
Edge(3,2,Edge(3,2,8))
Edge(3,6,Edge(3,6,6))
Edge(4,1,Edge(4,1,2))
Edge(5,2,Edge(5,2,4))
Edge(5,3,Edge(5,3,16))
Edge(5,6,Edge(5,6,6))

二、结构算子

//反转关系
  def reverse: Graph[VD, ED]
//生成子图
  def subgraph(epred: EdgeTriplet[VD,ED] => Boolean,
               vpred: (VertexId, VD) => Boolean): Graph[VD, ED]

案例:

//reverse反转关系
val gr4 = graph.reverse

gr4.edges.collect.foreach(println)
输出:
Edge(1,2,7)
Edge(1,4,1)
Edge(2,3,4)
Edge(2,5,2)
Edge(3,5,8)
Edge(4,2,2)
Edge(6,3,3)
Edge(6,5,3)



//subgraph-vpred
graph.subgraph(vpred=(id,t)=>t._2<65).triplets.collect.foreach(println)
输出:
((2,(Bob,27)),(1,(Alice,28)),7)
((2,(Bob,27)),(4,(David,42)),2)
((4,(David,42)),(1,(Alice,28)),1)
((5,(Ed,55)),(2,(Bob,27)),2)
((5,(Ed,55)),(6,(Fran,50)),3)

//subgraph-epred
graph.subgraph(epred(ep)=>ep.srcAttr._2<65).triplets.collect.foreach(println)
输出:
((2,(Bob,27)),(1,(Alice,28)),7)
((2,(Bob,27)),(4,(David,42)),2)
((4,(David,42)),(1,(Alice,28)),1)
((5,(Ed,55)),(2,(Bob,27)),2)
((5,(Ed,55)),(3,(Charlie,65)),8)
((5,(Ed,55)),(6,(Fran,50)),3)

三、join算子

从外部的RDDs加载数据,修改顶点属性

  def joinVertices[U](table: RDD[(VertexId, U)])(map: (VertexId, VD, U) => VD): Graph[VD, ED]

//Rdd中的顶点不匹配时,值为None
  def outerJoinVertices[U, VD2](table: RDD[(VertexId, U)])(map: (VertexId, VD, Option[U]) => VD2)
    : Graph[VD2, ED]

案例1:名字后拼接邮箱

val t = 
    sc.makeRDD(Array((1L,"qq.com"),(2L,"163.com"),(3L,"gmail.com")))

//joinVertices  只join有的
val g1 = 
    graph.joinVertices(t)((id,v,cmpy)=>(v._1+"@"+cmpy,v._2))

//outerJoinVertices  没有的会用none补充
val g2 =
    graph.outerJoinVertices(t)((id,v,cmpy)=>(v._1+"@"+cmpy,v._2))

g1.triplets.collect.foreach(println)
输出:
((2,(Bob@163.com,27)),(1,(Alice@qq.com,28)),7)
((2,(Bob@163.com,27)),(4,(David,42)),2)
((3,(Charlie@gmail.com,65)),(2,(Bob@163.com,27)),4)
((3,(Charlie@gmail.com,65)),(6,(Fran,50)),3)
((4,(David,42)),(1,(Alice@qq.com,28)),1)
((5,(Ed,55)),(2,(Bob@163.com,27)),2)
((5,(Ed,55)),(3,(Charlie@gmail.com,65)),8)
((5,(Ed,55)),(6,(Fran,50)),3)

g2.triplets.collect.foreach(println)
输出:
((2,(27@Some(163.com),27)),(1,(28@Some(qq.com),28)),7)
((2,(27@Some(163.com),27)),(4,(42@None,42)),2)
((3,(65@Some(gmail.com),65)),(2,(27@Some(163.com),27)),4)
((3,(65@Some(gmail.com),65)),(6,(50@None,50)),3)
((4,(42@None,42)),(1,(28@Some(qq.com),28)),1)
((5,(55@None,55)),(2,(27@Some(163.com),27)),2)
((5,(55@None,55)),(3,(65@Some(gmail.com),65)),8)
((5,(55@None,55)),(6,(50@None,50)),3)

案例2:统计每个用户的点赞和获赞个数

case class user(name:String,age:Int,inDeg:Int,outDeg:Int)

val g = 
    graph.outerJoinVertices(graph.inDegrees){
case(id,u,indeg)=>user(u._1,u._2,indeg.getOrElse(0),0)
}.outerJoinVertices(graph.outDegrees){
case (id,u,outdeg)=>user(u.name,u.age,u.inDeg,outdeg.getOrElse(0))}

g.vertices.collect.foreach(println)

输出:

(4,user(David,42,1,1))
(1,user(Alice,28,2,0))
(6,user(Fran,50,2,0))
(3,user(Charlie,65,1,2))
(5,user(Ed,55,0,3))
(2,user(Bob,27,2,2))
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值