Efficient Closest Community Search over Large Graph社区搜索问题

介绍

本文研究了最接近社区搜索问题,给定一个图G和查询顶点Q,从G中找到一个包括Q的连通子图。并且该连通子图内聚性比较大,也就说这些顶点类型特点相似性比较大。
通过一个两阶段的方法来计算:(1)计算G中包含Q且最具凝聚力的最大连通子图g0,以及(2)迭代地从g0中删除离Q最远的顶点,随后也删除其他违反凝聚力要求的顶点。

算法

第一阶段
基线算法 o(n+m)

Baseline-S1
先计算出个顶点的core number,通过一个剥离算法在线性时间内计算出。
为的找出包含查询顶点Q的最大kcore。

Input: Graph G = (V,E) and a set of query vertices Q ⊂ V
Output: kQ and the (kQ,∞)-community of Q
1 Run the peeling algorithm of [1] to compute the core number for all vertices of
G;
2 Initialize a priority queue Q to contain an arbitrary vertex of Q;
3 kQ ← n;
4 while not all vertices of Q have been visited do
5 u ← pop the vertex with the maximum core number from Q;
6 Mark u as visited;
7 if core(u) < kQ then kQ ← core(u) ;
8 for each neighbor v ∈ N(u) do
9 if v is not in Q and has not been visited then Push v into Q;
10 g0 ← the connected component of the kQ-core of G that contains Q;
11 return (kQ, g0);
提升算法 o(n0+m0)

Indexed-S1
1 Compute kQ based on the index I;
2 Conduct a pruned breadth-first search on G by starting from an arbitrary
vertex of Q and visiting only vertices whose core numbers are at least kQ;
3 g0 ← the subgraph of G induced by vertices visited at Line 2;
4 return (kQ, g0);

第二阶段
基线算法 o(n0*m0)

Baseline-S2
从Baseline-S1算法得到的g0中,逐个删除查询距离远的,只要最后组件包括Q且是连通子图就可以。

Input: A set of query vertices Q ⊂ V , an integer kQ, and a graph g0 that
contains Q and has minimum vertex degree kQ
Output: Closest community of Q
1 Compute the query distance for all vertices of g0;
2 i ← 0;
3 while true do
4 u ← the vertex in gi with the largest query distance;
5 gi+1 ← the connected component of the kQ-core of gi\{u} that contains Q;
6 if gi+1 = ∅ then break ;
7 else i ← i + 1 ;
8 return gi;
furthest
提升算法0(m0+n0*logn0)

LinearOrder-S2:
先不验证删除顶点u之后是否满足连接性,而是先建立一个层级结构再说。
通过seq和targets这种结构
按着不同的查询距离,一次添加到seq和targets中,违反连接性的只添加到seq中。
在这里插入图片描述

/* Compute the hierarchical structure for the (kQ, d)-communities */
1 Compute the query distance for all vertices of g0;
2 Sort vertices of g0 in decreasing order with respect to their query distances;
3 seq ← ∅; targets ← ∅;
4 g
 ← g0; deg(u) ← the degree of u in g

for all vertices u ∈ g

;
5 while g

is not empty do
6 u ← the vertex in g

with the largest query distance;
7 if Q ∩ seq = ∅ then Append u to targets;
8 Q ← {u}; /* Q is a queue */;
9 while Q = ∅ do
10 Pop a vertex v from Q, and append v to seq;
11 for each neighbor w of v in g

do
12 deg(w) ← deg(w) − 1;
13 if deg(w) = kQ − 1 then Push w into Q ;
14 Remove v from g

;
使用并查集合并
/* Search for the closest community of Q */
15 Initialize an empty disjoint-set data structure S;
16 for each vertex u ∈ targets in the reverse order do
17 for each vertex v ∈ seq between u (inclusive) and the next target vertex
(exclusive) do
18 Add a singleton set for v into S;
19 for each neighbor w of v in g0 do
20 if w ∈ S then Union v and w in S ;
21 if Q is entirely contained in a single set of S then break ;
22 return all vertices in the set of S that contains Q;
CCS算法

解释:n,m 图G的顶点数和边数,n0,m0,g0的顶点数和边数
Input: Graph G = (V,E), a set of query vertex Q, and an index I
Output: Closest community of Q
1 Compute kQ based on the index I;
2 h0 ← the subgraph of G induced by Q;
3 i ← 0; g ← ∅;
4 while true do
5 g
← the connected component of the kQ-core of hi that contains Q;
6 g ← LinearOrder-S2(Q, kQ, g

);
7 if g = ∅ then
8 i ← i + 1; hi ← hi−1;
9 while hi = G and the size of hi is less than twice of hi−1 do
10 Get the next vertex u that has the smallest query distance;
11 Add to hi the vertex u and its adjacent edges to existing vertices
of hi;
12 else break;
13 return g;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值