图机器学习基础知识——CS224W（13-communities）

最新推荐文章于 2024-04-18 00:30:18 发布

ZreviaX

最新推荐文章于 2024-04-18 00:30:18 发布

阅读量623

点赞数 11

分类专栏：图机器学习基础知识文章标签：机器学习人工智能深度学习图卷积神经网络图机器学习

本文链接：https://blog.csdn.net/WindGrin_/article/details/137894104

版权

图机器学习基础知识专栏收录该内容

22 篇文章 1 订阅

订阅专栏

本文介绍了斯坦福大学CS224W课程中关于机器学习中的图论应用，特别是网络社区检测方法，如基于Louvain算法的层次聚类和BigCLAM中的社区关联图模型。文章详细阐述了如何构建配置模型、计算modularity值以及检测重叠社区的过程。

摘要由CSDN通过智能技术生成

CS224W: Machine Learning with Graphs

Stanford / Winter 2021

13-communities

Network communities
- Sets of nodes with lots of internal connections and few external ones (to the rest of the network) (内部连边多，与网络其余部分连边少)
Null Model: Configuration Model

Given real $G$ on $n$ nodes and $m$ edges, construct rewired network $G^{'}$ . Consider $G^{'}$ as a multigraph (multiple edges exist between nodes)
- Same degree distribution but uniformly random connections (节点度分布相同，连边随机)
- The expected number of edges between nodes $i$ and $j$ of degrees $k_i$ and $k_j$ equals (两节点间期望连边数量)
  
  $k_{i} \cdot \frac{k_{j}}{2 m}=\frac{k_{i} k_{j}}{2 m}$
  - There are $2 m$ directed edges (counting $\rightarrow j$ and $\rightarrow i$ ) in total
  - For each of $k_i$ out-going edges from node $i$ , the chance of it landing to node $j$ is $k_j/2m$ , hence $k_ik_j/2m$
- The expected number of edges in (multigraph) $G^{'}$ (随机图 $G^{'}$ 的总期望边数)
  
  $\begin{aligned} &\frac{1}{2} \sum_{i \in N} \sum_{j \in N} \frac{k_{i} k_{j}}{2 m}=\frac{1}{2} \cdot \frac{1}{2 m} \sum_{i \in N} k_{i}\left(\sum_{j \in N} k_{j}\right)= \\ &\frac{1}{4 m} 2 m \cdot 2 m=m \end{aligned}$
- 所以，使用null model，图的节点分布与边总数都被保留
Modularity $Q$
- A measure of how well a network is partitioned into communities
- Given a partitioning of the network into groups disjoint $\in S$
- Modularity values take range $[- 1, 1]$
  - It is positive if the number of edges within groups exceeds the expected number
  - $Q$ greater than 0.3-0.7 means significant community structure

Louvain Algorithm

Louvain Algorithm

Overview
- For community detection
  - $O (n l o g n)$ run time
- Supports weighted graphs
- Can detect hierarchical communities
Louvain: $1^{st}$ phase (Partitioning)
- Put each node in a graph into a distinct community (one node per community) (将每个点都划分进一个单独的社区)
- For each node $i$ , the algorithm performs two calculations (处理节点的顺序影响最后的结果，但研究表明这种影响程度不大，可以忽略)
  - Compute the modularity delta ( $\Delta Q$ ) when putting node $i$ into the community of some neighbor $j$ (计算把节点 $i$ 放入其他社区的 $\Delta Q$ )
  - Move $i$ to a community of node $j$ that yields the largest gain in $\Delta Q$ (将节点 $i$ 移入产生最大 $\Delta Q$ 的社区)
- Phase 1 runs until no movement yields a gain (直到移动节点不产生任何的 $\Delta Q$ 变化)
Louvain: $2^{nd}$ phase (Restructuring)
- The communities obtained in the first phase are contracted into super-nodes, and the network is created accordingly (第一阶段检测出的每个社区都被转换成超级节点)
  - Super-nodes are connected if there is at least one edge between the nodes of the corresponding communities (如果超级节点所属的社区之间至少有一条连边，那么对应的超级节点之间也有连边)
  - The weight of the edge between the two super-nodes is the sum of the weights from all edges between their corresponding communities (超级节点之间连边的权重为对应的社区之间的所有连边权重之和)
- Phase 1 is then run on the super-node network (第一阶段继续在超级节点上进行，因此自底层向高层产生层次聚类的结果)

在这里插入图片描述

Modularity Gain

Modularity Gain

将节点 $i$ 从社区 $D$ 移动到社区 $C$ 所产生的 $\Delta Q$ 是多少？

$\Delta Q(D \rightarrow i \rightarrow C)=\Delta Q(D \rightarrow i)+\Delta Q(i \rightarrow C)$
Deriving $\Delta Q(i \rightarrow C)$
- First, we derive modularity within $C$ , i.e., $Q (C)$
  - $\boldsymbol{\Sigma}_{\boldsymbol{i n}} \equiv \sum_{i, j \in C} A_{i j}$ : sum of link weights between nodes in $C$
  - $\boldsymbol{\Sigma}_{\text {tot }} \equiv \sum_{i \in C} k_{i}$ : sum of all link weights of nodes in $C$
  $\equiv \frac{1}{2 m} \sum_{i, j \in C}\left[A_{i j}-\frac{k_{i} k_{j}}{2 m}\right]=\frac{\sum_{i, j \in C} A_{i j}}{2 m}-\frac{\left(\sum_{i \in C} k_{i}\right)\left(\sum_{j \in C} k_{j}\right)}{(2 m)^{2}} = \frac{\Sigma_{i n}}{2 m}-\left(\frac{\Sigma_{t o t}}{2 m}\right)^{2}$
- Second, further define something useful
  - $\boldsymbol{k}_{i, i n} \equiv \sum_{j \in C} A_{i j}+\sum_{j \in C} A_{j i}$ : sum of link weights between node $i$ and $C$
  - $\boldsymbol{k}_{\boldsymbol{i}}$ : sum of all link weights (i.e., degree) of node $i$
$\begin{aligned} &Q_{\text {before }}=Q(C)+Q(\{i\}) \\ &\quad=\left[\frac{\Sigma_{\text {in }}}{2 m}-\left(\frac{\Sigma_{\text {tot }}}{2 m}\right)^{2}\right]+\left[0-\left(\frac{k_{i}}{2 m}\right)^{2}\right] \end{aligned}$

$\begin{aligned} &Q_{\mathrm{after}}=Q(C+\{i\}) \\ &\quad=\frac{\sum_{i n}+k_{i, i n}}{2 m}-\left(\frac{\sum_{t o t}+k_{i}}{2 m}\right)^{2} \end{aligned}$

$\begin{aligned} \Delta Q(i \rightarrow C)=& Q_{\mathrm{after}}-Q_{\mathrm{before}} \\ =& {\left[\frac{\Sigma_{i n}+k_{i, i n}}{2 m}-\left(\frac{\Sigma_{t o t}+k_{i}}{2 m}\right)^{2}\right] } \\ &-\left[\frac{\Sigma_{i n}}{2 m}-\left(\frac{\Sigma_{t o t}}{2 m}\right)^{2}-\left(\frac{k_{i}}{2 m}\right)^{2}\right] \end{aligned}$
- $\Delta Q(D \rightarrow i)$ can be derived similarly
$\Delta Q(D \rightarrow i \rightarrow C)=\Delta Q(D \rightarrow i)+\Delta Q(i \rightarrow C)$

Detecting Overlapping Communities: BigCLAM

Detecting Overlapping Communities: BigCLAM

在这里插入图片描述

Community Affiliation Graph Model (AGM)

Community Affiliation Graph Model (AGM)

Key Insight: How is a network generated from community affiliations? (如何从社区的从属关系中生成出一个网络？)

Given parameters $V,C,M,\{p_c\})$
- Node in community $c$ connect to each other by flipping a coin with probability $p_c$ (同属同一个社区的两个节点之间有连边的概率为二项分布)
- Nodes that belong to multiple communities have multiple coin flips (从属多个社区的节点有多次连边的机会，因为在每个社区内都会有一次机会)
  
  $v)=1-\prod_{c \in M_{u} \cap M_{v}}\left(1-p_{c}\right)$
Key Insight: Detecting communities with AGM given a graph, find the model $F$ (如何从生成的网络中倒推AGM模型的参数(BigCLAM Model))

Maximum likelihood estimation
- Given real graph $G$
- Find model/parameters $F$ which
- 最大化Graph Likelihood $P (G ∣ F)$ ：最大化在图中连边的似然概率，最小化在图中没连边的似然概率
“Relax” the AGM: Memberships have strengths
- For community $C$ , we model the probability of $u$ and $v$ being connected as
  
  $P_{C}(u, v)=1-\exp \left(-F_{u C} \cdot F_{v C}\right)$
  其中， $F_{xx} \geq 0$ ， $\leq P_{C}(u, v) \leq 1$
  - 两个节点不相连，当且仅当双方至少有一个节点的 $F_C = 0$
  - 两个节点相连，当且仅当双方的 $F_C$ 都很大
- 对于从属多个社区的节点 $u$ 和 $v$ 来说，两者相连的概率为
  
  $v)=1-\prod_{C \in \Gamma}\left(1-P_{C}(u, v)\right)$
BigCLAM Model

$\mathrm{P}(u, v)=1-\exp \left(-F_{u}^{T} F_{v}\right)$
- Given a network $G (V, E)$ , we maximize the likelihood (probability) of $G$ under our model
  
  $\begin{aligned} P(G \mid \boldsymbol{F}) &=\prod_{(u, v) \in E} P(u, v) \prod_{(u, v) \notin E}(1-P(u, v)) \\ &=\prod_{(u, v) \in E}\left(\mathbf{1}-\exp \left(-\boldsymbol{F}_{u}^{\boldsymbol{T}} \boldsymbol{F}_{v}\right)\right) \prod_{(u, v) \notin E} \exp \left(-\boldsymbol{F}_{u}^{\boldsymbol{T}} \boldsymbol{F}_{v}\right) \end{aligned}$
- Likelihood involves a product of many small probabilities -> Numerically unstable (似然概率包含很多小概率值的相乘->数值不稳定)
- We consider the log likelihood (所以考虑最大化对数似然函数)
  
  $\begin{aligned} &\log (P(G \mid \boldsymbol{F})) \\ &=\log \left(\prod_{(u, v) \in E}\left(1-\exp \left(-\boldsymbol{F}_{u}^{T} \boldsymbol{F}_{v}\right)\right) \prod_{(u, v) \notin E} \exp \left(-\boldsymbol{F}_{u}^{T} \boldsymbol{F}_{v}\right)\right) \\ &=\sum_{(u, v) \in E} \log \left(1-\exp \left(-\boldsymbol{F}_{u}^{T} \boldsymbol{F}_{v}\right)\right)-\sum_{(u, v) \notin E} \boldsymbol{F}_{u}^{T} \boldsymbol{F}_{v} \\ &\equiv \ell(\boldsymbol{F}) \end{aligned}$
- Optimizing $\ell(\boldsymbol{F})$
  - Start with random membership $F$ and Iterate until convergence
  - For $\in V$
    - Update membership $F_u$ for node $u$ while fixing the memberships of all other nodes
    - Specifically, we do gradient ascent, where we make small changes to $F_u$ that lead to increase in log-likelihood
    $\nabla \ell(F)=\sum_{v \in \mathcal{N}(u)}\left(\frac{\exp \left(-F_{u}^{T} F_{v}\right)}{1-\exp \left(-F_{u}^{T} F_{v}\right)}\right) \cdot F_{v}-\sum_{v \notin \mathcal{N}(u)} F_{v}$
Time complexity of BigCLAM gradient ascent