Graph Clustering with Graph Neural Networks

lynn_lxy

于 2023-12-13 19:28:51 发布

阅读量171

点赞数 2

文章标签： python 聚类深度学习

本文链接：https://blog.csdn.net/qq_59269283/article/details/134976346

版权

（写在前面：这是小白第一次写的论文阅读，写的目的是给自己讲课，希望用自己的方式把论文读懂，所以可能很多地方不甚详尽，如有错误敬请指正）

一、Main idea--Deep Modularity Networks(DMoN)

在详细解释前先理解两个我在读文章前产生的问题

1.DMoN 是什么

这是插入GNN的一个层，就像GNN中通常出现的卷积层池化层一样

2.DMoN在这里的作用是什么

如标题所言，ta可以进行Graph Clustering，而ta的实现方法是对 modularity 进行优化

那么由此可以引入一个文章中提及的 Preliminarie --Modularity:

模块化衡量簇内边缘与预期边缘之间的差异，其公式如下

其中m代表图中所有边的度数之和，Aij代表i和j点之间边的度，di和dj表示点i与j的链接的边度数和，δ(ci,cj)=1时两个点在同一簇内，这里注意Modularity的取值范围为（-1/2,1]。

由于最大化模块化度的问题被认为是NP困难的，所以文章中介绍了Spectral Modularity Maximization，采用spectral relaxation来有效的解决这一问题，ta的定义如下：

C是聚类分配矩阵，d 为度向量，而B的定义如下：

新的问题来了，那么C实际上是什么呢，C是B的前k个特征值对应的特征向量形成的矩阵。

现在暂时回归正题，DMoN是什么。According to the paper, 它是一种使用图神经网络进行属性聚类的方法，受模块化质量函数及其谱优化的启发，文章中提出了一种完全可微的无监督聚类目标，该目标使用空模型来优化软聚类分配，以控制图中的不均匀性。

首先，通过softmax方法得到软聚类分配C，公式如下：

关于上述公式中的GCN，文中的Preliminaries有提到本文对传统的GCN框架做了两个改进：一是将relu方法换成selu方法，二是删除self-loop creation而使用Wskip可训练的跳跃连接。

这里的A是归一化的图邻接矩阵，计算公式也非常简单：

这个时候发现，计算Spectral Modularity Maximization所需的B和C都已经可以得到了，那么通过最大化模块化来寻找聚类的公式似乎是呼之欲出了，然而DMoN的公式如下：

DMoN除了考虑到了Modularity之外，明显还考虑了另外一个因素，那就是Collapse- regularization ：

我们要意识到如果不对聚类矩阵C加以约束，很容易出现问题，文中提到，如果不对C加以约束，最小割和模块化目标的谱聚类都会出现虚假的局部最小值，也就是说将所有节点分配到同一聚类会产生一个平凡的局部最优解，该解决方案会陷入基于梯度的优化方法。那么Collapse regularization就是防止这一现象发生的。

Collapse regularization 是一种宽松的约束，可以防止琐碎的划分，同时不主导主要目标的优化，很明显它是对每一个簇矩阵求和后求Frobenius范数，当簇完全平衡时，它的值为零。除了设置Collapse regularization ，作者还在softmax前加入了一个dropout层来缓解平凡簇的发生。

这一理论的亮点是，优化Collapse regularization 不仅可以避免出现平凡簇，还不会影响整体目标函数的渐近性能。

二、Code

class DMoN(tf.keras.layers.Layer):

  def __init__(self,
               n_clusters,
               collapse_regularization = 0.1,
               dropout_rate = 0,
               do_unpooling = False):
  """Initializes the layer with specified parameters."""
    super(DMoN, self).__init__()
    self.n_clusters = n_clusters
    self.collapse_regularization = collapse_regularization
    self.dropout_rate = dropout_rate
    self.do_unpooling = do_unpooling

  def build(self, input_shape):
    """Builds the Keras model according to the input shape."""
    self.transform = tf.keras.models.Sequential([
        tf.keras.layers.Dense(
            self.n_clusters,
            kernel_initializer='orthogonal',
            bias_initializer='zeros'),
        tf.keras.layers.Dropout(self.dropout_rate)
    ])
    super(DMoN, self).build(input_shape)

  def call(
      self, inputs):
    """Performs DMoN clustering according to input features and input graph. """
   
    features, adjacency = inputs

    assert isinstance(features, tf.Tensor)
    assert isinstance(adjacency, tf.SparseTensor)
    assert len(features.shape) == 2
    assert len(adjacency.shape) == 2
    assert features.shape[0] == adjacency.shape[0]

    assignments = tf.nn.softmax(self.transform(features), axis=1)
    cluster_sizes = tf.math.reduce_sum(assignments, axis=0)  # Size [k].
    assignments_pooling = assignments / cluster_sizes  # Size [n, k].

    degrees = tf.sparse.reduce_sum(adjacency, axis=0)  # Size [n].
    degrees = tf.reshape(degrees, (-1, 1))

    number_of_nodes = adjacency.shape[1]
    number_of_edges = tf.math.reduce_sum(degrees)

    # Computes the size [k, k] pooled graph as S^T*A*S in two multiplications.
    graph_pooled = tf.transpose(
        tf.sparse.sparse_dense_matmul(adjacency, assignments))
    graph_pooled = tf.matmul(graph_pooled, assignments)

    # We compute the rank-1 normaizer matrix S^T*d*d^T*S efficiently
    # in three matrix multiplications by first processing the left part S^T*d
    # and then multyplying it by the right part d^T*S.
    # Left part is [k, 1] tensor.
    normalizer_left = tf.matmul(assignments, degrees, transpose_a=True)
    # Right part is [1, k] tensor.
    normalizer_right = tf.matmul(degrees, assignments, transpose_a=True)

    # Normalizer is rank-1 correction for degree distribution for degrees of the
    # nodes in the original graph, casted to the pooled graph.
    normalizer = tf.matmul(normalizer_left,
                           normalizer_right) / 2 / number_of_edges
    spectral_loss = -tf.linalg.trace(graph_pooled -
                                     normalizer) / 2 / number_of_edges
    self.add_loss(spectral_loss)

    collapse_loss = tf.norm(cluster_sizes) / number_of_nodes * tf.sqrt(
        float(self.n_clusters)) - 1
    self.add_loss(self.collapse_regularization * collapse_loss)

    features_pooled = tf.matmul(assignments_pooling, features, transpose_a=True)
    features_pooled = tf.nn.selu(features_pooled)
    if self.do_unpooling:
      features_pooled = tf.matmul(assignments_pooling, features_pooled)
    return features_pooled, assignments

很明显重点在于第三个函数call，首先通过softmax得到assignments是原文中的C，而为了能够与B做矩阵乘需要取前k列，也就是代码中的assignments_pooling，spectral_loss和collapse_loss明显是原文中的Modularity和Collapse- regularization将这些数值计算后加入到loss中，方便进行迭代优化。

lynn_lxy

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
2
评论
Graph Clustering with Graph Neural Networks

Collapse regularization 是一种宽松的约束，可以防止琐碎的划分，同时不主导主要目标的优化，很明显它是对每一个簇矩阵求和后求Frobenius范数，当簇完全平衡时，它的值为零。，很容易出现问题，文中提到，如果不对C加以约束，最小割和模块化目标的谱聚类都会出现虚假的局部最小值，也就是说将所有节点分配到同一聚类会产生一个平凡的局部最优解，该解决方案会陷入基于梯度的优化方法。现在暂时回归正题，DMoN是什么。新的问题来了，那么C实际上是什么呢，C是B的前k个特征值对应的特征向量形成的矩阵。
复制链接

扫一扫