[论文精读]Multi-View Attribute Graph Convolution Networks for Clustering

论文网址:用于聚类的多视图属性图卷积网络 |IJCAI公司

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用


1.1. 心得


(2)感觉多模态很多都是说smri, fmri, DTI三个的。但是好像EEG也能变成脑图诶,为什么似乎到目前为止没有看到把EEG来作为第四个模态的呢?而且我发现这几个模态的脑图谱不是共享的(好像AAL是可以用作fmri和smri),那这个节点数要怎么对齐啊?


2. 论文逐段精读

2.1. Abstract

        ①Existing GNNs ignore the node feature (其实有些还是带了) and graph reconstruction (看完后面的,作者似乎认为一次编码一次解码就是图重构

        ②They proposed a novel Multi-View Attribute Graph Convolution Networks (MAGCN) with two-pathway encoders for clustering. The first pathway is multiview attribute graph attention networks that can reduces noise, redundancy and learns the embedding features of multi-view graph data. The second pathway is consistent embedding encoders, which is able to capture the geometric relationship and the consistency of probability distribution among different views

2.2. Introduction 

        ①Transform graph data into low dimensional (减少特征数量~), compact (不那么分散) and continuous (不是离散数据) feature space is the capability of graph embedding

        ②GNN is suitable for handling single-view data rather than multi-view

        ③The limits of existing multi-view models: a) can not assign different weights for different neighbors, b) might ignore the node feature or graph reconstruction, c) do not consider the similarities between different views(啊啊这个也可以考虑的吗我要看看你是怎么考虑的)

        ④Existing GNN mostly focus on multi-graph(又是什么的多图?)instead of multi-attribute(你举社交网络的例子干嘛?说人们可以有多个属性,如工作、爱好等,那脑图的属性是啥?

 paragon  n. 完美典范,尽善尽美的人(或物);(100克拉以上的)无暇钻石

2.3. Related Work

        The authors enumerate some neighbor aggregation, attention based and multi view models

2.4. Proposed Methodology

2.4.1. Notation

        ①Defining a graph \mathbf{G}=\mathbf{(V,E)}(\mathbf{G}\in\mathbb{R}^{n\times n}), where \mathbf{V}=\{v_{1},v_{2},...,v_{n}\} denotes node set, \mathbf{E} denotes edge set, n denotes the number of nodes

        ②The attribute feature of nodes: \mathbf{X}_{m}=\{x_{m}^{1},...,x_{m}^{i},...,x_{m}^{n}\}(\mathbf{X}_{m}\in \mathbb{R}^{n\times d_{m}}),m=1,2,...,M, where M denotes the number of views

2.4.2. The Framework of MAGCN

        ①The overall framework:

they first encode \mathbf{X}_{m} to graph embedding \mathbf{H}_{m}=\{h_{m}^{1},...,h_{m}^{i},...,h_{m}^{n}\}(\mathbf{H}_{m}\in \mathbb{R}^{n\times d}) by multi-view attribute graph convolution encoders (green).Then transforming \mathbf{H}_{m} to consistent clustering embedding \mathbf{Z} by consistent embedding encoders (purple)

(1)Multi-view Attribute Graph Convolution Encoder

        ①The graph embedding function can be simply expressed as f_m(\mathbf{G},\mathbf{X}_m;\theta)\to\mathbf{H}_m, where \theta denotes the auto-encoder parameter

        ② Part of Multi-view Attribute Graph Convolution Encoder (MAGCE) for view m:

        ③The l-th output of MAGCE:


where \mathbf{G^{\prime}}=\mathbf{G}+\mathbf{I}_{N} denotes the " relevance coefficient matrix with added self-connection"(这是功能连接矩阵加了I还是邻接矩阵加了I还是别的啊);


\sigma denotes the activate function

        ④The l starts from 0 and end with L

        ⑤The learnable relevance matrix \mathbf{S} in the l-th layer:

\mathbf{S}=\varphi\left(\mathbf{G}\odot t_s^{(l)}\mathbf{H}_m^{(l)}\mathbf{W}^{(l)}+\mathbf{G}\odot t_r^{(l)}\mathbf{H}_m^{(l)}\mathbf{W}^{(l)}\right)

where t_s^{(l)} and t_n^{(l)}\in\mathbb{R}^{1\times d_l} denote the trainable parameters, \varphi denote activation function 

        ⑥Normalizing \mathbf{S} to get the final relevance coefficient \mathbf{G}:


where \mathbf{N}_{i} denotes the neighbors of node i

        ⑦The output of the \left ( i-1 \right )-th layer multi-view attribute graph convolution decoders:


        ⑧The reconstructed graph structure \mathbf{\hat{G}}_{m}^{ij}=\phi(-h_{m}^{i}\stackrel{\top}{h_{m}^{j}}) and \phi \left ( \cdot \right ) denotes the inner product operator

        ⑨The reconstruction loss:


(2)Consistent Embedding Encoders

        ①Reducing the dimensionality of graph embeddings by mapping function:


where \eta denotes the encoder parameter

        ②The similarity between two views can be Manhattan Distance, Euclidean distance, cosine similarity, etc.

        ③The loss function of geometric relationship consistency:

\mathcal{L}_{geo}=\min_{\eta}\sum_{i\neq j}^{M}\left\|\mathbf{Z}_{i}-\mathbf{Z}_{j}\right\|_{F}^{2}

        ④Defining the adaptive fusion \mathbf{Z}\mathrm{=}\sum_{m=1}^{M}\beta_{i}\mathbf{Z}_{i}

        ⑤The original probability distribution \mathbf{Q} of \mathbf{Z} with t-distribution: 


where \{\mu_{j}\}_{j=1}^{k} denotes the k initial clauster centroids, \alpha denotes the degree of freedom, q_{ij} denotes the probability of assigning node i to cluster j

        ⑥The target probability distribution \mathbf{P} of \mathbf{Z}:


where f_{j}=\sum_{i}q_{ij} denotes soft cluster frequencies



2.4.3. Task for Clustering

        ①Total loss function:

{\mathcal L}=\min_{g,c,\mathbf{P}}{\mathcal L}_{re}+\lambda_{2}{\mathcal L}_{geo}+\lambda_{3}{\mathcal L}_{pro}

        ②Clustering label of node iy_i=\arg_k\max_k\left(p_{ik}\right)

2.5. Task for Clustering

2.5.1.  Experimental Setting

(1)Metrics and Databases

        ①Dataset: Cora, Citeseer, Pubmed

        ②Evaluation metrics: clustering accuracy (ACC), normalized mutual information (NMI) and average rand index (ARI)

        ③View 2 creating: adopting Fast Fourier Transform (FFT), Gabor transform, Euler transform and Cartesian product in view 1

(2)Implementation Details

        ①The node representation dimensions of the two layer in Cora is [512, 512], in Citeseer is [2000, 512], in Pubmed is [128, 64].    

        ②They adopt fully connected layer in in integrate-encoder in all datasets

        ③Activate function: ReLU

        ④\lambda _1=1,\lambda _2=10^{-2},\lambda _3=10^2

(3)Comparison Algorithms

        ①node attribute: K-Means

        ②graph structure: Graph Encoder, DeepWalk, denoising autoencoder for graph embedding (DNGR) and modularized nonnegative matrix factorization (M-NMF)

        ③graph structure & node attribute: graph autoencoders (GAE) and variational graph auto-encoders (VGAE), marginalized graph autoencoder (MGAE), adversarial regularized graph autoencoder (ARGAE) and adversarial variational regularized graph autoencoder (ARVGAE), deep attentional embedding graph clustering (DAEGA) and graph attention auto-encoders (GATE)

         ④deep multi-view clustering: deep canonical correlation analysis (DCCA) and deep typical correlated autoencoder (DCCAE)

2.5.2. Experimental Results

(1)Evaluation Metrics with Comparison Algorithms

        ①Comparison table:

(2)Analysis of Probability Distribution Consistency

        ①Through iterations, \mathbf{Q}_1\mathbf{Q}_2 and \mathbf{P} steadily learn more accurate prediction capability:

where the x-axis denotes the clusters and the y-axis denotes the cluster probability

(3)Impact of Parameters

        ①Controlling variables method is used for analyzing the three regularization parameters:

(4)Analyzing Different View 2

        ①Comparison that using different methods to construct view 2

2.6. Conclusions

        As a model which contains dual encoders, MAGCN reconstructs the high dimensional features and integrates low dimensional consistent information

3. 知识补充

3.1. Discrete Data



  1. 分类数据:例如,在描述一个人的性别时,通常使用“男”或“女”这样的标签,这些标签之间是不连续的。类似地,描述血型(A、B、AB、O)或民族时也存在不连续性。

  2. 整数特征:当特征值只能取整数时,特征空间也是不连续的。例如,描述一个物体的数量或某个指标的评分等级时。

  3. 二进制特征:二进制特征只能取0或1,这种特征空间显然是不连续的。这在很多计算机视觉和机器学习的应用中都很常见,比如某些特征是否被激活或存在。

  4. 时间戳数据:虽然时间本身是连续的,但当我们以特定的时间间隔(如小时、天、月等)来记录数据时,特征空间就变得不连续了。

  5. 地理数据:在地理信息系统中,地理坐标(经度和纬度)虽然理论上可以是连续的,但由于数据获取的限制或处理的需要,可能只记录特定地点的数据,从而形成不连续的特征空间。

  6. 基因序列数据:在生物信息学中,基因序列由一系列离散的碱基对(A、T、C、G)组成,这些碱基对之间的变化是不连续的。

  7. 文本数据:在处理文本数据时,词语或短语作为特征,它们之间的转换通常也是不连续的。尽管可以通过词嵌入等方法将文本数据映射到连续空间,但原始的词或短语空间仍然是不连续的。



3.2. Multi-modality and multi-view


感觉在脑图这边,不同的成像方式(EEG, fmri, smri, DTI, CT)之类的叫多模态





4. Reference List

Cheng, J. et al. (2020) 'Multi-View Attribute Graph Convolution Networks for Clustering', Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, pp. 2973-2979. doi: Multi-View Attribute Graph Convolution Networks for Clustering | IJCAI

