论文网址:Multi-view graph convolutional networks with attention mechanism - ScienceDirect
论文代码:Multi-View GCNs with Attention Mechanism (MAGCN)
英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!
目录
2.5. Multi-view graph convolutional network with attention mechanism
1. 省流版
1.1. 心得
(1)是节点分类任务,和脑一般应用的图分类不一样,省时间我感觉可以略过了...
(2)2022年真是GNN发展迅速的一年...
(3)倒是很难用于脑网络了...脑网络怎么能拥有多视图的,不会除了功能连接矩阵还有结构连接矩阵吧...虽然说是可以的但是医学上好像说了那些ASD,AD之类的和脑结构没啥关联...应该是不会改变脑结构的
(4)多视图倒是很有趣的想法,虽然不知道在生化领域算不算新颖,脑网络这还暂时没遇到过。只是感觉这种东西很需要数据啊,有这么多原始信息吗?
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
①Most of the GCN based models rely on fixed adjacency matrices, namely single view topology of the underlying graph(底层图的单视图拓扑...emm...我不能非常好地解释这玩意儿)
②⭐However, it limits and goes wrong when there are collection issues
③They provide a Multi-View Graph Convolutional Networks with Attention Mechanism (MAGCN) model combined with topological multi-view graph and attention based feature aggregation approach
④MAGCN handles node classification problem...(完,一下给我干沉默了,算了看都看了继续看吧)
error-prone 容易出错;易错;易于出错的;错误倾向
2.2. Introduction
①For node classification, it might be different in structure between training domain and target domain
②They aim to base on multi-view graphs namely with multi adjacency matrix to construct an approximate graph structure
③Briefly introduce their model
2.3. Related work
①Spatial based methods such as diffusion convolutional neural networks (DCNN), Graph-SAGE, MoNet, MPNN, graph isomorphism networks (GIN)
②Spectral based approaches such as GCN, ChebNet
③Those models which consider the topology and attention mechanism do not contain multi-view graph
disentangle vt.使解脱;使摆脱;分清,清理出(混乱的论据、想法等);理顺;解开…的结;使脱出
2.4. Preliminaries
①An undirected graph , where denotes the node set and the number of node is , denotes the set of edges, represents the adjacency matrix
②Then a multi-view graph can be , where denotes the number of views. Furthermore, the representation can be simplified as
③The graph Fourier transform (GFT) is , where , denotes the eigenvector matrix of , a the normalized graph Laplacian matrix. denotes the degree matrix
④The graph convolution operator in Fourier domain with:
⑤The graph convolution with convolution operator :
and it can be approximated by:
to reduce the computation complexity, where represents scaled Laplacian matrix, represents the largest eigenvalue, denotes the Chebyshev polynomial of order , denotes the Chebyshev coefficient
⑥The filter :
where ;
means it contains the self-loops;
;
is the trainable weight matrix of
2.5. Multi-view graph convolutional network with attention mechanism
①Define the graph ,where denotes nodes with feature respectively.Combining all the features, there are feature matrix
②The traditional GCN can be , where denotes the designed activation function
③They change the sigle view graph to by information theory
④The overall framework, which contains two multi-GCN blocks and one multiview attention block:
(为什么多视图图有5个节点捏?N个拓扑,也有特征矩阵。
作者意思是最开始是M,然后经过unfold变成F,即,
GAP之后变成,
merge with softmax之后变成)
conundrum n.难题;谜语;复杂难解的问题;令人迷惑的难题
(1)Multi-GCN(unfold)block
①Input:
②Output: with approach , where denotes the number of views
(2)Attention block
①Attention block combined with identity stage and attention distribution learning stage. Identity stage maps to by . Attention distribution learning stage includes global average pooling (GAP) and a MLP
②The schematic of GAP:
③The traditional GAP is: where denotes the layer and
④In order to change the weight of each , the authors propose a graph GAP:
where represents the neighbors of the -th node on the -th view;
denotes the graph aggregation and reflectes the improvement of the model
⑤Then, learn the weights through MLP
⑥With , mapping the with
(3)Multi-GCN(merge)
①Classify the with , where is the number of classes
②According to the semi-supervised method, they apply cross-entropy error as the loss:
where is the set of labeled nodes, is the label indicator matrix
2.6. Theoretical analysis
严谨地用数学论证了为啥他们提出来的是好的,超过了我的数学能力,暂时不看
2.7. Experiments
①They apply attack simulations with different levels of topology perturbations to prove the robustness of MAGCN
②The datasets:
③The output dimension of multi-GCN (unfold): 16
④Layers of MLP in attention: 3
⑤Numbers of neurons in the first, second and last layers: 6, 3 and the number equals to the views respectively
⑥Optimizer: Adam
⑦Learning rate: 0.01
⑧Weight decay: 0.0005
⑨Weight initialization: Glorot uniform initializer
⑩Dropout rate: 0.5
⑪⭐Number of views: all apply 3 in 3 datasets, topology, feature similarity between nodes and text similarity (值大于某个阈值则加边)
⑫Comparisons with 10 runs:
⑬Choices in ablation study:
GCN+View 1: GCN with view 1 (the given adjacency matrix) |
GCN+View 2: GCN with view 2 (the similarity-based graph) |
GCN+View 3: GCN with view 3 (the b-matching graph) |
MLP+GCN+View 1,2,3: GCN with three views via a standard MLP |
MAGCN+View 1,2,3: Our MAGCN with three views |
and the comparison:
⑭Visualize the result by t-SNE (the left is GCN and the right one is MAGCN):
⑮Robustness analysis with random topology attack (RTA): randomly delete some edges with rate from 0.1 to 1:
⑯Robustness analysis with low label rates (LLR): label rate sets are {0.025, 0.02, 0.015, 0.01, 0.005}:
⑰The other MAGCN with cosine similarity: . There are ablation choices as well:
GCN+View 1: GCN with view 1, i.e., the given adjacency matrix |
GCN+View 2: GCN with view 2, i.e., the similarity-based graph |
GCN+View 2^⁎: GCN with view 2^⁎, i.e., the weighted trainable similarity-based graph based on cosine similarity |
MAGCN+View 1,2: MAGCN with the view 1 and view 2 |
MAGCN+View 1,2^⁎: MAGCN with the view 1 and view 2^⁎ |
and the comparison:
2.8. Conclusion
The MAGCN is able to capture the node features from different hops of neighbors
3. 知识补充
3.1. Discrete convolution
3.2. Information theory
参考学习:信息论入门教程 - 阮一峰的网络日志 (ruanyifeng.com)
4. Reference List
Yao K. et al. (2022) 'Multi-view graph convolutional networks with attention mechanism', Artificial Intelligence, 307. doi: Redirecting