【论文】Semi-Supervised Classification with Graph Convolutional Networks

最新推荐文章于 2024-04-11 11:07:33 发布

Andrearn

最新推荐文章于 2024-04-11 11:07:33 发布

阅读量659

点赞数 3

分类专栏： Graph

本文链接：https://blog.csdn.net/u013588351/article/details/104506424

版权

本文介绍了一种可扩展且高效的基于图卷积神经网络的半监督学习方法，该方法直接在图上操作。通过局部一阶近似谱图卷积，模型在图边数量上线性缩放，同时编码局部图结构和节点特征。实验表明，这种方法在多种数据集上优于现有方法。

摘要由CSDN通过智能技术生成

Semi-Supervised Classification with Graph Convolutional Networks

1 Introduction
2 Fast Approximate Convolutions on Graphs
- 2.1 Spectral Graph Convolutions
- 2.2 Layer-Wise Linear Model
3 Semi-Supervised Node Classification
- 3.1 Example

We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

We present

a scalable approach
for semi-supervised learning
on graph-structured data
based on an efficient variant of convolutional neural networks
operate directly on graphs.

问题：

本文是如何做到scalable的？
本文是如何做到efficient的？
为什么强调directly on graph？难道还有indirectly on graph吗？

We motivate the choice of our convolutional architecture

via a localized first-order approximation
of spectral graph convolutions.

问题：
4. 什么叫做localized first-order approximation？

Our model

scales linearly in the number of graph edges
and learns hidden layer representations that encode
- both local graph structure
- and features of nodes.

问题：
5. 什么叫做 scales linearly in the number of graph edges？
6. 如何完成对 local graph structure和features of nodes的编码？

In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

我们提出了一种基于图结构数据的半监督学习的可扩展方法，该方法基于卷积神经网络的有效变体，该变体直接在图上运行。我们通过频谱图卷积的局部一阶逼近来激发卷积结构的选择。我们的模型在图边缘的数量上线性缩放，并学习对局部图结构和节点特征进行编码的隐藏层表示。在引文网络和知识图数据集上进行的大量实验中，我们证明了我们的方法比相关方法有明显的优势。

1 Introduction

We consider the problem of classifying nodes (such as documents) in a graph (such as a citation network), where labels are only available for a small subset of nodes. This problem can be framed as graph-based semi-supervised learning, where label information is smoothed over the graph via some form of explicit graph-based regularization (Zhu et al., 2003; Zhou et al., 2004; Belkin et al., 2006; Weston et al., 2012), e.g. by using a graph Laplacian regularization term in the loss function:

考虑一个图上的节点分类问题（只知道部分节点的标签）——即基于图的半监督学习问题。目标函数一般需要加上一个正则项来平滑处理——比如图拉普拉斯正则项：

在这里插入图片描述
最小化相连节点的目标标签的距离（基于相邻结点相似度更高的假设）。

（关于图像拉普拉斯算子：https://blog.csdn.net/songzitea/article/details/12842825）
（关于图拉普拉斯算子（比较通俗易懂）：https://zhuanlan.zhihu.com/p/50742283）

思考：为什么要加上这个正则项呢？因为这是一个半监督学习问题——我们我们只有一部分节点的标签，但是却需要对所有节点进行预测。如果只有 $L_0$ 作为目标函数，那么我们就只能预测带标签的节点，而对于没有标签的节点则完全没有参与到运算当中，我们缺少梯度信息，无法对没有标签的节点进行预测。而加上了正则项之后，没有标签的节点的预测值也加入了目标函数（这是通过一个有意义的先验实现的，相邻节点的标签有相似性）。而本文则提出了另一种方法，使得不需要加上这个正则项，也可以对没有标签的节点进行预测。

问题：
7. 本文使用了怎样的方法，使得只使用部分节点的标签，却能对所有节点进行预测？

In this work, we encode the graph structure directly using a neural network model $f (X, A)$ and train on a supervised target $L_0$ for all nodes with labels, thereby avoiding explicit graph-based regularization in the loss function. Conditioning $f (\cdot)$ on the adjacency matrix of the graph will allow the model to distribute gradient information from the supervised loss $L_0$ and will enable it to learn representations of nodes both with and without labels

本文直接使用神经网络 $f (X, A)$ 对节点分类，使用带有标签的节点构建目标函数 $L_0$ ，因此不需要拉普拉斯正则项。这样训练出来的模型不仅能对带标签的节点输出预测标签，也能对不带标签的节点进行预测（依然是半监督模型），因为在运算过程中，通过邻接矩阵A能够把梯度信息分发到整个图上（包括不带标签的节点）。

文章说是通过邻接矩阵A能够把梯度信息分发到整个图上，这到底是怎么一回事呢？

Our contributions are two-fold.

Firstly, we introduce a simple and well-behaved layer-wise propagation rule for neural network models which operate directly on graphs and show how it can be motivated from a first-order approximation of spectral graph convolutions (Hammond et al., 2011).
Secondly, we demonstrate how this form of a graph-based neural network model can be used for fast and scalable semi-supervised classification of nodes in a graph.
Experiments on a number of datasets demonstrate that our model compares favorably both in classification accuracy and efficiency (measured in wall-clock time) against state-of-the-art methods for semi-supervised learning.