Contrastive Graph Poisson Networks: Semi-Supervised Learning with Extremely Limited Labels

最新推荐文章于 2022-07-26 22:20:28 发布

_penghnu

最新推荐文章于 2022-07-26 22:20:28 发布

阅读量408

点赞数 3

文章标签：算法

原文链接：https://shiruipan.github.io/publication/neurips-21-wan/

版权

针对图神经网络在极少量标签下性能下降的问题，本文提出了一种名为CGPN的新框架。CGPN结合了图泊松网络和GAT，生成两种可比较的图视图，并利用对比学习从大量未标记数据中挖掘额外的监督信号。CGPN通过对比传播有限的标签信息，提高了模型在极低标签情况下的学习能力，实验证实在处理低标签率的节点分类任务时表现优越。

摘要由CSDN通过智能技术生成

To tackle semi-supervised node classification with extremely limited labels, Wan et al. proposed a CGPN framework to utilize graph Poisson network and GAT to generate two comparable views and then applies a contrastive objective to exploit the supervision information from the massive unlabeled data.

This paper mainly uses contrastive learning to find supervision signals from a large number of unlabeled nodes, and uses graph Poisson network and GAT to generate two views respectively. This paper improves the node classification problem under its limited labels.

Abstract：Most existing GNN models require sufficient labeled data for effective network training. Their performance can be seriously degraded when labels are extremely limited. To address this issue, we propose a new framework termed Contrastive Graph Poisson Networks(CGPN) for node classification under extremely limited labeled data.因为目前大多数的GNN模型需要足够的标签数据才能进行有效的训练，当标签数量极少时性能会急剧下降。为了解决极少标签的问题提出了用来做节点分类的对比图泊松学习。Essentially, our CGPN can enhance the learning performance of GNNs under extremely limited labels by contrastively propagating the limited labels to the entire graph. 从本质上讲，我们的CGPN可以通过对比地将有限的标签传播到整个图中，从而提高GNN在极有限标签下的学习性能。

Introduction：Graph-based Semi-Supervised Learning (SSL) refers to classifying unlabeled data based on a handful of(为数不多的)labeled data and a given graph structure indicating the connections between all data. Recently, graph-based SSL has attracted increasing attention due to its solid mathematical foundation, and satisfactory performance. Current GNNs, such as Graph Convolutional Networks (GCNs) and graph attention networks (GATs), require sufficient labeled data to obtain satisfactory generalization abilities. Unfortunately, the reliance on sufficient labeled data increases the burden of data collection, and the number of labels can be extremely limited in some real-world scenarios. The performance of most current GNNs seriously declines as the label size shrinks, since the scarce supervision signals are insufficient to train a model with satisfactory discriminative ability. (基于图的半监督学习是根据少量有标记的数据和表示所有数据之间连接的给定图结构对未标记的数据进行分类，近来，基于图的SSL由于其坚实的数学基础和令人满意的性能而受到越来越多的关注，目前的GNN，比如图卷积网络(GCNs)和图注意网络(GATs)，需要足够的标记数据才能获得满意的泛化能力。但是，对足够标签数据的依赖增加了数据收集的负担，并且标签的数量在某些现实场景中可能非常有限。)

In line with the aforementioned observations(根据上述观察), this paper proposes a new framework termed Contrastive Graph Poisson Networks (CGPN) to address the problem of semi-supervised node classification under extremely low label rates. Deriving from the variational inference(变分推理), our proposed CGPN framework approximates the intractable posterior with a surrogate distribution, where two types of GNNs have been adopted for instantiation.(我们提出的CGPN框架用代理分布近似解决了棘手的后验问题，其中采用了两种类型的GNN进行实例化。)

we first design a new Graph Poisson Network (GPN) to propagate the limited labels to the entire graph effectively. Meanwhile, we exploit another GNN, such as GAT, together with the proposed GPN to model the approximated posterior according to variational inference.

On this basis, we acquire predictions from two comparable views, where a contrastive objective can be naturally incorporated to jointly refine the learning process of the GPN and GNN models. Moreover, the supervision signals implicitly contained in the massive unlabeled data can be exploited with the formulated contrastive loss. As a result, the model learning ability of our proposed framework can be lifted. Experimental results on benchmark datasets confirm the strong benefits of our proposed CGPN when dealing with semi-supervised node classification at very low label rates. (在此基础上，我们从两种可比较的视图获得预测，可以自然地结合一个对比的目标，共同细化GPN和GNN模型的学习过程。此外，利用制定的对比损耗可以利用隐含在大量未标记数据中的监督信号。因此，我们提出的框架的模型学习能力可以提高。在基准数据集上的实验结果证实了我们提出的CGPN在处理极低标签率的半监督节点分类时的强大优势。)

本文的贡献：

First, we propose a novel GNN framework termed CGPN to solve the semi-supervised node classification with extremely limited labels. CGPN significantly outperforms the existing GNNs.

Second, we design a new Graph Poisson Network (GPN). Different from the Poisson learning algorithm, our GPN incorporates graph-structure information and could be trained in an end-to-end manner to guide the propagation of labels more flexibly.

Third, we integrate contrastive learning into the variational inference framework, so that extra supervision information can be explored from the massive unlabeled data to help train our CGPN framework.

Related work:

Graph-based semi-supervised learning

The early graph-based techniques are designed based on the simple assumption that nearby nodes are likely to have the same label. (早期的基于图的技术是基于一个简单的假设设计的，即附近的节点可能有相同的标签。)This goal can be achieved through the low-dimensional embeddings with Laplacian eigenmaps(拉普拉斯算子).

Meanwhile, graph partition offers another important line in graph-based

SSL. To further enhance the learning capacities, various techniques have been proposed to model the data features and graph structure jointly.

Recently, a set of graph-based SSL approaches have been proposed to improve the performance of the above-mentioned techniques.

Graph neural networks

Early-staged works aim to derive diverse types of graph convolution in spectral-domain based on the graph spectral theory. ( 早期的工作旨在基于图谱理论推导出不同类型的谱域图卷积) Another line of research efforts focus on directly performing graph convolution in the spatial domain. In spatial GNN models, the convolution operation is defined as a weighted average function over the neighbors of each node, which characterizes the impact exerting to the target node from its neighboring ones. (另一个研究方向是直接在空间域中进行图卷积。在空间GNN模型中，卷积运算被定义为每个节点邻居上的加权平均函数，它描述了邻近节点对目标节点的影响。)

Methodology ：

Inference framework（推理框架）

To infer the labels of the unlabeled nodes, i.e. , we need to estimate the posterior distribution given the node features X, the observed labels , and the adjacency matrix A, namely (|A, X, ) with parameters θ. Computation of this posterior is usually analytically intractable, so we resort to approximate posterior inference methods. ( 这个后验的计算通常在分析上是难以处理的，因此我们求助于近似的后验推理方法。) Inspired by the recent advances in scalable variational inference(可伸缩的变分推理), we introduce a distribution (|A, X, ) parameterized by ϕ to approximate the true posterior (|A, X, ). Afterwards, we can write the Evidence Lower BOund (ELBO) as:

在这个公式当中，(||)表示的是两个分布之间的散度，在实际使用中，仍然需要使用GNN指定(|A, X, )和(Y|A, X)的参数形式。本文我们用图泊松网络对(|A, X, )进行实例化，用GAT对(Y|A, X)进行实例化。

Instantiations（实例化）

The superiority of Poisson learning over traditional Laplacian learning has been proven both theoretically and experimentally at very low label rates. However, the graph structure has not been fully leveraged to guide the

propagation of labels in Poisson learning. Concretely, Poisson learning relies on a fixed graph which can be noisy in reality, and thus the intrinsic relationships among graph nodes cannot be well explored. Meanwhile, the structural information constituted by the neighboring node features has not been exploited, since Poisson learning mainly emphasizes the propagation of the input label information. As a consequence, inaccurate label predictions can be accumulated with iterative propagation, which inevitably results in performance degradation. To handle these difficulties, we propose a more flexible GNN model called ‘Graph Poisson Networks’ (GPN).Inspired by GAT, we intend to adaptively capture the importance of the neighbors exerting to the target node via attention mechanism. In this way, the graph informatireasonable.

Note that when we employ GAT to instantiate(Y|A, X)on can be gradually refined via network training, which makes the propagation of labels more , the attention coefficients are shared(共享参数) across GPN and GAT, so that the scale of network parameterscan be reduced. Additionally, the GAT can help guide the propagation of labels through the shared attention coefficients.

Contrastive label inference(对比标签推理)

We intend to leverage the supervision signals beyond the limited labels. In this paper, contrastive learning is utilized to explore extra supervision information from the massive unlabeled data for model training, which can improve the performance of label inference. To be specific, we maximize the agreement between the predictions of the same node that are generated

from (|A, X, ) and (Y|A, X), i.e., and . Meanwhile, we pull the predictions of different node pairs away. As a result, the pairwise contrastive loss between and can be defined as:

在这个公式当中，表示可调的温度参数，<,>表示内积，分子表示不同视图的相同节点正样本，分母第一项同分子，后两项分别表示不同视图的不同节点和相同视图的不同节点的负样本。根据这个公式，需要最小化的总体对比目标为：

In addition to the contrastive loss, a standard multiclass softmax cross-entropy loss should also be applied to penalize the difference between the outcomes of (|A, X, ) and the ground-truth labels, i.e., and . Hence, by assigning the weight hyperparameters and to and correspondingly, we arrive at the total loss as:

CGPN的框架结构：

CGPN中的对比学习：

Experiments:

For all the adopted datasets, we randomly choose one, two, three, and four labeled nodes per class for training, respectively, in order to evaluate the model performance under label-scarce settings. The hyperparameters, such as the number of hidden units and the learning rate, are determined via grid search. In our experiments, the original architecture of GCN is adopted in both the baselines and CGPN-GCN. In CGPN-GA T, the attention coefficients are shared between GPN and GAT, where only the single-head attention mechanism is utilized for simplicity.