DGL官方教程--Relational graph convolutional network

最新推荐文章于 2024-03-19 15:59:24 发布

平湖片帆

最新推荐文章于 2024-03-19 15:59:24 发布

阅读量3.7k

点赞数 2

分类专栏：图神经网络及其变体文章标签：机器学习深度学习人工智能 python 神经网络

原文链接：https://docs.dgl.ai/en/latest/tutorials/models/1_gnn/4_rgcn.html

版权

图神经网络及其变体专栏收录该内容

5 篇文章 7 订阅

订阅专栏

Note:
Click here to download the full example code

Relational graph convolutional network

Author: Lingfan Yu, Mufei Li, Zheng Zhang

在本教程中，您将学习如何实现关系图卷积网络（R-GCN）。这种类型的网络是一种通用化GCN的工作，旨在处理知识库中实体之间的不同关系。要了解有关R-GCN背后的研究的更多信息，请参阅使用图卷积网络对关系数据建模。

简单的图卷积网络（GCN）和 DGL教程）利用数据集的结构信息（即图连接性）来改善节点表示的提取。图的边缘保留为未键入。

知识图由主题，关系，对象形式的三元组集合组成。因此，边缘对重要信息进行编码，并具有自己的嵌入要学习。此外，在任何给定对之间可能存在多个边缘。

A brief introduction to R-GCN

在统计关系学习（SRL）中，有两个基本任务：

实体分类 -在其中为实体分配类型和分类属性。
链接预测 -在其中恢复丢失的三元组。
在这两种情况下，期望从图的邻域结构中恢复丢失的信息。例如，前面引用的R-GCN论文提供了以下示例。知道米哈伊尔·巴里什尼科夫在瓦加诺瓦学院受过教育就意味着米哈伊尔·巴里什尼科夫应该有个人标签，并且三元组（米哈伊尔·巴里什尼科夫，居住在俄罗斯）必须属于知识图谱。

R-GCN使用通用图卷积网络解决了这两个问题。它使用多边缘编码进行了扩展，以计算实体的嵌入，但是具有不同的下游处理。

通过在实体（节点）的最终嵌入时附加一个softmax分类器来完成实体分类。训练是通过失去标准的交叉熵来进行的。
链接预测是通过使用参数化得分函数，使用自动编码器体系结构重建边缘来完成的。培训使用否定抽样。

本教程重点介绍第一个任务，即实体分类，以展示如何生成实体表示。在DGL Github存储库中可以找到这两项任务的完整代码。

Key ideas of R-GCN

回想一下，在GCN中，每个节点的隐藏表示 $i$ 在$ (l+1)^{th} $层的计算方法是：
$h_i^{l+1} = \sigma\left(\sum_{j\in N_i}\frac{1}{c_i} W^{(l)} h_j^{(l)}\right)~~~~~~~~~~(1)$
哪里 $c i$ 是归一化常数。
R-GCN和GCN之间的主要区别在于，在R-GCN中，边缘可以表示不同的关系。在GCN中，重量 $W^{(l)}$ 在等式中 (1) 由图层中的所有边共享 $l$ 。相反，在R-GCN中，不同的边类型使用不同的权重，并且只有相同关系类型的边r 具有相同的投影权重 $W_r^{(l)}$ 。

所以实体中的隐藏表示 $l+1)^{th}$ R-GCN中的层可以用以下公式表示：
$h_i^{l+1} = \sigma\left(W_0^{(l)}h_i^{(l)}+\sum_{r\in R}\sum_{j\in N_i^r}\frac{1}{c_{i,r}}W_r^{(l)}h_j^{(l)}\right)~~~~~~~~~~(2)$

哪里 $N_i^r$ 表示节点的邻居索引集 $i$ 根据关系 $r\in R$ 和 $c_{i,r}$ 是归一化常数。在实体分类中，R-GCN论文使用 $c_{i,r}=|N_i^r|$ 。

直接应用上述方程式的问题是参数数量的快速增长，尤其是在具有高度多元关系的数据的情况下。为了减小模型参数的大小并防止过度拟合，原始论文提出使用基分解。

$W_r^{(l)}=\sum\limits_{b=1}^B a_{rb}^{(l)}V_b^{(l)}~~~~~~~~~~(3)$

因此，重量 $W_r^{(l)}$ 是基础变换的线性组合 $V_b^{(l)}$ 带有系数 $a_{rb}^{(l)}$ 。基地数 $B$ 比知识库中的关系数小得多。

注意
在链路预测中实现另一权重正则化，块分解。

Implement R-GCN in DGL

R-GCN模型由几个R-GCN层组成。第一R-GCN层还用作输入层，并接受与节点实体相关联并投影到隐藏空间的要素（例如，描述文本）。在本教程中，我们仅将实体ID用作实体功能。

R-GCN layers

对于每个节点，R-GCN层执行以下步骤：

L使用节点表示形式和与边缘类型关联的权重矩阵（消息功能）计算传出消息
聚合传入消息并生成新的节点表示形式（减少和应用功能）

以下代码是R-GCN隐藏层的定义。

注意
每种关系类型都具有不同的权重。因此，权重矩阵具有三个维度：关系，input_feature，output_feature。

import torch
import torch.nn as nn
import torch.nn.functional as F
from dgl import DGLGraph
import dgl.function as fn
from functools import partial

class RGCNLayer(nn.Module):
    def __init__(self, in_feat, out_feat, num_rels, num_bases=-1, bias=None,
                 activation=None, is_input_layer=False):
        super(RGCNLayer, self).__init__()
        self.in_feat = in_feat
        self.out_feat = out_feat
        self.num_rels = num_rels
        self.num_bases = num_bases
        self.bias = bias
        self.activation = activation
        self.is_input_layer = is_input_layer

        # sanity check
        if self.num_bases <= 0 or self.num_bases > self.num_rels:
            self.num_bases = self.num_rels

        # weight bases in equation (3)
        self.weight = nn.Parameter(torch.Tensor(self.num_bases, self.in_feat,
                                                self.out_feat))
        if self.num_bases < self.num_rels:
            # linear combination coefficients in equation (3)
            self.w_comp = nn.Parameter(torch.Tensor(self.num_rels, self.num_bases))

        # add bias
        if self.bias:
            self.bias = nn.Parameter(torch.Tensor(out_feat))

        # init trainable parameters
        nn.init.xavier_uniform_(self.weight,
                                gain=nn.init.calculate_gain('relu'))
        if self.num_bases < self.num_rels:
            nn.init.xavier_uniform_(self.w_comp,
                                    gain=nn.init.calculate_gain('relu'))
        if self.bias:
            nn.init.xavier_uniform_(self.bias,
                                    gain=nn.init.calculate_gain('relu'))

    def forward(self, g):
        if self.num_bases < self.num_rels:
            # generate all weights from bases (equation (3))
            weight = self.weight.view(self.in_feat, self.num_bases, self.out_feat)
            weight = torch.matmul(self.w_comp, weight).view(self.num_rels,
                                                        self.in_feat, self.out_feat)
        else:
            weight = self.weight

        if self.is_input_layer:
            def message_func(edges):
                # for input layer, matrix multiply can be converted to be
                # an embedding lookup using source node id
                embed = weight.view(-1, self.out_feat)
                index = edges.data['rel_type'] * self.in_feat + edges.src['id']
                return {'msg': embed[index] * edges.data['norm']}
        else:
            def message_func(edges):
                w = weight[edges.data['rel_type']]
                msg = torch.bmm(edges.src['h'].unsqueeze(1), w).squeeze()
                msg = msg * edges.data['norm']
                return {'msg': msg}

        def apply_func(nodes):
            h = nodes.data['h']
            if self.bias:
                h = h + self.bias
            if self.activation:
                h = self.activation(h)
            return {'h': h}

        g.update_all(message_func, fn.sum(msg='msg', out='h'), apply_func)

Full R-GCN model defined

class Model(nn.Module):
    def __init__(self, num_nodes, h_dim, out_dim, num_rels,
                 num_bases=-1, num_hidden_layers=1):
        super(Model, self).__init__()
        self.num_nodes = num_nodes
        self.h_dim = h_dim
        self.out_dim = out_dim
        self.num_rels = num_rels
        self.num_bases = num_bases
        self.num_hidden_layers = num_hidden_layers

        # create rgcn layers
        self.build_model()

        # create initial features
        self.features = self.create_features()

    def build_model(self):
        self.layers = nn.ModuleList()
        # input to hidden
        i2h = self.build_input_layer()
        self.layers.append(i2h)
        # hidden to hidden
        for _ in range(self.num_hidden_layers):
            h2h = self.build_hidden_layer()
            self.layers.append(h2h)
        # hidden to output
        h2o = self.build_output_layer()
        self.layers.append(h2o)

    # initialize feature for each node
    def create_features(self):
        features = torch.arange(self.num_nodes)
        return features

    def build_input_layer(self):
        return RGCNLayer(self.num_nodes, self.h_dim, self.num_rels, self.num_bases,
                         activation=F.relu, is_input_layer=True)

    def build_hidden_layer(self):
        return RGCNLayer(self.h_dim, self.h_dim, self.num_rels, self.num_bases,
                         activation=F.relu)

    def build_output_layer(self):
        return RGCNLayer(self.h_dim, self.out_dim, self.num_rels, self.num_bases,
                         activation=partial(F.softmax, dim=1))

    def forward(self, g):
        if self.features is not None:
            g.ndata['id'] = self.features
        for layer in self.layers:
            layer(g)
        return g.ndata.pop('h')

Handle dataset

本教程使用R-GCN论文中的应用信息学和形式描述方法研究所（AIFB）数据集。

# load graph data
from dgl.contrib.data import load_data
import numpy as np
data = load_data(dataset='aifb')
num_nodes = data.num_nodes
num_rels = data.num_rels
num_classes = data.num_classes
labels = data.labels
train_idx = data.train_idx
# split training and validation set
val_idx = train_idx[:len(train_idx) // 5]
train_idx = train_idx[len(train_idx) // 5:]

# edge type and normalization factor
edge_type = torch.from_numpy(data.edge_type)
edge_norm = torch.from_numpy(data.edge_norm).unsqueeze(1)

labels = torch.from_numpy(labels).view(-1)

out:

Loading dataset aifb
Number of nodes:  8285
Number of edges:  66371
Number of relations:  91
Number of classes:  4
removing nodes that are more than 3 hops away

Create graph and model

# configurations
n_hidden = 16 # number of hidden units
n_bases = -1 # use number of relations as number of bases
n_hidden_layers = 0 # use 1 input layer, 1 output layer, no hidden layer
n_epochs = 25 # epochs to train
lr = 0.01 # learning rate
l2norm = 0 # L2 norm coefficient

# create graph
g = DGLGraph()
g.add_nodes(num_nodes)
g.add_edges(data.edge_src, data.edge_dst)
g.edata.update({'rel_type': edge_type, 'norm': edge_norm})

# create model
model = Model(len(g),
              n_hidden,
              num_classes,
              num_rels,
              num_bases=n_bases,
              num_hidden_layers=n_hidden_layers)

Training loop

# optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=l2norm)

print("start training...")
model.train()
for epoch in range(n_epochs):
    optimizer.zero_grad()
    logits = model.forward(g)
    loss = F.cross_entropy(logits[train_idx], labels[train_idx])
    loss.backward()

    optimizer.step()

    train_acc = torch.sum(logits[train_idx].argmax(dim=1) == labels[train_idx])
    train_acc = train_acc.item() / len(train_idx)
    val_loss = F.cross_entropy(logits[val_idx], labels[val_idx])
    val_acc = torch.sum(logits[val_idx].argmax(dim=1) == labels[val_idx])
    val_acc = val_acc.item() / len(val_idx)
    print("Epoch {:05d} | ".format(epoch) +
          "Train Accuracy: {:.4f} | Train Loss: {:.4f} | ".format(
              train_acc, loss.item()) +
          "Validation Accuracy: {:.4f} | Validation loss: {:.4f}".format(
              val_acc, val_loss.item()))

out:

start training...
Epoch 00000 | Train Accuracy: 0.1696 | Train Loss: 1.3865 | Validation Accuracy: 0.2500 | Validation loss: 1.3862
Epoch 00001 | Train Accuracy: 0.9196 | Train Loss: 1.3434 | Validation Accuracy: 0.9286 | Validation loss: 1.3574
Epoch 00002 | Train Accuracy: 0.9286 | Train Loss: 1.2764 | Validation Accuracy: 0.9643 | Validation loss: 1.3140
Epoch 00003 | Train Accuracy: 0.9286 | Train Loss: 1.1893 | Validation Accuracy: 1.0000 | Validation loss: 1.2546
Epoch 00004 | Train Accuracy: 0.9286 | Train Loss: 1.0996 | Validation Accuracy: 1.0000 | Validation loss: 1.1837
Epoch 00005 | Train Accuracy: 0.9286 | Train Loss: 1.0229 | Validation Accuracy: 1.0000 | Validation loss: 1.1083
Epoch 00006 | Train Accuracy: 0.9464 | Train Loss: 0.9611 | Validation Accuracy: 1.0000 | Validation loss: 1.0355
Epoch 00007 | Train Accuracy: 0.9464 | Train Loss: 0.9116 | Validation Accuracy: 0.9643 | Validation loss: 0.9708
Epoch 00008 | Train Accuracy: 0.9554 | Train Loss: 0.8726 | Validation Accuracy: 0.9643 | Validation loss: 0.9181
Epoch 00009 | Train Accuracy: 0.9643 | Train Loss: 0.8429 | Validation Accuracy: 0.9643 | Validation loss: 0.8785
Epoch 00010 | Train Accuracy: 0.9643 | Train Loss: 0.8213 | Validation Accuracy: 0.9643 | Validation loss: 0.8503
Epoch 00011 | Train Accuracy: 0.9643 | Train Loss: 0.8062 | Validation Accuracy: 0.9643 | Validation loss: 0.8308
Epoch 00012 | Train Accuracy: 0.9643 | Train Loss: 0.7954 | Validation Accuracy: 0.9643 | Validation loss: 0.8175
Epoch 00013 | Train Accuracy: 0.9643 | Train Loss: 0.7875 | Validation Accuracy: 0.9643 | Validation loss: 0.8085
Epoch 00014 | Train Accuracy: 0.9732 | Train Loss: 0.7813 | Validation Accuracy: 0.9643 | Validation loss: 0.8024
Epoch 00015 | Train Accuracy: 0.9732 | Train Loss: 0.7764 | Validation Accuracy: 0.9643 | Validation loss: 0.7983
Epoch 00016 | Train Accuracy: 0.9732 | Train Loss: 0.7726 | Validation Accuracy: 0.9643 | Validation loss: 0.7956
Epoch 00017 | Train Accuracy: 0.9821 | Train Loss: 0.7695 | Validation Accuracy: 0.9643 | Validation loss: 0.7940
Epoch 00018 | Train Accuracy: 0.9821 | Train Loss: 0.7671 | Validation Accuracy: 0.9643 | Validation loss: 0.7933
Epoch 00019 | Train Accuracy: 0.9821 | Train Loss: 0.7650 | Validation Accuracy: 0.9643 | Validation loss: 0.7932
Epoch 00020 | Train Accuracy: 0.9821 | Train Loss: 0.7633 | Validation Accuracy: 0.9643 | Validation loss: 0.7937
Epoch 00021 | Train Accuracy: 0.9821 | Train Loss: 0.7617 | Validation Accuracy: 0.9643 | Validation loss: 0.7947
Epoch 00022 | Train Accuracy: 0.9821 | Train Loss: 0.7601 | Validation Accuracy: 0.9643 | Validation loss: 0.7963
Epoch 00023 | Train Accuracy: 0.9821 | Train Loss: 0.7585 | Validation Accuracy: 0.9643 | Validation loss: 0.7983
Epoch 00024 | Train Accuracy: 0.9821 | Train Loss: 0.7568 | Validation Accuracy: 0.9643 | Validation loss: 0.8007

The second task, link prediction

到目前为止，您已经了解了如何使用DGL通过R-GCN模型实现实体分类。在知识库设置中，R-GCN生成的表示可用于发现节点之间的潜在关系。在R-GCN论文中，作者将R-GCN生成的实体表示提供给DistMult预测模型，以预测可能的关系。

该实现与此处介绍的实现类似，但在R-GCN层之上堆叠了一个额外的DistMult层。您可以在我们的` Github Python代码示例中找到使用R-GCN进行链接预测的完整实现。

< https://github.com/dmlc/dgl/blob/master/examples/pytorch/rgcn/link_predict.py >`_。
脚本的总运行时间：（0分钟6.505秒）

下载脚本：4_rgcn.py

下载脚本:4_rgcn.ipynb

平湖片帆

关注

2
点赞
踩
11

收藏

觉得还不错? 一键收藏
1
评论
DGL官方教程--Relational graph convolutional network

Relational graph convolutional networkAuthor: Lingfan Yu, Mufei Li, Zheng Zhang在本教程中，您将学习如何实现关系图卷积网络（R-GCN）。这种类型的网络是一种通用化GCN的工作，旨在处理知识库中实体之间的不同关系。要了解有关R-GCN背后的研究的更多信息，请参阅使用图卷积网络对关系数据建模。简单的图卷积网络（GCN...
复制链接

扫一扫