DGL官方教程--线性图神经网络（Line graph neural network）

最新推荐文章于 2025-03-25 16:21:34 发布

平湖片帆

最新推荐文章于 2025-03-25 16:21:34 发布

阅读量4.4k

点赞数 3

分类专栏：图神经网络及其变体文章标签：神经网络 python 网络

原文链接：https://docs.dgl.ai/en/latest/tutorials/models/1_gnn/6_line_graph.html#implementing-and-as-tensor-operation

版权

图神经网络及其变体专栏收录该内容

5 篇文章

订阅专栏

Note:
Click here to download the full example code

Line graph neural network

Author: Qi Huang, Yu Gai, Minjie Wang, Zheng Zhang
在本教程中，您将学习如何通过实现折线图神经网络（LGNN）解决社区检测任务。社区检测或图聚类包括将图中的顶点划分为群集，在群集中节点之间更加相似。

在“ 图卷积网络”教程中，您学习了如何在半监督设置中对输入图的节点进行分类。您使用图卷积神经网络（GCN）作为图特征的嵌入机制。

为了将图神经网络（GNN）概括为有监督的社区检测，在研究论文《基于线图神经网络的有监督的社区检测》中引入了基于线图的GNN变异。该模型的亮点之一是增强了直接的GNN架构，使其可以在使用非回溯运算符定义的边缘邻接的折线图中进行操作。

线图神经网络（LGNN）显示了DGL如何通过混合基本张量运算，稀疏矩阵乘法和消息传递API来实现高级图算法。

在以下各节中，您将了解社区检测，线图，LGNN及其实现。

Supervised community detection task with the Cora dataset

Community detection

在社区检测任务中，您将相似的节点聚类而不是对其进行标记。通常将节点相似性描述为在每个群集中具有较高的内部密度。

社区检测和节点分类有什么区别？与节点分类相比，社区检测的重点是检索图中的聚类信息，而不是为节点分配特定的标签。例如，只要节点与其社区成员一起集群，则在将所有“大电影”分配给“坏电影”标签的同时，将节点分配为“社区A”还是“社区B”都没有关系。电影网络分类任务将是一场灾难。

那么，社区检测算法和其他聚类算法（例如k-means）有什么区别？社区检测算法对图结构数据进行操作。与k-means相比，社区检测利用图结构，而不是仅基于节点的特征对其进行聚类。

Cora dataset

为了与GCN教程保持一致，您可以使用Cora数据集来说明一个简单的社区检测任务。Cora是一个科学出版物数据集，拥有2708篇论文，属于七个不同的机器学习领域。在这里，您将Cora公式化为有向图，每个节点为纸，每个边为引用链接（A-> B表示A引用B）。这是整个Cora数据集的可视化。

图片地址：https://i.imgur.com/X404Byc.png

Cora自然包含七个类，并且下面的统计数据表明每个类确实满足我们对社区的假设，即，相同类类的节点之间的连接概率要高于不同类节点的连接概率。以下代码片段验证了类内边缘多于类间边缘。

import torch
import torch as th
import torch.nn as nn
import torch.nn.functional as F

import dgl
from dgl.data import citation_graph as citegrh

data = citegrh.load_cora()

G = dgl.DGLGraph(data.graph)
labels = th.tensor(data.labels)

# find all the nodes labeled with class 0
label0_nodes = th.nonzero(labels == 0).squeeze()
# find all the edges pointing to class 0 nodes
src, _ = G.in_edges(label0_nodes)
src_labels = labels[src]
# find all the edges whose both endpoints are in class 0
intra_src = th.nonzero(src_labels == 0)
print('Intra-class edges percent: %.4f' % (len(intra_src) / len(src_labels)))

out:

Intra-class edges percent: 0.7680

Binary community subgraph from Cora with a test dataset

不失一般性，在本教程中，您将任务范围限制为二进制社区检测。

Note:
要从Cora创建实践二进制社区数据集，请首先从原始Cora的七个类中提取所有两个类对。对于每对，您将每个班级视为一个社区，并找到至少包含一个跨社区边缘的最大子图作为训练示例。结果，在这个小数据集中总共有21个训练样本。
使用以下代码，您可以可视化其中一个培训样本及其社区结构。

import networkx as nx
import matplotlib.pyplot as plt

train_set = dgl.data.CoraBinary()
G1, pmpd1, label1 = train_set[1]
nx_G1 = G1.to_networkx()

def visualize(labels, g):
    pos = nx.spring_layout(g, seed=1)
    plt.figure(figsize=(8, 8))
    plt.axis('off')
    nx.draw_networkx(g, pos=pos, node_size=50, cmap=plt.get_cmap('coolwarm'),
                     node_color=labels, edge_color='k',
                     arrows=False, width=0.5, style='dotted', with_labels=False)
visualize(label1, nx_G1)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-szofOzcs-1580042398478)(https://docs.dgl.ai/en/latest/_images/sphx_glr_6_line_graph_001.png)]
要了解更多信息，请访问原始研究论文，以了解如何将其推广到多个社区的案例。

Community detection in a supervised setting

社区检测问题可以通过有监督和无监督的方法来解决。您可以在受监管的环境中制定社区检测，如下所示：

每个培训示例包括 $(G, L)$ ，在哪里 $G$ 是有向图 $(V, E)$ 。对于每个节点 $v$ 在 $V$ ，我们分配一个地面真相社区标签 $z_v \in \{0,1\}$ 。
参数化模型 $\theta)$ 预测标签集 $\tilde{Z} = f(G)$ 对于节点 $V$ 。
对于每个示例 $(G, L)$ ，模型将学习如何最大限度地减少专门设计的损失函数（等价损失） $L_{equivariant} = (\tilde{Z}，Z)$

注意：
在这种监督下，模型自然可以为每个社区预测标签。但是，社区分配应与标签排列等价。为了实现这一点，在每个正向过程中，我们将根据标签的所有可能排列计算得出的损耗中的最小值。
从数学上讲，这意味着 $L_{equivariant} = \underset{\pi \in S_c} {min}-\log(\hat{\pi}, \pi)$ 在哪里 $S_c$ 是标签的所有排列的集合，并且 $\hat{\pi}$ 是一组预测标签， $\log(\hat{\pi},\pi)$ 表示对数可能性为负。
例如，对于带有节点的样本图 {1,2,3,4} 和社区任务 {A,A,A,B}，带有每个节点的标签 l∈{0,1}，所有可能排列的组 Sc={{0,0,0,1},{1,1,1,0}}。

Line graph neural network key ideas

该主题的一项关键创新是折线图的使用。与先前教程中的模型不同，消息传递不仅发生在原始图（例如，来自Cora的二进制社区子图）上，而且发生在与原始图关联的线图上。

What is a line-graph?

在图论中，线图是对原始图的边缘邻接结构进行编码的图表示。

具体来说，是折线图 L(G)将原始图G的边缘变成一个节点。如下图所示（摘自研究论文）。

图片地址：https://i.imgur.com/4WO5jEm.png

这里， $e_{A}:= （i\rightarrow j）$ 和 $e_{B}:= (j\rightarrow k)$ 是原始图中的两条边 $G$ 。线状图 $G_L$ ，它们对应于节点 $v^{l}_{A}, v^{l}_{B}$ 。

下一个自然的问题是，如何连接线图中的节点？如何连接两个边？在这里，我们使用以下连接规则：

两个节点 $v^{l}_{A}, v^{l}_{B}$ 如果对应的两个边在 $l g$ 中连接 $e_{A}, e_{B}$ 在 $g$ 中共享一个且仅一个节点： $e_{A}$ 的目标节点是 $e_{B}$ 的源节点 $（ j ）$ 。

注意：
从数学上讲，此定义对应于一个称为非回溯运算符的概念： $B_{(i \rightarrow j), (\hat{i} \rightarrow \hat{j})}$ , $\begin{cases} 1 \text{ if } j = \hat{i}, \hat{j} \neq i\\ 0 \text{ otherwise} \end{cases}$ 如果形成边缘 $B_{node1, node2} = 1$ 。

One layer in LGNN, algorithm structure

LGNN将一系列线形图神经网络层链接在一起。图形表示x 和它的线图伴侣 y 随数据流的变化如下。

图片地址：https://i.imgur.com/bZGGIGp.png

在 $k -$ 第层 $i$ 的第神经元 l频道更新其嵌入 $x^{(k+1)}_{i,l}$ 与：

然后，线图表示 $y^{(k+1)}_{i,l}$ 与，
$y{(k+1)}_{i',l{'}} = {}&\rho[y{(k)}_{i{'}}\gamma{(k)}_{1,l{'}}+(D_{L(G)}y{(k)})_{i{'}}\gamma{(k)}_{2,l{'}}\&+\sum{J-1}_{j=0}(A_{L(G)}{2{j}}y{k})_{i}\gamma{(k)}_{3+j,l{'}}\&+[{\text{Pm},\text{Pd}}{T}x{(k+1)}]_{i{'}}\gamma{(k)}_{3+J,l^{'}}]\&+\text{skip-connection}\qquad i^{'} \in V_{l}, l^{'} = 1,2,3, ... b^{'}_{k+1}/2$
哪里 skip-connection 指执行相同的操作而没有非线性 ρ和线性投影 $\theta_\{\frac{b_{k+1}}{2} + 1, ..., b_{k+1}-1, b_{k+1}\}$ 和 $\gamma_\{\frac{b_{k+1}}{2} + 1, ..., b_{k+1}-1, b_{k+1}\}$ 。

Implement LGNN in DGL

即使上一节中的方程式可能看起来令人生畏，但在实施LGNN之前有助于理解以下信息。

这两个方程是对称的，可以实现为具有不同参数的同一类的两个实例。第一个方程对图表示起作用 $x$ ，而第二个则以线形图表示 $y$ 。让我们将这种抽象表示为 $f$ 。那么第一个是 $\theta_x)$ ，第二个是 $\theta_y)$ 。也就是说，将它们参数化以分别计算原始图及其伴随线图的表示形式。

每个方程式由四个项组成。下面以第一个为例。

$x^{(k)}\theta^{(k)}_{1,l}$ ，前一层输出的线性投影 $x^{(k)}$ ，表示为 $\text{prev}(x)$ 。
$(Dx^{(k)})\theta^{(k)}_{2,l}$ ，度算子在上的线性投影 $x^{(k)}$ ，表示为 $\text{deg}(x)$ 。
$\sum^{J-1}_{j=0}(A^{2^{j}}x^{(k)})\theta^{(k)}_{3+j,l}$ ，是 $2^{j}$ 邻接运算符 $x^{(k)}$ ，表示为 $\text{radius}(x)$ .
- $[\{Pm,Pd\}y^{(k)}]\theta^{(k)}_{3+J,l}$ ，使用关联矩阵融合另一个图的嵌入信息 ${Pm, Pd\}$ ，然后是线性投影，表示为 $\text{fuse}(y)$ 。
再次使用不同的参数执行每个项，并且求和后没有非线性。因此， $f$ 可以写成：

两个方程按以下顺序链接起来：

请记住本概述中列出的意见，然后继续执行。重要的一点是，您对提到的术语使用不同的策略。

注意：
你能明白 ${Pm, Pd\}$ 对此解释进行更彻底的介绍。大致来说， $g$ 和 $l g$ （折线图）与循环简短传播协同工作。在这里，您实现 ${Pm, Pd\}$ 作为数据集中的SciPy COO稀疏矩阵，并在批处理时将它们堆叠为张量。另一个批处理解决方案是 ${Pm, Pd\}$ 作为二部图的邻接矩阵，它将线图的特征映射到图的特征，反之亦然。

Implementing prev and deg as tensor operation

线性投影和度运算都是简单的矩阵乘法。将它们编写为PyTorch张量操作。
在中**init**，您可以定义投影变量。

self.linear_prev = nn.Linear(in_feats, out_feats)
self.linear_deg = nn.Linear(in_feats, out_feats)

在forward()， $p r e v$ 和 $d e g$ 与任何其他PyTorch张量操作相同。

prev_proj = self.linear_prev(feat_a)
deg_proj = self.linear_deg(deg * feat_a)

Implementing radius as message passing in DGL

正如GCN教程中讨论的那样，您可以将一个邻接运算符表述为一步传递消息。作为概括， $2^j$ 邻接操作可以表述为执行 $2^j$ 消息传递的步骤。因此，求和等于对节点的表示进行求和。 $2^j, j=0, 1, 2..$ 传递步骤消息，即收集信息 $2^j$ 每个节点的邻域。

在__init__中，定义每个中使用的投影变量 $2^j$ 消息传递的步骤。

self.linear_radius = nn.ModuleList(
        [nn.Linear(in_feats, out_feats) for i in range(radius)])

在中__forward__，使用以下功能aggregate_radius()从多个跃点收集数据。在以下代码中可以看到。请注意，update_all多次调用。

# Return a list containing features gathered from multiple radius.
import dgl.function as fn
def aggregate_radius(radius, g, z):
    # initializing list to collect message passing result
    z_list = []
    g.ndata['z'] = z
    # pulling message from 1-hop neighbourhood
    g.update_all(fn.copy_src(src='z', out='m'), fn.sum(msg='m', out='z'))
    z_list.append(g.ndata['z'])
    for i in range(radius - 1):
        for j in range(2 ** i):
            #pulling message from 2^j neighborhood
            g.update_all(fn.copy_src(src='z', out='m'), fn.sum(msg='m', out='z'))
        z_list.append(g.ndata['z'])
    return z_list

在__forward__中：

fuse = self.linear_fuse(th.mm(pm_pd, feat_b))

Implementing fuse as sparse matrix multiplication

${Pm, Pd\}$ 是一个稀疏矩阵，每列上只有两个非零条目。因此，您可以将其构造为数据集中的稀疏矩阵，然后实施 $f u s e$ 作为稀疏矩阵乘法。

Completing $f (x, y)$

最后，以下内容显示了如何将所有术语汇总在一起，将其传递给跳过连接以及批处理规范。

result = prev_proj + deg_proj + radius_proj + fuse

传递结果以跳过连接。

result = th.cat([result[:, :n], F.relu(result[:, n:])], 1)

然后将结果传递给批处理规范。

result = self.bn(result) #Batch Normalization.

这是一个LGNN层抽象的完整代码 $f (x, y)$

class LGNNCore(nn.Module):
    def __init__(self, in_feats, out_feats, radius):
        super(LGNNCore, self).__init__()
        self.out_feats = out_feats
        self.radius = radius

        self.linear_prev = nn.Linear(in_feats, out_feats)
        self.linear_deg = nn.Linear(in_feats, out_feats)
        self.linear_radius = nn.ModuleList(
                [nn.Linear(in_feats, out_feats) for i in range(radius)])
        self.linear_fuse = nn.Linear(in_feats, out_feats)
        self.bn = nn.BatchNorm1d(out_feats)

    def forward(self, g, feat_a, feat_b, deg, pm_pd):
        # term "prev"
        prev_proj = self.linear_prev(feat_a)
        # term "deg"
        deg_proj = self.linear_deg(deg * feat_a)

        # term "radius"
        # aggregate 2^j-hop features
        hop2j_list = aggregate_radius(self.radius, g, feat_a)
        # apply linear transformation
        hop2j_list = [linear(x) for linear, x in zip(self.linear_radius, hop2j_list)]
        radius_proj = sum(hop2j_list)

        # term "fuse"
        fuse = self.linear_fuse(th.mm(pm_pd, feat_b))

        # sum them together
        result = prev_proj + deg_proj + radius_proj + fuse

        # skip connection and batch norm
        n = self.out_feats // 2
        result = th.cat([result[:, :n], F.relu(result[:, n:])], 1)
        result = self.bn(result)

        return result

Chain-up LGNN abstractions as an LGNN layer

实现：
在这里插入图片描述
LGNNCore如示例代码中所示，将两个实例链接在一起，并在前向传递中使用不同的参数。

class LGNNLayer(nn.Module):
    def __init__(self, in_feats, out_feats, radius):
        super(LGNNLayer, self).__init__()
        self.g_layer = LGNNCore(in_feats, out_feats, radius)
        self.lg_layer = LGNNCore(in_feats, out_feats, radius)

    def forward(self, g, lg, x, lg_x, deg_g, deg_lg, pm_pd):
        next_x = self.g_layer(g, x, lg_x, deg_g, pm_pd)
        pm_pd_y = th.transpose(pm_pd, 0, 1)
        next_lg_x = self.lg_layer(lg, lg_x, x, deg_lg, pm_pd_y)
        return next_x, next_lg_x

Chain-up LGNN layers

定义一个具有三个隐藏层的LGNN，如以下示例所示。

class LGNN(nn.Module):
    def __init__(self, radius):
        super(LGNN, self).__init__()
        self.layer1 = LGNNLayer(1, 16, radius)  # input is scalar feature
        self.layer2 = LGNNLayer(16, 16, radius)  # hidden size is 16
        self.layer3 = LGNNLayer(16, 16, radius)
        self.linear = nn.Linear(16, 2)  # predice two classes

    def forward(self, g, lg, pm_pd):
        # compute the degrees
        deg_g = g.in_degrees().float().unsqueeze(1)
        deg_lg = lg.in_degrees().float().unsqueeze(1)
        # use degree as the input feature
        x, lg_x = deg_g, deg_lg
        x, lg_x = self.layer1(g, lg, x, lg_x, deg_g, deg_lg, pm_pd)
        x, lg_x = self.layer2(g, lg, x, lg_x, deg_g, deg_lg, pm_pd)
        x, lg_x = self.layer3(g, lg, x, lg_x, deg_g, deg_lg, pm_pd)
        return self.linear(x)

Training and inference

首先加载数据。

from torch.utils.data import DataLoader
training_loader = DataLoader(train_set,
                             batch_size=1,
                             collate_fn=train_set.collate_fn,
                             drop_last=True)

接下来，定义主要的训练循环。请注意，每个训练样本都包含三个对象：A DGLGraph，SciPy稀疏矩阵pmpd和中的标签数组numpy.ndarray。使用以下命令生成折线图：

lg = g.line_graph(backtracking=False)

请注意，backtracking=False正确模拟非回溯操作是必需的。我们还定义了一个实用函数，将SciPy稀疏矩阵转换为火炬稀疏张量。

# Create the model
model = LGNN(radius=3)
# define the optimizer
optimizer = th.optim.Adam(model.parameters(), lr=1e-2)

# A utility function to convert a scipy.coo_matrix to torch.SparseFloat
def sparse2th(mat):
    value = mat.data
    indices = th.LongTensor([mat.row, mat.col])
    tensor = th.sparse.FloatTensor(indices, th.from_numpy(value).float(), mat.shape)
    return tensor

# Train for 20 epochs
for i in range(20):
    all_loss = []
    all_acc = []
    for [g, pmpd, label] in training_loader:
        # Generate the line graph.
        lg = g.line_graph(backtracking=False)
        # Create torch tensors
        pmpd = sparse2th(pmpd)
        label = th.from_numpy(label)

        # Forward
        z = model(g, lg, pmpd)

        # Calculate loss:
        # Since there are only two communities, there are only two permutations
        #  of the community labels.
        loss_perm1 = F.cross_entropy(z, label)
        loss_perm2 = F.cross_entropy(z, 1 - label)
        loss = th.min(loss_perm1, loss_perm2)

        # Calculate accuracy:
        _, pred = th.max(z, 1)
        acc_perm1 = (pred == label).float().mean()
        acc_perm2 = (pred == 1 - label).float().mean()
        acc = th.max(acc_perm1, acc_perm2)
        all_loss.append(loss.item())
        all_acc.append(acc.item())

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    niters = len(all_loss)
    print("Epoch %d | loss %.4f | accuracy %.4f" % (i,
        sum(all_loss) / niters, sum(all_acc) / niters))

out:

Epoch 0 | loss 0.5751 | accuracy 0.6873
Epoch 1 | loss 0.5025 | accuracy 0.7742
Epoch 2 | loss 0.5078 | accuracy 0.7551
Epoch 3 | loss 0.4895 | accuracy 0.7624
Epoch 4 | loss 0.4682 | accuracy 0.7910
Epoch 5 | loss 0.4461 | accuracy 0.7992
Epoch 6 | loss 0.4815 | accuracy 0.7838
Epoch 7 | loss 0.4542 | accuracy 0.7970
Epoch 8 | loss 0.4338 | accuracy 0.8172
Epoch 9 | loss 0.4694 | accuracy 0.7604
Epoch 10 | loss 0.4525 | accuracy 0.7958
Epoch 11 | loss 0.4388 | accuracy 0.7941
Epoch 12 | loss 0.4440 | accuracy 0.8092
Epoch 13 | loss 0.4325 | accuracy 0.7982
Epoch 14 | loss 0.4087 | accuracy 0.8137
Epoch 15 | loss 0.4073 | accuracy 0.8129
Epoch 16 | loss 0.4123 | accuracy 0.8133
Epoch 17 | loss 0.4061 | accuracy 0.8201
Epoch 18 | loss 0.4100 | accuracy 0.8123
Epoch 19 | loss 0.4170 | accuracy 0.8348

Visualize training progress

您可以在一个培训示例中将网络的社区预测以及基本事实可视化。从以下代码示例开始。

pmpd1 = sparse2th(pmpd1)
LG1 = G1.line_graph(backtracking=False)
z = model(G1, LG1, pmpd1)
_, pred = th.max(z, 1)
visualize(pred, nx_G1)

在这里插入图片描述
与地面真相相比。请注意，这两个社区的颜色可能相反，因为该模型用于正确预测分区。

visualize(label1, nx_G1)

在这里插入图片描述
这是一个动画，可以更好地理解该过程。（40个迭代）

图片地址：https://i.imgur.com/KDUyE1S.gif

Batching graphs for parallelism

LGNN收集了一系列不同的图形。您可能会考虑批处理是否可以用于并行性。
批处理已进入数据加载器本身。在collate_fnfor PyTorch数据加载器中，使用DGL的batched_graph API对图形进行批处理。DGL通过将它们合并成一个大图来对图进行批处理，每个较小图的邻接矩阵是沿着大图邻接矩阵对角线的一个块。将{math，{Pm，Pd}}连接为块对角线矩阵，对应于DGL批处理图API。

def collate_fn(batch):
    graphs, pmpds, labels = zip(*batch)
    batched_graphs = dgl.batch(graphs)
    batched_pmpds = sp.block_diag(pmpds)
    batched_labels = np.concatenate(labels, axis=0)
    return batched_graphs, batched_pmpds, batched_labels