Task02：消息传递图神经网络

最新推荐文章于 2023-11-15 11:31:49 发布

idruglab.com

最新推荐文章于 2023-11-15 11:31:49 发布

阅读量589

点赞数

本文链接：https://blog.csdn.net/m0_46306014/article/details/118059635

版权

Task02：消息传递图神经网络

一、MPNN

消息传递神经网络（Message Passing Neural Network，MPNN）首次提出于《Neural Message Passing for Quantum Chemistry》，在量子化学性质预测的任务中取得了不错的成绩，相关介绍可以见阿泽的学习笔记写的一篇文章。

MPNN框架：

MPNN主要包括两个阶段：消息传递阶段（Message Passing）、读出阶段（Readout），这里介绍消息传递阶段。
$\mathbf{x}_i^{(k)} = \gamma^{(k)} \left( \mathbf{x}_i^{(k-1)}, \square_{j \in \mathcal{N}(i)} \, \phi^{(k)}\left(\mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)},\mathbf{e}_{j,i}\right) \right),$

消息的创建（ $\phi$ ）
消息的聚合（ $\square$ ）
节点的更新（ $\gamma$ ）

以下内容可详细阅读datawhale开源项目。

二、`MessagePassing`基类分析

Pytorch Geometric(PyG)提供了MessagePassing基类，它封装了“消息传递”的运行流程。通过继承MessagePassing基类，可以方便地构造消息传递图神经网络。构造一个最简单的消息传递图神经网络类，我们只需定义**message()方法（ $\phi$ ）、update()方法（ $\gamma$ ），以及使用的消息聚合方案**（aggr="add"、aggr="mean"或aggr="max"）。这一切是在以下方法的帮助下完成的：

MessagePassing(aggr="add", flow="source_to_target", node_dim=-2)（对象初始化方法）
MessagePassing.propagate(edge_index, size=None, **kwargs)
MessagePassing.message(...)
MessagePassing.aggregate(...)
MessagePassing.message_and_aggregate(...)
MessagePassing.update(aggr_out, ...)

在propagate（）方法中message（）(或message_and_aggregate（）)、update（）等方法被调用。propagate()方法首先检查edge_index是否为SparseTensor类型以及是否子类实现了message_and_aggregate()方法，如是就执行子类的message_and_aggregate方法和update()方法；否则依次执行子类的message(),aggregate(),update()三个方法。

不同的图神经网络可以通过覆写message方法、aggregate方法的覆写、message_and_aggregate方法、update方法来实现。

三、`MessagePassing`子类实例

我们以继承MessagePassing基类的GCNConv类为例，学习如何通过继承MessagePassing基类来实现一个简单的图神经网络，可配合该文章食用。

GCNConv的数学定义为
$\mathbf{x}_i^{(k)} = \sum_{j \in \mathcal{N}(i) \cup \{ i \}} \frac{1}{\sqrt{\deg(i)} \cdot \sqrt{\deg(j)}} \cdot \left( \mathbf{\Theta} \cdot \mathbf{x}_j^{(k-1)} \right),$
其中，邻接节点的表征 $\mathbf{x}_j^{(k-1)}$ 首先通过与权重矩阵 $\mathbf{\Theta}$ 相乘进行变换，然后按端点的度 $\deg(i), \deg(j)$ 进行归一化处理，最后进行求和。这个公式可以分为以下几个步骤：

向邻接矩阵添加自环边。
对节点表征做线性转换。
计算归一化系数。
归一化邻接节点的节点表征。
将相邻节点表征相加（"求和 "聚合）。

步骤1-3通常是在消息传递发生之前计算的。步骤4-5可以使用MessagePassing基类轻松处理。该层的全部实现如下所示。

import torch
from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops, degree

class GCNConv(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super(GCNConv, self).__init__(aggr='add', flow='source_to_target')
        # "Add" aggregation (Step 5).
        # flow='source_to_target' 表示消息从源节点传播到目标节点
        self.lin = torch.nn.Linear(in_channels, out_channels)

    def forward(self, x, edge_index):
        # x has shape [N, in_channels]
        # edge_index has shape [2, E]

        # Step 1: Add self-loops to the adjacency matrix.
        edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

        # Step 2: Linearly transform node feature matrix.
        x = self.lin(x)

        # Step 3: Compute normalization.
        row, col = edge_index
        deg = degree(col, x.size(0), dtype=x.dtype)
        deg_inv_sqrt = deg.pow(-0.5)
        norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]

        # Step 4-5: Start propagating messages.
        return self.propagate(edge_index, x=x, norm=norm)

    def message(self, x_j, norm):
        # x_j has shape [E, out_channels]
        # Step 4: Normalize node features.
        return norm.view(-1, 1) * x_j

GCNConv继承了MessagePassing并以"求和"作为领域节点信息聚合方式。该层的所有逻辑都发生在其forward()方法中。在这里，我们首先使用torch_geometric.utils.add_self_loops()函数向我们的边索引添加自循环边（步骤1），以及通过调用torch.nn.Linear实例对节点表征进行线性变换（步骤2）。propagate()方法也在forward方法中被调用，propagate()方法被调用后节点间的信息传递开始执行。

归一化系数是由每个节点的节点度得出的，它被转换为每条边的节点度。结果被保存在形状为[num_edges,]的变量norm中（步骤3）。

在message()方法中，我们需要通过norm对邻接节点表征x_j进行归一化处理。

通过以上内容的学习，我们便掌握了创建一个仅包含一次“消息传递过程”的图神经网络的方法。如下方代码所示，我们可以很方便地初始化和调用它：

from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='dataset/Cora', name='Cora')
data = dataset[0]

net = GCNConv(data.num_features, 64)
h_nodes = net(data.x, data.edge_index)
print(h_nodes.shape)

通过串联多个这样的简单图神经网络，我们就可以构造复杂的图神经网络模型。

四、作业

请总结MessagePassing基类的运行流程。
请复现一个一层的图神经网络的构造，总结通过继承MessagePassing基类来构造自己的图神经网络类的规范。

答：

通过MessagePassing()或者MessagePassing.__init__()进行初始化：
- aggr：定义要使用的聚合方案（“add”、"mean "或 “max”）；
- flow：定义消息传递的流向（"source_to_target "或 “target_to_source”）；
- node_dim：定义沿着哪个维度传播，默认值为-2，也就是节点表征张量（Tensor）的哪一个维度是节点维度。
- ……
实现MessagePassing.propagate，该方法的源码如下：

def propagate(self, edge_index: Adj, size: Size = None, **kwargs):
      
        size = self.__check_input__(edge_index, size)

        # Run "fused" message and aggregation (if applicable).
        if (isinstance(edge_index, SparseTensor) and self.fuse
                and not self.__explain__):
            coll_dict = self.__collect__(self.__fused_user_args__, edge_index,
                                         size, kwargs)

            msg_aggr_kwargs = self.inspector.distribute(
                'message_and_aggregate', coll_dict)
            out = self.message_and_aggregate(edge_index, **msg_aggr_kwargs)

            update_kwargs = self.inspector.distribute('update', coll_dict)
            return self.update(out, **update_kwargs)

        # Otherwise, run both functions in separation.
        elif isinstance(edge_index, Tensor) or not self.fuse:
            coll_dict = self.__collect__(self.__user_args__, edge_index, size,
                                         kwargs)

            msg_kwargs = self.inspector.distribute('message', coll_dict)
            out = self.message(**msg_kwargs)

            # For `GNNExplainer`, we require a separate message and aggregate
            # procedure since this allows us to inject the `edge_mask` into the
            # message passing computation scheme.
            if self.__explain__:
                edge_mask = self.__edge_mask__.sigmoid()
                # Some ops add self-loops to `edge_index`. We need to do the
                # same for `edge_mask` (but do not train those).
                if out.size(self.node_dim) != edge_mask.size(0):
                    loop = edge_mask.new_ones(size[0])
                    edge_mask = torch.cat([edge_mask, loop], dim=0)
                assert out.size(self.node_dim) == edge_mask.size(0)
                out = out * edge_mask.view([-1] + [1] * (out.dim() - 1))

            aggr_kwargs = self.inspector.distribute('aggregate', coll_dict)
            out = self.aggregate(out, **aggr_kwargs)

            update_kwargs = self.inspector.distribute('update', coll_dict)
            return self.update(out, **update_kwargs)

即propagate()方法首先检查edge_index是否为SparseTensor类型以及是否子类实现了message_and_aggregate()方法，如是就执行子类的message_and_aggregate方法和update()方法；否则依次执行子类的message(),aggregate(),update()三个方法。

通过继承MessagePassing基类实现GATConv:

class GATConv(MessagePassing):
   
    def __init__(self, in_channels: Union[int, Tuple[int, int]],
                 out_channels: int, heads: int = 1, concat: bool = True,
                 negative_slope: float = 0.2, dropout: float = 0.0,
                 add_self_loops: bool = True, bias: bool = True, **kwargs):
        kwargs.setdefault('aggr', 'add')
        super(GATConv, self).__init__(node_dim=0, **kwargs)

        self.in_channels = in_channels
        self.out_channels = out_channels
        self.heads = heads
        self.concat = concat
        self.negative_slope = negative_slope
        self.dropout = dropout
        self.add_self_loops = add_self_loops

        if isinstance(in_channels, int):
            self.lin_l = Linear(in_channels, heads * out_channels, bias=False)
            self.lin_r = self.lin_l
        else:
            self.lin_l = Linear(in_channels[0], heads * out_channels, False)
            self.lin_r = Linear(in_channels[1], heads * out_channels, False)

        self.att_l = Parameter(torch.Tensor(1, heads, out_channels))
        self.att_r = Parameter(torch.Tensor(1, heads, out_channels))

        if bias and concat:
            self.bias = Parameter(torch.Tensor(heads * out_channels))
        elif bias and not concat:
            self.bias = Parameter(torch.Tensor(out_channels))
        else:
            self.register_parameter('bias', None)

        self._alpha = None

        self.reset_parameters()

    def forward(self, x: Union[Tensor, OptPairTensor], edge_index: Adj,
                size: Size = None, return_attention_weights=None):
      
        H, C = self.heads, self.out_channels

        x_l: OptTensor = None
        x_r: OptTensor = None
        alpha_l: OptTensor = None
        alpha_r: OptTensor = None
        if isinstance(x, Tensor):
            assert x.dim() == 2, 'Static graphs not supported in `GATConv`.'
            x_l = x_r = self.lin_l(x).view(-1, H, C)
            alpha_l = (x_l * self.att_l).sum(dim=-1)
            alpha_r = (x_r * self.att_r).sum(dim=-1)
        else:
            x_l, x_r = x[0], x[1]
            assert x[0].dim() == 2, 'Static graphs not supported in `GATConv`.'
            x_l = self.lin_l(x_l).view(-1, H, C)
            alpha_l = (x_l * self.att_l).sum(dim=-1)
            if x_r is not None:
                x_r = self.lin_r(x_r).view(-1, H, C)
                alpha_r = (x_r * self.att_r).sum(dim=-1)

        assert x_l is not None
        assert alpha_l is not None

        if self.add_self_loops:
            if isinstance(edge_index, Tensor):
                num_nodes = x_l.size(0)
                if x_r is not None:
                    num_nodes = min(num_nodes, x_r.size(0))
                if size is not None:
                    num_nodes = min(size[0], size[1])
                edge_index, _ = remove_self_loops(edge_index)
                edge_index, _ = add_self_loops(edge_index, num_nodes=num_nodes)
            elif isinstance(edge_index, SparseTensor):
                edge_index = set_diag(edge_index)

        # propagate_type: (x: OptPairTensor, alpha: OptPairTensor)
        out = self.propagate(edge_index, x=(x_l, x_r),
                             alpha=(alpha_l, alpha_r), size=size)

        alpha = self._alpha
        self._alpha = None

        if self.concat:
            out = out.view(-1, self.heads * self.out_channels)
        else:
            out = out.mean(dim=1)

        if self.bias is not None:
            out += self.bias

        if isinstance(return_attention_weights, bool):
            assert alpha is not None
            if isinstance(edge_index, Tensor):
                return out, (edge_index, alpha)
            elif isinstance(edge_index, SparseTensor):
                return out, edge_index.set_value(alpha, layout='coo')
        else:
            return out


    def message(self, x_j: Tensor, alpha_j: Tensor, alpha_i: OptTensor,
                index: Tensor, ptr: OptTensor,
                size_i: Optional[int]) -> Tensor:
        alpha = alpha_j if alpha_i is None else alpha_j + alpha_i
        alpha = F.leaky_relu(alpha, self.negative_slope)
        alpha = softmax(alpha, index, ptr, size_i)
        self._alpha = alpha
        alpha = F.dropout(alpha, p=self.dropout, training=self.training)
        return x_j * alpha.unsqueeze(-1)

    def __repr__(self):
        return '{}({}, {}, heads={})'.format(self.__class__.__name__,
                                             self.in_channels,
                                             self.out_channels, self.heads)