pygcn源码学习

最新推荐文章于 2024-03-21 09:37:07 发布

小鸡炖蘑菇@

最新推荐文章于 2024-03-21 09:37:07 发布

阅读量278

点赞数 2

文章标签：人工智能算法深度学习 python

本文链接：https://blog.csdn.net/weixin_48799576/article/details/131288552

版权

GCN论文地址： arxiv.org/pdf/1609.02907.pdf

源码：mirrors / tkipf / pygcn · GitCode

数据处理

这里使用的数据集是cora，该数据集包含关于机器学习的论文，数据集由cora.content 以及 cora.sites 两个文件组成。

其中cora.content文件中的数据格式为：<paper_id> +<word_attributes>+ <class_label>。第一列<paper_id>是论文的id，共有2708篇文章；中间的<word_attributes>的长度为1433，每个位置上的值为0/1，表示论文的特征；<class_label>则表示论文的标签，即每篇论文所属的类别。

cora.sites文件中的数据格式为：<ID of cited paper>+ <ID of citing paper>。表示前面论文在后面的论文中被引用。

数据处理部分主要是load_data()函数，主要流程为：

获取节点特征，归一化处理
获取标签，进行one-hot编码
获取邻接矩阵，对称化，归一化处理
转为模型可以处理的张量形式

首先是从文本文件中获取数据并且加载成一个初步的numpy数组，从中提取出特征features和标签labels

idx_features_labels = np.genfromtxt("{}{}.content".format(path, dataset),
                                        dtype=np.dtype(str))  # 从文本文件中加载数据并生成一个NumPy数组。
features = sp.csr_matrix(idx_features_labels[:, 1:-1], dtype=np.float32)  # feature为第二列到倒数第二列，labels为最后一列
labels = encode_onehot(idx_features_labels[:, -1])

接着构建图和邻接矩阵，通过读取 cora.sites 来获得边的信息来构建 paper 之间的引用关系图。

由于论文的id是杂乱无序的，表示起来十分不方便，所以源码中对论文的id进行了编号处理，使用字典给每篇论文id打上对应的编号。构建图主要生成一下部分：

图的组成	步骤
idx（节点编号）	提取idx -> 生成字典，给节点编号
edges（边）	读取边 -> 用节点编号表示边
adj（邻接矩阵）	生成adj -> 对称化 -> 归一化 -> 转变为稀疏张量

代码中构建邻接矩阵的过程中 edges[:, 0] 表示 paper_id1, edges[:, 1] 表示 paper_id2. 利用 sp.coo_matrix 来构建 “COOrdinate” 类型的稀疏矩阵. 如果两个节点之间有连接, 相应的位于 (edges[:, 0], edges[:, 1]) 处的值就是 1.

idx = np.array(idx_features_labels[:, 0], dtype=np.int32)
idx_map = {j: i for i, j in enumerate(idx)}

edges_unordered = np.genfromtxt("{}{}.cites".format(path, dataset),
                                    dtype=np.int32)  # edges_unordered为直接从边表文件中直接读取的结果，是一个(edge_num, 2)的数组，每一行表示一条边两个端点的idx

# 边的edges_unordered中存储的是端点id，要将每一项的id换成编号。
# 在idx_map中以idx作为键查找得到对应节点的编号，reshape成与edges_unordered形状一样的数组
edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),  # flatten：降维，返回一维数组
                     dtype=np.int32).reshape(edges_unordered.shape)

# 根据coo矩阵性质，这一段的作用就是，网络有多少条边，邻接矩阵就有多少个1，
# 所以先创建一个长度为edge_num的全1数组，每个1的填充位置就是一条边中两个端点的编号，
# 即edges[:, 0], edges[:, 1]，矩阵的形状为(node_size, node_size)。
adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
                       shape=(labels.shape[0], labels.shape[0]),
                       dtype=np.float32)

# build symmetric adjacency matrix 建立对称邻接矩阵
# 论文里A^=(D~)^0.5 A~ (D~)^0.5这个公式
# 对于无向图，邻接矩阵是对称的。上一步得到的adj是按有向图构建的，转换成无向图的邻接矩阵需要扩充成对称矩阵
adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)

features = normalize(features)
adj = normalize(adj + sp.eye(adj.shape[0]))  # eye创建单位矩阵，第一个参数为行数，第二个为列数
# 对应公式A~=A+IN

最后将对应的变量转换成模型可以处理的张量形式

# 分别构建训练集、验证集、测试集，并创建特征矩阵、标签向量和邻接矩阵的tensor，用来做模型的输入
idx_train = range(140)
idx_val = range(200, 500)
idx_test = range(500, 1500)

features = torch.FloatTensor(np.array(features.todense()))
labels = torch.LongTensor(np.where(labels)[1])
adj = sparse_mx_to_torch_sparse_tensor(adj)

idx_train = torch.LongTensor(idx_train)
idx_val = torch.LongTensor(idx_val)
idx_test = torch.LongTensor(idx_test)

return adj, features, labels, idx_train, idx_val, idx_test

模型

模型代码如下，这是使用了两层GraphConvolution的图卷积神经网络

class GCN(nn.Module):
    def __init__(self, nfeat, nhid, nclass, dropout):
        super(GCN, self).__init__()

        self.gc1 = GraphConvolution(nfeat, nhid)
        self.gc2 = GraphConvolution(nhid, nclass)
        self.dropout = dropout

    def forward(self, x, adj):
        x = F.relu(self.gc1(x, adj))
        x = F.dropout(x, self.dropout, training=self.training)
        x = self.gc2(x, adj)
        return F.log_softmax(x, dim=1)

GraphConvolution 层的定义如下:

class GraphConvolution(Module):
    """
    Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
    """

    def __init__(self, in_features, out_features, bias=True):
        super(GraphConvolution, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.weight = Parameter(torch.FloatTensor(in_features, out_features))
        if bias:
            self.bias = Parameter(torch.FloatTensor(out_features))
        else:
            self.register_parameter('bias', None)
        self.reset_parameters()

    def reset_parameters(self):
        stdv = 1. / math.sqrt(self.weight.size(1))
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

    def forward(self, input, adj):
        support = torch.mm(input, self.weight)
        output = torch.spmm(adj, support)
        if self.bias is not None:
            return output + self.bias
        else:
            return output

    def __repr__(self):
        return self.__class__.__name__ + ' (' \
               + str(self.in_features) + ' -> ' \
               + str(self.out_features) + ')'

其输入为节点的特征 input 以及归一化的邻接矩阵adj，计算公式为

其中，torch.spmm 是稀疏矩阵的乘法。如果节点的特征 input 不存在的话, 可以考虑将节点的 one-hot 表示作为特征输入到模型中。

在这段代码中，首先将输入转化为张量，再转化为可训练的Parameter对象，并绑定到module里面。net.parameter()中就有了这个绑定的parameter,所以在参数优化的时候可以进行优。Parameter()用于将参数自动加入到参数列表，让某些变量在学习过程中不断修改其值以达到最优化。

模型训练

因为模型最后一层的输出为F.log_softmax(x, dim=1)，所以损失函数使用的是负对数似然损失（Negative Log Likelihood Loss），该函数将模型的输出与目标标签进行比较，并计算相应的负对数似然损失。

loss_train = F.nll_loss(output[idx_train], labels[idx_train])

优化器使用的是Adam优化器：

optimizer = optim.Adam(model.parameters(),
                       lr=args.lr, weight_decay=args.weight_decay)

模型训练关键代码，按照数据输入-->前向传播-->计算损失-->反向传播计算梯度-->前向传播的顺序循环进行：

    model.train()
    optimizer.zero_grad()
    output = model(features, adj)
    loss_train = F.nll_loss(output[idx_train], labels[idx_train])
    acc_train = accuracy(output[idx_train], labels[idx_train])
    loss_train.backward()
    optimizer.step()

参考文献

pygcn源码注释 - 知乎 (zhihu.com)

(14条消息) Pygcn源码解读_爱吃橘子-的博客-CSDN博客

(15条消息) PyGCN 源码阅读_cora.sites_珍妮的选择的博客-CSDN博客

小鸡炖蘑菇@

关注

2
点赞
踩
4

收藏

觉得还不错? 一键收藏
1
评论
pygcn源码学习

GCN论文地址： arxiv.org/pdf/1609.02907.pdf源码：mirrors / tkipf / pygcn · GitCode这里使用的数据集是cora，该数据集包含关于机器学习的论文，数据集由以及两个文件组成。其中文件中的数据格式为：<paper_id> +<word_attributes>+ <class_label>。第一列<paper_id>是论文的id，共有2708篇文章；中间的<word_attributes>的长度为1433，每个位置上的值为0/1，表示论文的特征；<
复制链接

扫一扫