Task 03 基于网络的节点表征学习

最新推荐文章于 2023-04-09 10:35:12 发布

听风1996

最新推荐文章于 2023-04-09 10:35:12 发布

阅读量192

点赞数 2

文章标签：神经网络

本文链接：https://blog.csdn.net/m0_37395076/article/details/118161561

版权

基于网络的节点表征学习

1.表示学习

机器学习领域：
- 图像—>向量/视频—>向量….—>所有的深度学习模型都可以归结为表示学习的问题
- 挑战：如何利用我们在图像/视频上取得的成功经验来应对图特征的表示学习
卷积神经网络：表示学习利器。
从图的角度看图像上的CNN：在欧式空间上的格点图
- 尺度不变性
- 多尺度结构
为什么可以在图像上做卷积？
- 图像局部结构相同
- 卷积核与局部结构相同
- 基于空间位置（spatial）的卷积（与spatial对应的是Spectral 谱）
目标：将在欧式空间上的CNN扩展到拓扑空间。—>图卷积

2. 图卷积神经网络

卷积神经网络（GCN）：
- 输入：邻接矩阵（节点数x节点数），特征矩阵（节点数x输入特征数）
- 输出：新的特征矩阵（节点数x输出特征数）
- 多层网络可以进行叠加
- 节点层面：节点自身特征和其邻域特征的聚合
  
  $H^{(l+1)}=\sigma\left(\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)}\right)$
  
  其中： $\tilde{A}=A+I_{N}$ :带自环(自身特征与邻居节点特征进行Aggregation)的邻接矩阵
  
  $\tilde{D}=\sum_{j}\tilde{A}_{ij}$ :度矩阵
  
  H：节点向量即特征矩阵(Feature Matrix) W：模型参数(Weight Matrix)
  
  AxHxW 相乘相当于卷积操作
  
  $\sigma(.)$ :激活函数
- 对GCN公式的直观理解
- 两层GCN构造&损失函数
  
  $A)=\operatorname{softmax}\left(\hat{A} \operatorname{ReLU}\left(\hat{A} X W^{(0)}\right) W^{(1)}\right)$ （ $\hat{A}=\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}}$ ）
  
  $\mathcal{L}=-\sum_{l \in \mathcal{Y}_{L}} \sum_{f=1}^{F} Y_{l f} \ln Z_{l f}$
- GCN推导思路：在图的拓扑空间近似在谱空间中的滤波操作，减少可学习参数。
- 从另一个角度理解GCN：对邻居节点特征的（带权重的）聚合。

3. 图注意力网络 GAT（对权重进行扩展）

GCN中使用的邻接矩阵权重是提前给定的，例如 $\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}}$
图注意力网络引入自注意力机制，利用当前节点的特征以及其邻居及邻居节点的特征计算邻居节点的重要性，把该重要性作为新的邻居矩阵进行卷积计算。（增加了计算量）
优势：利用节点特征的相似性更能反应邻接信息。

$\alpha_{i j}=\frac{\exp \left(\text { LeakyReLU }\left(\overrightarrow{\mathbf{a}}^{T}\left[\mathbf{W} \vec{h}_{i} \| \mathbf{W} \vec{h}_{j}\right]\right)\right)}{\sum_{k \in \mathcal{N}_{i}} \exp \left(\text { LeakyReLU }\left(\overrightarrow{\mathbf{a}}^{T}\left[\mathbf{W} \vec{h}_{i} \| \mathbf{W} \vec{h}_{k}\right]\right)\right)}$

$\vec{h}_{i}^{\prime}=\sigma\left(\frac{1}{K} \sum_{k=1}^{K} \sum_{j \in \mathcal{N}_{i}} \alpha_{i j}^{k} \mathbf{W}^{k} \vec{h}_{j}\right)$

4. PyG代码实践

PyG中的GCNConv模块说明

构造函数接口：
```
GCNConv(in_channels: int, out_channels: int, improved: bool = False, cached: bool = False, add_self_loops: bool = True, normalize: bool = True, bias: bool = True, **kwargs)
```
参数说明：
- in_channels：输入数据维度；
- out_channels：输出数据维度；
- improved：如果为true， $\mathbf{\hat{A}} = \mathbf{A} + 2\mathbf{I}$ ，其目的在于增强中心节点自身信息；
- cached：是否存储 $\mathbf{\hat{D}}^{-1/2} \mathbf{\hat{A}} \mathbf{\hat{D}}^{-1/2}$ 的计算结果以便后续使用，这个参数只应在归纳学习（transductive learning）的景中设置为true；
- add_self_loops：是否在邻接矩阵中增加自环边；
- normalize：是否添加自环边并在运行中计算对称归一化系数；
- bias：是否包含偏置项。

构建MLP，GCN，GAT模型（models.py）

import torch
import torch.nn.functional as F
from torch.nn import Linear
from torch_geometric.nn import GCNConv
from torch_geometric.nn import GATConv

class MLP(torch.nn.Module):
    def __init__(self, in_features_channles, hidden_channles, out_features_channles):
        super(MLP, self).__init__()
        torch.manual_seed(42)
        self.lin_one = Linear(in_features_channles, hidden_channles)
        self.lin_two = Linear(hidden_channles, out_features_channles)

    def forward(self, input, edge_index):
        x = self.lin_one(input)
        x = x.relu()
        x = F.dropout(x, p=0.5, training=self.training)
        output = self.lin_two(x)
        return output

class GCN(torch.nn.Module):
    def __init__(self, in_features_channles, hidden_channles, out_features_channles):
        super(GCN, self).__init__()
        torch.manual_seed(42)
        self.conv1 = GCNConv(in_features_channles, hidden_channles)
        self.conv2 = GCNConv(hidden_channles, out_channels)

    def forward(self, input, edge_index):
        x = self.conv1(input, edge_index)
        x = x.relu()
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.conv2(x, edge_index)
        return x

        
class GAT(torch.nn.Module):
    def __init__(self, in_features_channles, hidden_channles, out_features_channles):
        super(GAT, self).__init__()
        torch.manual_seed(42)
        self.conv1 = GATConv(in_features_channles, hidden_channles)
        self.conv2 = GATConv(hidden_chanles,out_channels)
        
    def forward(self, input, edge_index):
        x = self.conv1(input, edge_index)
        x = x.relu()
        x = F.dropout(x, p=0.5, training=self.training)

构建训练流程（train.py）

import argparse
import torch
from models import MLP,GCN,GAT
from torch_geometric.datasets import Planetoid
from torch_geometric.transforms import NormalizeFeatures

# 参数设置
model_names = ['MLP', 'GCN', 'GAT']
optimizer_names = ['sgd', 'adam']
parser = argparse.ArgumentParser(description='Graph nerual network')
parser.add_argument('--model', default= 'GCN', choices=model_names)
parser.add_argument('--optimizer','-o', default='adam', choices=optimizer_names)
parser.add_argument('--epochs', default=100, type=int, metavar='N')
parser.add_argument('--lr', '--learning-rate', default=0.01, type=float, metavar='LR')
parser.add_argument('--weight-decay', '--wd', default=5e-4, type=float, metavar='W')
# 下载数据集
dataset = Planetoid(root = "./dataset", name = 'Cora', transform=NormalizeFeatures())

def main():
    # set dataset
    data = dataset[0]
    node_features = data.x.cuda()
    node_label = data.y.cuda()
    edge_index = data.edge_index.cuda()
    in_features_channles = dataset.num_features
    hidden_channles=16
    out_features_channles=dataset.num_classes
    train_index = data.train_mask
    test_index = data.test_mask
    # initialize model
    args = parser.parse_args()
    if args.model =='MLP':
        model = MLP(in_features_channles, hidden_channles, out_features_channles)
    elif args.model =='GCN':
        model = GCN(in_features_channles, hidden_channles, out_features_channles)
    elif args.model =='GAT':
        model = GAT(in_features_channles, hidden_channles, out_features_channles)
    else:
        raise ValueError('Unsupported or unkown architecture')
    # set random seeds
    torch.manual_seed(42)
    torch.cuda.manual_seed_all(0)
    # move model to GPU
    model.cuda()
    # define loss function
    criterion = torch.nn.CrossEntropyLoss().cuda()

    # define optimizer
    if args.optimizer == 'adam':
        optimizer = torch.optim.Adam(model.parameters(), lr = args.lr, weight_decay = args.weight_decay)
    elif args.optimizer == 'sgd':
        optimizer = torch.optim.SGD(model.parameters(), lr = args.lr, weight_decay = args.weight_decay)
    for epoch in range(1,args.epochs+1):
        loss = train(node_features, node_label, edge_index, model, train_index, criterion, optimizer)
        if epoch % 10 == 0:
            print(f'Epoch:{epoch:03d},loss:{loss:.4f}')
    test_acc = test(node_features, node_label, edge_index, model, test_index)
    print(f'Test Acc:{test_acc:.4f}')

def train(node_features, node_label, edge_index, model, train_index, criterion, optimizer):
    # strat training
    model.train()
    # zero out gradients
    optimizer.zero_grad()
    # compute output vector 
    out = model.forward(node_features, edge_index)
    # compute loss
    loss = criterion(out[train_index], node_label[train_index])
    loss.backward()
    optimizer.step()
    return loss

def test(node_features, node_label, edge_index, model, test_index):
    # strat testing
    model.eval()
    # compute output vector
    out = model(node_features, edge_index)
    # use the class with highest probability
    pred = out.argmax(dim=1)
    test_correct = pred[test_index]== node_label[test_index]
    test_acc = int(test_correct.sum()) / int(test_index.sum())
    return test_acc

if __name__ == '__main__':
    main()

5. 结果分析

定量结果如下：

# MLP
Epoch:010,loss:1.8935
Epoch:020,loss:1.7579
Epoch:030,loss:1.5602
Epoch:040,loss:1.3132
Epoch:050,loss:1.0011
Epoch:060,loss:0.8857
Epoch:070,loss:0.7286
Epoch:080,loss:0.6340
Epoch:090,loss:0.5930
Epoch:100,loss:0.5329
Test Acc:0.5560
# GCN
Epoch:010,loss:1.8752
Epoch:020,loss:1.7340
Epoch:030,loss:1.5649
Epoch:040,loss:1.3471
Epoch:050,loss:1.1623
Epoch:060,loss:0.9756
Epoch:070,loss:0.8021
Epoch:080,loss:0.7483
Epoch:090,loss:0.6294
Epoch:100,loss:0.5731
Test Acc:0.8090
# GAT
Epoch:010,loss:1.8769
Epoch:020,loss:1.7326
Epoch:030,loss:1.5355
Epoch:040,loss:1.2970
Epoch:050,loss:0.9762
Epoch:060,loss:0.8291
Epoch:070,loss:0.6570
Epoch:080,loss:0.5441
Epoch:090,loss:0.5030
Epoch:100,loss:0.4347
Test Acc:0.7870

可视化节点表征分布：

为了实现节点表征分布的可视化，我们先利用TSNE将高维节点表征嵌入到二维平面空间，然后在二维平面空间画出节点。

import matplotlib.pyplot as plt
from sklearn.manifold import TSNE

def visualize(h, color):
    z = TSNE(n_components=2).fit_transform(out.detach().cpu().numpy())
    plt.figure(figsize=(10,10))
    plt.xticks([])
    plt.yticks([])

    plt.scatter(z[:, 0], z[:, 1], s=70, c=color, cmap="Set2")
    plt.show()