GraphSAGE: Inductive Representation Learning on Large Graphs

最新推荐文章于 2024-05-25 09:35:44 发布

BUPT-WT

最新推荐文章于 2024-05-25 09:35:44 发布

阅读量689

点赞数 1

分类专栏：图神经网络

本文链接：https://blog.csdn.net/weixin_41362649/article/details/111505646

版权

图神经网络专栏收录该内容

6 篇文章 2 订阅

订阅专栏

GraphSAGE研究意义：

1. 图卷积神经网络最常用的几个模型(GCN、GAT、GraphSAGE)

2、归纳式学习(inductive learning)

3、不同于之前的学习node embedding，提出学习aggregators等函数的方式

4、探讨了多种aggregator方式(mean、pool、lstm)

5、图表征学习的经典baseline

论文主要结构:

一、摘要Abstract

介绍图的广泛应用，主要引出本文的motivations是做图的归纳式学习，通过学习一组函数对节点的邻居采样，然后汇聚得到向量式表达，具体可以总结为以下几点:

1、提出一种归纳式学习模型，可以得到新点/新图的表征

2、GraphSAGE模型通过学习一组函数来得到点的特征

3、采样并汇聚点的邻居特征与节点的特征拼接得到点的特征

4、GraphSAGE算法在直推式和归纳式学习均达到最优结果

二、Introduction

介绍了图的广泛应用，介绍之前的工作主要是基于静态图的算法，GraphSAGE处理新点甚至新图，总结了DeepWalk、Node2vec、GCN等算法，提出本文算法主要是训练aggregate函数

三、Related Work

介绍之前的算法，基于随机游走、矩阵分解、图卷积等算法

四、GraphSAGE模型

主要介绍前向传播算法、模型参数介绍、aggregator模型结构

GraphSAGE算法如上图Algorithm1，主要的部分就是归纳也就是(4)、(5)两部分，所有邻居信息汇聚，以及自身信息和邻居信息合并计算

接着，文章又介绍了目标函数(如上图3.2)，不仅可以进行有监督学习，还可以进行无监督学习，无监督学习的目标函数和之前的图算法目标函数一致，说的就是图结构中，两个节点关系比较紧密，那么学出来的两个节点的embedding也比较相似

之后介绍了aggregate函数的几种方式，包括Mean、LSTM、Pooling，论文附录中还给出批量学习的算法

五、Experiments

实验设置、数据集选择、直推式学习实验、参数分析、不同aggregate函数对模型的影响分析

主要介绍了一些实验参数以及对·实验数据集的介绍，最后实验结果对比

六、Theoretical Analysis && Conclusion

总结提出的GraphSAGE模型具有归纳式的能力，邻居汇聚时考虑不同的aggregator方式，讨论了几种未来方向和subgraph embedding 邻居采样方式等

创新点:

1、归纳式学习(inductive learning)

2、多种aggregators探讨

3、文中并给出一些理论分析

关键点:

1、模型结构

2、邻居节点的sampling

3、Batch训练方式

启发点:

1、归纳式学习方式

2、多种aggregate函数讨论

3、Batch 训练方式 sample 邻居性能高效

4、GCN、GAT、GraphSAGE经典的baselines

七、Coding

论文中的数据集-cora

数据集主要包含两个文件，
    一个是cora.cites表示两个节点节点是否有边
    另一个是cora.content 表示每个节点的特征以及label

example:

cora.cites

35	1033
35	103482
35	103515
35	1050679
35	1103960
35	1103985
35	1109199
35	1112911
35	1113438
35	1113831
35	1114331
35	1117476
35	1119505
35	1119708
35	1120431
35	1123756
35	1125386
35	1127430
35	1127913
.....

cora.content

31336	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	Neural_Networks
......

""" 加载数据并对数据进行处理 """


def load_cora():
    
    import numpy as np
    num_nodes = 2708
    num_feats = 1433
    feat_data = np.zeros((num_nodes, num_feats))
    labels = np.empty((num_nodes, 1), dtype=np.int64)
    node_map = {}
    label_map = {}

    with open('../cora/cora.content') as fp:
        for i,line in enumerate(fp):
            info = line.strip().split()
            tmp = []
            for ss in info[1:-1]:
                tmp.append(float(ss))
            feat_data[i,:] = tmp

            node_map[info[0]] = i
            if not info[-1] in label_map:
                label_map[info[-1]] = len(label_map)

            labels[i] = label_map[info[-1]]
            
    from collections import defaultdict

    adj_lists = defaultdict(set)

    with open('../cora/cora.cites') as fp:
        for i,line in enumerate(fp):
            info = line.strip().split()
            uid = node_map[info[0]]
            target_uid = node_map[info[1]]

            adj_lists[uid].add(target_uid)
            adj_lists[target_uid].add(uid)

    return feat_data,labels,adj_lists


""" 构建aggregate 函数"""

import torch
import torch.nn as nn
from torch.autograd import Variable
import random

class MeanAggregator(nn.Module):
    def __init__(self,features,cuda=False,gcn=False):
        super(MeanAggregator,self).__init__()
        self.features = features
        self.cuda = cuda
        self.gcn = gcn
    
    def forward(self,nodes,to_neighs,num_sample=10):
        _set = set
        if not num_sample is None:
            _sample = random.sample
            samp_neighs = [_set(_sample(to_neigh, num_sample)) if len(to_neigh) >= num_sample else to_neigh for to_neigh in to_neighs]
        else:
            sample_neighs = to_neighs
            
        if self.gcn:
            sample_neighs = [samp_neigh + set([nodes[i]]) for i,samp_neigh in enumerate(samp_neighs)]
            
        unique_nodes_list = list(set.union(*samp_neighs))
        unique_nodes = {n:i for i,n in enumerate(unique_nodes_list)}
        
        mask = Variable(torch.zeros(len(samp_neighs),len(unique_nodes)))
        column_indices = [unique_nodes[n] for samp_neigh in samp_neighs for n in samp_neigh]
        row_indices = [i for i in range(len(samp_neighs)) for j in range(len(samp_neighs[i]))]
        mask[row_indices,column_indices] = 1
        
        if self.cuda:
            mask = mask.cuda()
            
        num_neigh = mask.sum(1,keepdim=True)
        mask = mask.div(num_neigh)
        if self.cuda:
            embed_matrix = self.features(torch.LongTensor(unique_nodes_list).cuda())
        else:
            embed_matrix = self.features(torch.LongTensor(unique_nodes_list))
            
        to_feats = mask.mm(embed_matrix)
        
        return to_feats



""" 自身节点和邻居节点进行聚合 """

import torch
import torch.nn as nn
from torch.nn import init
import torch.nn.functional as F

class Encoder(nn.Module):
    """
    Encodes a node's using 'convolutional' GraphSage approach
    """
    def __init__(self, features, feature_dim, 
            embed_dim, adj_lists, aggregator,
            num_sample=10,
            base_model=None, gcn=False, cuda=False, 
            feature_transform=False): 
        super(Encoder, self).__init__()

        self.features = features
        # 变换前的hidden_size/维度
        self.feat_dim = feature_dim
        self.adj_lists = adj_lists
        # 即邻居聚合后的mebedding
        self.aggregator = aggregator
        self.num_sample = num_sample
        if base_model != None:
            self.base_model = base_model

        self.gcn = gcn
        # 变换后的hidden_size/维度
        self.embed_dim = embed_dim
        self.cuda = cuda
        self.aggregator.cuda = cuda
        # 矩阵W维度 = 变换后维度 * 变换前维度
        # 其中gcn表示是否拼接，如果拼接的话由于是"自身向量||邻居聚合向量", 所以维度为2倍
        self.weight = nn.Parameter(
                torch.FloatTensor(embed_dim, self.feat_dim if self.gcn else 2 * self.feat_dim))
        init.xavier_uniform(self.weight)

    def forward(self, nodes):
        """
        Generates embeddings for a batch of nodes.

        nodes     -- list of nodes
        """
        neigh_feats = self.aggregator.forward(nodes, [self.adj_lists[int(node)] for node in nodes], 
                self.num_sample)
        if not self.gcn:
            if self.cuda:
                self_feats = self.features(torch.LongTensor(nodes).cuda())
            else:
                self_feats = self.features(torch.LongTensor(nodes))
            # 将自身和聚合邻居的向量拼接, algorithm 1 line 5的拼接部分
            combined = torch.cat([self_feats, neigh_feats], dim=1)
        else:
            # 只用聚合邻居的向量来表示，不用自身信息, algorithm 1 line 5的拼接部分
            combined = neigh_feats
        # 送入到神经网络，algorithm 1 line 5乘以矩阵W
        combined = F.relu(self.weight.mm(combined.t()))
        # 经过一层GNN layer后的点的embedding，维度为embed_dim * nodes
        return combined


""" 定义整体结构 """

class SupervisedGraphSage(nn.Module):

    def __init__(self, num_classes, enc):
        super(SupervisedGraphSage, self).__init__()
        # 这里面赋值为enc2(经过两层GNN)
        self.enc = enc
        self.xent = nn.CrossEntropyLoss()
        # 全连接参数矩阵，映射到labels num_classes维度做分类
        self.weight = nn.Parameter(torch.FloatTensor(num_classes, enc.embed_dim))
        init.xavier_uniform(self.weight)

    def forward(self, nodes):
        # embeds实际是我们两层GNN后的输出nodes embedding
        embeds = self.enc(nodes)
        # 最后将nodes * hidden size 映射到 nodes * num_classes(= 7)之后做softmax计算cross entropy
        scores = self.weight.mm(embeds)
        return scores.t()

    def loss(self, nodes, labels):
        # 钱箱传播
        scores = self.forward(nodes)
        # 定义的cross entropy
        return self.xent(scores, labels.squeeze())



""" 训练模型 """

def run_cora():
    # 随机数设置seed(种子)
    np.random.seed(1)
    random.seed(1)
    # cora数据集点数
    num_nodes = 2708
    # 加载cora数据集, 分别是
    # feat_data: 特征
    # labels: 标签
    # adj_lists: 邻接表，dict (key: node, value: neighbors set)
    feat_data, labels, adj_lists = load_cora()
    # 设置输入的input features矩阵X的维度 = 点的数量 * 特征维度
    features = nn.Embedding(2708, 1433)
    # 为矩阵X赋值，参数不更新
    features.weight = nn.Parameter(torch.FloatTensor(feat_data), requires_grad=False)
    # features.cuda()

    # 一共两层GNN layer
    # 第一层GNN
    # 以mean的方式聚合邻居, algorithm 1 line 4
    agg1 = MeanAggregator(features, cuda=True)
    # 将自身和聚合邻居的向量拼接后送入到神经网络(可选是否只用聚合邻居的信息来表示), algorithm 1 line 5
    enc1 = Encoder(features, 1433, 128, adj_lists, agg1, gcn=True, cuda=False)

    # 第二层GNN
    # 将第一层的GNN输出作为输入传进去
    # 这里面.t()表示转置，是因为Encoder class的输出维度为embed_dim * nodes
    agg2 = MeanAggregator(lambda nodes : enc1(nodes).t(), cuda=False)
    # enc1.embed_dim = 128, 变换后的维度还是128
    enc2 = Encoder(lambda nodes : enc1(nodes).t(), enc1.embed_dim, 128, adj_lists, agg2,
            base_model=enc1, gcn=True, cuda=False)

    # 采样的邻居点的数量
    enc1.num_samples = 5
    enc2.num_samples = 5

    # 7分类问题
    # enc2是经过两层GNN layer时候得到的 node embedding/features
    graphsage = SupervisedGraphSage(7, enc2)
    # graphsage.cuda()

    # 目的是打乱节点顺序
    rand_indices = np.random.permutation(num_nodes)

    # 划分测试集、验证集、训练集
    test = rand_indices[:1000]
    val = rand_indices[1000:1500]
    train = list(rand_indices[1500:])

    # 用SGD的优化，设置学习率
    optimizer = torch.optim.SGD(filter(lambda p : p.requires_grad, graphsage.parameters()), lr=0.7)
    # 记录每个batch训练时间
    times = []
    # 共训练100个batch
    for batch in range(100):
        # 取256个nodes作为一个batch
        batch_nodes = train[:256]
        # 打乱训练集的顺序，使下次迭代batch随机
        random.shuffle(train)
        # 记录开始时间
        start_time = time.time()
        optimizer.zero_grad()
        # 这个是SupervisedGraphSage里面定义的cross entropy loss
        loss = graphsage.loss(batch_nodes, 
                Variable(torch.LongTensor(labels[np.array(batch_nodes)])))
        # 反向传播和更新参数
        loss.backward()
        optimizer.step()
        # 记录结束时间
        end_time = time.time()
        times.append(end_time-start_time)
        # print (batch, loss.data[0])
        print (batch, loss.data)

    # 做validation
    val_output = graphsage.forward(val)
    # 计算micro F1 score
    print ("Validation F1:", f1_score(labels[val], val_output.data.numpy().argmax(axis=1), average="micro"))
    # 计算每个batch的平均训练时间
    print ("Average batch time:", np.mean(times))

""" 模型运行结果 """

run_cora()

0 tensor(1.9649)
1 tensor(1.9406)
2 tensor(1.9115)
3 tensor(1.8925)
4 tensor(1.8731)
5 tensor(1.8354)
6 tensor(1.8018)
7 tensor(1.7535)
8 tensor(1.6938)
9 tensor(1.6029)
10 tensor(1.6312)
11 tensor(1.5248)
12 tensor(1.4800)
13 tensor(1.4503)
14 tensor(1.4162)
15 tensor(1.3210)
16 tensor(1.2243)
17 tensor(1.2255)
18 tensor(1.0978)
19 tensor(1.1330)
20 tensor(0.9534)
21 tensor(0.9112)
22 tensor(0.9170)
23 tensor(0.7924)
24 tensor(0.8008)
25 tensor(0.7142)
26 tensor(0.7839)
27 tensor(0.8878)
28 tensor(1.2177)
29 tensor(0.9943)
30 tensor(0.8073)
31 tensor(0.6588)
32 tensor(0.6254)
33 tensor(0.5622)
34 tensor(0.5158)
35 tensor(0.4763)
36 tensor(0.5298)
37 tensor(0.5419)
38 tensor(0.5098)
39 tensor(0.4122)
40 tensor(0.4262)
41 tensor(0.4451)
42 tensor(0.4126)
43 tensor(0.4409)
44 tensor(0.3913)
45 tensor(0.4496)
46 tensor(0.4365)
47 tensor(0.4601)
48 tensor(0.4714)
49 tensor(0.4090)
50 tensor(0.4145)
51 tensor(0.3428)
52 tensor(0.3454)
53 tensor(0.3531)
54 tensor(0.3131)
55 tensor(0.2719)
56 tensor(0.3519)
57 tensor(0.3286)
58 tensor(0.3125)
59 tensor(0.2529)
60 tensor(0.3033)
61 tensor(0.2332)
62 tensor(0.3049)
63 tensor(0.3026)
64 tensor(0.3770)
65 tensor(0.3811)
66 tensor(0.3223)
67 tensor(0.2450)
68 tensor(0.2620)
69 tensor(0.2846)
70 tensor(0.2482)
71 tensor(0.3044)
72 tensor(0.4133)
73 tensor(0.3156)
74 tensor(0.4421)
75 tensor(0.2596)
76 tensor(0.2585)
77 tensor(0.2639)
78 tensor(0.2035)
79 tensor(0.2328)
80 tensor(0.1748)
81 tensor(0.1730)
82 tensor(0.1978)
83 tensor(0.1614)
84 tensor(0.1890)
85 tensor(0.1227)
86 tensor(0.1568)
87 tensor(0.1527)
88 tensor(0.2365)
89 tensor(0.2297)
90 tensor(0.1787)
91 tensor(0.1920)
92 tensor(0.1864)
93 tensor(0.1254)
94 tensor(0.1678)
95 tensor(0.1336)
96 tensor(0.1562)
97 tensor(0.2531)
98 tensor(0.2392)
99 tensor(0.2089)
Validation F1: 0.864
Average batch time: 0.047979302406311035

BUPT-WT

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
GraphSAGE: Inductive Representation Learning on Large Graphs

GraphSAGE研究意义：1. 图卷积神经网络最常用的几个模型(GCN、GAT、GraphSAGE)2、归纳式学习(inductive learning)3、不同于之前的学习node embedding，提出学习aggregators等函数的方式4、探讨了多种aggregator方式(mean、pool、lstm)5、图表征学习的经典baseline论文主要结构:一、摘要Abstract介绍图的广泛应用，主要引出本文的motivations是做图的归纳式学习，通过学习.
复制链接

扫一扫

专栏目录