2021-09-08

4500个正例(相似对),4500个负例
训练集6299对,测试集2701对
————————————————

Graph match

相似度分类
GNN框架

每轮跑完所有的batch,即共63个batch
对于输入的每个batch,有100对图,每个图189个节点,共189*2*100=37800个节点
有node_features [37800, 90]
edge_features   None
from_idx		99154
to_idx			99154
graph_idx		37800
labels			100

GraphEmbeddingNet(
  (_encoder): GraphEncoder(
    (MLP1): Sequential(
      (0): Linear(in_features=90, out_features=32, bias=True)
    )
#即[37800,90]x[90,32]->[37800,32](1)
  )
  (_prop_layers): ModuleList(
#即[2,99154,32]的edge_inputs拼成[99154,64]的edge_inputs作为输入
    (0): GraphPropLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
#即[99154,64]x[64,64]=relu=>x[64,64]->[99154,64],
#再整合成[37800,64]送进反向信息传播网
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
#即当前输出是[37800,64],矩阵加法大小不变(2),
#把(2)扩维出[1,37800,64](3),把(1)扩维到[1,37800,32](4)
#把(3)和(4)一起送进GRU
      (GRU): GRU(64, 32)
#即输出[1,37800,32]再缩维到[37800,32]
    )
#即每层输入[37800,32]    
    (1): GraphPropLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(64, 32)
    )
    (2): GraphPropLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(64, 32)
    )
    (3): GraphPropLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(64, 32)
    )
    (4): GraphPropLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(64, 32)
    )
  )
#即将5层传播结果与编码结果汇总得到[6,37800,32]的输出  
  (_aggregator): GraphAggregator(
    (MLP1): Sequential(
      (0): Linear(in_features=32, out_features=256, bias=True)
    )
#即[37800,32]x[32,256]->[37800,256],
#再将其分为左右两个[37800,128]的矩阵做元素乘法得到[37800,128],
#再整合成[200,128]
    (MLP2): Sequential(
      (0): Linear(in_features=128, out_features=128, bias=True)
    )
#即[200,128]x[128,128]->[200,128]   
  )
)
此时对于这200个图,每个图对应一个128维的向量,
横向切开得到一上一下两个[100,128],分别是x和y,
即(x,y)对应一对图,对应一个label,对应一个loss,
即每个batch对应一个100维的loss

GMN框架

GraphMatchingNet(
  (_encoder): GraphEncoder(
    (MLP1): Sequential(
      (0): Linear(in_features=90, out_features=32, bias=True)
    )
  )
  (_prop_layers): ModuleList(
    (0): GraphPropMatchingLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
#即输出[37800,64](2) 
     
#对[37800,32]的(1)划分成200块,每块189行,
#对每块中每相邻的x和y,先计算相似度[189,189],
#再分别乘y和x得到一对注意力attention_x, attention_y [189,32]
#汇总成[200,189,32]再拼接成[37800,32](5)
#(5)=(1)-(5)
#再将(2)和(5)拼成[37800,96]
      (GRU): GRU(96, 32)
    )
    (1): GraphPropMatchingLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(96, 32)
    )
    (2): GraphPropMatchingLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(96, 32)
    )
    (3): GraphPropMatchingLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(96, 32)
    )
    (4): GraphPropMatchingLayer(
      (_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (_reverse_message_net): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
      (GRU): GRU(96, 32)
    )
  )
  (_aggregator): GraphAggregator(
    (MLP1): Sequential(
      (0): Linear(in_features=32, out_features=256, bias=True)
    )
    (MLP2): Sequential(
      (0): Linear(in_features=128, out_features=128, bias=True)
    )
  )
)

——————————————————————

GCN(有histogram)

相似度分类

每轮重新读一遍所有batch
每个batch有128对,即共50个batch
输入model一对图的[189,104]节点属性矩阵(由一维向量onehot得来)、边(以第一对为例是9881002)、标签target
SimGNN(
#分别将这两个图通过各自的三层GCN
  (convolution_1): GCNConv(104, 128)
#即[189,104]x[104,128]->[189,128] 
#988=自环=>1177=聚合传播=>1177
  (relu)
  (dropout)
  (convolution_2): GCNConv(128, 64)
  (relu)
  (dropout)
  (convolution_3): GCNConv(64, 32)
#即输出[189,32]  
  
  (histogram)
#即将两个[189,32]相乘得[189,189]后拉直得到[35721,1]用来计算出[1,16]的hist

#分别将两个[189,32]通过各自的attention
  (attention): AttentionModule()
#即[189,32]x[32,32]->[189,32]=mean=>32=tanh=>32=>[32,1]
#[189,32]x[32,1]=sigmoid=>[189,1]
#[32,189]x[189,1]->[32,1]

#将两个[32,1]一起送入张量层
  (tensor_network): TenorNetworkModule()
#即先把原本[32,32,16]的权重view成[32,512]
#[1,32]x[32,512]->[1,512]=view=>[32,16]
#[16,32]x[32,1]->[16,1](1)
#将原本的两个[32,1]cat成[64,1]
#[16,64]x[64,1]->[16,1](2)+[16,1](1)+[16,1](bias)=relu=>[16,1]
#再转置后与hist拼接成[1,32]

  (fully_connected_first): Linear(in_features=32, out_features=16, bias=True)
  (relu)
#即[1,32]x[32,16]->[1,16](如果没有histogram则权重为[16,16])
  (scoring_layer): Linear(in_features=16, out_features=1, bias=True)
  (sigmoid)
#[1,16]x[16,1]->[1,1]
)
对batch内每一对图计算一个loss
对每个batch计算一个loss的总和用来BP

————————————

graph bert

类别分类

188个图,训练169个,测试19个
输入

x                [169,28]      #  e^x 
说是节点属性,其实对于每一个graph,就是在nx中保存的节点索引列表,不足的部分用最大节点数+1来补上
例如
[0, 1, 13, 2, 3, 11, 4, 5, 6, 10, 7, 20, 8, 9, 15, 12, 14, 19, 16, 17, 18, 21, 22][0, 1, 13, 2, 3, 11, 4, 5, 6, 10, 7, 20, 8, 9, 15, 12, 14, 19, 16, 17, 18, 21, 22, 29, 29, 29, 29, 29]

d                [169,28]      #  e^d 节点度  

w                [169,28,28]   #  e^w 边的权值
可以看成邻接矩阵

wl 				 [169,28]      #  e^r WL      
node_color_dict中以节点序号为索引填入1,也即是表示有哪些节点的字典
例如{0: 1, 1: 1, 13: 1, 2: 1, 3: 1, 11: 1, 4: 1, 5: 1, 6: 1, 10: 1, 7: 1, 20: 1, 8: 1, 9: 1, 15: 1, 12: 1, 14: 1, 19: 1, 16: 1, 17: 1, 18: 1, 21: 1, 22: 1}
node_neighbor_dict遍历2跳邻居,也是记录有哪些节点,有点类似邻接链表
例如{0: {1: 1, 13: 1}, 1: {0: 1, 2: 1}, 13: {0: 1, 12: 1}, 2: {1: 1, 3: 1, 11: 1}, 3: {2: 1, 4: 1}, 11: {2: 1, 10: 1, 12: 1}, 4: {3: 1, 5: 1}, 5: {4: 1, 6: 1, 10: 1}, 6: {5: 1, 7: 1, 20: 1}, 10: {11: 1, 5: 1, 9: 1}, 7: {6: 1, 8: 1}, 20: {6: 1, 21: 1, 22: 1}, 8: {7: 1, 9: 1}, 9: {10: 1, 8: 1, 15: 1}, 15: {9: 1, 14: 1, 16: 1}, 12: {13: 1, 11: 1, 14: 1}, 14: {15: 1, 12: 1, 19: 1}, 19: {14: 1, 18: 1}, 16: {15: 1, 17: 1}, 17: {16: 1, 18: 1}, 18: {19: 1, 17: 1}, 21: {20: 1}, 22: {20: 1}}
取节点13的邻居字典{0: 1, 12: 1},
再取出value[1, 1]
再在其最前面加上node_color_dict[13],其实就是再加上一个1,得到['1', '1', '1']
进而拼出'1_1_1'
再通过hashlib.md5编码
汇总所有节点的编码后再得到去重字典
例如{'4eb90ba61276b0e27cee6f190e612949': 1, '9e8973112eebad7f27f0b762abd14d1e': 2, 'ec308451c1d095c528cfa3c009ea7235': 3}
再根据这个字典把编码映射成相应value
例如{13: 1, 0: 1, 1: 1, 2: 2, 3: 1, 11: 2, 4: 1, 5: 2, 6: 2, 10: 2, 7: 1, 20: 2, 8: 1, 9: 2, 15: 2, 12: 2, 14: 2, 19: 1, 16: 1, 17: 1, 18: 1, 21: 3, 22: 3}
经过多次迭代更新上面的字典
最后去除字典的value就是这个图的WL[1, 1, 1, 2, 1, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 2, 2, 1, 1, 1, 1, 3, 3]



y_true 			 169
context_idx_list 0



MethodGraphBertGraphClassification(
#先通过一层‘none’残差处理w其实返回两个none
#如果是‘raw’残差处理w,则经过下面两个linear,784=28*28
  (res_h): Linear(in_features=784, out_features=32, bias=True)
  (res_y): Linear(in_features=784, out_features=2, bias=True)
#分别得到residual_h和residual_y
#residual_h加给每个BertLayer的输出
#residual_y加给cls_y的结果
  
  (bert): MethodGraphBert(
    (embeddings): BertEmbeddings(
      (raw_feature_embeddings): Linear(in_features=28, out_features=32, bias=True)
#即对w处理,[169,28,28]x[28,32]->[169,28,32]
      (tag_embeddings): Embedding(1000, 32)
#即对x处理,[169,28]=embed=>[169,28,32]      
      (degree_embeddings): Embedding(1000, 32)
#即对d处理,[169,28]=embed=>[169,28,32]     
      (wl_embeddings): Embedding(1000, 32)
#即对wl处理,[169,28]=embed=>[169,28,32]
#四个再求和
      (LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.5, inplace=False)
#输出(1)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=32, out_features=32, bias=True)
              (key): Linear(in_features=32, out_features=32, bias=True)
              (value): Linear(in_features=32, out_features=32, bias=True)
#即对(1)分别做三个[169,28,32]x[32,32]->[169,28,32]得到Q,K,V
#再分别把Q,K,V view成[169,28,2,16],
#再permute成[169,2,28,16]
#计算QK^T,[169,2,28,16]x[169,2,16,28]->[169,2,28,28]
#经过除√16标准化,再softmax
              (dropout): Dropout(p=0.3, inplace=False)
#dropout后再乘V,[169,2,28,28]x[169,2,28,16]->[169,2,28,16]
#再permute成[169,28,2,16]再view成[169,28,32]输出(2)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=32, out_features=32, bias=True)
#即对(2)[169,28,32]x[32,32]->[169,28,32](3)
              (dropout): Dropout(p=0.5, inplace=False)
#对(3)dropout
              (LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
#将(1)和(3)求和后做layernorm得到(4)
#再将(4)与(2)求和后输出(5)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=32, out_features=32, bias=True)
#即对(5)[169,28,32]x[32,32]->[169,28,32],
#再通过一个gelu函数得到(6)
          )
          (output): BertOutput(
            (dense): Linear(in_features=32, out_features=32, bias=True)
#即对(6)[169,28,32]x[32,32]->[169,28,32](7)
            (dropout): Dropout(p=0.5, inplace=False)
            (LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
#对(7)dropout后和(5)求和再做layernorm输出(8)
          )
        )
        (1): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=32, out_features=32, bias=True)
              (key): Linear(in_features=32, out_features=32, bias=True)
              (value): Linear(in_features=32, out_features=32, bias=True)
              (dropout): Dropout(p=0.3, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=32, out_features=32, bias=True)
              (LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.5, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=32, out_features=32, bias=True)
          )
          (output): BertOutput(
            (dense): Linear(in_features=32, out_features=32, bias=True)
            (LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.5, inplace=False)
          )
        )
      )
    )
    (pooler): BertPooler(
#将上面的输出(9)[169,28,32]取其[169,32]
      (dense): Linear(in_features=32, out_features=32, bias=True)
#[169,32]x[32,32]->[169,32]
      (activation): Tanh()
    )
#将上面tanh后输出的(10)与(9)一同输出,其实(10)没啥用啊?
  )
#对于(9)[169,28,32]对齐第二个维度求均值,即28个[169,32]矩阵求均值得到(11)
  (cls_y): Linear(in_features=32, out_features=2, bias=True)
#对(11)[169,32]x[32,2]->[169,2]后再log_softmax
)
输出结果是[169,2]的矩阵,标签是169的向量,
这时计算损失采用F.cross_entropy,其中nll_loss可以把[169,2]均值成169的向量再与标签计算交叉熵得到一个值
即一个图对应一个loss值

————————————————————
RWGNN

分折,共188个图,测试94,训练84,验证10
每个图节点数不同,节点维度都是7维

graph_indicator_batch对应当前有n个节点的这个图,
记录j-i(i是当前batch的序号,j=min(i+64,84)),
即第0 batch对应0,1,2,...,64,第1 batch对应0,1,...,64
其中每个数字的个数对应当前图节点个数
作用是在总的节点矩阵中标注某个节点属于的图的索引



adj_train				分批[1155,1155]	[1161,1161]	[381,381]
features_train			分批[1155,7]	[1161,7]	[381,7]
graph_indicator_train	分批1155		1161		381
y_train					分批64			64			24		
即以第一批为例,输入64个图,共有1155个节点
RW_NN(
#先把graph_indicator_train去重,得到图索引unique和每个图的节点数counts

#取学习参数[hidden_graphs,(size_hidden_graphs*(size_hidden_graphs-1))//2]记作adj_train[16,C5^2]
#构造[hidden_graphs,size_hidden_graphs,size_hidden_graphs]的adj_hidden_norm[16,5,5]
  (Relu)
#将每个[5,5]的矩阵上三角填入relu(adj_train),再与其转置相加

  (fc): Linear(in_features=7, out_features=4, bias=True)
#即[1155,7]x[7,4]->[1155,4]
  (sigmoid): Sigmoid()
#激活后记作x
#取学习参数[hidden_graphs,size_hidden_graphs,hidden_dim]记作z[16,5,4],
#通过torch.einsum("abc,dc->abd",(z,x))乘法,得到zx即[16,5,4]x[1155,4]->[16,5,1155]
  
#接下来2轮循环

#第一轮
#先构造16个5维单位矩阵,即eye[16,5,5]
#通过torch.einsum("abc,acd->abd",(eye,z))乘法,得到o即[16,5,5]x[16,5,4]->[16,5,4]
#通过torch.einsum("abc,dc->abd",(o,x))乘法,得到t即[16,5,4]x[1155,4]->[16,5,1155]
  (dropout)
#t=zx○t即[16,5,1155]○[16,5,1155]->[16,5,1155]元素乘法
#先根据t和图总数构造零矩阵temp[16,5,64]
#再通过temp.index_add_(2,graph_indicator,t)加法,相当于
#	for i in range(temp.shape[2]):				#64
#		for j in range(len(graph_indicator)):	#1155
#			if graph_indicator[j]==i:
#				temp[i]+=t[j]
#再对temp求和sum(t, dim=1)并转置得到t[64,16]

#第二轮
#x=adj_train*x即[1155,1155]x[1155,4]->[1155,4]
#通过torch.einsum("abc,acd->abd",(adj_hidden_norm,z))乘法,得到z即[16,5,5]x[16,5,4]->[16,5,4]
#通过torch.einsum("abc,dc->abd",(z,x))乘法,得到t即[16,5,4]x[1155,4]->[16,5,1155]
  (dropout)
#t=zx○t即[16,5,1155]○[16,5,1155]->[16,5,1155]元素乘法
#先根据t和图总数构造零矩阵temp[16,5,64]
#再通过temp.index_add_(2,graph_indicator,t)加法
#再对temp求和sum(t, dim=1)并转置得到t[64,16]

#将两轮的t拼接得到[64,32]
  (bn): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (fc1): Linear(in_features=32, out_features=32, bias=True)
#即[64,32]x[32,32]->[64,32]
  (relu): ReLU()
  (dropout): Dropout(p=0.2, inplace=False)
  (fc2): Linear(in_features=32, out_features=2, bias=True)
#即[64,32]x[32,2]->[64,2]
#log_softmax
)
每个图对应一个2维向量,其实还是后继二分类交叉熵,即每一batch对应一个loss
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值