4500个正例(相似对),4500个负例
训练集6299对,测试集2701对
————————————————
Graph match
相似度分类
GNN框架
每轮跑完所有的batch,即共63个batch
对于输入的每个batch,有100对图,每个图189个节点,共189*2*100=37800个节点
有node_features [37800, 90]
edge_features None
from_idx 99154
to_idx 99154
graph_idx 37800
labels 100
GraphEmbeddingNet(
(_encoder): GraphEncoder(
(MLP1): Sequential(
(0): Linear(in_features=90, out_features=32, bias=True)
)
#即[37800,90]x[90,32]->[37800,32](1)
)
(_prop_layers): ModuleList(
#即[2,99154,32]的edge_inputs拼成[99154,64]的edge_inputs作为输入
(0): GraphPropLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
#即[99154,64]x[64,64]=relu=>x[64,64]->[99154,64],
#再整合成[37800,64]送进反向信息传播网
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
#即当前输出是[37800,64],矩阵加法大小不变(2),
#把(2)扩维出[1,37800,64](3),把(1)扩维到[1,37800,32](4)
#把(3)和(4)一起送进GRU
(GRU): GRU(64, 32)
#即输出[1,37800,32]再缩维到[37800,32]
)
#即每层输入[37800,32]
(1): GraphPropLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(64, 32)
)
(2): GraphPropLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(64, 32)
)
(3): GraphPropLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(64, 32)
)
(4): GraphPropLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(64, 32)
)
)
#即将5层传播结果与编码结果汇总得到[6,37800,32]的输出
(_aggregator): GraphAggregator(
(MLP1): Sequential(
(0): Linear(in_features=32, out_features=256, bias=True)
)
#即[37800,32]x[32,256]->[37800,256],
#再将其分为左右两个[37800,128]的矩阵做元素乘法得到[37800,128],
#再整合成[200,128]
(MLP2): Sequential(
(0): Linear(in_features=128, out_features=128, bias=True)
)
#即[200,128]x[128,128]->[200,128]
)
)
此时对于这200个图,每个图对应一个128维的向量,
横向切开得到一上一下两个[100,128],分别是x和y,
即(x,y)对应一对图,对应一个label,对应一个loss,
即每个batch对应一个100维的loss
GMN框架
GraphMatchingNet(
(_encoder): GraphEncoder(
(MLP1): Sequential(
(0): Linear(in_features=90, out_features=32, bias=True)
)
)
(_prop_layers): ModuleList(
(0): GraphPropMatchingLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
#即输出[37800,64](2)
#对[37800,32]的(1)划分成200块,每块189行,
#对每块中每相邻的x和y,先计算相似度[189,189],
#再分别乘y和x得到一对注意力attention_x, attention_y [189,32]
#汇总成[200,189,32]再拼接成[37800,32](5)
#(5)=(1)-(5)
#再将(2)和(5)拼成[37800,96]
(GRU): GRU(96, 32)
)
(1): GraphPropMatchingLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(96, 32)
)
(2): GraphPropMatchingLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(96, 32)
)
(3): GraphPropMatchingLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(96, 32)
)
(4): GraphPropMatchingLayer(
(_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(_reverse_message_net): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=64, bias=True)
)
(GRU): GRU(96, 32)
)
)
(_aggregator): GraphAggregator(
(MLP1): Sequential(
(0): Linear(in_features=32, out_features=256, bias=True)
)
(MLP2): Sequential(
(0): Linear(in_features=128, out_features=128, bias=True)
)
)
)
——————————————————————
GCN(有histogram)
相似度分类
每轮重新读一遍所有batch
每个batch有128对,即共50个batch
输入model一对图的[189,104]节点属性矩阵(由一维向量onehot得来)、边(以第一对为例是988和1002)、标签target
SimGNN(
#分别将这两个图通过各自的三层GCN
(convolution_1): GCNConv(104, 128)
#即[189,104]x[104,128]->[189,128]
#988=自环=>1177=聚合传播=>1177
(relu)
(dropout)
(convolution_2): GCNConv(128, 64)
(relu)
(dropout)
(convolution_3): GCNConv(64, 32)
#即输出[189,32]
(histogram)
#即将两个[189,32]相乘得[189,189]后拉直得到[35721,1]用来计算出[1,16]的hist
#分别将两个[189,32]通过各自的attention
(attention): AttentionModule()
#即[189,32]x[32,32]->[189,32]=mean=>32=tanh=>32=>[32,1]
#[189,32]x[32,1]=sigmoid=>[189,1]
#[32,189]x[189,1]->[32,1]
#将两个[32,1]一起送入张量层
(tensor_network): TenorNetworkModule()
#即先把原本[32,32,16]的权重view成[32,512]
#[1,32]x[32,512]->[1,512]=view=>[32,16]
#[16,32]x[32,1]->[16,1](1)
#将原本的两个[32,1]cat成[64,1]
#[16,64]x[64,1]->[16,1](2)+[16,1](1)+[16,1](bias)=relu=>[16,1]
#再转置后与hist拼接成[1,32]
(fully_connected_first): Linear(in_features=32, out_features=16, bias=True)
(relu)
#即[1,32]x[32,16]->[1,16](如果没有histogram则权重为[16,16])
(scoring_layer): Linear(in_features=16, out_features=1, bias=True)
(sigmoid)
#[1,16]x[16,1]->[1,1]
)
对batch内每一对图计算一个loss
对每个batch计算一个loss的总和用来BP
————————————
graph bert
类别分类
共188个图,训练169个,测试19个
输入
x [169,28] # e^x
说是节点属性,其实对于每一个graph,就是在nx中保存的节点索引列表,不足的部分用最大节点数+1来补上
例如
[0, 1, 13, 2, 3, 11, 4, 5, 6, 10, 7, 20, 8, 9, 15, 12, 14, 19, 16, 17, 18, 21, 22]
↓
[0, 1, 13, 2, 3, 11, 4, 5, 6, 10, 7, 20, 8, 9, 15, 12, 14, 19, 16, 17, 18, 21, 22, 29, 29, 29, 29, 29]
d [169,28] # e^d 节点度
w [169,28,28] # e^w 边的权值
可以看成邻接矩阵
wl [169,28] # e^r WL
node_color_dict中以节点序号为索引填入1,也即是表示有哪些节点的字典
例如{0: 1, 1: 1, 13: 1, 2: 1, 3: 1, 11: 1, 4: 1, 5: 1, 6: 1, 10: 1, 7: 1, 20: 1, 8: 1, 9: 1, 15: 1, 12: 1, 14: 1, 19: 1, 16: 1, 17: 1, 18: 1, 21: 1, 22: 1}
node_neighbor_dict遍历2跳邻居,也是记录有哪些节点,有点类似邻接链表
例如{0: {1: 1, 13: 1}, 1: {0: 1, 2: 1}, 13: {0: 1, 12: 1}, 2: {1: 1, 3: 1, 11: 1}, 3: {2: 1, 4: 1}, 11: {2: 1, 10: 1, 12: 1}, 4: {3: 1, 5: 1}, 5: {4: 1, 6: 1, 10: 1}, 6: {5: 1, 7: 1, 20: 1}, 10: {11: 1, 5: 1, 9: 1}, 7: {6: 1, 8: 1}, 20: {6: 1, 21: 1, 22: 1}, 8: {7: 1, 9: 1}, 9: {10: 1, 8: 1, 15: 1}, 15: {9: 1, 14: 1, 16: 1}, 12: {13: 1, 11: 1, 14: 1}, 14: {15: 1, 12: 1, 19: 1}, 19: {14: 1, 18: 1}, 16: {15: 1, 17: 1}, 17: {16: 1, 18: 1}, 18: {19: 1, 17: 1}, 21: {20: 1}, 22: {20: 1}}
取节点13的邻居字典{0: 1, 12: 1},
再取出value[1, 1]
再在其最前面加上node_color_dict[13],其实就是再加上一个1,得到['1', '1', '1']
进而拼出'1_1_1'
再通过hashlib.md5编码
汇总所有节点的编码后再得到去重字典
例如{'4eb90ba61276b0e27cee6f190e612949': 1, '9e8973112eebad7f27f0b762abd14d1e': 2, 'ec308451c1d095c528cfa3c009ea7235': 3}
再根据这个字典把编码映射成相应value
例如{13: 1, 0: 1, 1: 1, 2: 2, 3: 1, 11: 2, 4: 1, 5: 2, 6: 2, 10: 2, 7: 1, 20: 2, 8: 1, 9: 2, 15: 2, 12: 2, 14: 2, 19: 1, 16: 1, 17: 1, 18: 1, 21: 3, 22: 3}
经过多次迭代更新上面的字典
最后去除字典的value就是这个图的WL[1, 1, 1, 2, 1, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 2, 2, 1, 1, 1, 1, 3, 3]
y_true 169
context_idx_list 0
MethodGraphBertGraphClassification(
#先通过一层‘none’残差处理w其实返回两个none
#如果是‘raw’残差处理w,则经过下面两个linear,784=28*28
(res_h): Linear(in_features=784, out_features=32, bias=True)
(res_y): Linear(in_features=784, out_features=2, bias=True)
#分别得到residual_h和residual_y
#residual_h加给每个BertLayer的输出
#residual_y加给cls_y的结果
(bert): MethodGraphBert(
(embeddings): BertEmbeddings(
(raw_feature_embeddings): Linear(in_features=28, out_features=32, bias=True)
#即对w处理,[169,28,28]x[28,32]->[169,28,32]
(tag_embeddings): Embedding(1000, 32)
#即对x处理,[169,28]=embed=>[169,28,32]
(degree_embeddings): Embedding(1000, 32)
#即对d处理,[169,28]=embed=>[169,28,32]
(wl_embeddings): Embedding(1000, 32)
#即对wl处理,[169,28]=embed=>[169,28,32]
#四个再求和
(LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.5, inplace=False)
#输出(1)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0): BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=32, out_features=32, bias=True)
(key): Linear(in_features=32, out_features=32, bias=True)
(value): Linear(in_features=32, out_features=32, bias=True)
#即对(1)分别做三个[169,28,32]x[32,32]->[169,28,32]得到Q,K,V
#再分别把Q,K,V view成[169,28,2,16],
#再permute成[169,2,28,16]
#计算QK^T,[169,2,28,16]x[169,2,16,28]->[169,2,28,28]
#经过除√16标准化,再softmax
(dropout): Dropout(p=0.3, inplace=False)
#dropout后再乘V,[169,2,28,28]x[169,2,28,16]->[169,2,28,16]
#再permute成[169,28,2,16]再view成[169,28,32]输出(2)
)
(output): BertSelfOutput(
(dense): Linear(in_features=32, out_features=32, bias=True)
#即对(2)[169,28,32]x[32,32]->[169,28,32](3)
(dropout): Dropout(p=0.5, inplace=False)
#对(3)dropout
(LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
#将(1)和(3)求和后做layernorm得到(4)
#再将(4)与(2)求和后输出(5)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=32, out_features=32, bias=True)
#即对(5)[169,28,32]x[32,32]->[169,28,32],
#再通过一个gelu函数得到(6)
)
(output): BertOutput(
(dense): Linear(in_features=32, out_features=32, bias=True)
#即对(6)[169,28,32]x[32,32]->[169,28,32](7)
(dropout): Dropout(p=0.5, inplace=False)
(LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
#对(7)dropout后和(5)求和再做layernorm输出(8)
)
)
(1): BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=32, out_features=32, bias=True)
(key): Linear(in_features=32, out_features=32, bias=True)
(value): Linear(in_features=32, out_features=32, bias=True)
(dropout): Dropout(p=0.3, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=32, out_features=32, bias=True)
(LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.5, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=32, out_features=32, bias=True)
)
(output): BertOutput(
(dense): Linear(in_features=32, out_features=32, bias=True)
(LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.5, inplace=False)
)
)
)
)
(pooler): BertPooler(
#将上面的输出(9)[169,28,32]取其[169,32]
(dense): Linear(in_features=32, out_features=32, bias=True)
#[169,32]x[32,32]->[169,32]
(activation): Tanh()
)
#将上面tanh后输出的(10)与(9)一同输出,其实(10)没啥用啊?
)
#对于(9)[169,28,32]对齐第二个维度求均值,即28个[169,32]矩阵求均值得到(11)
(cls_y): Linear(in_features=32, out_features=2, bias=True)
#对(11)[169,32]x[32,2]->[169,2]后再log_softmax
)
输出结果是[169,2]的矩阵,标签是169的向量,
这时计算损失采用F.cross_entropy,其中nll_loss可以把[169,2]均值成169的向量再与标签计算交叉熵得到一个值
即一个图对应一个loss值
————————————————————
RWGNN
分折,共188个图,测试94,训练84,验证10
每个图节点数不同,节点维度都是7维
graph_indicator_batch对应当前有n个节点的这个图,
记录j-i(i是当前batch的序号,j=min(i+64,84)),
即第0 batch对应0,1,2,...,64,第1 batch对应0,1,...,64
其中每个数字的个数对应当前图节点个数
作用是在总的节点矩阵中标注某个节点属于的图的索引
adj_train 分批[1155,1155] [1161,1161] [381,381]
features_train 分批[1155,7] [1161,7] [381,7]
graph_indicator_train 分批1155 1161 381
y_train 分批64 64 24
即以第一批为例,输入64个图,共有1155个节点
RW_NN(
#先把graph_indicator_train去重,得到图索引unique和每个图的节点数counts
#取学习参数[hidden_graphs,(size_hidden_graphs*(size_hidden_graphs-1))//2]记作adj_train[16,C5^2]
#构造[hidden_graphs,size_hidden_graphs,size_hidden_graphs]的adj_hidden_norm[16,5,5]
(Relu)
#将每个[5,5]的矩阵上三角填入relu(adj_train),再与其转置相加
(fc): Linear(in_features=7, out_features=4, bias=True)
#即[1155,7]x[7,4]->[1155,4]
(sigmoid): Sigmoid()
#激活后记作x
#取学习参数[hidden_graphs,size_hidden_graphs,hidden_dim]记作z[16,5,4],
#通过torch.einsum("abc,dc->abd",(z,x))乘法,得到zx即[16,5,4]x[1155,4]->[16,5,1155]
#接下来2轮循环
#第一轮
#先构造16个5维单位矩阵,即eye[16,5,5]
#通过torch.einsum("abc,acd->abd",(eye,z))乘法,得到o即[16,5,5]x[16,5,4]->[16,5,4]
#通过torch.einsum("abc,dc->abd",(o,x))乘法,得到t即[16,5,4]x[1155,4]->[16,5,1155]
(dropout)
#t=zx○t即[16,5,1155]○[16,5,1155]->[16,5,1155]元素乘法
#先根据t和图总数构造零矩阵temp[16,5,64]
#再通过temp.index_add_(2,graph_indicator,t)加法,相当于
# for i in range(temp.shape[2]): #64
# for j in range(len(graph_indicator)): #1155
# if graph_indicator[j]==i:
# temp[i]+=t[j]
#再对temp求和sum(t, dim=1)并转置得到t[64,16]
#第二轮
#x=adj_train*x即[1155,1155]x[1155,4]->[1155,4]
#通过torch.einsum("abc,acd->abd",(adj_hidden_norm,z))乘法,得到z即[16,5,5]x[16,5,4]->[16,5,4]
#通过torch.einsum("abc,dc->abd",(z,x))乘法,得到t即[16,5,4]x[1155,4]->[16,5,1155]
(dropout)
#t=zx○t即[16,5,1155]○[16,5,1155]->[16,5,1155]元素乘法
#先根据t和图总数构造零矩阵temp[16,5,64]
#再通过temp.index_add_(2,graph_indicator,t)加法
#再对temp求和sum(t, dim=1)并转置得到t[64,16]
#将两轮的t拼接得到[64,32]
(bn): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(fc1): Linear(in_features=32, out_features=32, bias=True)
#即[64,32]x[32,32]->[64,32]
(relu): ReLU()
(dropout): Dropout(p=0.2, inplace=False)
(fc2): Linear(in_features=32, out_features=2, bias=True)
#即[64,32]x[32,2]->[64,2]
#log_softmax
)
每个图对应一个2维向量,其实还是后继二分类交叉熵,即每一batch对应一个loss