Transformer for Graphs
Introduction
1、GA: GNNs as Auxiliary Modules
直接将GNN加入Transformer框架里,然后根据GNN与Transformer的相对位置分类
2、PE: Improved Positional Embedding from Graphs
在使用Transformer模型之前,利用GNN把图的结构信息压缩到位置向量中,然后加入到Transformer的输入中
3、AT: Improved Attention Matrices from Graphs
在计算attention系数矩阵时加入图的信息,或者限制结点只attend他的邻居等
Transformer Architecture for Graphs
GNNs as Auxiliary Modules in Transformer
1、three types
(1) building Transformer blocks on top of GNN blocks
(2) alternately stacking GNN blocks and Transformer blocks
(3) parallelizing GNN blocks and Transformer blocks
2、examples
Improved Positional Embeddings from Graphs
1、requires heavy hype-parameter searching:explore a graph-encoding strategy without adjustment of the Transformer architecture(只在embedding部分改进,不需要调整Transformer部分)
2、compress the graph structure into positional embedding (PE) vectors and add them to the input before it is fed to the actual Transformer model
在根据图结构构造P时,更多的是用到一些关于图的基本算法,例如拉普拉斯特征向量、SVD向量等,当然也有可学习的方法
3、examples
Improved Attention Matrices from Graphs
1、the progress of compressing graph structure into fixed-sized vectors suffers from information loss, which might limit their effectiveness
2、improve the attention matrix computation based on graph information
主要改进A的计算方式,用不同的方法将图的结构信息加入到A的计算中
3、examples
Comparisions
1、the evaluated graph specific modules of Transformer lead to to better performance(与第一行比较)
2、The improvement on graph-level tasks is more significan than that on node-level tasks(做node-level tasks时是在一个大规模图上做,但并不是用的原始图,而是采样之后的图)
3、GA and AT methods bring more benefits than PE methods
(1)PE does not contain intact graph information
(2)PE is only fed into the input layer of the network, the graph-structural information would decay layer by layer across the model
4、different kinds of graph tasks enjoy different group of models
(1)GA:node-level tasks
GA methods are able to better encode the local information of the sampled induced subgraphs in node-level tasks
(2)AT:graph-level tasks
AT methods are suitable for modeling the global information of the single graphs in graph-level tasks
Future Directions
1、New paradigm of incorporating the graph and the Transformer
not just takes graphs as a prior, but also better reflects the properties of graphs
2、Extending to other kinds of graphs
mostly focus on homogeneous graphs,explore their potential on other forms of graphs, such as heterogeneous graphs and hypergraphs
3、Extending to large-scale graphs