Transformer for Graphs: An Overview from Architecture Perspective 综述笔记(待更)

Transformer for Graphs

Introduction

1、GA: GNNs as Auxiliary Modules

直接将GNN加入Transformer框架里,然后根据GNN与Transformer的相对位置分类

2、PE: Improved Positional Embedding from Graphs

在使用Transformer模型之前,利用GNN把图的结构信息压缩到位置向量中,然后加入到Transformer的输入中

3、AT: Improved Attention Matrices from Graphs

在计算attention系数矩阵时加入图的信息,或者限制结点只attend他的邻居等
在这里插入图片描述

Transformer Architecture for Graphs

GNNs as Auxiliary Modules in Transformer

在这里插入图片描述

1、three types

(1) building Transformer blocks on top of GNN blocks

(2) alternately stacking GNN blocks and Transformer blocks

(3) parallelizing GNN blocks and Transformer blocks

2、examples

Improved Positional Embeddings from Graphs

1、requires heavy hype-parameter searching:explore a graph-encoding strategy without adjustment of the Transformer architecture(只在embedding部分改进,不需要调整Transformer部分)

2、compress the graph structure into positional embedding (PE) vectors and add them to the input before it is fed to the actual Transformer model
在这里插入图片描述

在根据图结构构造P时,更多的是用到一些关于图的基本算法,例如拉普拉斯特征向量、SVD向量等,当然也有可学习的方法

3、examples

Improved Attention Matrices from Graphs

1、the progress of compressing graph structure into fixed-sized vectors suffers from information loss, which might limit their effectiveness

2、improve the attention matrix computation based on graph information
在这里插入图片描述

主要改进A的计算方式,用不同的方法将图的结构信息加入到A的计算中

3、examples

Comparisions

在这里插入图片描述

1、the evaluated graph specific modules of Transformer lead to to better performance(与第一行比较)

2、The improvement on graph-level tasks is more significan than that on node-level tasks(做node-level tasks时是在一个大规模图上做,但并不是用的原始图,而是采样之后的图)

3、GA and AT methods bring more benefits than PE methods

(1)PE does not contain intact graph information

(2)PE is only fed into the input layer of the network, the graph-structural information would decay layer by layer across the model

4、different kinds of graph tasks enjoy different group of models

(1)GA:node-level tasks

GA methods are able to better encode the local information of the sampled induced subgraphs in node-level tasks

(2)AT:graph-level tasks

AT methods are suitable for modeling the global information of the single graphs in graph-level tasks

Future Directions

1、New paradigm of incorporating the graph and the Transformer

not just takes graphs as a prior, but also better reflects the properties of graphs

2、Extending to other kinds of graphs

mostly focus on homogeneous graphs,explore their potential on other forms of graphs, such as heterogeneous graphs and hypergraphs

3、Extending to large-scale graphs

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值