Transformer for Graphs: An Overview from Architecture Perspective 综述笔记（待更）

mmwtcl_

已于 2022-11-30 13:26:09 修改

阅读量387

点赞数 2

分类专栏： graph-transformer 文章标签： transformer 深度学习人工智能

于 2022-08-06 14:05:59 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/wakeupshely/article/details/126194148

版权

graph-transformer 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

Transformer for Graphs

Introduction

1、GA: GNNs as Auxiliary Modules

直接将GNN加入Transformer框架里，然后根据GNN与Transformer的相对位置分类

2、PE: Improved Positional Embedding from Graphs

在使用Transformer模型之前，利用GNN把图的结构信息压缩到位置向量中，然后加入到Transformer的输入中

3、AT: Improved Attention Matrices from Graphs

在计算attention系数矩阵时加入图的信息，或者限制结点只attend他的邻居等
在这里插入图片描述

Transformer Architecture for Graphs

GNNs as Auxiliary Modules in Transformer

在这里插入图片描述

1、three types

(1) building Transformer blocks on top of GNN blocks

(2) alternately stacking GNN blocks and Transformer blocks

(3) parallelizing GNN blocks and Transformer blocks

2、examples

Improved Positional Embeddings from Graphs

1、requires heavy hype-parameter searching：explore a graph-encoding strategy without adjustment of the Transformer architecture（只在embedding部分改进，不需要调整Transformer部分）

2、compress the graph structure into positional embedding (PE) vectors and add them to the input before it is fed to the actual Transformer model
在这里插入图片描述

在根据图结构构造P时，更多的是用到一些关于图的基本算法，例如拉普拉斯特征向量、SVD向量等，当然也有可学习的方法

3、examples

Improved Attention Matrices from Graphs

1、the progress of compressing graph structure into fixed-sized vectors suffers from information loss, which might limit their effectiveness

2、improve the attention matrix computation based on graph information
在这里插入图片描述

主要改进A的计算方式，用不同的方法将图的结构信息加入到A的计算中

3、examples

Comparisions

在这里插入图片描述

1、the evaluated graph specific modules of Transformer lead to to better performance（与第一行比较）

2、The improvement on graph-level tasks is more significan than that on node-level tasks（做node-level tasks时是在一个大规模图上做，但并不是用的原始图，而是采样之后的图）

3、GA and AT methods bring more benefits than PE methods

（1）PE does not contain intact graph information

（2）PE is only fed into the input layer of the network, the graph-structural information would decay layer by layer across the model

4、different kinds of graph tasks enjoy different group of models

（1）GA：node-level tasks

GA methods are able to better encode the local information of the sampled induced subgraphs in node-level tasks

（2）AT：graph-level tasks

AT methods are suitable for modeling the global information of the single graphs in graph-level tasks

Future Directions

1、New paradigm of incorporating the graph and the Transformer

not just takes graphs as a prior, but also better reflects the properties of graphs

2、Extending to other kinds of graphs

mostly focus on homogeneous graphs，explore their potential on other forms of graphs, such as heterogeneous graphs and hypergraphs

3、Extending to large-scale graphs

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Transformer for Graphs: An Overview from Architecture Perspective 综述笔记（待更）

Transformer for Graphs: An Overview from Architecture Perspective
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。