Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

论文信息
论文名字:Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting
发表:2019NIPS
论文地址:https://arxiv.org/abs/1907.00235

要解决的问题

locality-agnostics:原始的Transformer架构在自注意力中通过点乘的方式计算相似度,对局部关系不敏感;
memory-bottleneck:Transformer的空间复杂度会随着输入长度的变化而指数级增长。

Method

Enhancing the locality of Transformer在这里插入图片描述
采用因果卷积的方式,Q、K和V选择了不同的卷积核大小,增加历史信息,使得自注意力机制能更加多地关注局部信息。

Breaking the memory bottleneck of Transformer
在这里插入图片描述
从当前单元之前的单元中选择一个子集,降低输入的长度,进而缓解Transformer的存储瓶颈,而该子集的索引构成如下:
I l k = { l − 2 ⌊ log ⁡ 2 l ⌋ , l − 2 ⌊ log ⁡ 2 l ⌋ − 1 , l − 2 ⌊ log ⁡ 2 l ⌋ − 2 , … , l − 2 0 , l } I^k_l=\{l-2^{\lfloor \log_2 l \rfloor},l-2^{\lfloor \log_2 l \rfloor-1},l-2^{\lfloor \log_2 l \rfloor-2},\dots,l-2^0,l \} Ilk={l2log2l,l2log2l1,l2log2l2,,l20,l}例如, l = 1024 = 2 10 l=1024=2^{10} l=1024=210 I l k = { 1024 − 2 10 , 1024 − 2 9 , 1024 − 2 8 , … , 1024 − 2 0 , 1024 } I_l^k=\{1024-2^{10},1024-2^9,1024-2^8,\dots,1024-2^0,1024\} Ilk={1024210,102429,102428,,102420,1024}

实验验证

验证指标
在这里插入图片描述
(直接拷贝了)

人造数据上的实验结果

使用一个3层LSTM构成的模型作为baseline,并构造伪数据:
在这里插入图片描述
A 1 、 A 2 、 A 3 A_1、A_2、A_3 A1A2A3是随机生成的, A 4 A_4 A4取三者最大。
在这个数据中, t 0 t_0 t0表现得越大,对预测就越难,因为需要记住更多的内容。
在这里插入图片描述
实验结果证明,相较于LSTM,Transformer能够更好地在长距离上保证准确率。

真实世界数据的实验

在这里插入图片描述Long-term and short-term forecasting,无论是在长、短期的预测中,本文的方法都有很好的表现。

在这里插入图片描述
Convolutional self-attention,针对因果卷积中的卷积核大小进行的消融实验。主要是两个数据集之间的差别,electricity数据集相对简单,因此无论k是多少,效果都还好;但是traffic的数据集的挑战更高,因此当k变大的时候,效果较为显著。
在这里插入图片描述
同时,使用了因果卷积的方法,收敛更快,这一点在两个数据集上,表现是一致的。

在这里插入图片描述
Sparse attention,kernel size=6, L e 1 = 768 ( f o r   s p a r s e ) , L e 2 = 293 ( f o r   f u l l ) ; L t 1 = 576 , L t 2 = 254 L_{e_1}=768(for \ sparse),L_{e_2}=293(for\ full);L_{t_1}=576,L_{t_2=254} Le1=768(for sparse),Le2=293(for full);Lt1=576,Lt2=254,作者们认为他们的方法无论是在electricity上还是traffic上,都有很好的表现。在traffic上表现更强的一点原因在于数据本身有更强的长距离依赖,在相同的memory负载下,full的表现不如sparse。
可是当full和sparse的长度一致时,full在electricity上的表现都更好;在traffic上,作者们认为具有长距离依赖关系的数据更加适用。但是在不加入卷积的基础上,sparse的效果还不如full,即使加入了卷积,提升也不明显

Conclusion

在他们自己构造的数据集上,验证了Transformer在长距离依赖上的有效性,但是数据集不具备普适性,或者没有什么挑战性。
而在公开的数据集上,提升也并没有特别的多。
更多的提升应该是来自因果卷积的加入,提供了更多的历史信息,而稀疏矩阵的效果并没有达到预期中的效果,尤其是在存储上,也并没有看到很多的提升。
因此我认为这篇论文的因果卷积部分是一个创新(相较于2019年),而之后再解决存储的问题上也确实提供了可行的方法(商榷一下?)。

更详细的其实可以参考这篇博文,里面还有关于附录内容的讲解;同时也可以参考知乎上匿名大佬的评价

力有不逮,还请见谅。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Good day! The cost of university tuition fees has been a topic of concern among students and parents alike. Some argue that the fees are too high, limiting access to higher education, while others believe that the cost is justified by the quality of education and facilities provided. In this article, we will examine the different perspectives surrounding university tuition fees and explore the impact they have on students. Firstly, it is important to acknowledge that higher education is a valuable investment. Research has shown that individuals with a university degree earn significantly more over their lifetime compared to those without. This is because university education equips students with new skills, knowledge and networks, enhancing their employability and career prospects. However, the rising cost of tuition fees is making it increasingly difficult for students from lower-income families to pursue a degree. This is because tuition fees may not only be a financial burden but can also discourage students psychologically and create extra stress. Furthermore, many students are forced to accumulate significant debt in order to pay for their education. This debt can leave them struggling financially for years after graduation, burdening them with the responsibility to pay back the borrowed amount. In some cases, students are forced to delay important milestones, such as buying a house or starting a family, due to this financial burden. In addition to the financial impact, the high cost of tuition fees can also have a psychological impact on students. It can create a sense of pressure to succeed academically in order to justify the cost of their education, potentially leading to burnout, stress and anxiety. It can also lead to a sense of resentment towards the university and the feeling that students are being taken advantage of. In conclusion, university tuition fees are a complicated issue with both positive and negative effects. While it is important to acknowledge the value of higher education, we must also address the rising cost of tuition fees and the impact it has on students. Whether it is by increasing government funding for education, providing more scholarships, or reducing operational costs, solutions must be found to ensure that access to higher education is available to all, regardless of their financial background. I hope this article was informative and useful to you. Good luck with your studies!
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值