再谈encoder-decoder框架下的alignment based 与attention based

最新推荐文章于 2023-03-30 15:43:20 发布

loveqiong2746

最新推荐文章于 2023-03-30 15:43:20 发布

阅读量514

点赞数

分类专栏：算法

本文链接：https://blog.csdn.net/u011334375/article/details/102609467

版权

算法专栏收录该内容

26 篇文章 0 订阅

订阅专栏

以ATIS数据集为例（Figure 1），槽值填充任务是一种explicit的alignment。而像翻译任务，是一种非alignmented任务。在翻译中，为了让input和output语义表达一致性更强，引入了attention机制，这种attention机制，本质上是一种soft alignment。
在这里插入图片描述
“Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling”这篇文章就alignment based 和attention based两种方法，探讨了它们之间的信息交互，以提升算法效果。先上效果图。
我们看到，在explicit alignment任务中，纯attention 效果只有81.64%，这也是bert这种纯attention机制在序列标注任务时，要借助于LSTM 和CRF这类链式结构模型的原因。单纯aligned input的F1达到95.72%，加入attention后，提升到95.78%。
下面我们看纯attention、纯aligned 、attention+aligned三种结构的差异在哪里。

在这里插入图片描述
可以看到，FIgure(2) (a)纯attention的decoder input只有c1,c2,c3,c4 (b) 纯aligne的input只有h1,h2,h3,h4;© attention+align的input既有c又有h 。因为我们是explicit align任务，所以纯aligned优于纯attention不难理解。需要额外指出的是，作者对attention怎么对align效果作补充进行了解释。也就是在对待标注词进行label的时候，attention变量c实际上包括了上下文中对当前任务有效的信息。
在这里插入图片描述

loveqiong2746

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
再谈encoder-decoder框架下的alignment based 与attention based

Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling
复制链接

扫一扫