Attention Model 在自然语言处理里的应用

最新推荐文章于 2024-08-10 07:08:54 发布

YoungDreamNJU

最新推荐文章于 2024-08-10 07:08:54 发布

阅读量4k

点赞数

分类专栏：论文笔记文章标签：自然语言处理 Attention 论文 nlp

本文链接：https://blog.csdn.net/YoungDreamNJU/article/details/54895783

版权

本文详细探讨了Attention Model在自然语言处理领域的应用，包括机器翻译、词嵌入、问答系统、文档分类、语言到逻辑形式转换以及摘要生成等任务。经典之作如Neural Machine Translation By Jointly Learning To Align And Translate和Show, Attend and Tell展示了Attention如何改善模型性能，文中还介绍了全局和局部Attention的差异以及其效果。" 120939302,10492690,CSS实现淘宝焦点图布局,"['前端开发', 'CSS', '网页布局']

摘要由CSDN通过智能技术生成

本文的目标是介绍Attention Model在自然语言处理里的应用，本文的结构是：先介绍两篇经典之作，一篇NMT，一篇是Image Caption；之后介绍Attention在不同NLP Task上的应用，在介绍时有详有略。

经典之作

有两篇文章被Attention的工作广泛引用，这里单拎出来介绍：

NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE

NMT通常用encoder-decoder family的方法，把句子编码成一个定长向量，再解码成译文。作者推测定长向量是encoder-decoder架构性能提升的瓶颈，因此让模型自动寻找（与预测下一个词相关的）部分原文。
这里写图片描述
Encoder部分，作者使用了Bidirectional RNN for annotating sequences

这是PPT介绍
http://www.iclr.cc/lib/exe/fetch.php?media=iclr2015:bahdanau-iclr2015.pdf

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

这篇文章的任务是给图片起个标题，我自己做了一页PPT总结了文章思路

这里写图片描述

接下来介绍自然语言处理各种Task中的Attention应用。

Attention in Word Embedding

Not All Contexts Are Created Equal: Better Word Representations with Variable Attention

The general intuition of the model is that some words are only relevant for predicting local context (e.g. function words), while other words are more suited for determining global context, such as the topic of the document.
In CBOW: