Fine-Grained Attention Mechanism for Neural Machine Translation

学习笔记迁移至以下公众号:在这里插入图片描述

Fine-Grained Attention Mechanism for Neural Machine Translation

  1. 在这篇文章中作者提出了一种fine-grained的注意力机制即每一维的context vector将获得一个单独的注意力得分。
  2. 作者证明了这种做法的合理性

通过Choi et al.(2017)的文章,对词向量的维度进行contextualization发现词向量的每一个维度都都在context中充当着不同的作用,由于这一观点的启发,作者Choi 和Bengio便提出这种fine-grained的注意力机制。

Background : Attention-based Neral Machine Translation

这部分主要介绍了Bahdanau et al.(2015)提出的注意力机制。

given a source sentence X = ( w 1 x , w 2 x , . . . , w T x ) X=(w_1^x,w_2^x,...,w_T^x) X=(w1x,w2x,...,wTx)

那么我们的目标函数就是 p ( Y = ( w 1 y , w 2 y , . . . , w T y ) ∣ X ) p(Y=(w_1^y,w_2^y,...,w_T^y)|X) p(Y=(w1y,w2y,...,wTy)X).
解释为:在给定输入条件下,获得某一输出的概率。

在这个模型中包含Encoder,Decoder,and attention mechanism三个部分。

一、Encoder

encoder 通常通过双向循环神经网络来完成。在编码过程开始之前,每一个source word w t x w_t^x wtx被映射到一个连续的向量空间中去(所有输入单词的词嵌入矩阵)。
x t = E x [ . , w t x ] x_t=E^x[.,w_t ^x] xt=Ex[.,wtx]
如果图示的话可以表示为:

![这里写图片描述](https://img-blog.csdn.net/20180601155037397?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3pob3VrYWl5aW5faHphdQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)
将词向量一次输入到Encoder层中。
$\overrightarrow{h_t} = \overrightarrow{\Phi}(\overrightarrow{h}_{t-1},x_t)$,
$\overleftarrow h_t = \overleftarrow{\Phi}(\overleftarrow{h}_{t+1},x_t)$,

这里的 Φ ← \overleftarrow{\Phi} Φ 可以是LSTM( Hochreiter and Schmidhuber (1997)),或者GRU( Cho et al. (2014)))。
在每一个t时刻将正反两个方向的结果concat在一起。

$h_t = [\overrightarrow{h_t},\overleftarrow h_t ]$
这就获得了一个annotation vectors:
$C={h_1,h_2,...,h_T}$
二、Decoder

解码层可以模型化为:

$p(w_{t'}^y|w_{
三、Attention mechanism

在解码层中维持着一个hidden state z t ′ z_{t'} zt。在每一个 t ′ t' t时刻模型首先用注意力机制 f A t t f_{Att} fAtt对C中的每一个向量进行select ,or weight 。在Bahdanau的注意了机制中 f A t t f_{Att} fAtt是一个前向神经网络,输入分别为前一时刻的解码层的hidden state,和C中的一个向量。在这个前向神经网络中tanh()常常被作为激发函数。

$e_{t',t}=f_{Att}(z_{t'-1},h_t)$
这里的得分$e_{t',t}$用softmax来normalized.
$a_{t't}=\frac{exp(e_{t',t})}{\sum_{k=1}^{T} {exp(e_{t',k})}}$
再将对每一个annotation vector的weights加起来。
$c_{t'}=\sum_{t=1}^{T}{a_{t't}}{h_t}$

与编码层不同的是这里的 y t ′ − 1 y_{t'-1} yt1是上一个目标单词的向量。

这里的 c t ′ c_{t'} ct在作为Decoder的一个输入

$\overrightarrow{h_t} = \overrightarrow{\Phi}(\overrightarrow{z}_{t'-1},y_{t'-1},c_i)$,

Variants of Attention Mechanism

Jean et al.(2015a); Chung et al. (2016a)L,uong et al.(2016)对Bahdanau et al.(2015)的注意力机制做了点改进。

前者将score function 修改成 e t ′ , t = f A t t Y ( z t ′ − 1 , h t , y t ′ − 1 ) e_{t',t} = f_{AttY}(z_{t'-1},h_t,y_{t'-1}) et,t=fAttY(zt1,ht,yt1)
即加入了上衣时刻的输出值。

Fine-Grained Attention Mechanism

上面无论是Bahdanau的还是后面工作者所做的改进,给定一个query对于每一个context vector 仅得到一个注意力值。如图(a)所示。
作者提出的模型类似于途中的(b)所示。

这里写图片描述

$e_{t',t}^d = f_{AttY2D}^d(z_{t'-1},h_t,y_{t'-1})$
$a_{t',t}^d = \frac{exp(e_{t',t}^d)}{\sum_{k=1}^T}excp(e_{e_{t',k}}^d)$
$c_t'=sum\sum_{t=1}^{T}a_{t',t}*h_t$

这里的改变是讲aligment结果进行softmax后与每一个annotation vector做Hadamard 乘积,这样就将annotation vector的每一个维度独立开来进行注意力机制的描述。

[1] Fine-grained attention mechanism for neural machine translation

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Object detection in remote sensing images is a challenging task due to the complex backgrounds, diverse object shapes and sizes, and varying imaging conditions. To address these challenges, fine-grained feature enhancement can be employed to improve object detection accuracy. Fine-grained feature enhancement is a technique that extracts and enhances features at multiple scales and resolutions to capture fine details of objects. This technique includes two main steps: feature extraction and feature enhancement. In the feature extraction step, convolutional neural networks (CNNs) are used to extract features from the input image. The extracted features are then fed into a feature enhancement module, which enhances the features by incorporating contextual information and fine-grained details. The feature enhancement module employs a multi-scale feature fusion technique to combine features at different scales and resolutions. This technique helps to capture fine details of objects and improve the accuracy of object detection. To evaluate the effectiveness of fine-grained feature enhancement for object detection in remote sensing images, experiments were conducted on two datasets: the NWPU-RESISC45 dataset and the DOTA dataset. The experimental results demonstrate that fine-grained feature enhancement can significantly improve the accuracy of object detection in remote sensing images. The proposed method outperforms state-of-the-art object detection methods on both datasets. In conclusion, fine-grained feature enhancement is an effective technique to improve the accuracy of object detection in remote sensing images. This technique can be applied to a wide range of applications, such as urban planning, disaster management, and environmental monitoring.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值