Multi-source attention mechanism

一、Attention Strategies for Multi-Source Sequence-to-Sequence Learning

本文主要考虑多encoder和单个RNN decoder的scenario.主要分为以下三种来讨论:

1、Concatenation of the context vectors

A widely adopted technique for combining multiple attention models in a decoder is concatenation of the context vectors. This setting forces the model to attend to each encoder independently and lets the attention combination to be resolved implicitly in the subsequent network layers. 

2、Flat Attention Combination

We let the decoder learn the αi distribution jointly over all encoder hidden states.

α系数是对于所有的encoders states归一化的。

attention energy term e按照Bahdanau的计算方法,注意The parameters va and Wa are shared among the encoders, and Ua is different for each encoder and serves as an encoder-specific projection of hidden states into a common vector space. 

The states of the individual encoders occupy different vector spaces and can have a different dimensionality, therefore the context vector cannot be computed as their weighted sum. We project them into a single space using linear projections:

3、Hierarchical Attention Combination

The hierarchical attention combination model computes every context vector independently, similarly to the concatenation approach. Instead of concatenation, a second attention mechanism is constructed over the context vectors. 

First, we compute the context vector for each encoder independently using Equation 3.

Second, we project the context vectors (and optionally the sentinel) into a common space (Equation 8), we compute another distribution over the projected context vectors(Equation 9) and their corresponding weighted average (Equation 10):

 

Both of the alternatives(method 2 and method 3) allow us to explicitly compute distribution over the encoders and thus interpret how much attention is paid to each encoder at every decoding step.

在multi-source MT的实验中,hierarchical attention的效果是最好的。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值