Dynamic Memory Networks for Visual and Textual Question Answering 论文阅读笔记

最新推荐文章于 2022-11-12 16:43:00 发布

yyyybupt

最新推荐文章于 2022-11-12 16:43:00 发布

阅读量340

点赞数

分类专栏： nlp

本文链接：https://blog.csdn.net/qq_41747565/article/details/102608908

版权

nlp 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

原文链接

1. 文章亮点

提出了一个新 input 模块，使用 sentence reader 和 input fusion layer 两级编码器，信息在句子间可以流动
memory 中，使用 facts的全局知识计算现有GRU的公式中的 attention 门

2. 背景介绍

memory network 能够推理出自然语言或三元组（主题，关系，对象）的 facts
Attention mechnism 在机器翻译和图像模型取得了很好的进展
（DMN）是一个具有 memory 和 Attention 的神经网络模型，在QA、情感分析和词性标记有很好的结果

3. 模型介绍

3.1 Input Module

(1) sentence reader : 将单词编码成句子映射 (positional encoder)

word tokens --> sentence encoding : $\lbrack\omega_1^i,\dots,\omega_{M_i}^i\rbrack\rightarrow f_i$

$f_i={\textstyle\sum_M^{j=1}}l_j\circ\omega_j^i$ $l_{jd}=(1-j/M)-(d/D)(1-2j/M)$

(2) input fusion layer : 句子间信息的交互 (bi-directional GRU)

$\overrightarrow{f_i}=GRU_{fwd}(f_i,\overrightarrow{f_{i-1}})$
$\overleftarrow{f_i}=GRU_{fwd}(f_i,\overleftarrow{f_{i+1}})$
$\overleftrightarrow{f_i}=\overleftarrow{f_i}+\overrightarrow{f_i}$

GRU的实现 $h_i=GRU(x_i,h_{i-1})$

$u_i=\sigma(W^{(u)}x_i+U^{(u)}h_{i-1}+b^{(u)})$
$r_i=\sigma(W^{(r)}x_i+U^{(r)}h_{i-1}+b^{(r)})$
$\widetilde{h_i}=\tan h(Wx_i+r_i\circ Uh_{i-1}+b^{(h)})$
$h_i=u_i\circ\widetilde{h_i}+(1-u_i)\circ h_{i-1}$

3.2 Episodic Memory Module

(1) Attention gate 的计算

$z_i^t=\lbrack\overleftrightarrow{f_i}\circ q;\overleftrightarrow{f_i}\circ m^{t-1};\left|\overleftrightarrow{f_i}-q\right|;\left|\overleftrightarrow{f_i}-m^{t-1}\right|\rbrack$

$Z_i^t=W^{(2)}\tan h(W^{(1)}z_i^t+b^{(1)})+b^{(2)}$

$g_i^t=\frac{exp(Z_i^t)}{\sum_{k=1}^{M_i}exp(Z_k^t)}$

(2) Attention Mechanism : 采用 Attention based GRU --> 我们使用了基于 Attention 的GRU的最终隐藏状态作为 $c^t$ ，由于更新 $m^t$

$h_i=g_i^t\circ\widetilde{h_i}+(1-g_i^t)\circ h_{i-1}$

(3) Episode Memory Updates

$m^t=GRU(c^t,m^{t-1})$

$m^t=ReLU(W^t\lbrack m^{t-1};c^t;q\rbrack+b)$

实验结果

(1) 在bAbI-10k数据集上测试各种模型架构的错误率

ODMN 最原始的DMN模型
DMN2 输入模型采用 input fusion layer
DMN3 用 attention based GRU 代替 soft attention
DMN+ 用唯一的权重和带有ReLU激活的线性层更新 $m^t$

对比得到结论：

input fusion layer 改善了 distant facts 之间的交互
在DMN3中添加 attention GRU 可以提供复杂的位置和顺序信息

(2) 在bAbI English 10k数据集中测试任务上各种模型架构的错误率

end-to-end memory network 有 explicit memory 和 recurrent attention mechanism 。

positional encoding 做 input module
RNN权重处理 episode module
ReLU non-linearity 处理 memory update

neural reasoner framework

deep architecture 做逻辑推理
interaction-pooling 处理输入间的交互

yyyybupt

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Dynamic Memory Networks for Visual and Textual Question Answering 论文阅读笔记

原文链接1. 文章亮点提出了一个新 input 模块，使用 sentence reader 和 input fusion layer 两级编码器，信息在句子间可以流动 memory 中，使用 facts的全局知识计算现有GRU的公式中的 attention 门2. 背景介绍memory network能够推理出自然语言或三元组（主题，关系，对象）的 facts Attentio...
复制链接

扫一扫