Memory Network学习笔记

最新推荐文章于 2022-03-08 15:04:51 发布

饥渴的小苹果

最新推荐文章于 2022-03-08 15:04:51 发布

阅读量417

点赞数

分类专栏：自然语言处理

自然语言处理专栏收录该内容

27 篇文章 8 订阅

订阅专栏

引言

Memory Networks是由Facebook的Jason Weston等人提出的一个神经网络框架，通过引入长期记忆组件(long-term memory component)来解决神经网络长程记忆困难的问题。在此框架基础上，发展出许多Memory Networks的变体模型。

出现原因

seq2seq中的记忆依靠rnnCell或者lstmCell实现，但是rnn和lstm的记忆能力实在有限，最多也就记忆十几个时间步长。因此当句子长度增长时或者需要添加先验知识时，seq2seq就不能满足此时对话系统的需求了。

paper

发表在ICLR
这个是作者的slides

简单理解

memory 最开始出现在是 QA（问答系统）任务中。下面是memory network的结构

由4个单元组成
input单元，Generalization单元，output单元，Response单元,也就是上图中蓝色的单元

input单元

Input converts incoming input text to the internal feature representation.

Input单元的作用是将文本转化为特征向量，其实许多任务中都已经有特定的特征转换方式，比如bi-lstm ，CNN等
Generalization单元

Generalization takes the input feature vector and current memory and decides which slot to store the new input into. It can also modify/delete any earlier memory based on this new information, which can be seen as generalizing the stored knowledge as new pieces of information are encountered.

Generalization单元起到的作用是根据特征向量找到一个memory slot（一般一个memory最开始会设置有多少个memory，一个slot你可以理解为一个1024维度的向量），并将input信息写到这个slot中，此外还可以修改和删除slot中的信息。

你可以把它理解为一个寻址，更新的过程。
球被小红拿走了，小红给了小黄，小黑又把小黄的球给抢走了。
举个不恰当的例子：8个slots，假设其中第三个slot球在谁手上的信息。
当第三句话的input被转换成feature后，memory NN 能够定位到第三个slot，将球在小黄手上的信息给擦掉，加入新的信息，也就是球在小黑手上。
output单元

Output takes the question feature vector and current memory and generates a feature vector for the answer. This is where the inference must take place. In the simplest case, this can be implemented as a ranking function over all occupied memory slots, and the highest scoring supporting memory can be retrieved as:
o1 = O1(x, M) = argmaxi sO(x, mi)
where sO is a function that scores the match between a question and the contents of a
memory slot.

接着上面的例子，如果Question是球现在在谁手上？
那么output unit的任务是将问题“球现在在谁手上？” 转换成向量表示。然后根据这个表示去memory NN 中找到一个最相关的slot。如何找？你可以把这个过程看成一个排序的过程，得分最高的slot就取出来输出答案。
Response单元

Response takes the answer feature vector and generates a natural language statement, which is outputted by the system. Ideally, an RNN or LSTM network that outputs a se- quence of text tokens should suffice for this component.

这个单元的作用就是根据前面的这个slot和question的表示输出文本答案。

参考文献

饥渴的小苹果

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Memory Network学习笔记

引言Memory Networks是由Facebook的Jason Weston等人提出的一个神经网络框架，通过引入长期记忆组件(long-term memory component)来解决神经网络长程记忆困难的问题。在此框架基础上，发展出许多Memory Networks的变体模型。出现原因seq2seq中的记忆依靠rnnCell或者lstmCell实现，但是rnn和lstm的记忆能力实在...
复制链接

扫一扫