论文阅读：Extending Neural Generative Conversational Model using External Knowledge Sources

最新推荐文章于 2021-10-06 19:04:42 发布

Lcyztf

最新推荐文章于 2021-10-06 19:04:42 发布

阅读量628

点赞数

分类专栏： Dialogue Systems

本文链接：https://blog.csdn.net/Lcyztf/article/details/82755279

版权

Dialogue Systems 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

题目很大很好，方法非常简单粗暴，感觉挺水的……这里就总结一下一些值得思考的地方。

关于incorporate external knowledge的系列工作主要集中于task-oriented任务中，主要分为structured KB 和unstructured data两个方面。open-domain用的并不多。看这个paper本来是想看它如何从data中找knowledge的……但是方法异常暴力。就是根据context逐词在external knowledge中检索然后直接average所有的token embedding……

这篇paper有趣的点有两个：

1、对于implicit 提供信息的结构，比如ensemble 进来的其他information concat到input这样，额外引入了一个loss，L3。这样会强迫模型去focus on ensemble information。但是私以为这里应当有一个weight来balance L1和L3。

2、关于external information的variance问题。（remains questions）

intuitively, 方差越大提供的信息越有分量。

The knowledge vectors being away from the mean of their distribution.

The knowledge vectors having high variance.

We observed that the vectors not being spread out made them less useful than the encoded context itself in the initial experiments. we observed that the model learns to use the external context, but, as discussed in Section 4.2, the variance in the external context vectors constructed using the two knowledge sources was too low. To ﬁx this, we scaled the external context vectors with N(4,1). This improved the variance in the knowledge that subsequently improved the usefulness of these vectors.

问题：一个n-dim vector的distribution如何衡量？mean好理解，variance……是协方差？这个直觉对吗？什么叫scale the external context vector with N（4，1）？