论文阅读:Extending Neural Generative Conversational Model using External Knowledge Sources

题目很大很好,方法非常简单粗暴,感觉挺水的……这里就总结一下一些值得思考的地方。

关于incorporate external knowledge的系列工作主要集中于task-oriented任务中,主要分为structured KB 和unstructured data两个方面。open-domain用的并不多。看这个paper本来是想看它如何从data中找knowledge的……但是方法异常暴力。就是根据context逐词在external knowledge中检索然后直接average所有的token embedding……

 这篇paper有趣的点有两个:

1、对于implicit 提供信息的结构,比如ensemble 进来的其他information concat到input这样,额外引入了一个loss,L3。这样会强迫模型去focus on ensemble information。但是私以为这里应当有一个weight来balance L1和L3。

2、 关于external information的variance问题。(remains questions)

intuitively, 方差越大提供的信息越有分量。

The knowledge vectors being away from the mean of their distribution. 

The knowledge vectors having high variance. 

We observed that the vectors not being spread out made them less useful than the encoded context itself in the initial experiments. we observed that the model learns to use the external context, but, as discussed in Section 4.2, the variance in the external context vectors constructed using the two knowledge sources was too low. To fix this, we scaled the external context vectors with N(4,1). This improved the variance in the knowledge that subsequently improved  the usefulness of these vectors.

问题:一个n-dim vector的distribution如何衡量?mean好理解,variance……是协方差?这个直觉对吗?什么叫scale the external context vector with N(4,1)?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值