论文：Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots

最新推荐文章于 2022-07-04 19:20:58 发布

Lcyztf

最新推荐文章于 2022-07-04 19:20:58 发布

阅读量773

点赞数

分类专栏： Dialogue Systems 文章标签： NLP dialogue system deep learning

本文链接：https://blog.csdn.net/Lcyztf/article/details/81084365

版权

Dialogue Systems 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

论文链接：https://arxiv.org/abs/1805.02333

本文提出了一种用seq2seq给每个（context， response）pair打分，并把这个分数作为“soft” margin 用linear svm loss来进行训练的方法，有针对性地解决了当前训练检索式对话系统的matching model，在训练时sample negative responses的时候遇到的两大问题：语义完全不相关导致undesired decision boundary；sample到false negative的情况。

For retrieval-based conversation systems：A key step to response selection is measuring the matching degree between a response candidate and an input.

While existing research focuses on how to deﬁne a matching model with neural networks, little attention has been paid to how to learn such a model when few labeled data are available.

简言之，大家都在搞neural network based matching model，而针对这些matching model的训练，问题很大很多，但是没有人解决。问题主要表现为：

A common practice is to transform the matching problem to a classiﬁcation problem with human responses as positive examples and randomly sampled ones as negative examples.

This strategy, however, oversimpliﬁes the learning problem, as most of the randomly sampled responses are either far from the semantics of the messages or the contexts, or they are false negatives which pollute the training data as noise.

当前的解决办法是作为一个分类系统来训练（CE loss），human response作为positive examples，然后random sample出来一些作为negative examples。但是这种方法过度简化了问题：大多数sample到的responses在语义上都和utterance极其不相关——训练分类器的时候会cause undesired decision boundary；更有甚者，false negative（本身是一个appropriate response）。

因此，作者提出使用如下的训练目标：yi1是human response(这不是有监督吗……所以本质上解决的是sample的不好的问题吧)，这可以类比为hinge loss（SVM loss），希望sample到的其他reponse和utterance的matching degree比human response的matching degree要低至少sij’这么大的margin。

sij’ is a normalized weak signal deﬁned as max(0, sij /si1 − 1). The normalization here eliminates bias from different xi.

sij是在human-human conversation data上面训练的seq2seq模型输出的生成response j conditioned on utterance i的对数似然概率，这里注意，这个值无疑是越大越匹配，但是这个值是负数，在相除以后就是在比较绝对值了！所以对于sample到的每一条样本而言：

①如果semantically far from utterance，则margin会很大。

②如果是false negative，则margin接近或者等于0。

综上所述，解决了上面关于sample的两大问题。

Experiments: to be continued.

Lcyztf

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
论文：Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots

论文链接：https://arxiv.org/abs/1805.02333本文提出了一种用seq2seq给每个（context， response）pair打分，并把这个分数作为“soft” margin 用linear svm loss来进行训练的方法，有针对性地解决了当前训练检索式对话系统的matching model，在训练时sample negative responses的时候遇到的...
复制链接

扫一扫

专栏目录