论文:Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots

论文链接:https://arxiv.org/abs/1805.02333

本文提出了一种用seq2seq给每个(context, response)pair打分,并把这个分数作为“soft” margin 用linear svm loss来进行训练的方法,有针对性地解决了当前训练检索式对话系统的matching model,在训练时sample negative responses的时候遇到的两大问题:语义完全不相关导致undesired decision boundary;sample到false negative的情况。

For retrieval-based conversation systems:A key step to response selection is measuring the matching degree between a response candidate and an input.

While existing research focuses on how to define a matching model with neural networks, little attention has been paid to how to learn such a model when few labeled data are available.

简言之,大家都在搞neural network based matching model,而针对这些matching model的训练,问题很大很多,但是没有人解决。问题主要表现为:

A common practice is to transform the matching problem to a classification problem with human responses as positive examples and randomly sampled ones as negative examples. 

This strategy, however, oversimplifies the learning problem, as most of the randomly sampled responses are either far from the semantics of the messages or the contexts, or they are false negatives which pollute the training data as noise.

当前的解决办法是作为一个分类系统来训练(CE  loss),human response作为positive examples,然后random sample出来一些作为negative examples。但是这种方法过度简化了问题:大多数sample到的responses在语义上都和utterance极其不相关——训练分类器的时候会cause undesired decision boundary;更有甚者,false negative(本身是一个appropriate response)。

因此,作者提出使用如下的训练目标:yi1是human response(这不是有监督吗……所以本质上解决的是sample的不好的问题吧),这可以类比为hinge loss(SVM loss),希望sample到的其他reponse和utterance的matching degree比human response的matching degree要低至少sij’这么大的margin。

  

sij’ is a normalized weak signal defined as max(0, sij /si1 − 1). The normalization here eliminates bias from different xi.  

sij是在human-human conversation data上面训练的seq2seq模型输出的生成response j conditioned on utterance i的对数似然概率,这里注意,这个值无疑是越大越匹配,但是这个值是负数,在相除以后就是在比较绝对值了!所以对于sample到的每一条样本而言:

①如果semantically far from utterance,则margin会很大。

②如果是false negative,则margin接近或者等于0。

综上所述,解决了上面关于sample的两大问题。

Experiments: to be continued.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值