Lcyztf-CSDN博客

原创论文阅读：Extending Neural Generative Conversational Model using External Knowledge Sources

题目很大很好，方法非常简单粗暴，感觉挺水的……这里就总结一下一些值得思考的地方。关于incorporate external knowledge的系列工作主要集中于task-oriented任务中，主要分为structured KB 和unstructured data两个方面。open-domain用的并不多。看这个paper本来是想看它如何从data中找knowledge的……但是方法异常...

2018-09-18 11:41:11 668

系列总结：Sequence Generation via GAN

上海交大SeqGAN系列工作之一。一、SeqGAN: Sequence Generative Adversarial Nets with Policy GradientLSTM generator + CNN discriminator + policy gradient where action-value function is estimated via Monte-Carlo s...

2018-09-16 15:41:49 787

论文阅读：Neural Net Models of Open-domain Discourse Coherence——Jiwei Li

本文是Jiwei大神17年EMNLP上的paper。

2018-09-16 14:59:31 341

原创论文阅读：Best of Both Worlds: Transferring Knowledge from D to G

首先pretrain D和G，然后fix D，让G不断sample response，然后根据D的监督信号进行更新。这里使用Gumbel Softmax来解决non-differentialable problem。作者从MLE（or equivalently CE）的generic and safe response问题入手，指出MLE训练的生成模型容易“game” MLE，会倾向于“av...

2018-09-09 16:03:41 331

原创关于word embedding的一些思考

源于最近做生成和检索式对话系统，以及一篇well named paper：When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation？这里总结一下最近对word embedding的思考。https://www.cnblogs.com/Determined22/p/5780305.htm...

2018-09-08 23:49:25 1582

原创 Multi-source attention mechanism

一、Attention Strategies for Multi-Source Sequence-to-Sequence Learning本文主要考虑多encoder和单个RNN decoder的scenario.主要分为以下三种来讨论：1、Concatenation of the context vectorsA widely adopted technique for combin...

2018-09-06 20:54:28 873

原创论文阅读：Sequence Generation by Editing Prototype

一、Response Generation by Context-aware Prototype Editing是一个retrieval——edit vector——conditional generating的过程，目标是解决safe response问题，让生成的回答更加informative and engaging，intuiation是比较c-c差异然后改写r。注意两点：①retri...

2018-08-31 14:57:55 1970

原创论文阅读：Instance Weighting in Dialogue Systems

总结一下最近读到的三篇instance weighting的paper。一、Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models ——SIGDIAL 18第一个提出做instance weighting，值得注意的想法是，把这个weighting model看成是一...

2018-08-29 17:25:15 955

原创论文阅读：Deep contextualized word representations

NAACL 18 Best Paper本文再度提醒我们，deep learning的精髓在于representation，而NLP至今没有把最根本的表示——embedding和language model做好（没有下大气力去灌）。A good low-level representation can bring significant improvement that beyond our w...

2018-08-22 00:28:20 5844 2

原创论文阅读：STC data set for single-turn short text conversation——Wang 2013 Noah's Ark Lab

首先吐槽一句，不公开完整human labelled 数据集……这是一个基于Sina微博的数据集，是从一些中国搞NLP的高级知识分子的微博posts中爬下来的（posts的质量较高），但是comments（replies）是所有人都可以发的。一、data set构建的方法如下：1、 crawling the community of users首先确定10个在sina微博上...

2018-07-25 01:12:46 1192

原创论文阅读：RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

核心问题：What makes a good reply in open-domain dialog systems?一、Observation1、Resembling the groundtruth generally implies a good reply.生成的reply和groundtruth相似度越高越好。这是一个general assumption。我们需要注意：sh...

2018-07-22 01:40:06 1062

原创论文：Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots

论文链接：https://arxiv.org/abs/1805.02333本文提出了一种用seq2seq给每个（context， response）pair打分，并把这个分数作为“soft” margin 用linear svm loss来进行训练的方法，有针对性地解决了当前训练检索式对话系统的matching model，在训练时sample negative responses的时候遇到的...

2018-07-17 21:25:50 828

原创 CS231n Notes Linear Classification

1. Linear Classifier在数学上是如下的式子：每个example都是一个column vector，可以把W矩阵的每行看作一个针对每个类别的classifier。针对W和b可以有（直观上）两种理解：（1）hyperplane：在high dimensional space上将data points线性分开。（2）template matching：Each r...

2018-07-17 16:36:28 320

Lcyztf的博客