混合编码网络—特定领域对话系统-CSDN博客

论文标题：Hybrid Code Networks: practical and efficient end-to-end dialog control
with supervised and reinforcement learning
论文原文地址

作者：Jason D. Williams_Microsoft Research_email:jason.williams@microsoft.com
Kavosh Asadi_Brown University_email:kavosh@brown.edu
Geoffrey Zweig∗_Microsoft Research_email:g2zweig@gmail.com

简评：提出特定领域的对话系统，在有限数据下的强化学习。所谓的Hybrid Code就我个人理解，是将多种语句理解编码方式混合使用再加上特定领域的命名实体以及动作模板，最后加权对语句进行编码，从而做到对特定领域的针对性，所以该方法如果放到全领域可能会因为数据量过大而失效（毕竟全领域的命名实体和动作响应机制数据量过于庞大而且容易出现奇异）。
该篇只整理了该模型的简介，具体的实施细节待遇到相关问题时再回头整理。
ps:该篇是对End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning进行了一些新的改进。

1. Introduction

This paper presents a model for end-to-end learning, called Hybrid Code Networks (HCNs) which addresses these problems. In addition to learning an RNN, HCNs also allow a developer to express domain knowledge via software and action templates. Experiments show that, compared to existing recurrent end-to-end techniques, HCNs achieve the same performance with considerably less training data, while retaining the key benefit of end-to-end trainability. Moreover, the neural network can be trained with supervised learning or reinforcement learning, by changing the gradient update applied.

本文提出了一种端到端的学习模型，称为混合编码网络（HCNs）解决这些问题。除了学习RNN，HCNs也允许开发者表示领域知识通过软件和操作模板。实验表明，相比现有的回归的端到端的技术，HCNs达到相同的性能，较少的训练数据，同时保留了端到端的可训练性的关键利益。此外，神经网络可以训练与监督学习或强化学习，通过改变梯度更新应用。

2. Model description

Operational loop-摘自论文原文
Figure 1: Operational loop. Trapezoids refer to programmatic code provided by the software developer,and shaded boxes are trainable components. Vertical bars under “6” represent concatenated vectorswhich form the input to the RNN.

4个模块：
1.a recurrent neural network;递归神经网络
2.domain-specific software; 特定领域的软件（例如图中的WeatherBot）
3.domain-specific action templates;特定领域动作模板
4.a conventional entity extraction module for identifying entity mentions in text.用于识别文本中实体引用的常规实体抽取模块

步骤：