![](https://img-blog.csdnimg.cn/20201014180756916.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
强化学习
sinat_34080511
这个作者很懒,什么都没留下…
展开
-
对话管理
《Generating Text with Deep Reinforcement Learning》 介绍了一种用DQN来做seq to seq学习的架构,通过迭代的方式对输出序列解码。目的是为了让解码first tackle easier portions of the sequences。 采用LSTM编解码网络。 很多实际问题可以描述为seq to seq学习的问题,包括语音识别,原创 2016-09-24 21:55:27 · 1477 阅读 · 0 评论 -
RL in nlp
优化rnn结构 NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING对话管理 Jason D. Williams, Geoffrey Zweig. End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning.End-to-E...原创 2018-03-29 21:33:04 · 474 阅读 · 0 评论 -
13Policy Gradient
In this chapter we consider methods that instead learn a parameterized policy that can select actions without consulting a value function. A value function may still be used to learn the policy weights原创 2017-06-13 22:45:01 · 217 阅读 · 0 评论