强化学习论文笔记
文章平均质量分 86
超级超级小天才
这个作者很懒,什么都没留下…
展开
-
超全!深度强化学习领域值得一读的论文列表
参考自:https://spinningup.openai.com/en/latest/spinningup/keypapers.html强烈建议直接去看原文,每一篇文献都有链接以下是深度强化学习(Deep Reinforcement Learning)中值得一读的论文,实际上,这些远远不够全面,但应该能为希望在该领域上进行研究的人提供一个有用的起点。Model-Free RLDeep Q-Learning[1] Playing Atari with Deep Reinforcement Le.转载 2021-06-15 20:50:13 · 1924 阅读 · 0 评论 -
[TRPO] Trust Region Policy Optimization
论文链接:http://proceedings.mlr.press/v37/schulman15引用:Schulman J, Levine S, Abbeel P, et al. Trust region policy optimization[C]//International conference on machine learning. PMLR, 2015: 1889-1897.概述Trust Region Policy Optimization (TRPO) 算法是一个 model-free原创 2021-06-04 20:10:52 · 753 阅读 · 0 评论 -
[DDPG] Continuous Control with Deep Reinforcement Learning
论文链接:https://arxiv.org/abs/1509.02971引用:Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[J]. arXiv preprint arXiv:1509.02971, 2015.概述Deep Deterministic Policy Gradient (DDPG) 是DPG算法加上深度学习的版本,是一个 model-free、o原创 2021-06-02 09:35:31 · 2266 阅读 · 0 评论 -
[DQN] Playing Atari with Deep Reinforcement Learning
论文链接:https://arxiv.org/abs/1312.5602引用:Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning[J]. arXiv preprint arXiv:1312.5602, 2013.概述Deep Reinforcement Learning (DQN) 是一个 model-free、off-policy 的强化学习算法,使用深度神经网络作为非线性的函数原创 2021-06-01 18:04:41 · 633 阅读 · 2 评论