Continuous control with deep reinforcement learning

最新推荐文章于 2021-12-29 18:11:27 发布

算法学习者

最新推荐文章于 2021-12-29 18:11:27 发布

阅读量7.6k

点赞数

分类专栏： paper reading RL

paper reading 同时被 2 个专栏收录

85 篇文章 0 订阅

订阅专栏

37 篇文章 9 订阅

订阅专栏

https://arxiv.org/abs/1509.02971

Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra

(Submitted on 9 Sep 2015 ( v1), last revised 29 Feb 2016 (this version, v5))

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

Comments:	10 pages + supplementary
Subjects:	Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1509.02971 [cs.LG]
	(or arXiv:1509.02971v5 [cs.LG] for this version)

Submission history

From: Jonathan Hunt [ view email]
[v1] Wed, 9 Sep 2015 23:01:36 GMT (344kb,D)
[v2] Wed, 18 Nov 2015 17:34:41 GMT (338kb,D)
[v3] Thu, 7 Jan 2016 19:09:07 GMT (338kb,D)
[v4] Tue, 19 Jan 2016 20:30:47 GMT (339kb,D)
[v5] Mon, 29 Feb 2016 18:45:53 GMT (339kb,D)

Which authors of this paper are endorsers? | Disable MathJax ( What is MathJax?)

算法学习者

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Continuous control with deep reinforcement learning

https://arxiv.org/abs/1509.02971Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra(Submitted on 9 Sep 20
复制链接

扫一扫