[RL 11] Asynchronous Methods for Deep Reinforcement Learning (A3C) (ICML, 2016)

最新推荐文章于 2022-10-12 10:36:16 发布

xyp99

最新推荐文章于 2022-10-12 10:36:16 发布

阅读量185

点赞数

分类专栏： DRL 算法

本文链接：https://blog.csdn.net/xyp99/article/details/109392527

版权

16 篇文章 3 订阅

订阅专栏

Asynchronous Methods for Deep Reinforcement Learning (A3C) (ICML, 2016)

Design goal
- train deep neural network policies reliably and without large resource requirements.
Main ideas
1. asynchronous actor-learners
2. parallel running(to explorate different parts of envs)
Techs
1. accumulate gradients
2. n-step bootstrap
3. parameters sharing
4. exploration
  1. different $\epsilon$
  2. entropy for A3C

Atari
1. n-step is faster
2. A3C outperformance others
Scalability (可拓展性)
- improvement over actors
- n-step gets more speed up ratio
  - perhaps due to the reduction of bias
Robustness
- there is usually a range of good learning rate