Discrete VS Continuous Control
Discrete
Continuous
DQN一个动作一个维度,不能用于连续控制
Policy Network一个动作一个维度,不能用于连续控制
非要用DQN做连续控制,就要将连续空间离散化
Better Approaches to Continuous Control
Deterministic policy network
updating Value Network by TD
Updating Policy Network by DPG
improvement:Using Target Networks
提升方法
Stochastic Policy for Continuous Control
Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation
Training Policy Network
Auxiliary Network
Policy Gradient Methods