Soft Actor-Critic:Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
软行为者-批评家:随机行为者的非策略最大熵深度强化学习
[论文简析]SAC: Soft Actor-Critic Part 1[1801.01290]_哔哩哔哩_bilibili
Soft Actor Critic
[论文简析]SAC: Soft Actor-Critic Part 1[1801.01290]_哔哩哔哩_bilibili