David Silver深度强化学习
RL入门
ttangzr
这个作者很懒,什么都没留下…
展开
-
(David Silver深度强化学习) - Lecture2 - Markov Decision Processes
David Silver deep reinforcement learning course in 2019. For document and discussion.Lecture2: Markov Decision ProcessesⅠ Markov Processes (Markov Chain)1.Introduction to MDPsMDP描述的是RL中的环境(environment),且该环境是fully observableMDP中的state完全描述了这个过程几乎所有的.原创 2020-06-27 17:47:59 · 412 阅读 · 1 评论 -
(David Silver深度强化学习) - Lecture1: Introduction to RL
David Silver deep reinforcement learning course in 2019. For document and discussion.Lecture1:IntroductionOutlineⅠ The RL Problem1.Rewardreward RtR_tRt 是一个标量的反馈信号表明agent的每一步的执行效果agent目标:将累积奖励最大化课程提出的奖励的假说:All goals can be described by.原创 2020-06-24 10:43:11 · 424 阅读 · 0 评论