![](https://img-blog.csdnimg.cn/20201014180756780.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
RL
文章平均质量分 94
fo-in
啥都不会,啥也想做
展开
-
Planning and Learning with Tabular Methods: Part3
ContentsHeuristic SearchRollout AlgorithmMonte Carlo Tree SearchReferencesHeuristic SearchDecision-time planning methods, collectively known as heuristic search, are classical state-space planning methods in AI.The approximate value function is applied原创 2021-08-02 14:56:43 · 140 阅读 · 0 评论 -
Planning and Learning with Tabular Methods: Part 2
ContentsExpected vs. Sample UpdatesIs it better devoted to a few expected updates or to bbb times as many sample updates?Trajectory SamplingReal-time Dynamic ProgrammingPlanning at Decision TimeExpected vs. Sample UpdatesSample updates can in many cases原创 2021-07-24 16:14:25 · 171 阅读 · 0 评论 -
Planning and Learning with Tabular Methods: Part 1
ContentsModels and PlanningPlanning and learning methodsRandom-sample one-step tabular Q-planningDyna: Integrated Planning, Acting, and LearningThe reason to put forward Dyna-QModel learning and direct RLDyna-QMethods of reinforcement learning fall into t原创 2021-07-18 17:12:52 · 264 阅读 · 0 评论 -
n-step Bootsrapping:Part1
Contents一级目录二级目录三级目录一级目录二级目录三级目录原创 2021-07-17 19:20:33 · 151 阅读 · 0 评论 -
Q-learning、Expected Sarsa、Double Learning
Contents一级目录二级目录三级目录一级目录二级目录三级目录原创 2021-04-21 23:58:49 · 509 阅读 · 0 评论 -
Sarsa: One of classical algorithms of RL
Sarsa: One of classical algorithms of RLWhat is TD learning?On policy and Off-policyA brief introduction of SarsaA simple implementationWhat is TD learning?“TD learning” means “temporal-difference learning”, witch is a combination of Monte Carlo ideas(MC原创 2021-04-10 21:53:18 · 2150 阅读 · 13 评论