RL
文章平均质量分 94
fo-in
啥都不会,啥也想做
展开
-
Planning and Learning with Tabular Methods: Part3
ContentsHeuristic SearchRollout AlgorithmMonte Carlo Tree SearchReferences Heuristic Search Decision-time planning methods, collectively known as heuristic search, are classical state-space planning methods in AI. The approximate value function is applied原创 2021-08-02 14:56:43 · 126 阅读 · 0 评论 -
Planning and Learning with Tabular Methods: Part 2
ContentsExpected vs. Sample UpdatesIs it better devoted to a few expected updates or to bbb times as many sample updates?Trajectory SamplingReal-time Dynamic ProgrammingPlanning at Decision Time Expected vs. Sample Updates Sample updates can in many cases原创 2021-07-24 16:14:25 · 164 阅读 · 0 评论 -
Planning and Learning with Tabular Methods: Part 1
ContentsModels and PlanningPlanning and learning methodsRandom-sample one-step tabular Q-planningDyna: Integrated Planning, Acting, and LearningThe reason to put forward Dyna-QModel learning and direct RLDyna-Q Methods of reinforcement learning fall into t原创 2021-07-18 17:12:52 · 251 阅读 · 0 评论 -
n-step Bootsrapping:Part1
Contents一级目录二级目录三级目录 一级目录 二级目录 三级目录原创 2021-07-17 19:20:33 · 143 阅读 · 0 评论 -
Q-learning、Expected Sarsa、Double Learning
Contents一级目录二级目录三级目录 一级目录 二级目录 三级目录原创 2021-04-21 23:58:49 · 492 阅读 · 0 评论 -
Sarsa: One of classical algorithms of RL
Sarsa: One of classical algorithms of RLWhat is TD learning?On policy and Off-policyA brief introduction of SarsaA simple implementation What is TD learning? “TD learning” means “temporal-difference learning”, witch is a combination of Monte Carlo ideas(MC原创 2021-04-10 21:53:18 · 2138 阅读 · 13 评论