《reinforcement learning：an introduction》第八章《Planning and Learning with Tabular Methods》总结

最新推荐文章于 2020-12-22 12:51:22 发布

mmc2015

最新推荐文章于 2020-12-22 12:51:22 发布

阅读量1.6k

点赞数 1

分类专栏：（深度）增强学习文章标签：增强学习 sutton RL reinforcement learni an introduction

本文链接：https://blog.csdn.net/mmc2015/article/details/76608559

版权

本文是对《reinforcement learning：an introduction》第八章的总结，探讨了模型与规划、Dyna-Q、模型错误时的影响及优先清扫策略。重点讲解了如何在有限的体验中建立模型，通过模拟来改善政策，以及当模型不准确时，如何处理探索与利用之间的权衡问题。

摘要由CSDN通过智能技术生成

由于组里新同学进来，需要带着他入门RL，选择从silver的课程开始。

对于我自己，增加一个仔细阅读《reinforcement learning：an introduction》的要求。

因为之前读的不太认真，这一次希望可以认真一点，将对应的知识点也做一个简单总结。

8.1 Models and Planning

By a model of the environment we mean anything that an agent can use to predict how the environment will respond to its actions

The word planning is used in several different ways in different fields. We use the term to refer to any computational process that takes a model as input and produces or improves a policy for interacting with the modeled environment

The difference is that whereas planning uses simulated experience generated by a model, learning methods use real experience generated by the environment. Of course this difference leads to a number of other differences

最低0.47元/天解锁文章

mmc2015

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
《reinforcement learning：an introduction》第八章《Planning and Learning with Tabular Methods》总结

由于组里新同学进来，需要带着他入门RL，选择从silver的课程开始。对于我自己，增加一个仔细阅读《reinforcement learning：an introduction》的要求。因为之前读的不太认真，这一次希望可以认真一点，将对应的知识点也做一个简单总结。8.1 Models and PlanningBy a model of the
复制链接

扫一扫

专栏目录