《reinforcement learning：an introduction》第九章《On-policy Prediction with Approximation》总结

最新推荐文章于 2021-09-24 21:32:31 发布

mmc2015

最新推荐文章于 2021-09-24 21:32:31 发布

阅读量1.4k

点赞数 1

分类专栏：（深度）增强学习文章标签：增强学习 sutton RL reinforcement learni an introduction

本文链接：https://blog.csdn.net/mmc2015/article/details/76833734

版权

由于组里新同学进来，需要带着他入门RL，选择从silver的课程开始。

对于我自己，增加一个仔细阅读《reinforcement learning：an introduction》的要求。

因为之前读的不太认真，这一次希望可以认真一点，将对应的知识点也做一个简单总结。

9.1 Value-function Approximation . . . . . . . . . . . . . . . . . . . . . . . 191
9.2 The Prediction Objective (MSVE) . . . . . . . . . . . . . . . . . . . . 192
9.3 Stochastic-gradient and Semi-gradient Methods . . . . . . . . . . . . . 194
9.4 Linear Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.5 Feature Construction for Linear Methods . . . . . . . . . . . . . . . . 203
9.5.1 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.5.2 Fourier Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
9.5.3 Coarse Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.5.4 Tile Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.5.5 Radial Basis Functions . . . . . . . . . . . . . . . . . . . . . . . 215
9.6 Nonlinear Function Approximation: Artificial Neural Networks . . . . 216
9.7 Least-Squares TD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

not all function approximation methods are equally well suited for use in reinforcement learning

==》learn efficiently from incrementally acquired data

==》handle nonstationary target functions

with genuine approximation, an update at one state affects many others, and it is not possible to get all states exactly correct.

By assumption we have far more states than weights, so making one state’s estimate more accurate invariably means making others’ less accurate.

we obtain a natural objective function, theMean Squared Value Error, or MSVE:

最低0.47元/天解锁文章

mmc2015

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
《reinforcement learning：an introduction》第九章《On-policy Prediction with Approximation》总结

由于组里新同学进来，需要带着他入门RL，选择从silver的课程开始。对于我自己，增加一个仔细阅读《reinforcement learning：an introduction》的要求。因为之前读的不太认真，这一次希望可以认真一点，将对应的知识点也做一个简单总结。9.1 Value-function Approximation . . . . . . .
复制链接

扫一扫