强化学习
阳光什锦
这个作者很懒,什么都没留下…
展开
-
Chapter 4 Dynamic Programming 学习笔记
Chapter 4 Dynamic Programming前言:原创 2020-05-29 09:12:36 · 522 阅读 · 0 评论 -
Chapter 3 Finite Markov Decision Processes
Chapter 3 Finite Markov Decision ProcessesMDPs are a classical formalization of sequential decision makingMDPs are meant to be a straightforward framing of the problem of learning from interaction to achieve a goal3.1 The Agent–Environment Int.原创 2020-05-22 10:22:14 · 417 阅读 · 0 评论 -
Chapter 2 Multi-armed Bandits 学习总结
前言In this chapter we study the evaluative aspect of reinforcement learning in a simplifified setting, one that does not involve learning to act in more than one situation. This nonassociative setting is the one in which most prior work involving eval原创 2020-05-15 15:54:11 · 433 阅读 · 0 评论