silver slides
MrTriste
Machine Learning & Data Mining
展开
-
Silver-Slides Chapter 1 - 强化学习入门:基本概念介绍
一些知识点 机器学习 = 监督学习 + 无监督学习 + 强化学习 RL的不同之处: There is no supervisor, only a reward signal Feedback is delayed, not instantaneous Time really matters (sequential, non i.i.d data) Agent’s actions a...原创 2018-04-10 11:47:32 · 278 阅读 · 0 评论 -
Silver-Slides Chapter 2 - 强化学习之马尔科夫决策过程 Markov Decision Process(MDP)
Markov Processes MDP被用来描述强化学习的可完全观测的环境。几乎所有的强化学习问题可以用MDP来描述,Optimal control primarily deals with continuous MDPs. Partially observable problems can be converted into MDPs. Bandits are MDPs with one s...原创 2018-04-10 20:19:23 · 578 阅读 · 0 评论 -
Silver-Slides Chapter 3 - 强化学习之动态规划Dynamic Programming(DP)
Chapter 3 - DP Introduction 动态规划,分解成子问题 MDP满足动态规划的Optimal substructure、Overlapping subproblems的两个性质。 用于MDP的planning问题 all of these methods can be viewed as attempts to achieve much the same e...原创 2018-04-11 15:13:44 · 429 阅读 · 0 评论 -
Silver-Slides Chapter 4 - 蒙特卡洛方法(MC)与时序差分 (TD)
Chapter 4 - MC-TD Introduction Last lecture: Planning by dynamic programming Solve a known MDP This lecture: Model-free prediction Estimate the value function of an unknown MDP N...原创 2018-04-12 18:03:38 · 1665 阅读 · 0 评论