![](https://img-blog.csdnimg.cn/20201014180756913.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
RL_by_DavidSilver_notes
文章平均质量分 85
David Silver 的 RL 课程的学习笔记
Ricky050
https://hrxweb.github.io/
展开
-
Reference
在书写这些笔记的过程中,完全是集百家之长,所有的参考部分在如下部分列出若侵权,请及时联系我删除,再次感谢以下资源生产者的奉献!github easyRL李宏毅-强化学习RL by David Silver原创 2021-09-17 17:28:53 · 64 阅读 · 0 评论 -
Lect6_Value_Function_Approximation
文章目录Value Funtion ApproximationIntroductionWhy need?Types of Value Function ApproximationWhich Funtion Approximator?Incremental MethodsValue Funtion Approx. by SGDLinear Funtion ApproximationIncremental Prediction AlgorithmsControl with Value Function Appr原创 2021-10-20 16:03:24 · 171 阅读 · 0 评论 -
Lect5_Model_free_Control
文章目录Model Free ControlOn-Policy Monte-Carlo ControlGeneralised Policy IterationMonte-Carlo Policy IterationPseudocodeMonte-Carlo ControlGLIE Monte-Carlo ControlOn-Policy Temporal-Difference LearningOn-Policy Control With SarsaSarsa(λ\lambdaλ)Forward View S原创 2021-10-12 22:53:38 · 118 阅读 · 0 评论 -
Lect4_MC_TD_Model_free_prediction
文章目录Model-Free PredictionMento-Carlo LearningMonte-Carlo Policy EvaluationFirst-Visit Monte-Carlo Policy EvaluationIncremental Mento-CarloTemporal-Difference LearningMC vs. TDUnified ViewDynamic Programming BackupMento-Carlo BackupTemporal-Difference Backu原创 2021-10-06 14:33:59 · 150 阅读 · 0 评论 -
Lect3_Dynamic_Programming
文章目录Planning by Dynamic ProgrammingIntroductionRequirements for DPDP used for planning in an MDPPolicy EvaluationIterative Policy EvaluationExamplePolicy IterationPolicy improvementValue IterationPrinciple of OptimalityDeterministic Value IterationA live d原创 2021-09-18 23:31:23 · 139 阅读 · 0 评论 -
Lect2_MDPs
文章目录Markov Decision ProcessesMarkov ProcessesDefinitionMarkov PropertyState Transition MatrixMarkov Reward ProcessDefinitionReturnWhy discountValue FunctionBellman EquationMarkov Decision ProcessesDefinitionPolicyValue FunctionBellman Expectation EquationO原创 2021-09-17 18:13:06 · 102 阅读 · 0 评论 -
Lect1_Intro_RL
文章目录Introduction to Reinforcement LearningThe RL ProblemstateInside An RL AgentPolicyValue FunctionModelProblems within RLLearning and PlanningExploration and ExploitationPrediction and ControlIntroduction to Reinforcement LearningThe RL Problemstate原创 2021-09-17 17:45:46 · 77 阅读 · 0 评论