2019年09月_lh15123as

09月 01月

原创 Chapter 7. n-step Bootstrapping

文章目录7.1 n-step TD Prediction7.2 n-step Sarsa7.3 n-step Off-policy Learning by Importance Sampling7.1 n-step TD Prediction输入：策略 :π\piπ算法参数：步长 α∈(0,1]\alpha \in (0,1]α∈(0,1]，正整数 nnn对 s∈Ss \in \math...

2019-09-05 17:00:32 212

原创 Chapter 6. Temporal-Difference Learning

文章目录6.1 TD Prediction6.1 TD Prediction

2019-09-05 14:25:50 109

原创 Chapter 5. Monte Carlo Methods

文章目录5.1 Monte Carlo Prediction5.2 Monte Carlo Estimation of Action Values5.3 Monte Carlo Control5.4 Monte Carlo Control without Exploring Starts5.5 Off-policy Prediction via Importance Sampling5.6 Inc...

2019-09-04 23:56:08 189

原创 Chapter 4. Dynamic Programming

@[TOC]强化学习Chapter 4. Dynamic Programming(4.1)v(s)=max⁡aE[Rt+1+γv(St+1)∣St=s,At=a] =max⁡a∑s′,rp(s′,r∣s,a)[r+γv∗(s′)] \begin{aligned} v_(s)& = \max a\mathbb{E}[R{t+1}+\gamma v_(S_{t+1}) |...

2019-09-03 11:59:24 195

DeeplearningAI神经网络和深度学习第二周作业

对照着别人的敲的，直接可以运行，图片和数据集是全的，不用自己去找啦

2017-11-08

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人