![](https://img-blog.csdnimg.cn/20201014180756927.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
强化学习学习笔记
leetteel
西安交通大学硕士
展开
-
Sarsa-Lambda
from maze_env import Maze from RL_brain import SarsaLambdaTable def update(): for episode in range(100): # initial observation observation = env.reset() # RL choose action based on observation action = RL.choose_actio原创 2021-09-11 20:59:46 · 90 阅读 · 0 评论 -
Q Learning
import numpy as np import random R = np.ones((12,12)) R = R*-1 R[0,3]=R[3,0]=0 R[1,2]=R[2,1]=0 R[2,5]=R[5,2]=0 R[6,7]=R[7,6]=0 R[3,6]=R[6,3]=0 R[3,4]=R[4,3]=0 R[1,4]=R[4,1]=0 R[7,4]=R[4,7]=0 R[5,8]=R[8,5]=0 R[8,9]=R[9,8]=0 R[9,10]=R[10,9]=0 R[10,11]=R[11.原创 2020-10-19 22:52:07 · 292 阅读 · 1 评论