跟着莫烦老师的强化学习教程时做的笔记,原贴:https://mofanpy.com/tutorials/machine-learning/reinforcement-learning/
Sarsa
和Q-Learing的区别:
更新方式不同
Q Learing 估计出来的下一个action不一定会走,但是sarsa一定会走
import numpy as np
import pandas as pd
class RL(object):
def __init__(self, action_space, learning_rate=0.01, reward_decay=0.9, e_greedy=0.9):
self.actions = action_space # a list
self.lr = learning_rate
self.gamma = reward_decay
self.epsilon = e_greedy
self.q_table = pd.