12.4.10 性能可视化
编写文件plot.py,用于生成图表以可视化展示在不同环境下(Pendulum、Acrobot 和 Mountain Car)使用不同的 TRPO 算法(NN-TRPO 和 TRLRPO)训练的代理的性能。文件plot.py的具体实现代码如下所示。
import pickle
import numpy as np
import matplotlib.pyplot as plt
res_nn_pend = pickle.load(open('results/pend_nn.pkl','rb'))
res_lr_pend = pickle.load(open('results/pend_lr.pkl','rb'))
res_nn_acro = pickle.load(open('results/acro_nn.pkl','rb'))
res_lr_acro = pickle.load(open('results/acro_lr.pkl','rb'))
res_nn_mount = pickle.load(open('results/mount_nn.pkl','rb'))
res_lr_mount = pickle.load(