强化学习 reward loss 曲线如何画出顶刊的对比效果

将生成的reward数据生成在文本文件中,一般来说,多几次训练数据越多的实验数据画的图效果越好。以下为单个图像的代码与效果图。epoch为25。

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

with open('examp1.txt', 'r') as file:
    content = file.read()
rewards1 = []
for line in content.split('\n'):
    if line:
        rewards1.append(float(line))

with open('examp2.txt', 'r') as file:
    content = file.read()
rewards2 = []
for line in content.split('\n'):
    if line:
        rewards2.append(float(line))

rewards = np.concatenate((rewards1, rewards2))
episode1 = range(len(rewards1))
episode2 = range(len(rewards2))
episode = np.concatenate((episode1, episode2))
sns.set_style('white')

sns.lineplot(x=episode, y=rewards, color='blue', label='example')

plt.xlabel("episode")
plt.ylabel("reward")

plt.savefig('episode reward 图像')
plt.show()

生成对比图只要把对比的数据保存下来接着仿照下面的例子就可以了,以下为两个reward对比图像的代码与效果图。epoch为25。多个的对比图同理。

with open('examp1.txt', 'r') as file:
    content = file.read()
rewards1 = []
for line in content.split('\n'):
    if line:
        rewards1.append(float(line))

with open('examp2.txt', 'r') as file:
    content = file.read()
rewards2 = []
for line in content.split('\n'):
    if line:
        rewards2.append(float(line))

with open('examp3.txt', 'r') as file:
    content = file.read()
rewards3 = []
for line in content.split('\n'):
    if line:
        rewards3.append(float(line))

with open('examp4.txt', 'r') as file:
    content = file.read()
rewards4 = []
for line in content.split('\n'):
    if line:
        rewards4.append(float(line))

rewards = np.concatenate((rewards1, rewards2))
rewardss = np.concatenate((rewards3, rewards4))

episode1 = range(len(rewards1))
episode2 = range(len(rewards2))
episode3 = range(len(rewards3))
episode4 = range(len(rewards4))
episode = np.concatenate((episode1, episode2))
episodes = np.concatenate((episode3, episode4))

sns.set_style('white')
sns.lineplot(x=episode, y=rewards, color='blue', label='example1')
sns.lineplot(x=episodes, y=rewardss, color='red', label='example2')

plt.xlabel("episode")
plt.ylabel("reward")

plt.savefig('episode reward 图像')
plt.show()

  • 5
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值