深度强化学习高级包PTAN-1. Agent, Experience

最新推荐文章于 2023-11-27 15:20:47 发布

安迪AI

最新推荐文章于 2023-11-27 15:20:47 发布

阅读量3.5k

点赞数 6

分类专栏：深度强化学习文章标签：强化学习 reinforcement learning 机器学习人工智能

本文链接：https://blog.csdn.net/HJJ19881016/article/details/105743835

版权

PTAN是一个开源的强化学习库，提供DQNAgent、PolicyAgent等预封装的Agent类，方便复用和减少开发工作。Agent类根据Observation产生action，如DNQAgent适用于离散型动作，PolicyAgent则处理连续分布的action。Experience类封装了agent与环境的交互信息，ExperienceSource和ExperienceSourceFirstLast分别用于不同方式的交互记录。ExperienceSourceFirstLast特别地，它累积多步reward并展示头尾状态。

摘要由CSDN通过智能技术生成

PTAN简介

ptan是一个开源的RL封装包(Github地址)，用于封装常用的RL代码，以便提高复用性，减少开发量。安装方法如下：

从pypi安装: pip install ptan
从github安装：pip install pip install git+https://github.com/Shmuma/ptan.git

ptan中常用的封装类包括：

Agent：根据输入的Observation得到action
Experience：得到agent和环境交互的信息

Agent

Agent包装类内部的工作流程如下：

ptan中已经封装好了一些常用的Agent类：DQNAgent，PolicyAgent。如果想自己自定义，可以继承基类BaseAgent

DNQAgent

主要用于dqn族的算法，Net输出的是离散型的actions。使用方法如下：

import torch.nn as nn
import torch.nn.functional as F
import torch
import ptan

#### 构建一个简单模型
class DQNModel(nn.Module):
    def __init__(self, n_actions):
        super().__init__()
        self.n_actions = n_actions
        
    def forward(self, x):
        return torch.eye(x.size()[0], self.n_actions)
    
net = DQNModel(n_actions=5)
x = torch.zeros(3, 9)
print("input is:\n", x)
print("output is:\n", net(x))

-- 输出 --
input is:
 tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0.]])
output is:
 tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0.