在自定义环境使用RL baselines,只需要遵循gym接口即可。
也就是说,你的环境必须实现下述方法(并且继承自OpenAI Gym类):
如果你用图像作为输入,输入值必须在[0,255]因为当用CNN策略时观测会被标准化(除以255让值落在[0,1])
import gym
from gym import spaces
class CustomEnv(gym.Env):
"""Custom Environment that follows gym interface"""
metadata = {'render.modes': ['human']}
def __init__(self, arg1, arg2, ...):
super(CustomEnv, self).__init__()
# Define action and observation space
# They must be gym.spaces objects
# Example when using discrete actions:
self.action_space = spaces.Discrete(N_DISCRETE_ACTIONS)
# Example for using image as input:
self.observation_space = spaces.Box(low=0, high=255,
shape=(HEIGHT, WIDTH, N_CHANNELS), dtype=np.uint8)
def step(self, action):
...
def reset(self):
...
def render(self, mode='human', close=False):
...
然后你就可以用其训练一个RL智体:
# Instantiate and wrap the env
env = DummyVecEnv([lambda: CustomEnv(arg1, ...)])
# Define and Train the agent
model = A2C(CnnPolicy, env).learn(total_timesteps=1000)
这里有一份创建自定义Gym环境的在线教程。
视需求,你还可以像gym注册环境,这可让用户实现一行创建Rl智体(并用gym.make()
实例化环境)。