import gym
env = gym.make('CartPole-v0')
for episode in range(20):
observation = env.reset() #环境重置
for timestep in range(100):
env.render() #可视化
# print(observation)
action = env.action_space.sample() #动作采样
observation, reward, done, info = env.step(action) #单步交互
if done:
print(observation)
print("Episode {} finished after {} timestep".format(episode, timesteps+1))
break
env.close()
1、gym编程范式
2、环境对象env
属性:
env.observation_space:状态空间
env.action_space:动作空间
observation = env.reset():环境重置
env.render() :可视化
observation, reward, done, info = env.step(action):单步交互
env.close():关闭环境
env.seed():提供随机数工具
3、gym.spaces 状态空间 动作空间详解
定义状态空间(observation space)、动作