Play with OpenAI Gym in Ubuntu 16.04: Hello World

最新推荐文章于 2022-05-16 17:57:25 发布

止于至玄

最新推荐文章于 2022-05-16 17:57:25 发布

阅读量936

点赞数

分类专栏： Code and IDE Reinforcement Learning 文章标签： python

Reinforcement Learning 同时被 2 个专栏收录

24 篇文章 8 订阅

订阅专栏

Code and IDE

17 篇文章 5 订阅

订阅专栏

OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms.

Install

git clone https://github.com/openai/gym
cd gym
sudo pip install -e .

That’s minimal install. You can also try full install.

A Demo

import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000):
    env.render()
    env.step(env.action_space.sample())

This is just a demo to verify that your gym works well.

Observation

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break

The environment’s step function returns exactly what we need. In fact, step returns four values. These are:

observation(object): An environment-specific object representing your observation of the environment, i.e. the state.
reward(float): Rewards achieved by the previous action.
done(boolean): The sign of the termination of an episode.
info(dict): Diagnostic information useful for debugging.

Every environment comes with first-class Space objects that describe the valid actions and observations.

print(env.action_space)
#> Discrete(2)
print(env.observation_space)
#> Box(4,)
print(env.observation_space.high)
#> array([ 2.4       ,         inf,  0.20943951,         inf])
print(env.observation_space.low)
#> array([-2.4       ,        -inf, -0.20943951,        -inf])

The Discrete space allows a fixed range of non-negative numbers, so in this case valid actions are either 0 or 1. The Box space represents an n-dimensional box, so valid observations will be an array of 4 numbers. Box and Discrete are the most common Spaces. You can sample from a Space or check that something belongs to it:

from gym import spaces
space = spaces.Discrete(8) # Set with 8 elements {0, 1, 2, ..., 7}
x = space.sample()
assert space.contains(x)
assert space.n == 8

Environment

from gym import envs
print(envs.registry.all())

This will give you a list of EnvSpecs

Record & Update

Wrap your environment with a Monitor Wrapper as follows:

import gym
from gym import wrappers
env = gym.make('CartPole-v0')
env = wrappers.Monitor(env, '/tmp/cartpole-experiment-1')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break

You may install ffmpeg firstly:

sudo apt-get install ffmpeg

You can then upload your results to OpenAI Gym:

import gym
gym.upload('/tmp/cartpole-experiment-1', api_key='YOUR_API_KEY')

止于至玄

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录