gymnasium中CarRacing环境

参考网页:

  1. https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/box2d/car_racing.py
  2. https://gymnasium.farama.org/environments/box2d/car_racing/

gymnasium.make()创建CarRacing环境

CarRacing是一个top-down视角的最简单的控制任务(输入为像素)
在powershell中运行这段代码会发生什么呢?

    python gymnasium/envs/box2d/car_racing.py

一些指示器与状态RGB缓冲区一起显示在窗口的底部。从左到右:真实速度、四个ABS传感器、方向盘位置和陀螺仪。
游戏截图
Remember: it's a powerful rear-wheel drive car - don't press the accelerator and turn at the same time.(不要同时加速并且转弯)
111

action space

If continuous there are 3 actions :
- 0: steering, -1 is full left, +1 is full right
- 1: gas
- 2: breaking

If discrete there are 5 actions:
- 0: do nothing
- 1: steer left
- 2: steer right
- 3: gas
- 4: brake

observable space

## Observation Space

A top-down 96x96 RGB image of the car and race track.

Rewards

The reward is -0.1 every frame and +1000/N for every track tile visited, where N is the total number of tiles
 visited in the track. For example, if you have finished in 732 frames, your reward is 1000 - 0.1*732 = 926.8 points.

Starting State

The car starts at rest in the center of the 

Episode Termination

The episode finishes when all the tiles are visited. The car can also go outside the playfield -
 that is, far off the track, in which case it will receive -100 reward and die.

Arguments

```python
>>> import gymnasium as gym
>>> env = gym.make("CarRacing-v2", render_mode="rgb_array", lap_complete_percent=0.95, domain_randomize=False, continuous=False)
>>> env
<TimeLimit<OrderEnforcing<PassiveEnvChecker<CarRacing<CarRacing-v2>>>>>
```
  • lap_complete_percent=0.95 dictates the percentage of tiles that must be visited by
    the agent before a lap is considered complete.赛道完成百分比:表示agent需要经过赛道上95%的格子才能被认为完成一圈

  • domain_randomize=False enables the domain randomized variant of the environment.
    In this scenario, the background and track colours are different on every reset.域随机化:当这个参数设为 True 时,每次重置环境时,背景和赛道的颜色都会变化。可以用于提升泛化能力

  • continuous=True converts the environment to use discrete action space.
    The discrete action space has 5 actions: [do nothing, left, right, gas, brake].当这个参数设为 True 时,环境使用离散的动作空间

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值