参考网页:
- https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/box2d/car_racing.py
- https://gymnasium.farama.org/environments/box2d/car_racing/
gymnasium中CarRacing环境解读
gymnasium.make()创建CarRacing环境
CarRacing是一个top-down视角的最简单的控制任务(输入为像素)
在powershell中运行这段代码会发生什么呢?
python gymnasium/envs/box2d/car_racing.py
一些指示器与状态RGB缓冲区一起显示在窗口的底部。从左到右:真实速度、四个ABS传感器、方向盘位置和陀螺仪。
Remember: it's a powerful rear-wheel drive car - don't press the accelerator and turn at the same time.
(不要同时加速并且转弯)
action space
If continuous there are 3 actions :
- 0: steering, -1 is full left, +1 is full right
- 1: gas
- 2: breaking
If discrete there are 5 actions:
- 0: do nothing
- 1: steer left
- 2: steer right
- 3: gas
- 4: brake
observable space
## Observation Space
A top-down 96x96 RGB image of the car and race track.
Rewards
The reward is -0.1 every frame and +1000/N for every track tile visited, where N is the total number of tiles
visited in the track. For example, if you have finished in 732 frames, your reward is 1000 - 0.1*732 = 926.8 points.
Starting State
The car starts at rest in the center of the
Episode Termination
The episode finishes when all the tiles are visited. The car can also go outside the playfield -
that is, far off the track, in which case it will receive -100 reward and die.
Arguments
```python
>>> import gymnasium as gym
>>> env = gym.make("CarRacing-v2", render_mode="rgb_array", lap_complete_percent=0.95, domain_randomize=False, continuous=False)
>>> env
<TimeLimit<OrderEnforcing<PassiveEnvChecker<CarRacing<CarRacing-v2>>>>>
```
-
lap_complete_percent=0.95
dictates the percentage of tiles that must be visited by
the agent before a lap is considered complete.赛道完成百分比:表示agent需要经过赛道上95%的格子才能被认为完成一圈 -
domain_randomize=False
enables the domain randomized variant of the environment.
In this scenario, the background and track colours are different on every reset.域随机化:当这个参数设为 True 时,每次重置环境时,背景和赛道的颜色都会变化。可以用于提升泛化能力 -
continuous=True
converts the environment to use discrete action space.
The discrete action space has 5 actions: [do nothing, left, right, gas, brake].当这个参数设为 True 时,环境使用离散的动作空间