【gym】【离散动作空间】【Mountain Car 山地车】

资源存储库

已于 2024-08-25 12:39:46 修改

阅读量406

点赞数 6

分类专栏：笔记文章标签：笔记

于 2024-08-25 12:37:32 首次发布

本文链接：https://blog.csdn.net/wq6qeg88/article/details/141527761

版权

Mountain Car 山地车

Description 描述

Observation Space观测空间

Action Space 动作空间

Transition Dynamics:过渡动态：

Version History版本历史记录

Mountain Car 山地车

This environment is part of the Classic Control environments. Please read that page first for general information.
此环境是经典控制环境的一部分。请先阅读该页面以获取一般信息。


Action Space 动作空间	Discrete(3)
Observation Shape 状态形状	(2,)
Observation High 状态上限	[0.6 0.07]
Observation Low 状态下限	[-1.2 -0.07]
Import	`gym.make("MountainCar-v0")`

Description 描述

The Mountain Car MDP is a deterministic MDP that consists of a car placed stochastically at the bottom of a sinusoidal valley, with the only possible actions being the accelerations that can be applied to the car in either direction. The goal of the MDP is to strategically accelerate the car to reach the goal state on top of the right hill. There are two versions of th

最低0.47元/天解锁文章

资源存储库

关注

6
点赞
踩
9

收藏

觉得还不错? 一键收藏
打赏
0
评论
【gym】【离散动作空间】【Mountain Car 山地车】

山地车 MDP 是一种确定性 MDP，它由一辆随机放置在正弦谷底部的汽车组成，唯一可能的动作是可以在任一方向上应用于汽车的加速度。gym 中的山地车域有两个版本：一个具有离散动作，另一个具有连续动作。两端的碰撞都是无弹性的，与壁碰撞时速度设置为 0。速度t+1 = 速度t +（作用 - 1）* 力 - cos（3 * 位置t） * 重力。目标是尽快到达放置在右侧山丘顶部的旗帜，因此智能体每个时间步长都会受到 -1 的奖励。终止：车的位置大于或等于 0.5（目标位置在右侧山丘的顶部）中分配一个均匀的随机值。
复制链接

扫一扫