Profiles of Few Mujoco Locomotion Tasks
🖋 Peng Zhenghao
🗓 2020.04.18
Notations:
FR = forward reward = forward_reward_weight * x_velocity
HR = healthy reward = healthy_reward_weight * is_healthy [bool]
CC = contrl cost = ctrl_cost_weight * sum(square(action))
TC = contact cost = contact_cost_weight * sum(square(contact_force))
A general formulation of these tasks are: (though some environments does not compute some terms)
reward = (
forward_reward_weight * x_velocity
+ healthy_reward_weight * is_healthy