Mobile ALOHA 的模仿学习算法和协同训练--翻译斯坦福机器人项目1

最新推荐文章于 2024-06-15 23:00:34 发布

weixin_40478869

最新推荐文章于 2024-06-15 23:00:34 发布

阅读量703

点赞数 5

文章标签：机器人机器学习人工智能神经网络

本文链接：https://blog.csdn.net/weixin_40478869/article/details/135984431

版权

# Imitation Learning algorithms and Co-training for Mobile ALOHA

# Mobile ALOHA 的模仿学习算法和协同训练

#### Project Website: https://mobile-aloha.github.io/

This repo contains the implementation of ACT, Diffusion Policy and VINN, together with 2 simulated environments:

Transfer Cube and Bimanual Insertion. You can train and evaluate them in sim or real.

For real, you would also need to install [Mobile ALOHA](https://github.com/MarkFzp/mobile-aloha). This repo is forked from the [ACT repo](https://github.com/tonyzhaozh/act).

该仓库包含 ACT、Diffusion Policy 和 VINN 的实现，

以及 2 个模拟环境：

Transfer Cube (移动物体)和 Bimanual Insertion（双手操控穿插交互）。您可以模拟或真实地训练和评估它们。实际上，您还需要安装[Mobile ALOHA](https://github.com/MarkFzp/mobile-aloha)。该存储库是从[ACT 存储库](https://github.com/tonyzhaozh/act)Fork分叉出来的。

### Updates:

You can find all scripted/human demo for simulated environments [here](https://drive.google.com/drive/folders/1gPR03v05S1xiInoVJn7G7VJ9pDCnxq9O?usp=share_link).

[您可以在此处](https://drive.google.com/drive/folders/1gPR03v05S1xiInoVJn7G7VJ9pDCnxq9O?usp=share_link)找到模拟环境的所有脚本/人工演示。运行环境之类的。

### Repo Structure

- ``imitate_episodes.py`` Train and Evaluate ACT

脚本功能：训练和评估 ACT

- ``policy.py`` An adaptor for ACT policy

功能：策略的适配器

- ``detr`` Model definitions of ACT, modified from DETR

功能： ACT 的模型定义，修改自 DETR

- ``sim_env.py`` Mujoco + DM_Control environments with joint space control

功能：具有联合空间控制的 Mujoco + DM_Control 环境

- ``ee_sim_env.py`` Mujoco + DM_Control environments with EE space control

功能：具有 EE 空间控制的 Mujoco + DM_Control 环境

- ``scripted_policy.py`` Scripted policies for sim environments

功能：模拟环境的脚本化策略

- ``constants.py`` Constants shared across files

功能：跨文件共享的常量

- ``utils.py`` Utils such as data loading and helper functions

功能：数据加载和辅助函数等实用程序

- ``visualize_episodes.py`` Save videos from a .hdf5 dataset

功能：保存 .hdf5 为扩展名的视频文件，里面是 HDF5数据集

### Installation 安装的PYTHON模块

conda create -n aloha python=3.8.10

conda activate aloha

pip install torchvision

pip install torch

pip install pyquaternion

pip install pyyaml

pip install rospkg

pip install pexpect

pip install mujoco==2.3.7

pip install dm_control==1.0.14

pip install opencv-python

pip install matplotlib

pip install einops

pip install packaging

pip install h5py

pip install ipython

cd act/detr && pip install -e .

- also need to install https://github.com/ARISE-Initiative/robomimic/tree/r2d2 (note the r2d2 branch) for Diffusion Policy by `pip install -e .`

还需要安装[GitHub - ARISE-Initiative/robomimic at r2d2](https://github.com/ARISE-Initiative/robomimic/tree/r2d2)（BRANCH分支，注意是 r2d2 分支）以进行扩散策略`pip install -e .`

扩散策略的封装

### Example Usages

To set up a new terminal, run:

要设置新终端，请运行：

conda activate aloha

cd <path to act repo>

### Simulated experiments (LEGACY table-top ALOHA environments)

### 拟实验（LEGACY 桌面 ALOHA 环境）

We use ``sim_transfer_cube_scripted`` task in the examples below.

我们在下面的示例中使用该脚本sim_transfer_cube_scripted执行。

Another option is ``sim_insertion_scripted``.

另一种选择是执行 sim_insertion_scripted脚本。

To generated 50 episodes of scripted data, run:

要生成 50 集脚本数据，请运行：

python3 record_sim_episodes.py --task_name sim_transfer_cube_scripted --dataset_dir <data save dir> --num_episodes 50

To can add the flag ``--onscreen_render`` to see real-time rendering.

To visualize the simulated episodes after it is collected, run

可以添加选项flag`--onscreen_render`来查看实时渲染。要在收集后可视化模拟事件，请运行

python3 visualize_episodes.py --dataset_dir <data save dir> --episode_idx 0

Note: to visualize data from the mobile-aloha hardware, use the visualize_episodes.py from https://github.com/MarkFzp/mobile-aloha

[注意：把数据进行可视化，通过项目中的硬件配置，通过调用脚本 visualize_episodes.py

To train ACT:

训练 ACT

```训练 ACT

# Transfer Cube task

python3 imitate_episodes.py --task_name sim_transfer_cube_scripted --ckpt_dir <ckpt dir> --policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 --num_epochs 2000 --lr 1e-5 --seed 0

```

To evaluate the policy, run the same command but add ``--eval``. This loads the best validation checkpoint.

要评估策略，请运行相同的命令，但添加`--eval`. 这会加载最佳验证检查点。

The success rate should be around 90% for transfer cube, and around 50% for insertion.

转移物体的成功率应在 90% 左右，插入立方体的成功率应在 50% 左右。

To enable temporal ensembling, add flag ``--temporal_agg``.

要启用时间集成，请添加 flag `--temporal_agg`。

Videos will be saved to ``<ckpt_dir>`` for each rollout.

`<ckpt_dir>`每次推出时都会保存视频。

You can also add ``--onscreen_render`` to see real-time rendering during evaluation.

您还可以添加`--onscreen_render`以在评估期间查看实时渲染。

For real-world data where things can be harder to model, train for at least 5000 epochs or 3-4 times the length after the loss has plateaued.

对于难以建模的现实数据，训练至少 5000 个 epoch，或者在损失一定的稳定性，训练 3-4 倍的长度。

Please refer to [tuning tips](https://docs.google.com/document/d/1FVIZfoALXg_ZkYKaYVh-qOlaXveq5CtvJHXkY25eYhs/edit?usp=sharing) for more info.

请参阅[调整提示](https://docs.google.com/document/d/1FVIZfoALXg_ZkYKaYVh-qOlaXveq5CtvJHXkY25eYhs/edit?usp=sharing)以获取更多信息。

### [ACT tuning tips](https://docs.google.com/document/d/1FVIZfoALXg_ZkYKaYVh-qOlaXveq5CtvJHXkY25eYhs/edit?usp=sharing)

TL;DR: if your ACT policy is jerky or pauses in the middle of an episode, just train for longer! Success rate and smoothness can improve way after loss plateaus.

如果您的 ACT 策略不稳定或在训练场景中间暂停，请训练更长时间！成功率和平滑度，可以在损失一定的稳定性后得到改善。