sim9000a sim_sim2real在机器人操纵中

sim9000a simThis article contains details on sim2real in robotic manipulation for following tasks: 本文包含有关用于以下任务的机器人操纵中sim2real的详细信息: Perception for manipulation (DOPE / SD-MaskRCNN). 操纵感知(DOPE / SD-M...
摘要由CSDN通过智能技术生成

sim9000a sim

This article contains details on sim2real in robotic manipulation for following tasks:

本文包含有关用于以下任务的机器人操纵中sim2real的详细信息:

  • Perception for manipulation (DOPE / SD-MaskRCNN).

    操纵感知(DOPE / SD-MaskRCNN)。
  • Grasping (Dex-Net 3.0 / 6DOF GraspNet).

    抓取(Dex-Net 3.0 / 6DOF GraspNet)。
  • End-to-end policies. (Contact rich manipulation tasks & In hand manipulation of rubik’s cube)

    端到端策略。 (联系丰富的操作任务和魔方的手动操作)
  • Guided domain randomization techniques (ADR / Sim-Opt).

    引导域随机化技术(ADR / Sim-Opt)。

现实差距: (The reality gap:)

An increasingly impressive skills have been mastered by DeepRL algorithms over the years in simulation (DQN / AlphaGo / OpenAI Five). Both Deep learning and RL algorithms require super huge amounts of data. Moreover, RL algorithms there is risk to the environment or to the robot during the exploration phase. Simulation offers the promise of huge amounts of data (can be run in parallel and much faster than real time with minimal cost) and doesn’t break your robot during exploration. But these policies trained entirely in simulation fails to generalize on real robot. This gap between impressive performance in simulation and poor performance is known as the reality gap.

多年来,DeepRL算法已在模拟( DQN / AlphaGo / OpenAI Five )中掌握了越来越多的令人印象深刻的技能。 深度学习算法和RL算法都需要大量数据。 此外,RL算法在探索阶段会对环境或机器人造成风险。 仿真提供了海量数据的承诺(可以并行运行,并且比实时运行要快得多,而且成本最低),并且不会在探索过程中破坏您的机器人。 但是,这些完全在模拟中训练的策略无法推广到真实的机器人上。 仿真中令人印象深刻的性能与性能不佳之间的差距被称为现实差距。

Some of the ways to bridge the reality gap are:

弥合现实差距的一些方法是:

Image for post
Lil’Log [1] Lil'Log [1]
  • System Identification: Identify exact physical / geometrical / visual parameters of environment relevant to task and model it in simulation.

    系统识别:识别与任务相关的环境的确切物理/几何/视觉参数,并在仿真中对其建模。

  • Domain Adaptation: Transfer learning techniques for transferring / fine-tuning the policies trained in simulation in reality.

    领域适应:转移学习技术,用于转移/微调实际仿真中训练的策略。

  • Domain Randomization: Randomize the simulations to cover reality as one of the variations.

    域随机化:将模拟随机化以将现实作为变体之一。

We’ll mainly be focussing on domain randomization techniques and their extension used in some of the recent and successful sim2real transfers in robotic manipulation.

我们将主要关注域随机化技术及其在机器人操纵中最近成功使用的sim2real传输中使用的扩展。

域随机化 (Domain Randomization)

Formally domain randomization is defined as:

正式的域随机化定义为:

Image for post
P_{mu} is the randomized transition distribution. τ is the trajectory of samples as per policy π in the environment P_{mu}.
P_ {mu}是随机转变分布。 τ是环境P_ {mu}中根据策略π的样本的轨迹。

So effectively, domain randomization is trying to find a common policy π parameters that work across a wide range of randomized simulations P_{mu}. So the hope is that the policy that works across wide range of randomizations also works in the real world, assuming that the real world is just another randomization covered by randomization.

如此有效地,域随机化试图找到可在广泛的随机模拟P_ {mu}中工作的通用策略π参数。 因此,希望现实世界中的政策也可以在现实世界中工作,假设现实世界只是随机化所涵盖的另一个随机化。

Based on how these simulation randomization are chosen we have 2 types:

根据如何选择这些模拟随机化,我们有两种类型:

  • Domain randomization: Fixed randomization distributions over a range often chosen by hand. We will see how this has been used in perception & grasping tasks for data efficiency.

    域随机化:通常在手动选择的范围内的固定随机化分布。 我们将看到如何将其用于感知和掌握任务以提高数据效率。

  • Guided domain randomization: Either simulation or real world experiments can be used to change the randomization distribution. We will see how this has been used in training end2end policies for contact rich and dexterous tasks. Some of the guided domain randomizations do appear like domain adaptation.

    引导域随机化:可以使用仿真或真实世界的实验来更改随机化分布。 我们将看到如何在训练端到端策略中使用它来处理丰富而灵巧的联系人任务。 某些引导域随机化确实看起来像域适应。

域随机化: (Domain randomization:)

Some of the early examples of using domain randomizations was used for object localization on primitive shapes[2] and table top pushing[3]. We will look at examples of more advanced tasks such as segmentation and pose estimation with emphasis on what randomizations were chosen and how good are the transfer performance.

使用域随机化的一些早期示例被用于原始形状[2]和桌面推入[3]上的对象定位。 我们将看一些更高级的任务示例,例如分段和姿势估计,重点是选择了哪些随机化以及传输性能如何。

Domain Randomization in Perception:

感知领域随机化:

SD Mask R-CNN: SD (Synthetic Data) Mask R-CNN trains category agnostic instance segmentation entirely based on synthetic dataset with performance superior to that fine-tuned from COCO-dataset.

SD Mask R-CNN: SD(合成数据)Mask R-CNN完全基于合成数据集训练类别不可知实例分割,其性能优于从COCO数据集进行微调的性能。

Image for post
Data Generation Procedure for SD-Mask-RCNN. WISDOM (Wear House Instance Segmentation Dataset for Object Manipulation).
SD-Mask-RCNN的数据生成过程。 WISDOM(对象操纵的服装屋实例分割数据集)。

Simulator: pybullet

模拟器: pybullet

Randomizations: Since this network uses depth images as inputs, the randomizations needed are quite minimal ( depth realistic images are easy to generate compared to photo realistic).

随机化:由于此网络使用深度图像作为输入,因此所需的随机化非常小(与照片真实感相比,深度真实感图像易于生成)。

  • Sample a number n ∈ p(λ = 5) of objects and drop it in the bin using dynamic simulation. This will sample different objects and different object poses.

    对数量为n∈p(λ= 5)的对象进行采样,然后使用动态模拟将其放入垃圾箱。 这将采样不同的对象和不同的对象姿势。
  • Sample camera intrinsics K and camera extrinsic (R, t) ∈ SE(3) within a neighborhood of real camera intrinsics and extrinsic setup.

    在真实的相机内部特性和外部设置附近,对相机内部特性K和相机外部(R,t)∈SE(3)进行采样。
  • Render both the depth image D and foreground object masks M.

    渲染深度图像D和前景对象蒙版M。

The Mask-RCNN trained on instance segmentation entirely on synthetic data (SD-Mask R-CNN) is compared against a couple of baseline segmentation methods and Mask R-CNN trained on COCO dataset & fine-tined (FT Mask R-CNN) on WISDOM-real-train. The test set WISDOM-real-test used here is the real world dataset collected using a high-res and low-res depth cameras and hand labelled segmentation masks.

将完全在合成数据上进行实例分割训练的Mask-RCNN(SD-Mask R-CNN)与几种基准线分割方法进行了比较,在COCO数据集上经过训练的Mask R-CNN进行了精细细分(FT Mask R-CNN) WISDOM真实火车。 这里使用的测试集WISDOM-real-test是使用高分辨率和低分辨率深度相机以及手工标记的分割蒙版收集的真实世界数据集。

Image for post
Performance of Mask R-CNN. For both AP (Average Precision) and AR (Average Recall) higher is better.
Mask R-CNN的性能。 对于AP(平均精度)和AR(平均调用)而言,越高越好。

From the ablation study, both metrics go up as number of synthetic data samples are increased indicating more data could help the improve the perf

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值