论文笔记之抓取:Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations

Contributions:

  • (1) learn a 6-DOF grasping net from RGBD input;
  • (2) We build a grasping dataset from demonstrations in virtual reality with rich sensory and interaction annotations, propose a data augmentation strategy for effective learning;
  • (3) demonstrate that the learned geometry-aware representation leads to about 10% relative performance improvement over the baseline CNN on grasping objects from our dataset.
  • (4)demonstrate that the model generalizes to novel viewpoints and object instances.

1.Introduction

approach has the following features:

  • (1) it performs 3D shape reconstruction as an auxiliary task;
  • (2) it hallucinates the local view using a learning-free physical projection operator;
  • (3) it explicitly reuses the learned geometry-aware representation for grasping outcome prediction.

Network:

  • a shape generation network

learns to recognize and reconstruct the 3D geometry of the scene with an image encoder and voxel decoder.

image encoder transforms the RGBD input into a high-level geometry representation that involves shape, location, and orientation of the object.

voxel decoder network takes in the geometry representation and outputs the occupancy grid of the object.

  • a grasping outcome prediction network

produce a grasping outcome (e.g., success or failure)

Database

101 everyday objects with around 150K grasping demonstrations in Virtual Reality with both human and augmented synthetic interactions

Each objcet, 10-20 grasping attempts ----> a paralled jaw gripper

a pre-grasping status which includes the location and orientation of the object and gripper, as well as the grasping outcome

2.Related Work

The authors’ approach features:

  • (1) providing a method to learn a 6D grasping network from RGBD input
  • (2) an end-to-end deep learning framework for generative 3D shape modeling and leveraging it for predictive 6D grasping interaction
  • (3)learning-free projection layer that links the 2D observations with 3D object shape which allows for learning the shape representation without explicit 3D volume supervision.

3.MULTI - OBJECTIVE FRAMEWORK WITH GEOMETRY - AWARE REPRESENTATION

A. Learning generative geometry-aware representation from RGBD input

Differences:

  • (1)it takes location and orientation into consideration
  • (2)it is invariant to camera viewpoint and distance

Input an RGBD input --> I;

Output a corresponding 3D occupancy grid V;

Functional mapping \(f^V : I \to V \)

B. Depth supervision with in-network projection layer

projection operation \(f^D: V×P \to D \)

transforms a 3D shape into a 2D depth map with the camera transformation matrix P


The depth projection can be seen as:

  • (1)performing dense sampling from input volume (in the 3D world frame) to output volume (in normalized device coordinates)
  • (2) flattening the 3D spatial output across one dimension.

C. Viewpoint-invariant geometry-aware representation with multi-view supervision

  • (1) use the averaged identity units from multiple viewpoints as input to shape decoder network
  • (2) provide multiple projections for supervising the 3D shape reconstruction during training.

At testing time only provide RGBD input from single viewpoint.

Given a series of n observations \(I_1 , I_2 , · · · , I_n\) of the scene, the 3D reconstruction can be formulated as \(f^V: {I_i}_{i=1}^n \to V\)

The projection operator from i-th viewpoint is \(f^D: V×P_i \to D_i \), D depth, P camera transformation matrix

Reconstruction loss \(L^{shape}\)

D. Learning predictive grasping interaction with geometry-aware representation.

I input RGBD

a action

l outcome

functional mapping \(f_{baseline}^l:I×a \to l \)

E. DGGN: Deep geometry-aware grasping network

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值