论文笔记之抓取：Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations

最新推荐文章于 2023-12-02 12:11:11 发布

eight_Jessen

最新推荐文章于 2023-12-02 12:11:11 发布

阅读量372

点赞数

分类专栏：论文笔记文章标签：深度学习 pytorch 机器学习神经网络

本文链接：https://blog.csdn.net/eight_Jessen/article/details/107945512

版权

论文笔记专栏收录该内容

49 篇文章 7 订阅

订阅专栏

Contributions:

(1) learn a 6-DOF grasping net from RGBD input;
(2) We build a grasping dataset from demonstrations in virtual reality with rich sensory and interaction annotations, propose a data augmentation strategy for effective learning;
(3) demonstrate that the learned geometry-aware representation leads to about 10% relative performance improvement over the baseline CNN on grasping objects from our dataset.
(4)demonstrate that the model generalizes to novel viewpoints and object instances.

1.Introduction

approach has the following features:

(1) it performs 3D shape reconstruction as an auxiliary task;
(2) it hallucinates the local view using a learning-free physical projection operator;
(3) it explicitly reuses the learned geometry-aware representation for grasping outcome prediction.

Network:

a shape generation network

learns to recognize and reconstruct the 3D geometry of the scene with an image encoder and voxel decoder.

image encoder transforms the RGBD input into a high-level geometry representation that involves shape, location, and orientation of the object.

voxel decoder network takes in the geometry representation and outputs the occupancy grid of the object.

a grasping outcome prediction network

produce a grasping outcome (e.g., success or failure)

Database

101 everyday objects with around 150K grasping demonstrations in Virtual Reality with both human and augmented synthetic interactions

Each objcet, 10-20 grasping attempts ----> a paralled jaw gripper

a pre-grasping status which includes the location and orientation of the object and gripper, as well as the grasping outcome

2.Related Work

The authors’ approach features:

(1) providing a method to learn a 6D grasping network from RGBD input
(2) an end-to-end deep learning framework for generative 3D shape modeling and leveraging it for predictive 6D grasping interaction
(3)learning-free projection layer that links the 2D observations with 3D object shape which allows for learning the shape representation without explicit 3D volume supervision.

3.MULTI - OBJECTIVE FRAMEWORK WITH GEOMETRY - AWARE REPRESENTATION

A. Learning generative geometry-aware representation from RGBD input

Differences:

(1)it takes location and orientation into consideration
(2)it is invariant to camera viewpoint and distance

Input an RGBD input --> I;

Output a corresponding 3D occupancy grid V;

Functional mapping \(f^V : I \to V \)

B. Depth supervision with in-network projection layer

projection operation \(f^D: V×P \to D \)

transforms a 3D shape into a 2D depth map with the camera transformation matrix P

The depth projection can be seen as:

(1)performing dense sampling from input volume (in the 3D world frame) to output volume (in normalized device coordinates)
(2) flattening the 3D spatial output across one dimension.

C. Viewpoint-invariant geometry-aware representation with multi-view supervision

(1) use the averaged identity units from multiple viewpoints as input to shape decoder network
(2) provide multiple projections for supervising the 3D shape reconstruction during training.

At testing time only provide RGBD input from single viewpoint.

Given a series of n observations \(I_1 , I_2 , · · · , I_n\) of the scene, the 3D reconstruction can be formulated as \(f^V: {I_i}_{i=1}^n \to V\)

The projection operator from i-th viewpoint is \(f^D: V×P_i \to D_i \), D depth, P camera transformation matrix

Reconstruction loss \(L^{shape}\)

D. Learning predictive grasping interaction with geometry-aware representation.

I input RGBD

a action

l outcome

functional mapping \(f_{baseline}^l:I×a \to l \)

E. DGGN: Deep geometry-aware grasping network

eight_Jessen

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
论文笔记之抓取：Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations

Contributions:(1) learn a 6-DOF grasping net from RGBD input;(2) We build a grasping dataset from demonstrations in virtual reality with rich sensory and interaction annotations, propose a data augmentation strategy for effective learning;(3) demonstra
复制链接

扫一扫