UMPNet: Universal Manipulation Policy Network for Articulated Objects

最新推荐文章于 2024-09-29 15:17:30 发布

好气呀

最新推荐文章于 2024-09-29 15:17:30 发布

阅读量664

点赞数 12

分类专栏：具身智能铰接物体文章标签：机器人

本文链接：https://blog.csdn.net/weixin_42679387/article/details/142103665

版权

文章目录

概述
论文解读
代码初读

概述

accepted：RAL/ICRA 2022
项目主页
文章解读参考：https://blog.csdn.net/passer__jw767/article/details/137012261

论文解读

工作动机

从self-guided exploration的角度出发，定于manipulation policy π 的goal为generate a sequence of actions to interact with a random articulated object which would result in novel states that haven’t been visited before.从而实现系统的learn through a self-guided exploration process, without explicit human demonstrations [23], scripted policy [27], or pre-defined goal conditions [28].
在这里插入图片描述
单步动作预测–>可变长度轨迹预测(6DoF)
提出了Arrow-of-Time(AoT)的概念，This AoT label indicates whether this action will change the object state back to the past or forward into the future.

方法架构

在这里插入图片描述
为了探索物体的新状态，系统应能够做到以下三点(a) choose the right position on the object to interact with, (b) select a proper action direction, and © consistently select actions in the following steps to explore novel states。对应了系统架构中的三个组件action position selection (a), action distance (b) and Arrow-of-Time inference © for action direction selection