论文阅读 [TPAMI-2022] SimVODIS: Simultaneous Visual Odometry, Object Detection, and Instance Segmentatio

最新推荐文章于 2024-11-13 09:26:31 发布

智尊宝人工智能社区

最新推荐文章于 2024-11-13 09:26:31 发布

阅读量267

点赞数

文章标签：计算机视觉深度学习机器学习人工智能 CVPR

本文链接：https://blog.csdn.net/weixin_42155685/article/details/123983589

版权

论文阅读 [TPAMI-2022] SimVODIS: Simultaneous Visual Odometry, Object Detection, and Instance Segmentation

论文搜索(studyai.com)

搜索论文: SimVODIS: Simultaneous Visual Odometry, Object Detection, and Instance Segmentation

搜索论文: http://www.studyai.com/search/whole-site/?q=SimVODIS:+Simultaneous+Visual+Odometry,+Object+Detection,+and+Instance+Segmentation

关键字(Keywords)

Semantics; Task analysis; Training; Intelligent agents; Feature extraction; Object detection; Instruction sets; Visual odometry (VO); data-driven VO; visual SLAM; semantic VO; semantic SLAM; semantic mapping; monocular video; depth map prediction; depth estimation; ego-m

机器视觉

检测分割; 姿态估计; 视觉SLAM; 视觉里程计; 深度估计; Mask-RCNN

摘要(Abstract)

Intelligent agents need to understand the surrounding environment to provide meaningful services to or interact intelligently with humans.

智能代理需要了解周围的环境，以便为人类提供有意义的服务或智能地与人类交互。.

The agents should perceive geometric features as well as semantic entities inherent in the environment.

代理应该感知环境中固有的几何特征和语义实体。.

Contemporary methods in general provide one type of information regarding the environment at a time, making it difficult to conduct high-level tasks.

一般来说，现代方法一次只能提供一种关于环境的信息，因此很难执行高级别任务。.

Moreover, running two types of methods and associating two resultant information requires a lot of computation and complicates the software architecture.

此外，运行两种类型的方法并关联两个结果信息需要大量计算，并使软件体系结构复杂化。.

To overcome these limitations, we propose a neural architecture that simultaneously performs both geometric and semantic tasks in a single thread: simultaneous visual odometry, object detection, and instance segmentation (SimVODIS).

为了克服这些局限性，我们提出了一种在单个线程中同时执行几何和语义任务的神经体系结构：同时视觉里程测量、对象检测和实例分割（SimVODIS）。.

SimVODIS is built on top of Mask-RCNN which is trained in a supervised manner.

SimVODIS建立在Mask RCNN之上，该RCNN以监督的方式进行训练。.

Training the pose and depth branches of SimVODIS requires unlabeled video sequences and the photometric consistency between input image frames generates self-supervision signals.

训练SimVODIS的姿势和深度分支需要未标记的视频序列，输入图像帧之间的光度一致性生成自我监控信号。.

The performance of SimVODIS outperforms or matches the state-of-the-art performance in pose estimation, depth map prediction, object detection, and instance segmentation tasks while completing all the tasks in a single thread.

SimVODIS在姿势估计、深度图预测、对象检测和实例分割任务方面的性能优于或匹配最先进的性能，同时在单个线程中完成所有任务。.

We expect SimVODIS would enhance the autonomy of intelligent agents and let the agents provide effective services to humans…

我们希望SimVODIS能够增强智能代理的自主性，让代理为人类提供有效的服务。。.