论文阅读 [TPAMI-2022] SpherePHD: Applying CNNs on 360${}^\circ$∘ Images With Non-Euclidean Spherical Pol_spherephd: applying cnns on a spherical polyhedron-CSDN博客

本文链接：https://blog.csdn.net/weixin_42155685/article/details/124055314

该研究提出了一种新的表示全方位图像的方法，通过球面多面体减少采样时的失真。采用可训练参数的像素堆叠实现对图像的卷积操作，允许直接应用常规CNN架构。实验表明，这种方法在单目深度估计任务上优于现有方法，并且还介绍了使用目标检测网络拟合全方位图像中任意方向边界椭圆的技术。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

论文阅读 [TPAMI-2022] SpherePHD: Applying CNNs on 360 ${}^\circ$ ∘ Images With Non-Euclidean Spherical PolyHeDron Representation

论文搜索(studyai.com)

搜索论文: SpherePHD: Applying CNNs on 360 ${}^\circ$ ∘ Images With Non-Euclidean Spherical PolyHeDron Representation

搜索论文: http://www.studyai.com/search/whole-site/?q=SpherePHD:+Applying+CNNs+on+360 ${}^\circ$ ∘+Images+With+Non-Euclidean+Spherical+PolyHeDron+Representation

关键字(Keywords)

Distortion; Kernel; Two dimensional displays; Cameras; Convolution; Task analysis; Image representation; Omni-directional cameras; 360 Degree; convolutional neural network; detection network; semantic segmentation; depth estimation; icosahedron; non-euclidean deep lear

机器视觉

检测分割; 深度估计

摘要(Abstract)

Omni-directional images are becoming more prevalent for understanding the scene of all directions around a camera, as they provide a much wider field-of-view (FoV) compared to conventional images.

与传统图像相比，全方位图像在理解摄像机周围各个方向的场景方面变得越来越普遍，因为它们提供了更宽的视野（FoV）。.

In this work, we present a novel approach to represent omni-directional images and suggest how to apply CNNs on the proposed image representation.

在这项工作中，我们提出了一种表示全方位图像的新方法，并提出了如何将CNN应用于所提出的图像表示。.

The proposed image representation method utilizes a spherical polyhedron to reduce distortion introduced inevitably when sampling pixels on a non-Euclidean spherical surface around the camera center.

提出的图像表示方法利用球面多面体来减少在围绕相机中心的非欧几里德球面上采样像素时不可避免地引入的失真。.

To apply convolution operation on our representation of images, we stack the neighboring pixels on top of each pixel and multiply with trainable parameters.

为了对图像表示应用卷积运算，我们将相邻像素叠加在每个像素的顶部，并使用可训练的参数相乘。.

This approach enables us to apply the same CNN architectures used in conventional euclidean 2D images on our proposed method in a straightforward manner.

这种方法使我们能够以一种简单的方式将常规欧几里德2D图像中使用的CNN架构应用到我们提出的方法中。.

Compared to the previous work, we additionally compare different designs of kernels that can be applied to our proposed method.

与之前的工作相比，我们还比较了可应用于我们提出的方法的不同内核设计。.

We also show that our method outperforms in monocular depth estimation task compared to other state-of-the-art representation methods of omni-directional images.

我们还表明，与其他最先进的全方位图像表示方法相比，我们的方法在单目深度估计任务中表现得更好。.

In addition, we propose a novel method to fit bounding ellipses of arbitrary orientation using object detection networks and apply it to an omni-directional real-world human detection dataset…

此外，我们还提出了一种利用目标检测网络拟合任意方向边界椭圆的新方法，并将其应用于全方位的真实人体检测数据集。。.