360度全景技术java,360度相机（全景图片）中的卷积（一）：Equirectangular Convolutions...

最新推荐文章于 2021-03-20 18:59:04 发布

weixin_39867708

最新推荐文章于 2021-03-20 18:59:04 发布

阅读量273

点赞数

文章标签： 360度全景技术java

360度相机(全景图片)中的卷积(一)：Equirectangular Convolutions

Corners for Layout: End-to-End Layout Recovery from 360 Images

室内场景三维布局恢复问题是十多年来的一个核心研究课题。然而，仍有几个重大挑战尚未解决。在最相关的方法中，最先进的方法中有很大一部分对场景进行了隐式或显式的假设，例如盒形或曼哈顿布局。此外，目前的方法计算复杂，不适合实时应用，如机器人导航和AR/VR。

本文提出了CFL (用于布局的角)，这是第一个用于360图像三维布局恢复的端到端模型。实验结果表明，本文提出方法的性能优于目前的技术水平，对该系统的假设比其他工作更少，成本更低。文章还证明了，通过使用一种直接应用于球面投影的卷积Equirectangular Convolutions，该模型比传统方法更能推广到相机位置变化。

本博客关注卷积方面的内容，至于 layout 方面的研究，读者可以参考原文，这里跳过这些内容，直接介绍卷积操作。

Equirectangular Convolutions

Spherical images are receiving an increasing attention due to the growing number of omnidirectional sensors in drones, robots and autonomous cars. A na¨ıve application of convolutional networks to a equirectangular projection, is not, in principle, a good choice due to thespace-varying distortionsintroduced by such projection.

由于无人机、机器人和自动驾驶汽车上的全向传感器越来越多，球形图像正受到越来越多的关注。一个简单的方法是应用卷积网络来实现 equirectangular 投影。但是原则上，不是一个好的选择，因为这是一个空间变化的扭曲投影(也就是说，随着空间位置的不同，扭曲程度也不同)。

In this section we present a convolution that we name EquiConv, which is defined in the spherical domain instead of the image domain and it is implicitly invariant to equirectangular representation distortions. The kernel in EquiConvs is defined as a spherical surface patch –see Figure 4.

提出了 EquiConv 的卷积，它是在球面域中定义的，而不是在图像域中定义的，它对等矩形表示的失真具有隐式不变性。

EquiConvs 中的核被定义为一个球面 patch，如图 4 所示：

We parametrize its receptive field by the angles αwand αh. Thus, we directly define a convolution over the field of view. The kernel is rotated and applied along the sphere and its position is defined by the spherical coordinates (φ andθ in the figure) of its center.

Unlike standard kernels, that are parameterized by their size kw × kh, with EquiConvs we define the angular size (αw × αh) and resolution (rw × rh). In practice, we keep the aspect ratio,αw/ rw= αh/ rh, and we use square kernels,so we will refer the field of view as α (αw = αh) and the resolution as r (rw = rh) respectively from now on.

As we increase the resolution of the kernel, the angular distance between the elements decreases, with the intuitive upper limit of not giving more resolution to the kernel than the image itself. In other words, the kernel is defined in a sphere, being its radius less or equal to the image sphere radius. EquiConvs can also be seen as a general model for spherical Atrous Convolutions [4, 5] where the kernel size is what we call resolution, and the rate is the field of view of the kernel divided by the resolution. An example of the differences of EquiConvs by modifiying α and r can be seen in Figure 5.

采用 αw 和 αh参数化感受野。因此，直接定义视场上的卷积。核沿着球体旋转并施加，其位置由其中心的球坐标 (图中为φ和θ)定义。

与标准的内核不同，标准内核是由它们的大小 kw × kh 参数化的，而在 EquiConvs，定义了角度大小 (αw × αh) 和分辨率 (rw × rh)。在实际操作中，我们保留长宽比，αw/ rw= αh/ rh，并使用正方形 kernel，所以从现在开始我们将视场表示为α (αw = αh)，分辨率表示为 r (rw = rh)。

当我们增加核的分辨率时，元素之间的角距离减小，直观的上限是核的分辨率不超过图像本身。换句话说，核被定义在一个球体中，即它的半径小于或等于像球的半径。EquiConvs 也可以被看作是球面卷积的一般模型[4,5]，其中核的大小就是我们所说的分辨率，速率是核的视场除以分辨率。图 5 显示了通过修改 α 和 r 来区分 EquiConvs 的一个例子。

EquiConvs Details

In [7], they introduce deformable convolutions by learning additional offsets from the preceding feature maps. Offsets are added to the regular kernel locations in the Standard Convolution enabling free form deformation of the kernel.

Inspired by this work, we deform the shape of the kernels according to the geometrical priors of the equirectangular image projection. To do that, we generate offsets that are not learned but fixed given the spherical distortion model and constant over the same horizontal locations. Here, we describe how to obtain the distorted pixel locations from the original ones.

本文的 EquiConvs 启发于可变形卷积。不同的是，EquiConvs 的偏移量不是学习得到的，而是根据球面几何先验之间计算得到的。这个几何先验就是：球面畸变模型 spherical distortion model和水平位置不变性 constant over the same horizontal locations(就是说，在 equirectangular 图像中同一行，水平位置的偏移量是相同的)。

Let us define (,

)as the pixel location on the equirectangular image where we apply the convolution operation (i.e. the image coordinate where the center of the kernel is located).

First, we define the coordinates for every element in the kernel and afterwards we rotate them to the point of the sphere where the kernel is being applied. We define each point of the kernel as

where i and j are integers in the range

and d is the distance from the center of the sphere to the kernel grid. In order to cover the field of view α,

We project each point into the sphere surface by normalizing the vectors, and rotate them to align the kernel center to the point where the kernel is applied.

where Ra(β) stands for a rotation matrix of an angle β around the a axis. φ0,0 and θ0,0 are the spherical angles of the center of the kernel –see Figure 4, and are defined as

where W and H are, respectively, the width and height of the equirectangular image in pixels.

首先，我们为内核中的每个元素定义坐标，然后将它们旋转到应用内核的球面上的点。

i 和 j 是

这个范围内的整数 , d 是从球面中心到核网格的距离，并且为了覆盖视野 α，d 由 (4)式表示。

我们通过对向量进行归一化，将每个点投影到球面中，并旋转它们以使核中心与核的点对齐。

Ra(β)代表角度 β 绕一个轴的旋转的矩阵； φ0,0 and θ0,0是核的圆心的球角-参见图4，被定义为(6)式。

Finally, the rest of elements are back-projected to the equirectangular image domain.

First, we convert the unit sphere coordinates to latitude and longitude angles:

And then, to the original 2D equirectangular image domain:

In Figure 6 we show how these offsets are applied to a regular kernel; and in Figure 7 three kernel samples on the spherical and on the equirectangular images.

最后，其余的元素被反投影到等矩形图像域。

首先，根据公式(7)我们将单位球坐标转换为经纬度；然后根据公式(8)变换到原二维等矩形图像域。

在图 6 中，展示了如何将这些偏移量应用于普通内核；而在图 7 中，在球面和等矩形图像上的三个核样本。

weixin_39867708

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
360度全景技术java,360度相机（全景图片）中的卷积（一）：Equirectangular Convolutions...

360度相机(全景图片)中的卷积(一)：Equirectangular ConvolutionsCorners for Layout: End-to-End Layout Recovery from 360 Images 室内场景三维布局恢复问题是十多年来的一个核心研究课题。然而，仍有几个重大挑战尚未解决。在最相关的方法中，最先进的方法中有很大一部分对场景进行了隐式或显式的假设，例如盒形或曼哈顿布...
复制链接

扫一扫