360度全景技术java,360度相机(全景图片)中的卷积(一):Equirectangular Convolutions...

360度相机(全景图片)中的卷积(一):Equirectangular Convolutions

Corners for Layout: End-to-End Layout Recovery from 360 Images

03cc1a6e9b02503971ba8e70787af8fc.png

室内场景三维布局恢复问题是十多年来的一个核心研究课题。然而,仍有几个重大挑战尚未解决。在最相关的方法中,最先进的方法中有很大一部分对场景进行了隐式或显式的假设,例如盒形或曼哈顿布局。此外,目前的方法计算复杂,不适合实时应用,如机器人导航和AR/VR。

本文提出了CFL (用于布局的角),这是第一个用于360图像三维布局恢复的端到端模型。实验结果表明,本文提出方法的性能优于目前的技术水平,对该系统的假设比其他工作更少,成本更低。文章还证明了,通过使用 一种直接应用于球面投影的卷积Equirectangular Convolutions,该模型比传统方法更能推广到相机位置变化。

本博客关注卷积方面的内容,至于 layout 方面的研究,读者可以参考原文,这里跳过这些内容,直接介绍卷积操作。

Equirectangular Convolutions

Spherical images are receiving an increasing attention due to the growing number of omnidirectional sensors in drones, robots and autonomous cars. A na¨ıve application of convolutional networks to a equirectangular projection, is not, in principle, a good choice due to thespace-varying distortionsintroduced by such projection.

由于无人机、机器人和自动驾驶汽车上的全向传感器越来越多,球形图像正受到越来越多的关注。一个简单的方法是应用卷积网络来实现 equirectangular 投影。但是原则上,不是一个好的选择,因为这是一个空间变化的扭曲投影(也就是说,随着空间位置的不同,扭曲程度也不同)。

In this section we present a convolution that we name EquiConv, which is defined in the spherical domain instead of the image domain and it is implicitly invariant to equirectangular representation distortions. The kernel in EquiConvs is defined as a spherical surface patch –see Figure 4.

提出了 EquiConv 的卷积,它是在球面域中定义的,而不是在图像域中定义的,它对等矩形表示的失真具有隐式不变性。

EquiConvs 中的核被定义为一个球面 patch,如图 4 所示:

3171238479b5cc6524fd68850db7ebfa.png

We parametrize its receptive field by the angles αwand αh. Thus, we directly define a convolution over the field of view. The kernel is rotated and applied along the sphere and its position is defined by the spherical coordinates (φ andθ in the figure) of its center.

Unlike standard kernels, that are parameterized by their size kw × kh, with EquiConvs we define the angular size (αw × αh) and resolution (rw × rh). In practice, we keep the aspect ratio,αw/ rw= αh/ rh, and we use square kernels,so we will refer the field of view as α (αw = αh) and the resolution as r (rw = rh) respectively from now on.

As we increase the resolution of the kernel, the angular distance between the elements decreases, with the intuitive upper limit of not giving more resolution to the kernel than the image itself. In other words, the kernel is defined in a sphere, being its radius less or equal to the image sphere radius. EquiConvs can also be seen as a general model for spherical Atrous Convolutions [4, 5] where the kernel size is what we call resolution, and the rate is the field of view of the kernel divided by the resolution. An example of the differences of EquiConvs by modifiying α and r can be seen in Figure 5.

采用 αw 和 αh参数化感受野。因此,直接定义视场上的卷积。核沿着球体旋转并施加,其位置由其中心的球坐标 (图中为φ和θ)定义。

与标准的内核不同,标准内核是由它们的大小 kw × kh 参数化的,而在 EquiConvs,定义了角度大小 (αw × αh) 和 分辨率 (rw × rh)。在实际操作中,我们保留长宽比,αw/ rw= αh/ rh,并使用正方形 kernel,所以从现在开始我们将视场表示为α (αw = αh),分辨率表示为 r (rw = rh)。

当我们增加核的分辨率时,元素之间的角距离减小,直观的上限是核的分辨率不超过图像本身。换句话说,核被定义在一个球体中,即它的半径小于或等于像球的半径。EquiConvs 也可以被看作是球面卷积的一般模型[4,5],其中核的大小就是我们所说的分辨率,速率是核的视场除以分辨率。图 5 显示了通过修改 α 和 r 来区分 EquiConvs 的一个例子。

f3773847277e352d0b0d3e1692e6f560.png

EquiConvs Details

In [7], they introduce deformable convolutions by learning additional offsets from the preceding feature maps. Offsets are added to the regular kernel locations in the Standard Convolution enabling free form deformation of the kernel.

Inspired by this work, we deform the shape of the kernels according to the geometrical priors of the equirectangular image projection. To do that, we generate offsets that are not learned but fixed given the spherical distortion model and constant over the same horizontal locations. Here, we describe how to obtain the distorted pixel locations from the original ones.

本文的 EquiConvs 启发于可变形卷积。不同的是,EquiConvs 的偏移量不是学习得到的,而是根据球面几何先验之间计算得到的。这个几何先验就是:球面畸变模型 spherical distortion model和 水平位置不变性 constant over the same horizontal locations(就是说,在 equirectangular 图像中同一行,水平位置的偏移量是相同的)。

Let us define (43f64145a35fe901d14631db3f07a25b.gif,

d14826ddd1040981dfc86b58f3b2c3c4.gif)as the pixel location on the equirectangular image where we apply the convolution operation (i.e. the image coordinate where the center of the kernel is located).

First, we define the coordinates for every element in the kernel and afterwards we rotate them to the point of the sphere where the kernel is being applied. We define each point of the kernel as

211faaff43b6c1d403c4d73a6185f457.png

where i and j are integers in the range

2fe227e79c29075548d480c2ee8c4e28.png and d is the distance from the center of the sphere to the kernel grid. In order to cover the field of view α,

73c4be8a276b2c1c672010066010a118.png

We project each point into the sphere surface by normalizing the vectors, and rotate them to align the kernel center to the point where the kernel is applied.

e563dc0c3648d48da07cc1f07660a2be.png

where Ra(β) stands for a rotation matrix of an angle β around the a axis. φ0,0 and θ0,0 are the spherical angles of the center of the kernel –see Figure 4, and are defined as

8f2cdb5f70d56816eeb54c512f6bb07d.png

where W and H are, respectively, the width and height of the equirectangular image in pixels.

首先,我们为内核中的每个元素定义坐标,然后将它们旋转到应用内核的球面上的点。

i 和 j 是

2fe227e79c29075548d480c2ee8c4e28.png 这个范围内的整数 , d 是从球面中心到核网格的距离,并且为了覆盖视野 α,d 由 (4)式表示。

我们通过对向量进行归一化,将每个点投影到球面中,并旋转它们以使核中心与核的点对齐。

Ra(β)代表角度 β 绕一个轴的旋转的矩阵; φ0,0 and θ0,0是核的圆心的球角-参见图4,被定义为(6)式。

Finally, the rest of elements are back-projected to the equirectangular image domain.

First, we convert the unit sphere coordinates to latitude and longitude angles:

bd57ae007d42d9629b37686902a748b9.png

And then, to the original 2D equirectangular image domain:

50a2d71c6e4806e60fc4003ce0ff02c5.png

In Figure 6 we show how these offsets are applied to a regular kernel; and in Figure 7 three kernel samples on the spherical and on the equirectangular images.

最后,其余的元素被反投影到等矩形图像域。

首先,根据公式(7)我们将单位球坐标转换为经纬度;然后根据公式(8)变换到原二维等矩形图像域。

在图 6 中,展示了如何将这些偏移量应用于普通内核;而在图 7 中,在球面和等矩形图像上的三个核样本。

fc8a4e5edfb4d0e53bc2f7dd99a6ddd9.png

37d235369b2098372d37674d7c08e1b0.png

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值