Hierarchical Features Driven Residual Learning for Depth Map Super-Resolution 2019TIP 论文阅读

Hierarchical Features Driven Residual Learning for Depth Map Super-Resolution 2019 TIP论文阅读

Abstract

Abstract Rapid development of affordable and portable consumer depth cameras facilitates the use of depth information in many computer vision tasks such as intelligent vehicles and 3D reconstruction. However, depth map captured by low-cost depth sensors (e.g., Kinect) usually suffers from low spatial resolution, which limits its potential applications. In this paper, we propose a novel deep network for depth map super-resolution (SR), called DepthSR-Net. The proposed DepthSR-Net automatically infers a high-resolution (HR) depth map from its low-resolution (LR) version by hierarchical features driven residual learning. Specifically, DepthSR-Net is built on residual U-Net deep network architecture. Given LR depth map, we first obtain the desired HR by bicubic interpolation upsampling and then construct an input pyramid to achieve multiple level receptive fields. Next, we extract hierarchical features from the input pyramid, intensity image, and encoder decoder structure of U-Net. Finally, we learn the residual between the interpolated depth map and the corresponding HR one using the rich hierarchical features. The final HR depth map is achieved by adding the learned residual to the interpolated depth map. We conduct an ablation study to demonstrate the effectiveness of each component in the proposed network. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art methods. In addition, the potential usage of the proposed network in other low-level vision problems is discussed.

I. INTRODUCTION

II. RELATED WORK

A. Non Color-Guided Depth Map SR Method

B. Color-Guided Depth Map SR Method

C. Deep Learning-Based Color Image SR Method

Among the previous works, MSG-Net [4] is the most related one to the proposed DepthSR-Net.
However, DepthSR-Net is different from the MSG-Net in the following aspects: 1) Instead of performing an early spectral decomposition, we learn the residual map to avoid the spectral decomposition pre-processing, which is more flexible and suitable for practical applications; 2) Different from direct applying LR depth map as input, we first upscale it to the desired solution by bicubic interpolation, which relaxes the constraint on the size of output. In other word, the proposed DepthSR-Net can process any scaling factors while the MSG-Net only generalizes to 2N scaling factors due to the constraint of automatic upsampling operation utilized in the MSG-Net; 3) Compared with the MSG-Net, we make full use of the multi-level features extracted from input pyramid to recover HR depth map; 4) Although both MSG-Net and DepthSR-Net employ the intensity image as guidance, they extract intensity features by different network architectures. We acknowledge that the features extracted from the intensity image can boost the performance of depth map SR and furtherd emonstrate this conclusion in our ablation study.
然而,DepthSR-Net与MSG-Net的不同之处在于:1)我们不进行早期的光谱分解,而是学习残差映射以避免光谱分解预处理,这更灵活,更适合实际应用;2)与直接应用LR深度图作为输入不同,我们首先通过双三次插值将其提升到期望的解,从而放宽了对输出大小的限制。也就是说,本文提出的深度网格可以处理任何尺度因子,而由于MSG-Net中使用的自动上采样操作的限制,MSG-Net只能处理2N个尺度因子;3)与MSG-Net相比,我们充分利用了从输入金字塔中提取的多层次特征来恢复HR深度图;4)虽然MSG-Net和DepthSR-Net都使用强度图像作为指导,但它们根据不同的网络架构提取强度特征。我们承认从强度图像中提取的特征可以提高深度图SR的性能,并在消融研究中进一步证明了这一结论。

III. PROPOSED METHOD

In this part, we first briefly formulate the problem that this paper focuses on, and then illustrate the details of the proposed DepthSR-Net architecture. At last, we present the loss function, and training and implementation details.

A. Problem Formulation

Following the conclusion proposed in [44], when the original mapping is more like an identity mapping, the residual mapping will be much easier to be optimized.
ccordingly, we learn the residual between the interpolated depth map and the corresponding HR depth map that is the missed high-frequency component in the process of bicubic interpolation upsampling.

B. Proposed DepthSR-Net Architecture

The overview of the proposed network architecture and parameter settings is shown in Figure 2.
在这里插入图片描述
• input pyramid branch that achieves multiple level receptive fields and produces hierarchical representation;
• encoder branch that concatenates the hierarchical features from input pyramid and produces a set of hierarchical encoder features;
• hierarchical Y guidance branch that extracts hierarchical intensity features to transfer useful structure to the final HR depth map;
• skip connections that transmits the encoder features to decoder path;
• decoder branch that produces the residual map by fusing rich hierarchical concatenated features.
•输入金字塔分支,实现多级接受域,产生层次表示;
•编码器分支,连接来自输入金字塔的层次特征,并产生一组层次编码器特征;
分级指导分支,提取分级强度特征,将有用的结构转化为最终的HR深度图;
跳过将编码器特性传输到解码器路径的连接;
•解码器分支,通过融合丰富的层次级联特性来生成剩余映射。

1) Input Pyramid Branch:

Input pyramid branch has following advantages: (1) providing hierarchical feature representation extracted from input depth map; (2) achieving multiple level receptive fields; (3) reducing the risk of over-fitting by providing an abstract form of the representation.
输入金字塔分支具有以下优点:(1)提供从输入深度图中提取的层次特征表示;(2)实现多层次的接受域;(3)通过提供表示的抽象形式来降低过拟合的风险。

2) Encoder Branch:

Different from U-Net, the encoder path in our DepthSR-Net concatenates the hierarchical features extracted from input pyramid branch, which fuses multi-level feature representation.与U-Net不同的是,我们的DepthSR-Net中的编码器路径将从输入金字塔分支提取的层次特征串联起来,融合了多层次的特征表示。

3) Hierarchical Y Guidance Branch:

Different from the multi-scale guidance utilized in [4], we use the fixed size of convolution kernel (i.e., 3 3) and guide the reconstruction of residual map in the decoder branch. Hierarchical intensity features extracted by 3 3 convolution kernel can make full use of the discontinuities in intensity image to locate the associated depth discontinuities in the process of reconstruction, and also reduces the computational burden.与[4]中使用的多尺度制导不同的是,我们使用的卷积核的大小是固定的。3)和指导重构残差映射的解码器分支。利用3.3卷积核提取的分层强度特征,可以充分利用强度图像中的不连续点,在重建过程中定位相关深度不连续点,同时减少了计算量。
Hierarchical Y guidance branch has following advantages: (1) transferring hierarchically useful structure of intensity image to the final HR depth map; (2) increasing the network width of the decoder branch.(1)将强度图像的层次结构转化为最终的HR深度图;(2)增加解码器分支的网络宽度。

4) Skip Connections:

Skip connections operation is the extra connection between nodes in different layers of neural network that skip one or more layers of nonlinear processing. The purpose is to transfer the corresponding features from encoder branch to decoder branch. The advantages of the skip connections are as follows: (1) alleviating the vanishing gradient problem; (2) ensuring maximum information flow between layers; (3) encouraging feature reuse.Skip Connections operation是神经网络中不同层节点之间的额外连接,它会跳过一个或多个非线性处理层。其目的是将相应的特征从编码器分支转移到解码器分支。箕斗连接的优点如下:(1)缓解了消失梯度问题;(2)保证层间信息流最大化;(3)鼓励特征重用。

5) Decoder Branch:

In the decoder branch, we progressively fuse the decoder features with the hierarchical features from other branches to predict the residual map.在解码器分支中,我们逐步将解码器特征与其他分支的层次特征融合起来,以预测剩余映射。

C. Loss Function

在这里插入图片描述

D. Network Training and Implementation

IV. EXPERIMENTS

A. Experiment on Middlebury Dataset

B. Experiment on Test-ToF Dataset

C. Experiment on Test-Ynoise Dataset

D. Experiment on Real Data

E. Running Time

F. Ablation Study

V. APPLICATION

VI. DISCUSSION AND CONCLUSION

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值