2018 MVSNet:Depth Inference for Unstructured Multi-view Stereo

目录

MVSNet:Depth Inference for Unstructured Multi-view Stereo.  (ECCV2018 )

摘要

1 Introduction

2 Related work



MVSNet:Depth Inference for Unstructured Multi-view Stereo.  (ECCV2018 )

Yao Yao1, Zixin Luo1, Shiwei Li1, Tian Fang2, and Long Quan1
The Hong Kong University of Science and Technology

        HomePage:  https://www.cse.ust.hk/~yyaoag/     Code: https://github.com/YoYo000/MVSNet

网络阅读链接:

        https://blog.csdn.net/john_xia/article/details/88100410

 

摘要

    We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet  

    提出了一种基于多视图图像的深度图推理的端到端深度学习体系结构。在网络中,我们首先提取深度视觉图像特征,然后通过可微分的homography变换在参考相机截锥(frustum)上建立三维cost volume。接下来,我们应用三维卷积对初始深度图进行正则化和回归,然后用参考图像对初始深度图进行细化,生成最终的输出。我们的框架使用基于方差的成本度量灵活地适应任意n视图输入,该度量将多个特性映射到一个成本特性中。在大型室内DTU数据集上演示了所提出的MVSNET。通过简单的后处理,我们的方法不仅显著优于以前的技术状态,而且在运行时速度快了几倍。我们还对复杂的室外Tanks and Temples 数据集进行了MVSNET评估,在2018年4月18日之前,我们的方法排名第一,没有任何微调,显示出MVSNET强大的泛化能力。

 

1 Introduction

传统方法:Traditional methods use hand-crafted similarity metrics and engineeredregularizations (e.g., normalized cross correlation and semi-global matching [12]) to compute dense correspondences and recover 3D points.

        缺点:low-textured, specular and reflective regions of the scene make dense matching intractable and thus lead to incomplete reconstructions.

之前的CNN方法:

       two-view stereo缺点:fails to fully utilize the multi-view information and leads to less accurate result.

       SurfaceNet,Learned Stereo Machine (LSM)缺点:both the two methods exploit the volumetric representation of regular grids; huge memory consumption of 3D volumes;low volume resolution OR takes a long time。

提出的方法:

      (1)computes one depth map at each time, rather than the whole 3D scene at once.

        (2)  differentiable homography warping operation, which build the 3D cost volumes from 2D image features and enables the end-to-end training.

       (3) variance-based metric, which maps multiple features into one cost feature in the volume, adapt arbitrary number of source images in the input

提出方法与之前CNN方法的差别:

      (1)our 3D cost volume is built upon the camera frustum instead of the regular Euclidean space.

        (2) our method decouples the MVS reconstruction to smaller problems of per-view depth map estimation, which makes large-scale reconstruction possible.

 

2 Related work

根据输出形式,MVS方法分为3类:

    (1)direct point cloud reconstructions.   

                   缺点:propagation of point clouds are difficult to be fully parallelized;take a long time.

    (2)volumetric reconstructions. 

                  缺点:space discretization error and the high memory consumption.

    (3)depth map reconstructions

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值