2018 MVSNet：Depth Inference for Unstructured Multi-view Stereo

最新推荐文章于 2024-03-21 16:20:37 发布

chenmingwei

最新推荐文章于 2024-03-21 16:20:37 发布

阅读量1.2k

点赞数

分类专栏： Paper Reading：三维重建

本文链接：https://blog.csdn.net/chenmingwei/article/details/90724016

版权

Paper Reading：三维重建专栏收录该内容

1 篇文章 0 订阅

订阅专栏

MVSNet：Depth Inference for Unstructured Multi-view Stereo. （ECCV2018 ）

摘要

1 Introduction

2 Related work

MVSNet：Depth Inference for Unstructured Multi-view Stereo. （ECCV2018 ）

Yao Yao1, Zixin Luo1, Shiwei Li1, Tian Fang2, and Long Quan1
The Hong Kong University of Science and Technology

HomePage: https://www.cse.ust.hk/~yyaoag/ Code: https://github.com/YoYo000/MVSNet

网络阅读链接：

https://blog.csdn.net/john_xia/article/details/88100410

摘要

We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet

提出了一种基于多视图图像的深度图推理的端到端深度学习体系结构。在网络中，我们首先提取深度视觉图像特征，然后通过可微分的homography变换在参考相机截锥(frustum)上建立三维cost volume。接下来，我们应用三维卷积对初始深度图进行正则化和回归，然后用参考图像对初始深度图进行细化，生成最终的输出。我们的框架使用基于方差的成本度量灵活地适应任意n视图输入，该度量将多个特性映射到一个成本特性中。在大型室内DTU数据集上演示了所提出的MVSNET。通过简单的后处理，我们的方法不仅显著优于以前的技术状态，而且在运行时速度快了几倍。我们还对复杂的室外Tanks and Temples 数据集进行了MVSNET评估，在2018年4月18日之前，我们的方法排名第一，没有任何微调，显示出MVSNET强大的泛化能力。

1 Introduction

传统方法：Traditional methods use hand-crafted similarity metrics and engineeredregularizations (e.g., normalized cross correlation and semi-global matching [12]) to compute dense correspondences and recover 3D points.

缺点：low-textured, specular and reflective regions of the scene make dense matching intractable and thus lead to incomplete reconstructions.

之前的CNN方法：

two-view stereo缺点：fails to fully utilize the multi-view information and leads to less accurate result.

SurfaceNet，Learned Stereo Machine (LSM)缺点：both the two methods exploit the volumetric representation of regular grids; huge memory consumption of 3D volumes；low volume resolution OR takes a long time。

提出的方法：

（1）computes one depth map at each time, rather than the whole 3D scene at once.

(2) differentiable homography warping operation, which build the 3D cost volumes from 2D image features and enables the end-to-end training.

(3) variance-based metric, which maps multiple features into one cost feature in the volume, adapt arbitrary number of source images in the input

提出方法与之前CNN方法的差别：

（1）our 3D cost volume is built upon the camera frustum instead of the regular Euclidean space.

(2) our method decouples the MVS reconstruction to smaller problems of per-view depth map estimation, which makes large-scale reconstruction possible.

2 Related work

根据输出形式，MVS方法分为3类：

（1）direct point cloud reconstructions.

缺点：propagation of point clouds are difficult to be fully parallelized；take a long time.

（2）volumetric reconstructions.

缺点：space discretization error and the high memory consumption.

（3）depth map reconstructions

chenmingwei

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
2018 MVSNet：Depth Inference for Unstructured Multi-view Stereo

目录MVSNet：Depth Inference for Unstructured Multi-view Stereo.（ECCV2018）摘要1 Introduction2 Related workMVSNet：Depth Inference for Unstructured Multi-view Stereo.（ECCV2018）Yao Yao1, Zixi...
复制链接

扫一扫