A curated list of papers & resources linked to 3D reconstruction from images.
Note that:
- This list is not exhaustive,
- Tables use alphabetical order for fairness.
If you look to a more generic computer vision awesome list please check this list
Contents
Tutorials
SLAM Tutorial & survey
Micro Flying Robots: from Active Vision to Event-based Vision D. Scaramuzza.
ICRA 2016 Aerial Robotics - (Visual odometry) D. Scaramuzza
Simultaneous Localization And Mapping: Present, Future, and the Robust-Perception Age. C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. D. Reid, J. J. Leonard.
- "The paper summarizes the outcome of the workshop “The Problem of Mobile Sensors: Setting future goals and indicators of progress for SLAM” held during the Robotics: Science and System (RSS) conference (Rome, July 2015)."
Visual Odometry: Part I - The First 30 Years and Fundamentals, D. Scaramuzza and F. Fraundorfer, IEEE Robotics and Automation Magazine, Volume 18, issue 4, 2011
Visual Odometry: Part II - Matching, robustness, optimization, and applications, F. Fraundorfer and D. Scaramuzza, IEEE Robotics and Automation Magazine, Volume 19, issue 2, 2012
Large-scale, real-time visual-inertial localization revisited S. Lynen, B. Zeisl, D. Aiger, M. Bosse, J. Hesch, M. Pollefeys, R. Siegwart and T. Sattler. Arxiv 2019.
SfM tutorial
Open Source Structure-from-Motion. M. Leotta, S. Agarwal, F. Dellaert, P. Moulon, V. Rabaud. CVPR 2015 Tutorial (material).
Large-scale 3D Reconstruction from Images. T. Shen, J. Wang, T.Fang, L. Quan. ACCV 2016 Tutorial.
MVS tutorial
Multi-View Stereo: A Tutorial. Y. Furukawa, C. Hernández. Foundations and Trends® in Computer Graphics and Vision, 2015.
State of the Art 3D Reconstruction Techniques N. Snavely, Y. Furukawa, CVPR 2014 tutorial slides. Introduction MVS with priors - Large scale MVS
RGB-D mapping
3D indoor scene modeling from RGB-D data: a survey K. Chen, YK. Lai and SM. Hu. Computational Visual Media 2015.
State of the Art on 3D Reconstruction with RGB-D Cameras K. Hildebrandt and C. Theobalt EUROGRAPHICS 2018.
All in one tutorial
Introduction of Visual SLAM, Structure from Motion and Multiple View Stereo. Yu Huang 2014.
Computer vision books
Computer Vision: Algorithms and Applications. R. Szeliski. 2010.
Papers
SLAM/VO
Visual odometry (image based only)
Real-time simultaneous localisation and mapping with a single camera. A. J. Davison. ICCV 2003.
Visual odometry. D. Nister, O. Naroditsky, and J. Bergen. CVPR 2004.
Real time localization and 3d reconstruction. E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, and P. Sayd. CVPR 2006.
Parallel Tracking and Mapping for Small AR Workspaces. G. Klein, D. Murray. ISMAR 2007.
Real-Time 6-DOF Monocular Visual SLAM in a Large-scale Environments. H. Lim, J. Lim, H. Jin Kim. ICRA 2014.
Direct Sparse Odometry, J. Engel, V. Koltun, D. Cremers, arXiv:1607.02565, 2016.
Visual SLAM algorithms: a survey from 2010 to 2016, T. Taketomi, H. Uchiyama, S. Ikeda, IPSJ T Comput Vis Appl 2017.
∇SLAM: Dense SLAM meets Automatic Differentiation. K. M. Jatavallabhula, G. Iyer, L. Paull. arXiv:1910.10672, 2019.
Direct Sparse Mapping J. Zubizarreta, I. Aguinaga and J. M. M. Montiel. arXiv:1904.06577, 2019.
OpenVSLAM: A Versatile Visual SLAM Framework Sumikura, Shinya and Shibuya, Mikiya and Sakurada, Ken. In Proceedings of the 27th ACM International Conference on Multimedia 2019
SfM papers
Incremental SfM
Photo Tourism: Exploring Photo Collections in 3D. N. Snavely, S. M. Seitz, and R. Szeliski. SIGGRAPH 2006.
Towards linear-time incremental structure from motion. C. Wu. 3DV 2013.
Structure-from-Motion Revisited. Schöenberger, Frahm. CVPR 2016.
Global SfM
Combining two-view constraints for motion estimation V. M. Govindu. CVPR, 2001.
Lie-algebraic averaging for globally consistent motion estimation. V. M. Govindu. CVPR, 2004.
Robust rotation and translation estimation in multiview reconstruction. D. Martinec and T. Pajdla. CVPR, 2007.
Non-sequential structure from motion. O. Enqvist, F. Kahl, and C. Olsson. ICCV OMNIVIS Workshops 2011.
Global motion estimation from point matches. M. Arie-Nachimson, S. Z. Kovalsky, I. KemelmacherShlizerman, A. Singer, and R. Basri. 3DIMPVT 2012.
Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion. P. Moulon, P. Monasse and R. Marlet. ICCV 2013.
A Global Linear Method for Camera Pose Registration. N. Jiang, Z. Cui, P. Tan. ICCV 2013.
Global Structure-from-Motion by Similarity Averaging. Z. Cui, P. Tan. ICCV 2015.
Linear Global Translation Estimation from Feature Tracks Z. Cui, N. Jiang, C. Tang, P. Tan, BMVC 2015.
Hierarchical SfM
Structure-and-Motion Pipeline on a Hierarchical Cluster Tree. A. M.Farenzena, A.Fusiello, R. Gherardi. Workshop on 3-D Digital Imaging and Modeling, 2009.
Randomized Structure from Motion Based on Atomic 3D Models from Camera Triplets. M. Havlena, A. Torii, J. Knopp, and T. Pajdla. CVPR 2009.
Efficient Structure from Motion by Graph Optimization. M. Havlena, A. Torii, and T. Pajdla. ECCV 2010.
Hierarchical structure-and-motion recovery from uncalibrated images. Toldo, R., Gherardi, R., Farenzena, M. and Fusiello, A.. CVIU 2015.
Multi-Stage SfM
Parallel Structure from Motion from Local Increment to Global Averaging. S. Zhu, T. Shen, L. Zhou, R. Zhang, J. Wang, T. Fang, L. Quan. arXiv 2017.
Multistage SFM : Revisiting Incremental Structure from Motion. R. Shah, A. Deshpande, P. J. Narayanan. 3DV 2014. -> Multistage SFM: A Coarse-to-Fine Approach for 3D Reconstruction, arXiv 2016.
HSfM: Hybrid Structure-from-Motion. H. Cui, X. Gao, S. Shen and Z. Hu, ICCV 2017.
Non Rigid SfM
Robust Structure from Motion in the Presence of Outliers and Missing Data. G. Wang, J. S. Zelek, J. Wu, R. Bajcsy. 2016.
Viewing graph optimization
Skeletal graphs for efficient structure from motion. N. Snavely, S. Seitz, R. Szeliski. CVPR 2008
Optimizing the Viewing Graph for Structure-from-Motion. C. Sweeney, T. Sattler, M. Turk, T. Hollerer, M. Pollefeys. ICCV 2015
Graph-Based Consistent Matching for Structure-from-Motion. T. Shen, S. Zhu, T. Fang, R. Zhang, L. Quan. ECCV 2016.
Unordered feature tracking
Unordered feature tracking made fast and easy. P. Moulon and P. Monasse. CVMP 2012.
Point Track Creation in Unordered Image Collections Using Gomory-Hu Trees. Svärm, Simayijiang, Enqvist, Olsson. ICPR 2012.
Fast connected components computation in large graphs by vertex pruning. A. Lulli, E. Carlini, P. Dazzi, C. Lucchese, and L. Ricci. IEEE Transactions on Parallel and Distributed Systems 2016.
Large scale image matching for SfM
Video Google: A Text Retrieval Approach to Object Matching in Video. J. Sivic, F. Schaffalitzky and A. Zisserman. ICCV 2003.
Scalable Recognition with a Vocabulary Tree. Nister, Stewenius, CVPR 2006.
Building Rome in a Day. S. Agarwal, N. Snavely, I. Simon, S. M. Seitz, R. Szeliski. ICCV 2009.
Product quantization for nearest neighbor search. H. Jégou, M. Douze and C. Schmid. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011.
Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction. J. Cheng, C. Leng, J. Wu, H. Cui, H. Lu. CVPR 2014.
Recent developments in large-scale tie-point matching. Hartmann, Havlena, Schindler. ISPRS 2016.
Graphmatch: Efficient Large-Scale Graph Construction for Structure from Motion. C. Qiaodong, V. Fragoso, C. Sweeney and P. Sen. 3DV 2017.
Localization
Real time localization in SfM reconstructions
Real-time Image-based 6-DOF Localization in Large-Scale Environments. Lim, Sinha, Cohen, Uyttendaele. CVPR 2012.
Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization. Lynen, Sattler, Bosse, Hesch, Pollefeys, Siegwart. RSS 2015.
DSAC - Differentiable RANSAC for Camera Localization. E. Brachmann, A. Krull, S. Nowozin, J. Shotton, F. Michel, S. Gumhold, C. Rother. CVPR 2017.
Learning Less is More - 6D Camera Localization via 3D Surface Regression. E. Brachmann, C. Rother. Submitted to CVPR 2018.
Multiple View Stereovision
Point cloud computation
Accurate, Dense, and Robust Multiview Stereopsis. Y. Furukawa, J. Ponce. CVPR 2007. PAMI 2010
State of the art in high density image matching. F. Remondino, M.G. Spera, E. Nocerino, F. Menna, F. Nex . The Photogrammetric Record 29(146), 2014.
Progressive prioritized multi-view stereo. A. Locher, M. Perdoch and L. Van Gool. CVPR 2016.
Pixelwise View Selection for Unstructured Multi-View Stereo. J. L. Schönberger, E. Zheng, M. Pollefeys, J.-M. Frahm. ECCV 2016.
TAPA-MVS: Textureless-Aware PAtchMatch Multi-View Stereo. A. Romanoni, M. Matteucci. ICCV 2019
Surface computation & refinements
Efficient Multi-View Reconstruction of Large-Scale Scenes using Interest Points, Delaunay Triangulation and Graph Cuts. P. Labatut, J-P. Pons, R. Keriven. ICCV 2007
Multi-View Stereo via Graph Cuts on the Dual of an Adaptive Tetrahedral Mesh. S. N. Sinha, P. Mordohai and M. Pollefeys. ICCV 2007.
Towards high-resolution large-scale multi-view stereo. H.-H. Vu, P. Labatut, J.-P. Pons, R. Keriven. CVPR 2009.
Refinement of Surface Mesh for Accurate Multi-View Reconstruction. R. Tylecek and R. Sara. IJVR 2010.
High Accuracy and Visibility-Consistent Dense Multiview Stereo. H.-H. Vu, P. Labatut, J.-P. Pons, R. Keriven. Pami 2012.
Exploiting Visibility Information in Surface Reconstruction to Preserve Weakly Supported Surfaces M. Jancosek et al. 2014.
A New Variational Framework for Multiview Surface Reconstruction. B. Semerjian. ECCV 2014.
Photometric Bundle Adjustment for Dense Multi-View 3D Modeling. A. Delaunoy, M. Pollefeys. CVPR2014.
Global, Dense Multiscale Reconstruction for a Billion Points. B. Ummenhofer, T. Brox. ICCV 2015.
Efficient Multi-view Surface Refinement with Adaptive Resolution Control. S. Li, S. Yu Siu, T. Fang, L. Quan. ECCV 2016.
Multi-View Inverse Rendering under Arbitrary Illumination and Albedo, K. Kim, A. Torii, M. Okutomi, ECCV2016.
Shading-aware Multi-view Stereo, F. Langguth and K. Sunkavalli and S. Hadap and M. Goesele, ECCV 2016.
Scalable Surface Reconstruction from Point Clouds with Extreme Scale and Density Diversity, C. Mostegel, R. Prettenthaler, F. Fraundorfer and H. Bischof. CVPR 2017.
Multi-View Stereo with Single-View Semantic Mesh Refinement, A. Romanoni, M. Ciccone, F. Visin, M. Matteucci. ICCVW 2017
Machine Learning based MVS
Matchnet: Unifying feature and metric learning for patch-based matching, X. Han, Thomas Leung, Y. Jia, R. Sukthankar, A. C. Berg. CVPR 2015.
Stereo matching by training a convolutional neural network to compare image patches, J., Zbontar, and Y. LeCun. JMLR 2016.
Efficient deep learning for stereo matching, W. Luo, A. G. Schwing, R. Urtasun. CVPR 2016.
Learning a multi-view stereo machine, A. Kar, C. Häne, J. Malik. NIPS 2017.
Learned multi-patch similarity, W. Hartmann, S. Galliani, M. Havlena, L. V. Gool, K. Schindler.I CCV 2017.
Surfacenet: An end-to-end 3d neural network for multiview stereopsis, Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L. ICCV2017.
DeepMVS: Learning Multi-View Stereopsis, Huang, P. and Matzen, K. and Kopf, J. and Ahuja, N. and Huang, J. CVPR 2018.
RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials, D. Paschalidou and A. O. Ulusoy and C. Schmitt and L. Gool and A. Geiger. CVPR 2018.
MVSNet: Depth Inference for Unstructured Multi-view Stereo, Y. Yao, Z. Luo, S. Li, T. Fang, L. Quan. ECCV 2018.
Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency, T. Khot, S. Agrawal, S. Tulsiani, C. Mertz, S. Lucey, M. Hebert. 2019.
DPSNET: END-TO-END DEEP PLANE SWEEP STEREO, Sunghoon Im, Hae-Gon Jeon, Stephen Lin, In So Kweon. 2019.
Point-based Multi-view Stereo Network, Rui Chen, Songfang Han, Jing Xu, Hao Su. ICCV 2019.
Multiple View Mesh Texturing
Seamless image-based texture atlases using multi-band blending. C. Allène, J-P. Pons and R. Keriven. ICPR 2008.
Let There Be Color! - Large-Scale Texturing of 3D Reconstructions. M. Waechter, N. Moehrle, M. Goesele. ECCV 2014.
UAV Trajectory Optimization for model completeness
Submodular Trajectory Optimization for Aerial 3D Scanning. M. Roberts, A. Truong, D. Dey, S. Sinha, A. Kapoor, N. Joshi, P. Hanrahan. 2017.
OpenSource resources
OpenSource SfM (Structure from Motion)
Project | Language | License |
---|---|---|
Bundler | C++ | GNU General Public License - contamination |
Colmap | C++ | BSD 3-clause license - Permissive |
TeleSculptor | C++ | BSD 3-Clause license - Permissive |
MicMac | C++ | CeCILL-B |
MVE | C++ | BSD 3-Clause license + parts under the GPL 3 license |
OpenMVG | C++ | MPL2 - Permissive |
OpenSfM | Python | Simplified BSD license - Permissive |
TheiaSfM | C++ | New BSD license - Permissive |
OpenSource Multiple View Geometry Library Solvers
Project | Language | License |
---|---|---|
OpenGV | C++ | BSD - permissive |
OpenSource MVS (Multiple View Stereovision)
Project | Language | License |
---|---|---|
Colmap | C++ CUDA | BSD 3-clause license - Permissive (Can use CGAL -> GNU General Public License - contamination) |
GPUIma + fusibile | C++ CUDA | GNU General Public License - contamination |
HPMVS | C++ | GNU General Public License - contamination |
MICMAC | C++ | CeCILL-B |
MVE | C++ | BSD 3-Clause license + parts under the GPL 3 license |
OpenMVS | C++ (CUDA optional) | AGPL3 |
PMVS | C++ CUDA | GNU General Public License - contamination |
SMVS Shading-aware Multi-view Stereo | C++ | BSD-3-Clause license |
OpenSource SLAM (Simultaneous Localization And Mapping)
Project | Language | License |
---|---|---|
COSLAM | C++ | GNU General Public License |
DSO-Direct Sparse Odometry | C++ | GPLv3 |
DTSLAM-Deferred Triangulation SLAM | C++ | modified BSD |
LSD-SLAM | C++/ROS | GNU General Public License |
MAPLAB-ROVIOLI | C++/ROS | Apachev2.0 |
OKVIS: Open Keyframe-based Visual-Inertial SLAM | C++ | BSD |
ORB-SLAM | C++ | GPLv3 |
REBVO - Realtime Edge Based Visual Odometry for a Monocular Camera | C++ | GNU General Public License |
SVO semi-direct Visual Odometry | C++/ROS | GNU General Public License |
Large scale image retrieval / CBIR (Content Based Image Retrieval)
Project | Language | License |
---|---|---|
DBoW2 | C++ | modified BSD License |
libvot | C++ | BSD 3-Clause License |
VocabTree2 | C++ | BSD License |
OpenSource minimization
Project | Language | License |
---|---|---|
CERES SOLVER | C++ | BSD License |
GTSAM | C++ | BSD License |
G2O | C++ | BSD License + L/GPL3 restriction |
NLOPT | C++ | LGPL |
Nearest Neighbor Search
Project | Language | License |
---|---|---|
ANN | C++ | GNU General Public License |
Annoy | C++ | Apache License |
FLANN | C++ | BSD License |
Libnabo | C++ | BSD License |
Nanoflann | C++ | BSD License |
Mesh storage processing
Project | Language | License |
---|---|---|
3DTK | C++ | GPLv3 |
CGAL | C++ | Module dependent GPL/LGPL |
InstantMesh Mesh Simplification | C++ | BSD License |
GEOGRAM | C++ | Revised BSD License |
libigl | C++ | MPL2 |
Mesh-processing-library | C++ | MIT License |
Open3D | C++ | MIT License |
OpenMesh | C++ | BSD 3 clause license |
PCL | C++ | 3-clause BSD license |
VCG | C++ | GPL |
Features
Features detection/Description
From handcrafted to deep local features. G. Csurka, C. R. Dance, M. Humenberger. 2018.
Project | Detection | Description |
---|---|---|
AKAZE | x | MSURF/MLDB |
DART | x | x |
KAZE | x | MSURF/MLDB |
LIOP/MIOP | x | |
LIFT (machine learning) | x | x |
MROGH | x | |
SIFT | x | x |
SURF | x | x |
SFOP | x | |
... |
"Real time" oriented methods
Project | Detection | Description |
---|---|---|
BRIEF | x | |
BRISK | x | x |
FAST | x | |
FREAK | x | |
FRIF | x | x |
HIPS | x | |
LATCH | x | |
MOPS | x | |
PhonySift | Multi-scale Fast | Reduced Sift grid |
ORB | Multiscale Fast | Oriented BRIEF |
Datasets with ground truth - Reproducible research
Feature detection/description repeatability
VGG Oxford 8 dataset with GT homographies + matlab code.
Hannover - Region Detector Evaluation Data Set Similar to the previous (5 dataset). Datasets have multiple image resolution & an increased GT homographies precision.
DTU - Robot Image Data Sets - Point Feature Data Set 60 scenes with know calibration & different illuminations.
Corresponding interest point patches for descriptor learning
Corresponding patches, saved with a canonical scale and orientation.
Multi-view Stereo Correspondence Dataset
HPatches Dataset linked to the ECCV16 workshop "Local Features: State of the art, open problems and performance evaluation"
Monocular odometry dataset
Mono dataset 50 real-world sequences. Dataset linked to the DSO Visual Odometry paper.
MVS - Point Cloud - Surface accuracy
Middlebury Multi-view Stereo See "A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms". CVPR 2006.
Dense MVS See "On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery". CVPR 2008.
DTU - Robot Image Data Sets -MVS Data Set See “Large Scale Multi-view Stereopsis Evaluation“. CVPR 2014.
A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos in Unstructured Scenes, T. Schöps, J. L. Schönberger, S. Galiani, T. Sattler, K. Schindler, M. Pollefeys, A. Geiger,. CVPR 2017.
Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction, A. Knapitsch, J. Park, Q.Y. Zhou and V. Koltun. SIGGRAPH 2017.