Stereo techniques have witnessed tremendous巨大 progress over the last decades, yet some aspects of the problem still remain challenging today. Striking examples are reflecting and textureless surfaces which cannot easily be recovered using traditional local regularizers . In this work, we therefore propose to regularize over larger distances using object-category specific disparity proposals (displets) which we sample using inverse graphics techniques based on a sparse disparity estimate and a semantic segmentation of the image. The proposed displets encode the fact that objects of certain categories are not arbitrarily shaped but typically exhibit regular structures. We integrate them as non-local regularizer for the challenging object class 'car' into a superpixel based CRF framework and demonstrate its benefits on the KITTI stereo evaluation. Our approach currently ranks first across all KITTI stereo leaderboards . The figure above depicts the result of our method on a challenging image. The left figure shows the input image with the inferred object wireframe线框 models overlayed覆盖. The right figure depicts the jointly inferred disparity map.
Introduction
Model
where i ~ j denotes the set of adjacent superpixels in S. In addition to the classic data term and pairwise constraints, we introduce long-range interactions into our model using displets: The displet unary potential (third term) encourages image regions with semantic class label c to be explained by a displet of the corresponding class. The last term ensures the displet and the associated superpixels in the image to be consistent.
Results
Below, we show three qualitative results in terms of the inferred推断 displets as well as the influence of the displets on the geometry of the superpixels. The influence is encoded as alpha channel: transparent 透明= no influence, solid = large influence.
Below, we visualize our results in terms of inferred disparity maps推测差距图 on three different examples. Top-left: Input image, top-right: Semantic segments from ALE, bottom-left: input disparity map obtained via semi-global matching (SGM) , bottom-right: our result.
Videos
Changelog
- 18.05.2016: Updated code and data files to work with MC-CNN-accurate and to produce results on KITTI 2015.
- 28.08.2015: First version online!
Download
- Paper (pdf, 4 MB)
- Extended Abstract (pdf, 1 MB)
- Supplementary Material (pdf, 10 MB)
- Poster (pdf, 8 MB)
- Slides (pdf, 11 MB)
- Displets Code (Matlab/C++, 5 MB)
- Displets Data (3D Models/pre-processed data, required to run the displets demo, 3 GB)
- Semantic and instance labels for all cars in the KITTI stereo 2012 training set (optional, 1 MB)
- Semi-convex Hull Code for creating low-res motion models (optional)
Citation
@INPROCEEDINGS{ Guney2015CVPR ,
author = { Fatma Güney and Andreas Geiger },
title = { Displets: Resolving Stereo Ambiguities using Object Knowledge },
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2015}
}