1 Introduce
• a metric to define the textureness of each image pixel; it serves as a proxy to understand how much the photoconsistency metric is reliable.
• to subdivide the image into superpixels and, for each iteration of the optimization procedure, to fit one plane for each superpixel; for each pixel, a new depth-normal hypothesis together with the textureness defined before, is evenly integrated in the optimization considering the likelihood of the plane fitting procedure.
• a novel depth refinement method that filters the depth and normal maps and fills each missing estimates with an approximate bilateral weighted median of the neighbors
2 Patch-Match for Multiview Stereo
定义两图的patch的像素级一致性: use a collaborative(协同、合作) search exploiting local coherency(一致性),即initial→prop→update。
Heise produce smoother depth estimates while preserving edges discontinuities, by regularizing(正则化) the estimate with quadratic relaxation(二次松弛)。
https://www.cv-foundation.org/openaccess/content_iccv_2013/papers/Heise_PM-Huber_PatchMatch_with_2013_ICCV_paper.pdf
Shen estimates a depth map for the selected subset of camera pairs. The algorithm refines the depth maps by enforcing consistency among multiple views, and it finally merges the depth maps into a point cloud. Galliani aggregate, for each reference camera, a set of matching costs computed from different source images. One of the major drawbacks of these approaches is the decoupled depth estimation and camera pairs selection.
Q1: 为啥说 depth estimation and camera pairs selection耦合是缺点?
A1:the decoupled depth estimation and camera pairs selection才是缺点
Rather than considering the whole set of images to compute the matching costs, Zheng et al. [31] proposed an elegant method to deal with view selection. They framed the joint depth estimation and pixel-wise view selection problem into a variational approximation framework. Following a generalized Expectation Maximization paradigm, they alternate between depth update with PatchMatch, keeping the view selection fixed, and pixel-wise view inference with the forward-backward algorithm, keeping the depth fixed.虽然没看懂,但是这个方法只估计了深度。所以Schonberger 引入了法线估计 to avoid the fronto-parallel assumption.
Then they add view-dependent priors to select views that more likely induce robust matching cost computation.
we propose two proxies to improve the reconstruction where untextured areas appear. 1 extend the probabilistic framework to explicitly detect and handle untextured regions by extending the set of PatchMatch hypotheses. 2 complete the depth estimation with a refinement procedure to fill the missing depth estimates.
3 Review of the COLMAP framework
KL散度 https://www.cnblogs.com/hxsyl/p/4910218.html
4 Textureness-Aware Joint PatchMatch and View Selection
segment images into superpixels such that each superpixel would span a region of the image with a texture mostly homogeneous and it likely stops in correspondence to an image edge . 将边缘周围光度一致性稳定区域的深度/法线估计传播到整个超像素。我们假设执行第3节中提出的框架的第一次迭代,这样我们就有了深度图的第一次估计,这只有在高度纹理区域对应的情况下才是可靠的。
Q2:怎么span和stop?
A2: M. Van den Bergh, X. Boix, G. Roig, and L. Van Gool.
Seeds: Superpixels extracted via energy-driven sampling.
International Journal of Computer Vision, 111(3):298–314,2015
seeds算法概要 https://blog.csdn.net/qq_36386112/article/details/77851111
超像素分割 https://www.zhihu.com/question/27623988
4.1 Piecewise(分段) planar Hypotheses generation
method: to augment the set of PatchMatch depth hypotheses
step1 extract the superpixels Sk of each image by means of the algorithm SEEDS
step2 在运行深度估计的第一次迭代后,我们滤除得到的深度图中的孤立小斑点。滤除完的图的Sk所在的区域内往往包含以下reliable 3d points Pinl, 如果是弱纹理区域,那么这些点往往是区域的边缘。
step3 计算内点(距离超像素对应的平面小于10cm)占Pinl的比例,如果内点比例大于Vran,就用tentative depth hypothesis更新当前hypothesis,即从具有较好内点比例的超像素中传播假设到neighbors。
Q3: we sample from Nk proportionally to the Bhattacharya distance among the RGB histograms of sk and the elements of Nk( the neighboring
superpixels set)
A3:Bhattacharyya距离(以下称巴氏距离)测量的是两个离散或连续概率分布的相似性
颜色直方图https://www.cnblogs.com/qing1991/p/10165743.html
小的超像素个数(coarse),大面积的图像被很好的覆盖,但同时,弱纹理区域是不适当的融合。反之亦然,超像素个数多,会underestimating large areas。所以引入两个新的假设:coarse和fine的深度和法向估计。
4.2 Textureness-Aware Hypotheses Integration
用方差表示纹理的强弱 弱纹理tx接近1
弱纹理w+接近1.0 强纹理w+接近0.8
弱纹理w-接近0.8 强纹理w-接近1.0
所以弱纹理的情况下(4)的Cphoto会乘一个接近0.8的增益,而coarse和fine的hypothesis会乘一个接近1的增益,所以实现了弱纹理区域的优化更偏向于新的coarse和fine的假设,而不是前面六个假设。
5 Joint Depth and Normal Depth Refinement
step1 apply a classical speckles filter to remove small blobs containing non-continuous depths values
step2 recoverb中的黑点点。bilateral NCC computation 中采用双边加权平均,系数与邻像素的图像和颜色空间与miss的像素的相似度相关(更好的方法是using a weighted median of depth)
Q4 上面这段话
A4:颜色直方图的bins:对于颜色直方图来说,横坐标是颜色空间(对应灰度直方图的灰度值),纵坐标是该颜色的像素点的数量。这样横坐标过多,而且很多横坐标对应的像素点数量很少,整个直方图会很稀疏。因此将颜色空间划分为若干个小的颜色区间,对于每个颜色通道(R,G,B),每32个划分到一个bin里面,这样每个颜色通道就有8bins,也就是每个颜色通道只能取8个值了,那么由排列组合知,取法一共有8^3=512种。这个就是只和miss点邻域内深度值频率最高的深度在的bin的所有点做双边滤波?