【立体匹配之二】SGBM

最新推荐文章于 2024-04-14 08:59:10 发布

北络

最新推荐文章于 2024-04-14 08:59:10 发布

阅读量8.6k

点赞数 3

分类专栏：立体校正文章标签： SGBM

本文链接：https://blog.csdn.net/wangbaodong070411209/article/details/87894062

版权

立体校正专栏收录该内容

4 篇文章 1 订阅

订阅专栏

概念

立体匹配主要是利用多幅图像还原三维世界的空间信息。通常，立体稠密匹配可以分为4个基本步骤：

Matching cost computation，计算左图一个像素和右图一个像素之间的代价。
Cost aggregation: connects the matching cost within a certain neighborhood，一般基于点之间的匹配很容易受噪声的影响，往往真实匹配的像素的代价并不是最低。所以有必要在点的周围建立一个区域，让像素周围之间进行比较，到底一个综合的结果。
Disparity computation: selects the disparity with the lowest matching cost，这一步可以区分局部算法与全局算法，局部算法直接优化代价聚合模型，输出一个粗略的视差图。全局算法，要建立一个能量函数，能量函数的数据项往往就是代价聚合公式，例如DoubleBP。
Disparity refinement: removing peaks, interpolating gaps or increasing the accuracy by sub-pixel interpolation，对上一步得到的粗估计的视差图进行精确计算，策略有很多，例如plane fitting，插值，BP，动态规划等。
上述只是立体匹配的一个框架，SGM论文也同样按照这样来组织的，真正的实现还要靠具体实现。下面就按照论文的顺序来详细了解下SGM算法。

SGBM源码分析

opencv sgbm修改自Heiko Hirschmuller的《Stereo Processing by Semi-global Matching and Mutual Information》，与原方法不同点：

没有实现原文中基于互信息的匹配代价计算，而是采用BT算法（“Depth Discontinuities by Pixel-to-Pixel Stereo” by S. Birchfield and C. Tomasi）；
默认运行单通道DP算法，只用了5个方向，而fullDP使能时则使用8个方向（可能需要占用大量内存）；
增加了一些BM算法中的预处理和后处理程序；

SGBM主要配置参数说明：
minDisparity：最小视差，默认为0。此参数决定左图中的像素点在右图匹配搜索的起点，int 类型；
numDisparities：视差搜索范围长度，其值必须为16的整数倍。最大视差 maxDisparity = minDisparity + numDisparities -1；
blockSize：SAD代价计算窗口大小，默认为5。窗口大小为奇数，一般在33 到2121之间；
P1、P2：能量函数参数，P1是相邻像素点视差增/减 1 时的惩罚系数；P2是相邻像素点视差变化值大于1时的惩罚系数。P2必须大于P1。需要指出，在动态规划时，P1和P2都是常数。
一般建议：P1 = 8cnsgbm.SADWindowSizesgbm.SADWindowSize；
P2 = 32cnsgbm.SADWindowSizesgbm.SADWindowSize；

代码片段一

typedef cv::Point_<short> Point2s;
// SGBM匹配流程
void StereoSGBM::operator ()( InputArray _left, InputArray _right,
                             OutputArray _disp )
{
    Mat left = _left.getMat(), right = _right.getMat();
    CV_Assert( left.size() == right.size() && left.type() == right.type() &&
              							left.depth() == DataType<PixType>::depth );
	// 视差图输出格式
    _disp.create( left.size(), CV_16S );
    // getMat()是一种获取矩阵的Mat的常用方法，不用额外的复制矩阵的数据
    Mat disp = _disp.getMat();
	// 核心函数，计算视差图
    computeDisparitySGBM( left, right, disp, *this, buffer );
    // 中值滤波
    medianBlur(disp, disp, 3);
	// 剔除小连连通区
    if( speckleWindowSize > 0 )
        filterSpeckles(disp, (minDisparity - 1)*DISP_SCALE, speckleWindowSize, DISP_SCALE*speckleRange, buffer);
}

上述代码调用了几个子函数，其中，computeDisparitySGBM是基于SGM+BT方法的视差图计算流程。

代码片段二

该函数又调用了calcPixelCostBT，也就是Birchfeld-Tomasi metric 来计算cost，具体函数如下：

/*
 For each pixel row1[x], max(maxD, 0) <= minX <= x < maxX <= width - max(0, -minD),
 and for each disparity minD<=d<maxD the function
 computes the cost (cost[(x-minX)*(maxD - minD) + (d - minD)]), depending on the difference between
 row1[x] and row2[x-d]. The subpixel algorithm from
 "Depth Discontinuities by Pixel-to-Pixel Stereo" by Stan Birchfield and C. Tomasi
 is used, hence the suffix BT.

 the temporary buffer should contain width2*2 elements
 */
static void calcPixelCostBT( const Mat& img1, const Mat& img2, int y,
                            int minD, int maxD, CostType* cost,
                            PixType* buffer, const PixType* tab,
                            int tabOfs, int , int xrange_min = 0, int xrange_max = DEFAULT_RIGHT_BORDER )
{	
	// default cn == 1
    int x, c, width = img1.cols, cn = img1.channels();
    int minX1 = std::max(maxD, 0), maxX1 = width + std::min(minD, 0); // 视差计算范围
    int D = maxD - minD, width1 = maxX1 - minX1;
    //This minX1 & maxX2 correction is defining which part of calculatable line must be calculated
    //That is needs of parallel algorithm
    xrange_min = (xrange_min < 0) ? 0: xrange_min;
    xrange_max = (xrange_max == DEFAULT_RIGHT_BORDER) || (xrange_max > width1) ? width1 : xrange_max;
    maxX1 = minX1 + xrange_max;
    minX1 += xrange_min;
    width1 = maxX1 - minX1;
    int minX2 = std::max(minX1 - maxD, 0), maxX2 = std::min(maxX1 - minD, width);
    int width2 = maxX2 - minX2;
	// Mat最直接的访问方法是通过.ptr<>函数得到一行的指针，并用[]操作符访问某一列的像素值
    const PixType *row1 = img1.ptr<PixType>(y), *row2 = img2.ptr<PixType>(y);
    PixType *prow1 = buffer + width2*2, *prow2 = prow1 + width*cn*2; // buffer后留下两行后面用
	// 查表
    tab += tabOfs;
	// initial
    for( c = 0; c < cn*2; c++ )
    {
        prow1[width*c] = prow1[width*c + width-1] = prow2[width*c] = prow2[width*c + width-1] = tab[0];
    }
	// 此次 img1.step 就是图像 width
    int n1 = y > 0 ? -(int)img1.step : 0, s1 = y < img1.rows-1 ? (int)img1.step : 0; // n1 是左图的上一行，s1 是左图的下一行
    int n2 = y > 0 ? -(int)img2.step : 0, s2 = y < img2.rows-1 ? (int)img2.step : 0;

    int minX_cmn = std::min(minX1,minX2)-1;
    int maxX_cmn = std::max(maxX1,maxX2)+1;
    minX_cmn = std::max(minX_cmn, 1);
    maxX_cmn = std::min(maxX_cmn, width - 1);
	// 这里我们假设输入的就是单通道
    if( cn == 1 )
    {
        for( x = minX_cmn; x < maxX_cmn; x++ )
        {
			// 梯度值
            prow1[x] = tab[(row1[x+1] - row1[x-1])*2 + row1[x+n1+1] - row1[x+n1-1] + row1[x+s1+1] - row1[x+s1-1]]; // x 方向sobel滤波
            prow2[width-1-x] = tab[(row2[x+1] - row2[x-1])*2 + row2[x+n2+1] - row2[x+n2-1] + row2[x+s2+1] - row2[x+s2-1]]; // 右图逆序存储
			// 像数值
            prow1[x+width] = row1[x]; // 第二行对应位置存储灰度值
            prow2[width-1-x+width] = row2[x];
        }
    }

    memset( cost + xrange_min*D, 0, width1*D*sizeof(cost[0]) );

    buffer -= width-1-maxX2; // 回退minD - 1步，不过要乘上视差数
    cost -= (minX1-xrange_min)*D + minD; // simplify the cost indices inside the loop

    for( c = 0; c < cn*2; c++, prow1 += width, prow2 += width )
    {
        int diff_scale = c < cn ? 0 : 2;

        // precompute
        //   v0 = min(row2[x-1/2], row2[x], row2[x+1/2]) and
        //   v1 = max(row2[x-1/2], row2[x], row2[x+1/2]) and
        // 其实应该是：v0 = min((row2[x-1] + row2[x])/2, row2[x], (row2[x] + row2[x + 1])/2)
        // 和        v1 = max((row2[x-1] + row2[x])/2, row2[x], (row2[x] + row2[x + 1])/2)

        // 如官方注释，开始计算B.T. metrics，理论可以参考函数前注释给出的论文Depth Discontinuities by Pixel-to-Pixel Stereo
        // 简单来说，就是对于左图像素x和在某个视差值对应的右图像素x-d，不再如SAD一样单纯计算I_l(x) - I_r(x-d),
        // 而是先拟合下右图像素x-d左右 亚像素精度的视差值，即(row2[x-d-1] + row2[x-d])/2和(row2[x-d] + row2[x-d+1])/2，
        // 再计算左图像素与这三者的差并取最小值，即下面的c0；对于左图像素做同样的处理，得到下面的c1
        for( x = width-1-maxX2; x < width-1- minX2; x++ ) // 预先计算出右图的结果并保存
        {
            int v = prow2[x];
            int vl = x > 0 ? (v + prow2[x-1])/2 : v;
            int vr = x < width-1 ? (v + prow2[x+1])/2 : v;
            int v0 = std::min(vl, vr); v0 = std::min(v0, v);
            int v1 = std::max(vl, vr); v1 = std::max(v1, v);
            buffer[x] = (PixType)v0;          // 第一行存最小值
            buffer[x + width2] = (PixType)v1; // 第二行存最大值
        }

        for( x = minX1; x < maxX1; x++ ) // 针对左图像素依次计算cost
        {
            int u = prow1[x];
            int ul = x > 0 ? (u + prow1[x-1])/2 : u;
            int ur = x < width-1 ? (u + prow1[x+1])/2 : u;
            int u0 = std::min(ul, ur); u0 = std::min(u0, u);
            int u1 = std::max(ul, ur); u1 = std::max(u1, u);
            {
                for( int d = minD; d < maxD; d++ )
                {
                    int v = prow2[width-x-1 + d]; // 之前右图是逆序存储，现在需要逆序取出，并加上视差值
                    int v0 = buffer[width-x-1 + d]; // 最小值
                    int v1 = buffer[width-x-1 + d + width2]; // 最大值
                    int c0 = std::max(0, u - v1); c0 = std::max(c0, v0 - u);
                    int c1 = std::max(0, v - u1); c1 = std::max(c1, u0 - v);
                    // cost计算。与原论文不同的是，opencv的B.T. metrics包含了两个部分，
                    // 一部分为prow1和prow2第一行所存储的左右图的sobel滤波结果的B.T. metrics，
                    // 一部分为prow1和prow2第二行所存储的左右图的灰度值的B.T. metrics
                    cost[x*D + d] = (CostType)(cost[x*D+d] + (std::min(c0, c1) >> diff_scale));
                }
            }
        }
    }
}

总结：

blockSize(SADWindowSize) 越小，也就是匹配代价计算的窗口越小，视差图噪声越大；blockSize越大，视差图越平滑；太大的size容易导致过平滑，并且误匹配增多，体现在视差图中空洞增多；
惩罚系数控制视差图的平滑度，P2>P1，P2越大则视差图越平滑；
八方向动态规划较五方向改善效果不明显，主要在图像边缘能够找到正确的匹配；

参考文章

迷雾forest：https://blog.csdn.net/wsj998689aa/column/info/wsjcolumn-1
ethan_1990：https://blog.csdn.net/rs_lys/article/details/83302323
SGBM配置参数：https://blog.csdn.net/cxgincsu/article/details/74451940
SGBM流程：https://blog.csdn.net/zhubaohua_bupt/article/details/51866567

北络

关注

3
点赞
踩
44

收藏

觉得还不错? 一键收藏
2
评论
【立体匹配之二】SGBM

双目图像稠密匹配可以分为4个基本步骤：Matching cost computation;Cost aggregation: connects the matching cost within a certain neighborhood;Disparity computation: selects the disparity with the lowest matching cost;...
复制链接

扫一扫