SURF C++代码详细阅读（一）

Harry-Wang

已于 2022-10-10 17:45:35 修改

阅读量1.6k

点赞数 1

分类专栏： SURF 文章标签： c++ 计算机视觉图像处理

于 2022-10-10 16:55:52 首次发布

本文链接：https://blog.csdn.net/harry_wang211/article/details/127211299

版权

SURF 专栏收录该内容

3 篇文章 2 订阅

订阅专栏

SURF C++代码详细阅读（一）

SURF算法概述
代码阅读

SURF算法概述

SURF是尺度不变特征变换算法（SIFT）的加速版，比SIFT要快好几倍（≈4），SURF算法用来完成特征点检测并构造相应描述符，可以应用于物体识别、图像拼接、3D重构等领域，由于其速度快、鲁棒性高，尤其为实时应用（目标检测、目标跟踪、视频拼接、3D重建）提供了可能。

特征点和描述符检测效果如下：
可以看到检测到的特征点和相应描述符
蓝色代表亮背景的暗点，红色代表暗背景的亮点，线段方向为主方向
在这里插入图片描述
关于SURF的原理有很多，在此不作说明。

代码阅读

源码来自opensurf和surf图像拼接开源项目并稍作修改：
http://opensurf1.googlecode.com/files/OpenSURFcpp.zip
https://github.com/ziqiguo/CS205-ImageStitching/tree/master/surf_sequential

surfDetDes

1.实现main函数中的方法需要调用即surflib 中的surfDetDes函数，surfDetDes函数为surfDet和surfDes函数结合，检测特征点并构造描述符。

//! Library function builds vector of described interest points
inline void surfDetDes(IplImage *img,    /* image to find Ipoints in */
                                             std::vector<Ipoint> &ipts, /* reference to vector of Ipoints */
                                             int single_mem_cpy,
                                             bool upright = false, /* run in rotation invariant mode? */
                                             int octaves = OCTAVES, /* number of octaves to calculate */
                                             int intervals = INTERVALS, /* number of intervals per octave */
                                             int init_sample = INIT_SAMPLE, /* initial sampling step */
                                             float thres = THRES /* blob response threshold */)
{
    // Create integral-image representation of the image
    IplImage *int_img = Integral(img);
    
    // Create Fast Hessian Object
    FastHessian fh(int_img, ipts, single_mem_cpy, octaves, intervals, init_sample, thres);

    // Extract interest points and store in vector ipts
    clock_t start = clock();
    fh.getIpoints();
    clock_t end = clock();
    std::cout<< "Extract keypoints took: " << float(end - start) / CLOCKS_PER_SEC    << " seconds" << std::endl;

    // Create Surf Descriptor Object
    Surf des(int_img, ipts);
  
    // Extract the descriptors for the ipts
    start = clock();
    des.getDescriptors(upright);
    end = clock();
    std::cout<< "Extract descriptor took: " << float(end - start) / CLOCKS_PER_SEC    << " seconds" << std::endl;

    // Deallocate the integral image
    cvReleaseImage(&int_img);
}

以 inline 修饰的函数叫做内联函数，编译时C++编译器会在调用内联函数的地方展开，没有函数压栈的开销，提升程序运行的效率。
在之后金字塔构造时也会用到内联函数

主要包括以下模块：

创建积分图像 IplImage *int_img = Integral(img);
创建快速海森矩阵 FastHessian fh(int_img, ipts, single_mem_cpy, octaves, intervals, init_sample, thres);
提取特征点 fh.getIpoints();
创建描述符对象并得到描述符 des.getDescriptors(upright);

并通过clock()时钟函数以 float(end - start) / CLOCKS_PER_SEC返回每一步用时。

1.创建积分图像

目的：加速运算。每个像素的灰度都是它与坐标原点(0,0)形成的对角线的矩形内的所有像素的灰度值之和。在计算某个矩阵框内的像素灰度值之和时，就可以很快得出结果。

//! Computes the integral image of image img.    Assumes source image to be a 
//! 32-bit floating point.    Returns IplImage of 32-bit float form.

/* 计算图像的积分图 */
IplImage *Integral(IplImage *source)
{
    // convert the image to single channel 32f
    IplImage *img = getGray(source); //灰度图
    IplImage *int_img = cvCreateImage(cvGetSize(img), IPL_DEPTH_32F, 1);// 建一个宽高与原图一致的单精度浮点数（32位）单通道图像

    // set up variables for data access
    int height = img->height;
    int width  = img->width;
    int step   = img->widthStep/sizeof(float); //存储一行像素需要的字节数（4的倍数）/每一个像素的字节数4 = 每一行的像素数
    float *data     = (float *) img->imageData;    //图像像素的首指针
    float *i_data   = (float *) int_img->imageData;    

    // first row only
	// 第一行的情况
    float rs = 0.0f;
    for(int j=0; j<width; j++) 
    {
        rs += data[j]; 
        i_data[j] = rs;
    }

    // remaining cells are sum above and to the left
    //i++需要一个暂时变量，然后将i加1后，返回的是暂时变量。而++i就是自增后返回i。在空间损耗上，i++要略高于++i，因此，在不影响代码逻辑的前提下，要尽量使用++i。
    //其他行 
    for(int i=1; i<height; ++i) 
    {
        rs = 0.0f;
        for(int j=0; j<width; ++j) //后续每个像素块把上方左方求和 上方和即积分图上一行像素值 左方和遍历一次
        {
            rs += data[i*step+j]; 
            i_data[i*step+j] = rs + i_data[(i-1)*step+j];
        }
    }

    // release the gray image
    //cvReleaseImage后，会设置此时图片占用的内存数据为NULL，但是该内存地址不会被释放，当有新图片使用cvLoadImage读时，会放在这个被置位NULL的内存地址处
    cvReleaseImage(&img);

    // return the integral image
    return int_img;
}

widthStep：表示存储一行像素需要的字节数。在cv::Iplimage属性中，widthStep必须是4的倍数，实现字节对齐，有利于提高运算速度。

循环中的++i： i++需要一个暂时变量，然后将i加1后，返回的是暂时变量。而++i就是自增后返回i。在空间损耗上，i++要略高于++i，因此，在不影响代码逻辑的前提下，要尽量使用++i。

cvReleaseImage：设置此时图片占用的内存数据为NULL，但是该内存地址不会被释放，当有新图片使用cvLoadImage读时，会放在这个被置位NULL的内存地址处。
补程序四模块：初始化、运算、控制、释放。

创建灰度图（单精度浮点数（32位）单通道），首先将第一行相加，后续每个像素块把上方左方求和上方和即积分图上一行像素值左方和遍历一次。
积分函数会在之后的海森矩阵、盒滤波器计算中用到。

2.创建快速海森矩阵

FastHessian类主要包含：

构造函数
获取特征点 getIpoints() 是最主要的函数，其需要用到其他函数包括：
构建并计算海森矩阵行列式响应图（DoH） void buildResponseMap(); buildResponseLayer(ResponseLayer *r)
判断极值 isExtremum
插值函数（改编SIFT）interpolateExtremum

class FastHessian {
    
    public:
     
        //! Constructor without image
        FastHessian(std::vector<Ipoint> &ipts, 
                                const int single_mem_cpy,
                                const int octaves = OCTAVES, 
                                const int intervals = INTERVALS, 
                                const int init_sample = INIT_SAMPLE, 
                                const float thres = THRES);

        //! Constructor with image
        FastHessian(IplImage *img, 
                                std::vector<Ipoint> &ipts,
                                const int single_mem_cpy, 
                                const int octaves = OCTAVES, 
                                const int intervals = INTERVALS, 
                                const int init_sample = INIT_SAMPLE, 
                                const float thres = THRES);

        //! Destructor
        ~FastHessian();

        //! Save the parameters
        void saveParameters(const int octaves, 
                                                const int intervals,
                                                const int init_sample, 
                                                const float thres,
                                                const int single_mem_cpy);

        //! Set or re-set the integral image source
        void setIntImage(IplImage *img);

        //! Find the image features and write into vector of features
        void getIpoints();
        
    private:

        //---------------- Private Functions -----------------//

        //! Build map of DoH responses
        void buildResponseMap();

        //! Calculate DoH responses for supplied layer
        void buildResponseLayer(ResponseLayer *r);

        //! 3x3x3 Extrema test
        int isExtremum(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer *b);        
        
        //! Interpolation functions - adapted from Lowe's SIFT implementation
        void interpolateExtremum(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer *b);//, Ipoint* ipts_tmp, int* changed);
        void interpolateStep(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer *b,
                                                    double* xi, double* xr, double* xc );
        double** deriv3D(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer *b);
        double** hessian3D(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer *b);

        //---------------- Private Variables -----------------//

        //! Pointer to the integral Image, and its attributes 
        IplImage *img;
        int i_width, i_height;

        int single_mem_cpy;

        //! Reference to vector of features passed from outside 
        std::vector<Ipoint> &ipts;

        //! Response stack of determinant of hessian values
        std::vector<ResponseLayer *> responseMap;

        //! Number of Octaves
        int octaves;

        //! Number of Intervals per octave
        int intervals;

        //! Initial sampling step for Ipoint detection
        int init_sample;

        //! Threshold value for blob resonses
        float thresh;

};

2.1构造函数

构造一个关于点向量或图像的FastHessian类都要设置一次参数saveParameters
图像还要设置初始图像

注意img是积分图像不然会影响后边计算DoH的逻辑理解

//! Constructor without image
FastHessian::FastHessian(std::vector<Ipoint> &ipts, const int single_mem_cpy,
                                                 const int octaves, const int intervals, const int init_sample, 
                                                 const float thresh) 
                                                 : ipts(ipts), i_width(0), i_height(0)
{
    // Save parameter set
    saveParameters(octaves, intervals, init_sample, thresh, single_mem_cpy);
}

//-------------------------------------------------------

//! Constructor with image 注意初始化图像是积分图像
FastHessian::FastHessian(IplImage *img, std::vector<Ipoint> &ipts, const int single_mem_cpy,
                                                 const int octaves, const int intervals, const int init_sample, 
                                                 const float thresh) 
                                                 : ipts(ipts), i_width(0), i_height(0)
{
    // Save parameter set
    saveParameters(octaves, intervals, init_sample, thresh, single_mem_cpy);

    // Set the current image
    setIntImage(img);
}

每次设置参数（小写）超出范围则使用全局静态常量（大写）

默认构造金字塔层数5 每一层间隔（DoH响应图）数4 采样因子2 阈值0.0004

//! Save the parameters
//! 每次设置参数 超出范围则使用全局静态常量
void FastHessian::saveParameters(const int octaves, const int intervals, 
                                            const int init_sample, const float thresh, const int single_mem_cpy)
{
    // Initialise variables with bounds-checked values
    this->octaves = 
        (octaves > 0 && octaves <= 4 ? octaves : OCTAVES);
    this->intervals = 
        (intervals > 0 && intervals <= 4 ? intervals : INTERVALS);
    this->init_sample = 
        (init_sample > 0 && init_sample <= 6 ? init_sample : INIT_SAMPLE);
    this->thresh = (thresh >= 0 ? thresh : THRES);

    this->single_mem_cpy = single_mem_cpy;
}

2.2 获取特征点

//! Find the image features and write into vector of features
void FastHessian::getIpoints()
{
    // filter index map
    static const int filter_map [OCTAVES][INTERVALS] = {{0,1,2,3}, {1,3,4,5}, {3,5,6,7}, {5,7,8,9}, {7,9,10,11}};

    // Clear the vector of exisiting ipts
    ipts.clear();

    std::clock_t start;
    start = std::clock();

    // Build the response map
    buildResponseMap();

    std::cout << "buildResponseMap took: " << (std::clock() - start) / (double)(CLOCKS_PER_SEC / 1000) << " ms" << std::endl;

    start = std::clock();

    // Get the response layers 判断是否是极值点
    ResponseLayer *b, *m, *t;

    for (int o = 0; o < octaves; ++o) 
    {
        for (int i = 0; i <= 1; ++i)
        {
            b = responseMap.at(filter_map[o][i]);
            m = responseMap.at(filter_map[o][i+1]);
            t = responseMap.at(filter_map[o][i+2]);

            // Ipoint ipts_tmp[t->height][t->width];
            // int changed[t->height][t->width];

            // loop over middle response layer at density of the most 
            // sparse layer (always top), to find maxima across scale and space

            
            for (int r = 0; r < t->height; ++r)
            {
                for (int c = 0; c < t->width; ++c)
                {
                    if (isExtremum(r, c, t, m, b))
                    {
                        interpolateExtremum(r, c, t, m, b);//, &ipts_tmp[r][c], &changed[r][c]);
                    }
                }
            }

            // for (int r = 0; r < t->height; ++r)
            // {
            //     for (int c = 0; c < t->width; ++c)
            //     {
            //         if (changed[r][c] == 1)
            //         {
            //             ipts.push_back(ipts_tmp[r][c]);
            //         }
            //     }
            // }
        }
    }

     std::cout << "Ipoint vector took: " << (std::clock() - start) / (double)(CLOCKS_PER_SEC / 1000) << " ms" << std::endl;
}

其过程为：
初始化过滤后的图像（DoH）static const int filter_map [OCTAVES][INTERVALS]
创建并计算DoH响应图 buildResponseMap();
在临近的三幅DoH图（ResponseLayer *b, *m, *t）中寻找极值点并进行插值

内容多，分篇写。这一篇写到：计算海森矩阵行列式响应图（DoH）

2.2.1 创建DoH响应图 buildResponseMap()

//! Build map of DoH responses
void FastHessian::buildResponseMap()
{
    // Calculate responses for the first 4 octaves:
    // Oct1: 9,  15, 21, 27
    // Oct2: 15, 27, 39, 51
    // Oct3: 27, 51, 75, 99
    // Oct4: 51, 99, 147,195
    // Oct5: 99, 195,291,387

    // Deallocate memory and clear any existing response layers
    //删除分配内存和响应层
    for(unsigned int i = 0; i < responseMap.size(); ++i)    
        delete responseMap[i];
    responseMap.clear();

    // Get image attributes
    int w = (i_width / init_sample);
    int h = (i_height / init_sample);
    int s = (init_sample);//采样因子

    int rsize;

    // Calculate approximated determinant of hessian values
    //计算hessian值的近似行列式
    //参数为 宽、高、采样因子、盒滤波器大小
    if (octaves >= 1)
    {
        responseMap.push_back(new ResponseLayer(w,     h,     s,     9));
        responseMap.push_back(new ResponseLayer(w,     h,     s,     15));
        responseMap.push_back(new ResponseLayer(w,     h,     s,     21));
        responseMap.push_back(new ResponseLayer(w,     h,     s,     27));
        rsize = 4;
    }
 
    if (octaves >= 2)
    {
        responseMap.push_back(new ResponseLayer(w/2, h/2, s*2, 39));
        responseMap.push_back(new ResponseLayer(w/2, h/2, s*2, 51));
        rsize = 6;
    }

    if (octaves >= 3)
    {
        responseMap.push_back(new ResponseLayer(w/4, h/4, s*4, 75));
        responseMap.push_back(new ResponseLayer(w/4, h/4, s*4, 99));
        rsize = 8;
    }

    if (octaves >= 4)
    {
        responseMap.push_back(new ResponseLayer(w/8, h/8, s*8, 147));
        responseMap.push_back(new ResponseLayer(w/8, h/8, s*8, 195));
        rsize = 10;
    }

    if (octaves >= 5)
    {
        responseMap.push_back(new ResponseLayer(w/16, h/16, s*16, 291));
        responseMap.push_back(new ResponseLayer(w/16, h/16, s*16, 387));
        rsize = 12;
    }


    for (unsigned int i = 0; i < responseMap.size(); ++i)
    {
        buildResponseLayer(responseMap[i]);
    }
}

五层金字塔盒滤波器大小在注释中，用过ResponseLayer创建12个不同大小的盒滤波器，利用尺度（盒滤波器大小）的重复性，这12个盒滤波器就可以代表5层Octaves每层4个DoH共5*4=20个盒滤波器响应层。
接着每一个盒滤波器响应层通过函数buildResponseLayer(responseMap[i]);得到
这里responseMap是一个响应层数组，每个元素为ResponseLayer

std::vector<ResponseLayer *> responseMap;

ResponseLayer结构

ResponseLayer(int width, int height, int step, int filter)
    {
        assert(width > 0 && height > 0);
        
        this->width = width;
        this->height = height;
        this->step = step;//采样因子
        this->filter = filter;//盒滤波的大小

        responses = new float[width * height];//存放每个大小的filer下的图像像素点的海森矩阵的行列式值 
        //开辟一个存放float数的空间，返回一个指向该存储空间的地址
        laplacian = new unsigned char[width * height];//存放laplacian标识符，判断是暗背景上的亮斑还是亮背景的暗斑
        //开辟一个存放unsigned char的空间，返回一个指向该存储空间的地址

        memset(responses,0,sizeof(float)*width*height);//申请的内存进行初始化
        memset(laplacian,0,sizeof(unsigned char)*width*height);
    }

2.2.2 计算DoH响应层

SURF的金字塔和SIFT大致结构一样，主要区别在：且用盒滤波器直接近似了高斯滤波模糊和拉普拉斯二阶导求极值，所以盒滤波器过滤图，即Hessian矩阵行列式的图像(DoH)，就相当于SIFT算法中的高斯差分图像(DoG)，且一副原图生产一副DoH（用积分图计算），而SIFT金字塔通过不同尺度图像做差产生DoG。
附一张SIFT金字塔做对比

在这里插入图片描述

DoH的计算过程

//! Calculate DoH responses for supplied layer
void FastHessian::buildResponseLayer(ResponseLayer *rl)
{
    float *responses = rl->responses;                 // response storage
    unsigned char *laplacian = rl->laplacian; // laplacian sign storage
    int step = rl->step;                                            // step size for this filter
    int b = (rl->filter - 1) / 2 + 1;                 // border for this filter
    int l = rl->filter / 3;                                     // lobe for this filter (filter size / 3)
    int w = rl->filter;                                             // filter size
    float inverse_area = 1.f/(w*w);                     // normalisation factor
    float Dxx, Dyy, Dxy;

    int height = rl->height;
    int width = rl->width;

    int img_step = img->widthStep/sizeof(float); //每一行的像素个数
    int img_height = img->height;
    int img_width = img->width;

    float *img_data = (float *) img->imageData;

    int r, c;

    {
    for(int ar = 0; ar < height; ++ar) 
    {  
        for(int ac = 0; ac < width; ++ac) 
        {
            // get the image coordinates
            r = ar * step;//这里的step为盒滤波器的采样因子（构造函数的第三参数），不是图像每一行的像素个数
            c = ac * step; 

            // Compute response components
            // 利用积分图计算海森矩阵的行列式近似值
            //参数：图像首地址 r1 c1 r2与r1差 c2与c1差 每行像素个数 高 宽 
            Dxx = BoxIntegral_acc(img_data, r - l + 1, c - b, 2*l - 1, w, img_step, img_height, img_width)
                    - BoxIntegral_acc(img_data, r - l + 1, c - l / 2, 2*l - 1, l, img_step, img_height, img_width)*3;
            Dyy = BoxIntegral_acc(img_data, r - b, c - l + 1, w, 2*l - 1, img_step, img_height, img_width)
                    - BoxIntegral_acc(img_data, r - l / 2, c - l + 1, l, 2*l - 1, img_step, img_height, img_width)*3;
            Dxy = BoxIntegral_acc(img_data, r - l, c + 1, l, l, img_step, img_height, img_width)
                        + BoxIntegral_acc(img_data, r + 1, c - l, l, l, img_step, img_height, img_width)
                        - BoxIntegral_acc(img_data, r - l, c - l, l, l, img_step, img_height, img_width)
                        - BoxIntegral_acc(img_data, r + 1, c + 1, l, l, img_step, img_height, img_width);

            // Normalise the filter responses with respect to their size
            Dxx *= inverse_area;
            Dyy *= inverse_area;
            Dxy *= inverse_area;

            responses[ar*width + ac] = (Dxx * Dyy - 0.81 * Dxy * Dxy);//海森矩阵行列式 0.81为近似（积分图+盒滤波近似代替高斯平滑和二阶导）的修正
            laplacian[ar*width + ac] = (Dxx + Dyy >= 0 ? 1 : 0);
            //求得的 近似Hessian矩阵行列式的图像(DoH) 就相当于 SIFT算法中的高斯差分图像(DoG)
        }
    }

    }

}

注意step与img_step的区别
r = ar * step; //这里的step为盒滤波器的采样因子（构造函数的第三参数），不是图像每一行的像素个数

参数含义 b：滤波器中心离边缘的距离；l：滤波器大小/3 ； w：滤波器中心离边缘的距离

这里举例大小为9的盒滤波器，图像中每一点的Dxx,Dyy,Dxy（从左到后）分别由以下滤波器形状与原图像素卷积后形成，滤波器中白色和黑色总和区域系数为+1，黑色区域为-3，灰色区域为0

在这里插入图片描述

按其中Dxx计算举例如下：

Dxx = BoxIntegral_acc(img_data, r - l + 1, c - b, 2*l - 1, w, img_step, img_height, img_width)
                    - BoxIntegral_acc(img_data, r - l + 1, c - l / 2, 2*l - 1, l, img_step, img_height, img_width)*3;

BoxIntegral_acc参数：图像首地址 r1 c1 r2与r1差 c2与c1差每行像素个数高宽
将图像任意一点(r,c)的Dxx卷积计算过程如下，：

请添加图片描述边缘时积分图像会取图像的边缘，见BoxIntegral_acc代码
注意这里的img_data为积分图像的首地址，在构造FastHessian时的属性img即为积分图像。

inline float BoxIntegral_acc(float *img_data, int row, int col, int rows, int cols, int step, int height, int width) 
{
  
  // The subtraction by one for row/col is because row/col is inclusive.
  int r1 = std::min(row,          height) - 1;
  int c1 = std::min(col,          width)  - 1;
  int r2 = std::min(row + rows,   height) - 1;
  int c2 = std::min(col + cols,   width)  - 1;

  float A(0.0f), B(0.0f), C(0.0f), D(0.0f);
  if (r1 >= 0 && c1 >= 0) A = img_data[r1 * step + c1];
  if (r1 >= 0 && c2 >= 0) B = img_data[r1 * step + c2];
  if (r2 >= 0 && c1 >= 0) C = img_data[r2 * step + c1];
  if (r2 >= 0 && c2 >= 0) D = img_data[r2 * step + c2];

  return std::max(0.f, A - B - C + D);
}