Sift：(第二步）关键点位置的确定

zzz_zzzz_

已于 2023-09-07 19:25:27 修改

阅读量286

点赞数 1

分类专栏： sift 文章标签：图像处理

于 2023-08-12 16:14:35 首次发布

本文链接：https://blog.csdn.net/zzz_zzzz_/article/details/132244753

版权

sift 专栏收录该内容

7 篇文章 2 订阅

订阅专栏

上一步：Sift:（第一步）建立高斯差分金字塔_zzz_zzzz_的博客-CSDN博客

图片的关键点，往往是具有稳定性质的点，包含较多信息的点，往往是处于极值的位置

首先，进行阈值化处理

当前检测点的值要满足 $abs(val)> \frac{0.5*T}{s}$ 这里的T取经验值0.04，s是要从多少张图片中检测极值点（高斯金字塔每组层数为s+3，高斯金字塔每组层数为s+2)。如果小于我们设定的这个值，说明这个点不稳定，可认为是噪声，我们不保留，就不必进入下一步了。

第二，进行关键点的初步探查

关键点是由高斯差分金字塔(DOG)的局部极值点组成的，关键点的初步探查是通过同一组各DOG相邻两层图像之间比较完成的。在图像的尺度空间中，如下图，检测一个点是否是极值点，要和它同尺度的周围的8个极值点以及相邻两个尺度对应的9*2个极值点比较，也就是要比较26个点。

如果当前的检测点是它周围点中的极值点，那么恭喜，它进入了下一步。

第三，关键点的定位

以上方法检测到的极值点是离散空间的极值点，并不是真正的极值点，只是它附近的一个点。下图显示了离散空间得到的极值点和连续空间极值点的区别。、

那么在离散空间极值点已知的基础上，怎么来找到真正连续空间的极值点呢？这就想到了泰勒公式了。泰勒公式，用一个函数在某点的信息描述其附近取值的公式。可以用函数在某一点的各阶导数做系数来构建一个多项式来近似表达这个函数，然后对多项式进行求导，令导数等于0来找到连续空间的极值点。之前学的泰勒公式都是一个未知数，而这里有三个未知数（x,y和尺度 $\delta$ ），我们设检测到的局部极值点 $X_{0}(x_{0},y_{0},\delta _{0})$ ，在此处做三元二阶泰勒展开：

我们得到了局部极值点与精确极值点之间的偏移量。

第四，舍去低对比度的点

上一步计算得到了f(X)，判断 $f(X)< \frac{T}{s}$ ，T还是取经验值0.04，s还是“要从多少张图片中检测极值点”。如果小于这个值，我们认为是噪声的可能性大，把它舍去。

第五，边缘效应的去除

首先要明白，这里的边缘是什么？不是我们直观上理解的图像的边儿，而是（见下图）：

非边缘位置的特征：在x和y方向上的梯度差不多。

引入海森矩阵：

海森矩阵的特征值 $\alpha$ 和 $\beta$ 代表x和y方向的梯度。但是计算矩阵的特征值往往比较复杂，那么根据矩阵的性质，我们看矩阵的迹和行列式：

假设 $\alpha$ > $\beta$ ， $\alpha$ = $\gamma \beta$ ，则

上式左边是一个对勾函数，当 $\gamma$ >1，对勾函数单调递增。要想在x和y方向上的梯度差不多，对应海森矩阵的特征值 $\alpha$ 和 $\beta$ 相差不大，再对应 $\gamma$ 要小于一定的值（认为这个值取经验值10），再对应矩阵的 $\frac{Tr(H)^{2}}{Det(H)}$ 要小于 $\frac{(10+1)^{2}}{10}$ 。

也就是 $\frac{Tr(H)^{2}}{Det(H)}$ < $\frac{(10+1)^{2}}{10}$ 的留下。

以上都是这一大环节的理论部分。

还有重要的一部分是 【有限差分求导】

Lowe,D 的SIFT代码

//第二步：找特征点的大框架 
/*
  Detects features at extrema in DoG scale space.  Bad features are discarded
  based on contrast and ratio of principal curvatures.

  @param dog_pyr DoG scale space pyramid（高斯差分金字塔） 
  @param octvs octaves of scale space represented by dog_pyr（高斯金字塔的组数/大层数）
  @param intvls intervals per octave（想要从高斯差分金字塔的多少张图片中提取特征，高斯差分金字塔每组的层数intvls+2）
  @param contr_thr low threshold on feature contrast（阈值化用到的经验值0.04） 
  @param curv_thr high threshold on feature ratio of principal curvatures（边缘效应去除用到的经验值10） 
  @param storage memory storage in which to store detected features（存储特征点的内存存储器） 

  @return Returns an array of detected features whose scales, orientations,
    and descriptors are yet to be determined.
*/
static CvSeq* scale_space_extrema( IplImage*** dog_pyr, int octvs, int intvls,
				   double contr_thr, int curv_thr,
				   CvMemStorage* storage )
{
  CvSeq* features;//一个可增长的序列，不是固定序列，用来存放找到的特征点结构体的序列 
  double prelim_contr_thr = 0.5 * contr_thr / intvls;//阈值化用到的参数 
  struct feature* feat;//特征点 
  struct detection_data* ddata;
  int o, i, r, c;

  features = cvCreateSeq( 0, sizeof(CvSeq), sizeof(struct feature), storage );//创建特征点序列空间 
  for( o = 0; o < octvs; o++ )//每组 
    for( i = 1; i <= intvls; i++ ) //组内处于中间位置的层 
      for(r = SIFT_IMG_BORDER; r < dog_pyr[o][0]->height-SIFT_IMG_BORDER; r++)//SIFT_IMG_BORDER为5，检测点的位置不包括图像的边儿 
	for(c = SIFT_IMG_BORDER; c < dog_pyr[o][0]->width-SIFT_IMG_BORDER; c++)
	  /* perform preliminary check on contrast */
	  if( ABS( pixval32f( dog_pyr[o][i], r, c ) ) > prelim_contr_thr ) //阈值化，当前检测点的值大于0.5*0.04/intvls才会进入下一步 
	    if( is_extremum( dog_pyr, o, i, r, c ) )//调用函数 is_extremum，判断是否为局部极值点 
	    {
		feat = interp_extremum(dog_pyr, o, i, r, c, intvls, contr_thr); //调用函数interp_extremum，如果找到精确极值点返回特征点结构体类型，否则返回NULL 
		if( feat )
		  {
		    ddata = feat_detection_data( feat );
		    if( ! is_too_edge_like( dog_pyr[ddata->octv][ddata->intvl],
					    ddata->r, ddata->c, curv_thr ) ) //调用函数is_too_edge_like，是边缘返回1，不是边缘返回0 
		      {
                          cvSeqPush( features, feat );//将feat压入features序列中 
		      }
		    else //是边缘 
		      free( ddata );
		    free( feat );
		  }
	    }
  return features;
}

判断是否为局部极值点

//判断是否为局部极值点 
/*
  Determines whether a pixel is a scale-space extremum by comparing it to it's
  3x3x3 pixel neighborhood.

  @param dog_pyr DoG scale space pyramid（高斯差分金字塔） 
  @param octv pixel's scale space octave（检测点所在高斯差分金字塔的组号）
  @param intvl pixel's within-octave interval（检测点所在高斯差分金字塔的层号） 
  @param r pixel's image row（当前检测点的x坐标） 
  @param c pixel's image col（当前监测点的y坐标） 

  @return Returns 1 if the specified pixel is an extremum (max or min) among
    it's 3x3x3 pixel neighborhood.
*/
static int is_extremum( IplImage*** dog_pyr, int octv, int intvl, int r, int c )
{
  double val = pixval32f( dog_pyr[octv][intvl], r, c );//当前监测点的值 
  int i, j, k;

  /* check for maximum */
  if( val > 0 ) //大于0，判断是否为极大值 
    {
      for( i = -1; i <= 1; i++ )
	for( j = -1; j <= 1; j++ )
	  for( k = -1; k <= 1; k++ )
	    if( val < pixval32f( dog_pyr[octv][intvl+i], r + j, c + k ) )
	      return 0; //返回0则不是局部极值点 
    }

  /* check for minimum */
  else//否则，判断是否为极小值 
    {
      for( i = -1; i <= 1; i++ )
	for( j = -1; j <= 1; j++ )
	  for( k = -1; k <= 1; k++ )
	    if( val > pixval32f( dog_pyr[octv][intvl+i], r + j, c + k ) )
	      return 0; //返回0则不是局部极值点 
    }

  return 1;//返回1则是局部极值点 
}

由局部极值点寻找精确极值

//得到精确极值 
/*
  Interpolates a scale-space extremum's location and scale to subpixel
  accuracy to form an image feature.  Rejects features with low contrast.
  Based on Section 4 of Lowe's paper.  

  @param dog_pyr DoG scale space pyramid（高斯差分金字塔） 
  @param octv feature's octave of scale space（检测点所在高斯差分金字塔的组号）
  @param intvl feature's within-octave interval（检测点所在高斯差分金字塔的层号）
  @param r feature's image row（当前检测点的x坐标） 
  @param c feature's image column（当前检测点的y坐标） 
  @param intvls total intervals per octave（组数） 
  @param contr_thr threshold on feature contrast（0.04） 

  @return Returns the feature resulting from interpolation of the given
    parameters or NULL if the given location could not be interpolated or
    if contrast at the interpolated loation was too low.  If a feature is
    returned, its scale, orientation, and descriptor are yet to be determined.
*/
static struct feature* interp_extremum( IplImage*** dog_pyr, int octv,
					int intvl, int r, int c, int intvls,
					double contr_thr )
{
  struct feature* feat;
  struct detection_data* ddata;
  double xi, xr, xc, contr;
  int i = 0;
  
  while( i < SIFT_MAX_INTERP_STEPS ) // SIFT_MAX_INTERP_STEPS为5，迭代最大次数 
    {
      interp_step( dog_pyr, octv, intvl, r, c, &xi, &xr, &xc );//调用interp_step，用指针带回三个的偏移量（尺度上，y上，x上） 
      if( ABS( xi ) < 0.5  &&  ABS( xr ) < 0.5  &&  ABS( xc ) < 0.5 )  //如果三个偏移量都小于0.5（经验值） 
	break;//跳出while循环，不必进行迭代了 
    
      //当它在任一维度上的偏移量大于0.5时
	  //意味着插值中心已经偏移到它的邻近点上
	  //需要改变当前点的位置 
      c += cvRound( xc );
      r += cvRound( xr );
      intvl += cvRound( xi );
      
      if( intvl < 1  ||
	  intvl > intvls  ||
	  c < SIFT_IMG_BORDER  ||
	  r < SIFT_IMG_BORDER  ||
	  c >= dog_pyr[octv][0]->width - SIFT_IMG_BORDER  ||
	  r >= dog_pyr[octv][0]->height - SIFT_IMG_BORDER )//新位置的要求，尺度不能超过最上和最下sigma的范围，(x,y)不能超出划定的边界 
	{
	  return NULL; //偏移过了，就直接返回NULL（该局部极值点处，没有进一步找到精确极值点） 
	}
      
      i++;
    }
  
  /* ensure convergence of interpolation */
  if( i >= SIFT_MAX_INTERP_STEPS ) // SIFT_MAX_INTERP_STEPS为5 
    return NULL;//迭代次数大于等于5次，返回NULL 
  
  contr = interp_contr( dog_pyr, octv, intvl, r, c, xi, xr, xc ); //调用函数interp_contr，计算f(X) 
  if( ABS( contr ) < contr_thr / intvls ) //舍去低对比度的点 
    return NULL;

  feat = new_feature(); //调用函数new_feature
  ddata = feat_detection_data( feat );//头文件中有定义 
  //以下都是特征点相关参数的赋值 
  feat->img_pt.x = feat->x = ( c + xc ) * pow( 2.0, octv );//联系下采样 
  feat->img_pt.y = feat->y = ( r + xr ) * pow( 2.0, octv );//对应最底层x，y位置 
  ddata->r = r;
  ddata->c = c;
  ddata->octv = octv;
  ddata->intvl = intvl;
  ddata->subintvl = xi;

  return feat;
}

根据泰勒公式计算偏移量

//在当前检测点做三元二阶泰勒展开 
/*
  Performs one step of extremum interpolation.  Based on Eqn. (3) in Lowe's
  paper.

  @param dog_pyr difference of Gaussians scale space pyramid（高斯差分金字塔） 
  @param octv octave of scale space（检测点所在高斯差分金字塔的组号）
  @param intvl interval being interpolated（检测点所在高斯差分金字塔的层号）
  @param r row being interpolated（当前检测点的x坐标）
  @param c column being interpolated（当前检测点的y坐标）
  @param xi output as interpolated subpixel increment to interval(尺度上的偏移量） 
  @param xr output as interpolated subpixel increment to row（上的偏移量） 
  @param xc output as interpolated subpixel increment to col（上的偏移量） 
*/

static void interp_step( IplImage*** dog_pyr, int octv, int intvl, int r, int c,
			 double* xi, double* xr, double* xc )
{
  CvMat* dD, * H, * H_inv, X;
  double x[3] = { 0 };
  
  dD = deriv_3D( dog_pyr, octv, intvl, r, c );//调用deriv_3D，有限差分求导求当前点处的三个一阶偏导 
  H = hessian_3D( dog_pyr, octv, intvl, r, c );//调用hessian_3D，有限差分求导求当前点处的9个二阶偏导 
  H_inv = cvCreateMat( 3, 3, CV_64FC1 );//3行3列的矩阵 
  cvInvert( H, H_inv, CV_SVD );//求H的逆矩阵 
  cvInitMatHeader( &X, 3, 1, CV_64FC1, x, CV_AUTOSTEP );//参数：(指针指向要被初始化的矩阵头,行,列,元素类型,数据,行宽) 
  //直接带计算好了的公式（二阶泰勒展开，求导，令导等于0，求得） 
  cvGEMM( H_inv, dD, -1, NULL, 0, &X, 0 );//矩阵通用乘法函数,参数1与参数2相乘，系数为参数3，参数4和参数5为偏移量和偏移量系数，结果放在参数6中，参数7为转置标志 
  
  //释放 
  cvReleaseMat( &dD );
  cvReleaseMat( &H );
  cvReleaseMat( &H_inv );

  //通过指针带回调用函数 
  *xi = x[2];
  *xr = x[1];
  *xc = x[0];
}

（有限差分法）计算当前位置的三个一阶偏导

//检测点处的三个一阶偏导（有限差分求导） 
/*
  Computes the partial derivatives in x, y, and scale of a pixel in the DoG
  scale space pyramid.

  @param dog_pyr DoG scale space pyramid（高斯差分金字塔） 
  @param octv pixel's octave in dog_pyr（检测点所在高斯差分金字塔的组号）
  @param intvl pixel's interval in octv（检测点所在高斯差分金字塔的层号）
  @param r pixel's image row（当前检测点的x坐标）
  @param c pixel's image col（当前检测点的y坐标）

  @return Returns the vector of partial derivatives for pixel I
    { dI/dx, dI/dy, dI/ds }^T as a CvMat*
*/
static CvMat* deriv_3D( IplImage*** dog_pyr, int octv, int intvl, int r, int c )
{
  CvMat* dI;
  double dx, dy, ds;

  dx = ( pixval32f( dog_pyr[octv][intvl], r, c+1 ) -
	 pixval32f( dog_pyr[octv][intvl], r, c-1 ) ) / 2.0;
  dy = ( pixval32f( dog_pyr[octv][intvl], r+1, c ) -
	 pixval32f( dog_pyr[octv][intvl], r-1, c ) ) / 2.0;
  ds = ( pixval32f( dog_pyr[octv][intvl+1], r, c ) -
	 pixval32f( dog_pyr[octv][intvl-1], r, c ) ) / 2.0;
  //以上都是有限差分求导公式而来 
  
  dI = cvCreateMat( 3, 1, CV_64FC1 );//3行1列的矩阵 
  cvmSet( dI, 0, 0, dx );//(0,0)元素为dx 
  cvmSet( dI, 1, 0, dy );//(1,0)元素为dy 
  cvmSet( dI, 2, 0, ds );//(2,0)元素为ds 

  return dI;
}

（有限差分求导）计算当前位置的二阶偏导

//检测点处的六个二阶偏导（有限差分求导） 
/*
  Computes the 3D Hessian matrix for a pixel in the DoG scale space pyramid.

  @param dog_pyr DoG scale space pyramid（高斯差分金字塔） 
  @param octv pixel's octave in dog_pyr（检测点所在高斯差分金字塔的组号）
  @param intvl pixel's interval in octv（检测点所在高斯差分金字塔的层号）
  @param r pixel's image row（当前检测点的x坐标）
  @param c pixel's image col（当前检测点的y坐标）

  @return Returns the Hessian matrix (below) for pixel I as a CvMat*

  / Ixx  Ixy  Ixs \ <BR>
  | Ixy  Iyy  Iys | <BR>
  \ Ixs  Iys  Iss /
*/
static CvMat* hessian_3D( IplImage*** dog_pyr, int octv, int intvl, int r,
			  int c )
{
  CvMat* H;
  double v, dxx, dyy, dss, dxy, dxs, dys;
  
  v = pixval32f( dog_pyr[octv][intvl], r, c );
  dxx = ( pixval32f( dog_pyr[octv][intvl], r, c+1 ) + 
	  pixval32f( dog_pyr[octv][intvl], r, c-1 ) - 2 * v );
  dyy = ( pixval32f( dog_pyr[octv][intvl], r+1, c ) +
	  pixval32f( dog_pyr[octv][intvl], r-1, c ) - 2 * v );
  dss = ( pixval32f( dog_pyr[octv][intvl+1], r, c ) +
	  pixval32f( dog_pyr[octv][intvl-1], r, c ) - 2 * v );
  dxy = ( pixval32f( dog_pyr[octv][intvl], r+1, c+1 ) -
	  pixval32f( dog_pyr[octv][intvl], r+1, c-1 ) -
	  pixval32f( dog_pyr[octv][intvl], r-1, c+1 ) +
	  pixval32f( dog_pyr[octv][intvl], r-1, c-1 ) ) / 4.0;
  dxs = ( pixval32f( dog_pyr[octv][intvl+1], r, c+1 ) -
	  pixval32f( dog_pyr[octv][intvl+1], r, c-1 ) -
	  pixval32f( dog_pyr[octv][intvl-1], r, c+1 ) +
	  pixval32f( dog_pyr[octv][intvl-1], r, c-1 ) ) / 4.0;
  dys = ( pixval32f( dog_pyr[octv][intvl+1], r+1, c ) -
	  pixval32f( dog_pyr[octv][intvl+1], r-1, c ) -
	  pixval32f( dog_pyr[octv][intvl-1], r+1, c ) +
	  pixval32f( dog_pyr[octv][intvl-1], r-1, c ) ) / 4.0;
	//以上都是有限差分求导公式而来 
  
  H = cvCreateMat( 3, 3, CV_64FC1 );//3行3列的矩阵
  cvmSet( H, 0, 0, dxx );//(0,0)元素为dxx 
  cvmSet( H, 0, 1, dxy );//(0,1)元素为dxy 
  cvmSet( H, 0, 2, dxs );//(0,2)元素为dxs 
  cvmSet( H, 1, 0, dxy );//(1,0)元素为dxy 
  cvmSet( H, 1, 1, dyy );//(1,1)元素为dyy 
  cvmSet( H, 1, 2, dys );//(1,2)元素为dys 
  cvmSet( H, 2, 0, dxs );//(2,0)元素为dxs 
  cvmSet( H, 2, 1, dys );//(2,1)元素为dys 
  cvmSet( H, 2, 2, dss );//(2,2)元素为dss 

  return H;
}

为舍去对比度低的点计算f(X)

//计算f(X) 
/*
  Calculates interpolated pixel contrast.  Based on Eqn. (3) in Lowe's
  paper.

  @param dog_pyr difference of Gaussians scale space pyramid（高斯差分金字塔） 
  @param octv octave of scale space（当前点在高斯差分金字塔的组号）
  @param intvl within-octave interval（当前点在高斯差分金字塔的层号）
  @param r pixel row（当前点的x坐标）
  @param c pixel column（当前点的y坐标）
  @param xi interpolated subpixel increment to interval(尺度上的偏移量） 
  @param xr interpolated subpixel increment to row（y上的偏移量） 
  @param xc interpolated subpixel increment to col（x上的偏移量） 

  @param Returns interpolated contrast.
*/
static double interp_contr( IplImage*** dog_pyr, int octv, int intvl, int r,
			    int c, double xi, double xr, double xc )
{
  CvMat* dD, X, T;
  double t[1], x[3] = { xc, xr, xi };

  cvInitMatHeader( &X, 3, 1, CV_64FC1, x, CV_AUTOSTEP );//三行一列 （xc,xr,xi)^T 
  cvInitMatHeader( &T, 1, 1, CV_64FC1, t, CV_AUTOSTEP );//一行一列 
  dD = deriv_3D( dog_pyr, octv, intvl, r, c );//调用函数deriv_3D，求得dD={dx dy ds}^T 
  cvGEMM( dD, &X, 1, NULL, 0, &T,  CV_GEMM_A_T );//矩阵通用乘法函数，CV_GEMM_A_T表示将第一个参数转置 
  cvReleaseMat( &dD );//释放 

  return pixval32f( dog_pyr[octv][intvl], r, c ) + t[0] * 0.5; //带f(X)公式 
}

为feature分配空间并初始化

//分配空间并初始化feature 
/*
  Allocates and initializes a new feature

  @return Returns a pointer to the new feature
*/
static struct feature* new_feature( void )
{
  struct feature* feat;
  struct detection_data* ddata;

  feat = malloc( sizeof( struct feature ) );
  memset( feat, 0, sizeof( struct feature ) );//feat指向的struct feature大小的空间用0初始化 
  ddata = malloc( sizeof( struct detection_data ) );
  memset( ddata, 0, sizeof( struct detection_data ) );
  feat->feature_data = ddata;
  feat->type = FEATURE_LOWE;

  return feat;
}

边缘效应的去除

//边缘效应的去除 
/*
  Determines whether a feature is too edge like to be stable by computing the
  ratio of principal curvatures at that feature.  Based on Section 4.1 of
  Lowe's paper.

  @param dog_img image from the DoG pyramid in which feature was detected
  @param r feature row
  @param c feature col
  @param curv_thr high threshold on ratio of principal curvatures

  @return Returns 0 if the feature at (r,c) in dog_img is sufficiently
    corner-like or 1 otherwise.
*/
static int is_too_edge_like( IplImage* dog_img, int r, int c, int curv_thr )
{
  double d, dxx, dyy, dxy, tr, det;

  /* principal curvatures are computed using the trace and det of Hessian */
  d = pixval32f(dog_img, r, c);
  dxx = pixval32f( dog_img, r, c+1 ) + pixval32f( dog_img, r, c-1 ) - 2 * d;
  dyy = pixval32f( dog_img, r+1, c ) + pixval32f( dog_img, r-1, c ) - 2 * d;
  dxy = ( pixval32f(dog_img, r+1, c+1) - pixval32f(dog_img, r+1, c-1) -
	  pixval32f(dog_img, r-1, c+1) + pixval32f(dog_img, r-1, c-1) ) / 4.0;
  tr = dxx + dyy;//海森矩阵的迹 = 特征值相加 
  det = dxx * dyy - dxy * dxy;//海森矩阵的行列式 = 特征值相乘 

  /* negative determinant -> curvatures have different signs; reject feature */
  if( det <= 0 ) //两个特征值异号 
    return 1; //是边缘 

  if( tr * tr / det < ( curv_thr + 1.0 )*( curv_thr + 1.0 ) / curv_thr ) //curv_thr为10，tr * tr / det对勾函数 
    return 0; //不是边缘 
  return 1;
}

这一步骤看上去代码很长，但很多都是直接代入公式来的。

下一步：sift：（第三步）为关键点赋予方向_zzz_zzzz_的博客-CSDN博客