Dense Scale Invariant Feature Transform (DSIFT)

Dense Scale Invariant Feature Transform (DSIFT)
Author
Andrea Vedaldi
Brian Fulkerson

dsift.h implements a dense version of SIFT. This is an object that can quickly compute descriptors for densely sampled keypoints with identical size and orientation. It can be reused for multiple images of the same size.

Overview

See also
The SIFT module,   Technical details

This module implements a fast algorithm for the calculation of a large number of SIFT descriptors of densely sampled features of the same scale and orientation. See the SIFT module for an overview of SIFT.

The feature frames (keypoints) are indirectly specified by the sampling steps (vl_dsift_set_steps) and the sampling bounds (vl_dsift_set_bounds). The descriptor geometry (number and size of the spatial bins and number of orientation bins) can be customized (vl_dsift_set_geometryVlDsiftDescriptorGeometry).

dsift-geom.png
Dense SIFT descriptor geometry

By default, SIFT uses a Gaussian windowing function that discounts contributions of gradients further away from the descriptor centers. This function can be changed to a flat window by invoking vl_dsift_set_flat_window. In this case, gradients are accumulated using only bilinear interpolation, but instad of being reweighted by a Gassuain window, they are all weighted equally. However, after gradients have been accumulated into a spatial bin, the whole bin is reweighted by the average of the Gaussian window over the spatial support of that bin. This “approximation” substantially improves speed with little or no loss of performance in applications.

Keypoints are sampled in such a way that the centers of the spatial bins are at integer coordinates within the image boundaries. For instance, the top-left bin of the top-left descriptor is centered on the pixel (0,0). The bin immediately to the right at (binSizeX,0), where binSizeX is a paramtere in the VlDsiftDescriptorGeometry structure. vl_dsift_set_bounds can be used to further restrict sampling to the keypoints in an image.

Usage

DSIFT is implemented by a VlDsiftFilter object that can be used to process a sequence of images of a given geometry. To use the DSIFT filter:

Technical details

This section extends the SIFT descriptor section and specialzies it to the case of dense keypoints.

Dense descriptors

When computing descriptors for many keypoints differing only by their position (and with null rotation), further simplifications are possible. In this case, in fact,

xh(t,i,j)==mσx^+T,mσgσwin(xT)wang(J(x)θt)w(xTxmσx^i)w(yTymσy^j)|J(x)|dx.

Since many different values of T are sampled, this is conveniently expressed as a separable convolution. First, we translate by  xij=mσ(x^i, y^i)  and we use the symmetry of the various binning and windowing functions to write

h(t,i,j)T==mσgσwin(Txxij)wang(J(x)θt)w(Txxmσ)w(Tyymσ)|J(x)|dx,T+mσ[xiyj].

Then we define kernels

ki(x)kj(y)==12πσwinexp(12(xxi)2σ2win)w(xmσ),12πσwinexp(12(yyj)2σ2win)w(ymσ),

and obtain

h(t,i,j)J¯t(x)==(kikjJ¯t)(T+mσ[xiyj]),wang(J(x)θt)|J(x)|.

Furthermore, if we use a flat rather than Gaussian windowing function, the kernels do not depend on the bin, and we have

k(z)h(t,i,j)==1σwinw(zmσ),(k(x)k(y)J¯t)(T+mσ[xiyj]),

(here  σwin  is the side of the flat window).

Note
In this case the binning functions   k(z)  are triangular and the convolution can be computed in time independent on the filter (i.e. descriptor bin) support size by integral signals.

Sampling

To avoid resampling and dealing with special boundary conditions, we impose some mild restrictions on the geometry of the descriptors that can be computed. In particular, we impose that the bin centers  T+mσ(xi, yj)  are always at integer coordinates within the image boundaries. This eliminates the need for costly interpolation. This condition amounts to (expressed in terms of the x coordinate, and equally applicable to y)

{0,,W1}Tx+mσxi=Tx+mσiNx12=T¯x+mσi,i=0,,Nx1.

Notice that for this condition to be satisfied, the descriptor center  Tx  needs to be either fractional or integer depending on  Nx  being even or odd. To eliminate this complication, it is simpler to use as a reference not the descriptor center T, but the coordinates of the upper-left bin  T¯ . Thus we sample the latter on a regular (integer) grid

[00]T¯=[T¯minx+pΔxT¯miny+qΔy][W1mσNxH1mσNy],T¯=TxNx12TyNy12

and we impose that the bin size  mσ  is integer as well.

 
from:



from: http://www.vlfeat.org/api/dsift.html
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值