图像特征描述子SIFT的快版变体Dense SIFT

VLFeat implements a fast dense version of SIFT, called vl_dsift. The function is roughly equivalent to running SIFT on a dense gird of locations at a fixed scale and orientation. This type of feature descriptors is often uses for object categorization.

Dense SIFT as a faster SIFT

The main advantage of using vl_dsift over vl_sift is speed. To see this, load a test image

I = vl_impattern('roofs1') ;
I = single(vl_imdown(rgb2gray(I))) ;

To check the equivalence of vl_disft and vl_sift it is necessary to understand in detail how the parameters of the two descriptors are related.

  • Bin size vs keypoint scale. DSIFT specifies the descriptor size by a single parameter, size, which controls the size of a SIFT spatial bin in pixels. In the standard SIFT descriptor, the bin size is related to the SIFT keypoint scale by a multiplier, denoted magnif below, which defaults to 3. As a consequence, a DSIFT descriptor with bin size equal to 5 corresponds to a SIFT keypoint of scale 5/3=1.66.

  • Smoothing. The SIFT descriptor smoothes the image according to the scale of the keypoints (Gaussian scale space). By default, the smoothing is equivalent to a convolution by a Gaussian of variance s^2 - .25, where s is the scale of the keypoint and .25 is a nominal adjustment that accounts for the smoothing induced by the camera CCD.

Thus the following code produces equivalent descriptors using either DSIFT or SIFT:

binSize = 8 ;
magnif = 3 ;
Is = vl_imsmooth(I, sqrt((binSize/magnif)^2 - .25)) ;

[f, d] = vl_dsift(Is, 'size', binSize) ;
f(3,:) = binSize/magnif ;
f(4,:) = 0 ;
[f_, d_] = vl_sift(I, 'frames', f) ;

The difference, of course, is that DSIFT is much faster.

 
Left: accuracy of the slow and fast dense SIFT implementations in vl_dsift compared to the SIFT baseline from vl_sift. Right: speedup. The fast version is less similar to the original SIFT descriptors but from 30 to 70 times faster than SIFT. Notice that the equivalence of the descriptors does not necessarily indicate that one would work better than the other in applications.

PHOW descriptors

The PHOW features [1] are a variant of dense SIFT descriptors, extracted at multiple scales. A color version, named PHOW-color, extracts descriptors on the three HSV image channels and stacks them up. A combination of vl_dsift and vl_imsmooth can be used to easily and efficiently compute such features.

VLFeat includes a simple wrapper, vl_phow, that does exactly this:

im = vl_impattern('roofs1') ;
[frames, descrs]=vl_phow(im2single(im)) ;

Note that this typically generate a very large number of features. In this example, there are 162,574 features.

References

  • [1] A. Bosch, A. Zisserman, and X. Munoz. Image classifcation using random forests and ferns. In Proc. ICCV, 2007.

from: http://www.vlfeat.org/overview/dsift.html

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值