用Python和OpenCV实现RootSIFT--Implementing RootSIFT in Python and OpenCV

rootsift_extracted_example

Still using the original, plain ole’ implementation of SIFT by David Lowe?

Well, according to Arandjelovic and Zisserman in their 2012 paper, Three things everyone should know to improve object retrieval, you’re selling yourself (and your accuracy) short by using the original implementation.

Instead, you should be utilizing a simple extension to SIFT, called RootSIFT, that can be used to dramatically increase object recognition accuracy, quantization, and retrieval accuracy.

Whether you’re matching descriptors of regions surrounding keypoints, clusterings SIFT descriptors using k-means, or building a bag of visual words model, the RootSIFT extension can be used to improve your results.

Best of all, the RootSIFT extension sits on top of the original SIFT implementation and does not require changes to the original SIFT source code.

You do not have to recompile or modify your favorite SIFT implementation to utilize the benefits of RootSIFT.

So if you’re using SIFT regularly in your computer vision applications, but have yet to level-up to RootSIFT, read on.

This blog post will show you how to implement RootSIFT in Python and OpenCV — without (1) having to change a single line of code in the original OpenCV SIFT implementation and (2) without having to compile the entire library.

Sound interesting? Check out the rest of this blog post to learn how to implement RootSIFT in Python and OpenCV.

Looking for the source code to this post?
Jump right to the downloads section.

OpenCV and Python versions:
In order to run this example, you’ll need Python 2.7 and OpenCV 2.4.X.

Why RootSIFT?

It is well known that when comparing histograms the Euclidean distance often yields inferior performance than when using the chi-squared distance or the Hellinger kernel [Arandjelovic et al. 2012].

And if this is the case why do we often use the Euclidean distance to compare SIFT descriptors when matching keypoints? Or clustering SIFT descriptors to form a codebook? Or quantizing SIFT descriptors to form a bag of visual words?

Remember, while the original SIFT papers discuss comparing descriptors using the Euclidean distance, SIFT is still a histogram itself — and wouldn’t other distance metrics offer greater accuracy?

It turns out, the answer is yes. And instead of comparing SIFT descriptors using a different metric we can instead modify the 128-dim descriptor returned from SIFT directly.

You see, Arandjelovic et al. suggests a simple algebraic extension to the SIFT descriptor itself, called RootSIFT, that allow SIFT descriptors to be “compared” using a Hellinger kernel — but still utilizing the Euclidean distance.

Here is the simple algorithm to extend SIFT to RootSIFT:

  • Step 1: Compute SIFT descriptors using your favorite SIFT library.
  • Step 2: L1-normalize each SIFT vector.
  • Step 3: Take the square-root of each element in the SIFT vector.
  • Step 4: L2-normalize the resulting vector. (This step is not necessary).

UPDATE: From the Arandjelovic et al.’s paper and presentation, it’s a little ambiguous if the final L2 normalization should be performed or not. This step is not mentioned in their paper. But it is mentioned on Slide 10 of their presentation. As Chris pointed out in the comments section, explicitly performing the L2 normalization is not needed. By taking the L1 norm, followed by the square-root, we have already L2 normalized the feature vector and further normalization is not needed.


That’s it!

It’s a simple extension. But this little modification can dramatically improve results, whether you’re matching keypoints, clustering SIFT descriptors, of quantizing to form a bag of visual words, Arandjelovic et al. has shown that RootSIFT can easily be used in all scenarios that SIFT is, while improving results.

In the rest of this blog post I’ll show you how to implement RootSIFT using Python and OpenCV. Using this implementation, you’ll be able to incorporate RootSIFT into your own applications — and improve your results!

Implementing RootSIFT in Python and OpenCV

Open up your favorite editor, create a new file and name it rootsift.py , and let’s get started:

The first thing we’ll do is import our necessary packages. We’ll use NumPy for numerical processing and cv2  for our OpenCV bindings.

We then define our RootSIFT  class on Line 5 and the constructer on Lines 6-8. The constructor simply initializes the OpenCV SIFT descriptor extractor.

The compute  function on Line 10 then handles the computation of the RootSIFT descriptor. This function requires two arguments and an optional third argument.

The first argument to the  compute  function is the image  that we want to extract RootSIFT descriptors from. The second argument is the list of keypoints, or local regions, from where the RootSIFT descriptors will be extracted. And finally, an epsilon variable, eps , is supplied to prevent any divide-by-zero errors.

From there, we extract the original SIFT descriptors on Line 12.

We make a check on Lines 15 and 16 — if there are no keypoints or descriptors, we simply return an empty tuple.

Converting the original SIFT descriptors to RootSIFT descriptors takes place on Lines 20-22.

We first L1-normalize each vector in the descs  array (Line 20).

From there, we take the square-root of each element in the SIFT vector (Line 21).

And we finally L2-normalize the resulting SIFT vectors (Line 22).

Lastly, all we have to do is return the tuple of keypoints and RootSIFT descriptors to the calling function on Line 25.

Running RootSIFT

To actually see RootSIFT in action, open up a new file, name it driver.py , and we’ll explore how to extract SIFT and RootSIFT descriptors from images:

On Lines 1 and 2 we import our RootSIFT  descriptor along with our OpenCV bindings.

We then load our example image, convert it to grayscale, and detect Difference of Gaussian keypoints on Lines 7-12.

From there, we extract the original SIFT descriptors on Lines 15-17.

And we extract the RootSIFT descriptors on Lines 20-22.

To execute our script, simply issue the following command:

Your output should look like this:

rootsift_extracted_example

As you can see, we have extract 1,006 DoG keypoints. And for each keypoint we have extracted 128-dim SIFT and RootSIFT descriptors.

From here, you can take this RootSIFT implementation and apply it to your own applications, including keypoint and descriptor matching, clustering descriptors to form centroids, and quantizing to create a bag of visual words model — all of which we will cover in future posts.

Summary

In this blog post I showed you how to extend the original OpenCV SIFT implementation by David Lowe to create the RootSIFT descriptor, a simple extension suggested by Arandjelovic and Zisserman in their 2012 paper, Three things everyone should know to improve object retrieval.

The RootSIFT extension does not require you to modify the source of your favorite SIFT implementation — it simply sits on top of the original implementation.

The simple 4-step 3-step process to compute RootSIFT is:

  • Step 1: Compute SIFT descriptors using your favorite SIFT library.
  • Step 2: L1-normalize each SIFT vector.
  • Step 3: Take the square-root of each element in the SIFT vector.
  • Step 4: L2-normalize the resulting vector.

No matter if you are using SIFT to match keypoints, form cluster centers using k-means, or quantize SIFT descriptors to form a bag of visual words, you should definitely consider utilizing RootSIFT rather than the original SIFT to improve your object retrieval accuracy.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a  FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including  exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

Resource Guide (it’s totally free).

Enter your email address below to get my  free 11-page Image Search Engine Resource Guide PDF. Uncover  exclusive techniques that I don't publish on this blog and start building image search engines of your own!

 chi-squareddifference of gaussiandoghellingerimage descriptorskeypoint detectionkeypointslocal invariant descriptorsrootsiftsift


from: http://www.pyimagesearch.com/2015/04/13/implementing-rootsift-in-python-and-opencv/

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值