用Python和OpenCV实现RootSIFT--Implementing RootSIFT in Python and OpenCV

最新推荐文章于 2023-10-14 22:05:08 发布

GarfieldEr007

最新推荐文章于 2023-10-14 22:05:08 发布

阅读量3.6k

点赞数 1

分类专栏：计算机视觉CV 文章标签： RootSIFT 实现 Python OpenCV

计算机视觉CV 专栏收录该内容

327 篇文章 28 订阅

订阅专栏

by Adrian Rosebrock on April 13, 2015 in Image Descriptors, Tutorials

Still using the original, plain ole’ implementation of SIFT by David Lowe?

Well, according to Arandjelovic and Zisserman in their 2012 paper, Three things everyone should know to improve object retrieval, you’re selling yourself (and your accuracy) short by using the original implementation.

Instead, you should be utilizing a simple extension to SIFT, called RootSIFT, that can be used to dramatically increase object recognition accuracy, quantization, and retrieval accuracy.

Whether you’re matching descriptors of regions surrounding keypoints, clusterings SIFT descriptors using k-means, or building a bag of visual words model, the RootSIFT extension can be used to improve your results.

Best of all, the RootSIFT extension sits on top of the original SIFT implementation and does not require changes to the original SIFT source code.

You do not have to recompile or modify your favorite SIFT implementation to utilize the benefits of RootSIFT.

So if you’re using SIFT regularly in your computer vision applications, but have yet to level-up to RootSIFT, read on.

This blog post will show you how to implement RootSIFT in Python and OpenCV — without (1) having to change a single line of code in the original OpenCV SIFT implementation and (2) without having to compile the entire library.

Sound interesting? Check out the rest of this blog post to learn how to implement RootSIFT in Python and OpenCV.

Looking for the source code to this post?
Jump right to the downloads section.

OpenCV and Python versions:
In order to run this example, you’ll need Python 2.7 and OpenCV 2.4.X.

Why RootSIFT?

It is well known that when comparing histograms the Euclidean distance often yields inferior performance than when using the chi-squared distance or the Hellinger kernel [Arandjelovic et al. 2012].

And if this is the case why do we often use the Euclidean distance to compare SIFT descriptors when matching keypoints? Or clustering SIFT descriptors to form a codebook? Or quantizing SIFT descriptors to form a bag of visual words?

Remember, while the original SIFT papers discuss comparing descriptors using the Euclidean distance, SIFT is still a histogram itself — and wouldn’t other distance metrics offer greater accuracy?

It turns out, the answer is yes. And instead of comparing SIFT descriptors using a different metric we can instead modify the 128-dim descriptor returned from SIFT directly.

You see, Arandjelovic et al. suggests a simple algebraic extension to the SIFT descriptor itself, called RootSIFT, that allow SIFT descriptors to be “compared” using a Hellinger kernel — but still utilizing the Euclidean distance.

Here is the simple algorithm to extend SIFT to RootSIFT:

Step 1: Compute SIFT descriptors using your favorite SIFT library.
Step 2: L1-normalize each SIFT vector.
Step 3: Take the square-root of each element in the SIFT vector.
~~Step 4: L2-normalize the resulting vector.~~ (This step is not necessary).

UPDATE: From the Arandjelovic et al.’s paper and presentation, it’s a little ambiguous if the final L2 normalization should be performed or not. This step is not mentioned in their paper. But it is mentioned on Slide 10 of their presentation. As Chris pointed out in the comments section, explicitly performing the L2 normalization is not needed. By taking the L1 norm, followed by the square-root, we have already L2 normalized the feature vector and further normalization is not needed.

That’s it!

It’s a simple extension. But this little modification can dramatically improve results, whether you’re matching keypoints, clustering SIFT descriptors, of quantizing to form a bag of visual words, Arandjelovic et al. has shown that RootSIFT can easily be used in all scenarios that SIFT is, while improving results.

In the rest of this blog post I’ll show you how to implement RootSIFT using Python and OpenCV. Using this implementation, you’ll be able to incorporate RootSIFT into your own applications — and improve your results!

Implementing RootSIFT in Python and OpenCV

Open up your favorite editor, create a new file and name it rootsift.py , and let’s get started:

Implementing RootSIFT in Python and OpenCV 
    
Python
 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
         22 
       
         23 
       
         24 
       
         25 
       
        # import the necessary packages 
       
        import 
          
        numpy  
        as 
          
        np 
       
        import 
          
        cv2 
       
        class 
          
        RootSIFT 
        : 
       
        def 
          
        __init__ 
        ( 
        self 
        ) 
        : 
       
        # initialize the SIFT feature extractor 
       
        self 
        . 
        extractor 
          
        = 
          
        cv2 
        . 
        DescriptorExtractor_create 
        ( 
        "SIFT" 
        ) 
       
        def 
          
        compute 
        ( 
        self 
        , 
          
        image 
        , 
          
        kps 
        , 
          
        eps 
        = 
        1e 
        - 
        7 
        ) 
        : 
       
        # compute SIFT descriptors 
       
        ( 
        kps 
        , 
          
        descs 
        ) 
          
        = 
          
        self 
        . 
        extractor 
        . 
        compute 
        ( 
        image 
        , 
          
        kps 
        ) 
       
        # if there are no keypoints or descriptors, return an empty tuple 
       
        if 
          
        len 
        ( 
        kps 
        ) 
          
        == 
          
        0 
        : 
       
        return 
          
        ( 
        [ 
        ] 
        , 
          
        None 
        ) 
       
        # apply the Hellinger kernel by first L1-normalizing and taking the 
       
        # square-root 
       
        descs 
          
        /= 
          
        ( 
        descs 
        . 
        sum 
        ( 
        axis 
        = 
        1 
        , 
          
        keepdims 
        = 
        True 
        ) 
          
        + 
          
        eps 
        ) 
       
        descs 
          
        = 
          
        np 
        . 
        sqrt 
        ( 
        descs 
        ) 
       
        #descs /= (np.linalg.norm(descs, axis=1, ord=2) + eps) 
       
        # return a tuple of the keypoints and descriptors 
       
        return 
          
        ( 
        kps 
        , 
          
        descs 
        )

The first thing we’ll do is import our necessary packages. We’ll use NumPy for numerical processing and cv2 for our OpenCV bindings.

We then define our RootSIFT class on Line 5 and the constructer on Lines 6-8. The constructor simply initializes the OpenCV SIFT descriptor extractor.

The compute function on Line 10 then handles the computation of the RootSIFT descriptor. This function requires two arguments and an optional third argument.

The first argument to the compute function is the image that we want to extract RootSIFT descriptors from. The second argument is the list of keypoints, or local regions, from where the RootSIFT descriptors will be extracted. And finally, an epsilon variable, eps , is supplied to prevent any divide-by-zero errors.

From there, we extract the original SIFT descriptors on Line 12.

We make a check on Lines 15 and 16 — if there are no keypoints or descriptors, we simply return an empty tuple.

Converting the original SIFT descriptors to RootSIFT descriptors takes place on Lines 20-22.

We first L1-normalize each vector in the descs array (Line 20).

From there, we take the square-root of each element in the SIFT vector (Line 21).

~~And we finally L2-normalize the resulting SIFT vectors (Line 22).~~

Lastly, all we have to do is return the tuple of keypoints and RootSIFT descriptors to the calling function on Line 25.

Running RootSIFT

To actually see RootSIFT in action, open up a new file, name it driver.py , and we’ll explore how to extract SIFT and RootSIFT descriptors from images:

Implementing RootSIFT in Python and OpenCV
Python
 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
         22 
       
        # import the necessary packages 
       
        from 
         
        rootsift 
        import 
         
        RootSIFT 
       
        import 
         
        cv2 
       
        # load the image we are going to extract descriptors from and convert 
       
        # it to grayscale 
       
        image 
         
        = 
         
        cv2 
        . 
        imread 
        ( 
        "example.png" 
        ) 
       
        gray 
         
        = 
         
        cv2 
        . 
        cvtColor 
        ( 
        image 
        , 
         
        cv2 
        . 
        COLOR_BGR2GRAY 
        ) 
       
        # detect Difference of Gaussian keypoints in the image 
       
        detector 
         
        = 
         
        cv2 
        . 
        FeatureDetector_create 
        ( 
        "SIFT" 
        ) 
       
        kps 
         
        = 
         
        detector 
        . 
        detect 
        ( 
        gray 
        ) 
       
        # extract normal SIFT descriptors 
       
        extractor 
         
        = 
         
        cv2 
        . 
        DescriptorExtractor_create 
        ( 
        "SIFT" 
        ) 
       
        ( 
        kps 
        , 
         
        descs 
        ) 
         
        = 
         
        extractor 
        . 
        compute 
        ( 
        gray 
        , 
         
        kps 
        ) 
       
        print 
         
        "SIFT: kps=%d, descriptors=%s " 
         
        % 
         
        ( 
        len 
        ( 
        kps 
        ) 
        , 
         
        descs 
        . 
        shape 
        ) 
       
        # extract RootSIFT descriptors 
       
        rs 
         
        = 
         
        RootSIFT 
        ( 
        ) 
       
        ( 
        kps 
        , 
         
        descs 
        ) 
         
        = 
         
        rs 
        . 
        compute 
        ( 
        gray 
        , 
         
        kps 
        ) 
       
        print 
         
        "RootSIFT: kps=%d, descriptors=%s " 
         
        % 
         
        ( 
        len 
        ( 
        kps 
        ) 
        , 
         
        descs 
        . 
        shape 
        )

On Lines 1 and 2 we import our RootSIFT descriptor along with our OpenCV bindings.

We then load our example image, convert it to grayscale, and detect Difference of Gaussian keypoints on Lines 7-12.

From there, we extract the original SIFT descriptors on Lines 15-17.

And we extract the RootSIFT descriptors on Lines 20-22.

To execute our script, simply issue the following command:

Implementing RootSIFT in Python and OpenCV 
    
Shell
 
         1 
       
        $ 
          
        python  
        driver 
        .py

Your output should look like this:

Implementing RootSIFT in Python and OpenCV
Shell
 
         1 
       
         2 
       
        SIFT 
        : 
         
        kps 
        = 
        1006 
        , 
         
        descriptors 
        = 
        ( 
        1006 
        , 
         
        128 
        ) 
         
        RootSIFT 
        : 
         
        kps 
        = 
        1006 
        , 
         
        descriptors 
        = 
        ( 
        1006 
        , 
         
        128 
        )

As you can see, we have extract 1,006 DoG keypoints. And for each keypoint we have extracted 128-dim SIFT and RootSIFT descriptors.

From here, you can take this RootSIFT implementation and apply it to your own applications, including keypoint and descriptor matching, clustering descriptors to form centroids, and quantizing to create a bag of visual words model — all of which we will cover in future posts.

Summary

In this blog post I showed you how to extend the original OpenCV SIFT implementation by David Lowe to create the RootSIFT descriptor, a simple extension suggested by Arandjelovic and Zisserman in their 2012 paper, Three things everyone should know to improve object retrieval.

The RootSIFT extension does not require you to modify the source of your favorite SIFT implementation — it simply sits on top of the original implementation.

The simple ~~4-step~~ 3-step process to compute RootSIFT is:

Step 1: Compute SIFT descriptors using your favorite SIFT library.
Step 2: L1-normalize each SIFT vector.
Step 3: Take the square-root of each element in the SIFT vector.
~~Step 4: L2-normalize the resulting vector.~~

No matter if you are using SIFT to match keypoints, form cluster centers using k-means, or quantize SIFT descriptors to form a bag of visual words, you should definitely consider utilizing RootSIFT rather than the original SIFT to improve your object retrieval accuracy.

Downloads:

Resource Guide (it’s totally free).

Enter your email address below to get my free 11-page Image Search Engine Resource Guide PDF. Uncover exclusive techniques that I don't publish on this blog and start building image search engines of your own!

chi-squared, difference of gaussian, dog, hellinger, image descriptors, keypoint detection, keypoints, local invariant descriptors, rootsift, sift

from: http://www.pyimagesearch.com/2015/04/13/implementing-rootsift-in-python-and-opencv/

GarfieldEr007

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
2
评论
用Python和OpenCV实现RootSIFT--Implementing RootSIFT in Python and OpenCV

by Adrian Rosebrock on April 13, 2015 in Image Descriptors, Tutorials021Still using the original, plain ole’ implementation of SIFT by D
复制链接

扫一扫