SIFT第一步:创建scale space

SIFT算法通过创建尺度空间来捕捉图像在不同尺度下的特征。这涉及对原始图像进行逐步模糊处理,确保不引入虚假细节,通常使用高斯模糊实现。SIFT会生成多个八度,每个八度包含不同模糊程度的图像。每幅图像的模糊程度通过增加“尺度”参数(即模糊量)来调整。这种尺度空间的概念有助于在后续步骤中生成差分高斯图像,以实现尺度不变特征转换。
摘要由CSDN通过智能技术生成

Real world objects are meaningful only at a certain scale. You might see a sugar cube perfectly on a table. But if looking at the entire milky way, then it simply does not exist. This multi-scale nature of objects is quite common in nature. And a scale space attempts to replicate this concept on digital images.

Scale spaces

Do you want to look at a leaf or the entire tree? If it’s a tree, get rid of some detail from the image (like the leaves, twigs, etc) intentionally.

While getting rid of these details, you must ensure that you do not introduce new false details. The only way to do that is with the Gaussian Blur (it was proved mathematically, under several reasonable assumptions).

So to create a scale space, you take the original image and generate progressively blurred out images. Here’s an example:

Look at how the cat’s helmet loses detail. So do it’s whiskers.

Scale spaces in SIFT

SIFT takes scale spaces to the next level. You take the original image, and generate progressively blurred out images. Then, you resize the original image to half size. And you generate blurred out images again. And you keep repeating.

Here’s what it would look like in SIFT:

Images of the same size (vertical) form an octave. Above are four octaves. Each octave has 5 images. The individual images are formed because of the increasing “scale” (the amount of blur).

The technical details

Now that you know things the intuitive way, I’ll get into a few technical details.

Octaves and Scales

The number of octaves and scale depends on the size of the original image. While programming SIFT, you’ll have to decide for yourself how many octaves and scales you want. However, the creator of SIFT suggests that 4 octaves and 5 blur levels are ideal for the algorithm.

The first octave

If the original image is doubled in size and antialiased a bit (by blurring it) then the algorithm produces more four times more keypoints. The more the keypoints, the better!

Blurring

Mathematically, “blurring” is referred to as the convolution of the gaussian operator and the image. Gaussian blur has a particular expression or “operator” that is applied to each pixel. What results is the blurred image.

The symbols:

  • L is a blurred image
  • G is the Gaussian Blur operator
  • I is an image
  • x,y are the location coordinates
  • σ is the “scale” parameter. Think of it as the amount of blur. Greater the value, greater the blur.
  • The * is the convolution operation in x and y. It “applies” gaussian blur G onto the image I.

This is the actual Gaussian Blur operator.

Amount of blurring

The amount of blurring in each image is important. It goes like this. Assume the amount of blur in a particular image is σ. Then, the amount of blur in the next image will be k*σ. Here k is whatever constant you choose.

This is a table of σ’s for my current example. See how each σ differs by a factor sqrt(2) from the previous one.

Summary

In the first step of SIFT, you generate several octaves of the original image. Each octave’s image size is half the previous one. Within an octave, images are progressively blurred using the Gaussian Blur operator.

In the next step, we’ll use all these octaves to generate Difference of Gaussian images.

This article is a part of the article series on SIFT: Scale Invariant Feature Transform.

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值