计算机视觉——SIFT算法-CSDN博客

本文链接：https://blog.csdn.net/Jiawen_/article/details/115282038

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档

文章目录

前言
一、SIFT算法特点
二、SIFT算法实质
三、SIFT算法实现特征匹配主要有以下流程：
四、提取关键点可视化
五、匹配地理标记图像
- 1.源码
- 2.运行结果
总结

前言

SIFT，即尺度不变特征变换（Scale-invariant feature transform，SIFT），是用于图像处理领域的一种描述。这种描述具有尺度不变性，可在图像中检测出关键点，是一种局部特征描述子。

一、SIFT算法特点

1、具有较好的稳定性和不变性，能够适应旋转、尺度缩放、亮度的变化，能在一定程度上不受视角变化、仿射变换、噪声的干扰。
2、区分性好，能够在海量特征数据库中进行快速准确的区分信息进行匹配
3、多量性，就算只有单个物体，也能产生大量特征向量
4、高速性，能够快速的进行特征向量匹配
5、可扩展性，能够与其它形式的特征向量进行联合

二、SIFT算法实质

在不同的尺度空间上查找关键点，并计算出关键点的方向。
在这里插入图片描述

三、SIFT算法实现特征匹配主要有以下流程：

1、特征点位置和尺度的提取：

在特征提取步骤下，其主要应用了图像金字塔和图像尺度空间
1.图像金字塔
将图像经过一系列的降采样，不同大小的图片其空间形状像是金字塔，因此得名。
2. 图像尺度空间
将图像经过不同尺度（σ）的高斯卷积算子，进而得到不同高斯尺度（σ）下的图像

具体步骤如下：
1.将相同size的相邻高斯尺度之间的灰度图像进行减法运算，进而得到高斯差分图像。可以看得出其边缘特征比较明显。
2.然后在高斯差分图像上检测特征点。

假若一个像素点比其周围8个点 + 相邻两个高斯差分图像上对应的18个点的像素值（共26个）都大或都小，则该点为特征点。
该特征点的尺度信息来自于 1. 检测出特征点所在的高斯差分图像，其图像的与原图的长宽比例 2. 检测出特征点所在的高斯差分图像的高斯尺度

2、特征点方向的提取

选择好特征点之后，还需要提取出特征点特方向信息。在特征点所在的系数*高斯尺度（σ）为长宽的正方形区域内，求像素点之间梯度变化的方向。
将方向划分到以45°为间隔的8个方向内进行统计，最多的方向则为该特征点的主方向

3、特征提取汇总

SIFT算法有了特征点的位置、尺度、方向三个信息，之后再依据尺度、方向对特征点进行描述，则特征点的特征向量将具有尺度不变性和旋转不变性。
在这里插入图片描述

4、特征描述

依据尺度、方向对特征点进行描述。

尺度信息体现在特征点的描述区域范围，是系数*尺度

方向信息：

以特征点的主方向作为特征描述的X轴，在其坐标系的四个象限上分别划出2*2个小格子，每个小格子分别对格子中的灰度变化方向进行统计。

每个小格子统计出来一个按照8个方向划分，8个方向的数量归一化后的结果，成为一个8维的向量。

一共4个象限即44个小格子， 44*8=128，最终SIFT的特征点将用128维向量表示。

（如果主方向为45°的倍数，按其作为主方向去描述后，有些位置的灰度值和方向需要通过插值得到）
在这里插入图片描述

4、特征匹配

考虑到特征向量中的元素为浮点数，向量之间的距离计算采用欧式距离进行计算。

当距离小于某一阈值时，认为两个特征点匹配上了，即匹配成功。
在这里插入图片描述

四、提取关键点可视化

from numpy import loadtxt, array, concatenate, zeros, dot, arccos

from numpy.linalg import linalg
from pylab import *
from PIL import Image
from numpy import *
import os
from numpy import loadtxt, arange, cos, sin, pi


def process_image(imagename, resultname, params="--edge-thresh 10 --peak-thresh 5"):
    """处理一幅图像，然后将结果保存在文件中"""

    if imagename[-3:] != 'pgm':
        # 创建一个pgm文件
        im = Image.open(imagename).convert('L')
        im.save('tmp.pgm')
        imagename = 'tmp.pgm'

    cmmd = str("C:/Users/Administrator/Desktop/vlfeat-0.9.20-bin/vlfeat-0.9.20/bin/win64/sift.exe " + imagename + " --output=" + resultname + " " + params)
    os.system(cmmd)
    print('processed', imagename, 'to', resultname)


def read_features_from_file(filename):
    """读取特征值属性值，然后将其以矩阵形式返回"""

    f = loadtxt(filename)
    return f[:, :4], f[:, 4:]  # 特征位置，描述子

def plot_features(im, locs, circle=False):
    """显示带有特征的图像
        输入：im（数组图像），locs（每个特征的行、列、尺度和方向角度）"""

    def draw_circle(c,r):
        t = arange(0,1.01,.01)*2*pi
        x = r*cos(t) + c[0]
        y = r*sin(t) + c[1]
        plot(x,y,'b',linewidth=2)

    imshow(im)
    if circle:
        for p in locs:
            draw_circle(p[:2],p[2])
    else:
        plot(locs[:,0],locs[:,1],'ob')
    axis('off')
    return

def match(desc1, desc2):
    """对于第一幅图像的每个描述子，选取其在第二幅图像中的匹配
        输入：desc1（第一幅图像中的描述子），desc2（第二幅图像中的描述子）"""

    desc1 = array([d/linalg.norm(d) for d in desc1])
    desc2 = array([d/linalg.norm(d) for d in desc2])

    dist_ratio = 0.6
    desc1_size = desc1.shape

    matchscores = zeros((desc1_size[0],1), 'int')
    desc2t = desc2.T    #预先计算矩阵转置
    for i in range(desc1_size[0]):
        dotprods = dot(desc1[i,:], desc2t) #向量点乘
        dotprods = 0.9999*dotprods
        # 反余弦和反排序，返回第二幅图像中特征的索引
        index = argsort(arccos(dotprods))

        # 检查最近邻的角度是否小于dist_ratio乘以第二近邻的角度
        if arccos(dotprods)[index[0]] < dist_ratio * arccos(dotprods)[index[1]]:
            matchscores[i] = int(index[0])

    return matchscores

def match_twosided(desc1,decs2):
    """双向对称版本的match"""

    matches_12 = match(desc1, decs2)
    matches_21 = match(decs2, decs2)

    ndx_12 = matches_12.nonzero()[0]

    # 去除不对称匹配
    for n in ndx_12:

        if matches_21[int(matches_12[n])] != n:
            matches_12[n] = 0

    return matches_12

def appendimages(im1, im2):
    """返回将两幅图像并排拼接成的一幅新图像"""

    # 选取具有最少行数的图像，然后填充足够的空行
    row1 = im1.shape[0]
    row2 = im2.shape[0]

    if row1 < row2:
        im1 = concatenate((im1,zeros((row2-row1,im1.shape[1]))), axis=0)
    elif row1 > row2:
        im2 = concatenate((im2,zeros((row1-row2,im2.shape[1]))), axis=0)

    # 如果这些情况都没有，那么他们的行数相同，不需要进行填充

    return concatenate((im1,im2), axis=1)


if __name__ == '__main__':
    imname = 'image/yankui1.jpg'
    im1 = array(Image.open(imname).convert('L'))
    process_image(imname, 'yankui1.sift')
    l1, d1 = read_features_from_file('image/yankui1.sift')

    figure()
    gray()
    plot_features(im1, l1, circle=True)
    show()

在这里插入图片描述

五、匹配地理标记图像

1.源码

源码如下（示例）：

from pylab import *
from PIL import Image
from PCV.localdescriptors import sift
from PCV.tools import imtools
import pydot

""" This is the example graph illustration of matching images from Figure 2-10.
To download the images, see ch2_download_panoramio.py."""


download_path = "./image"  # set this to the path where you downloaded the panoramio images
path = "./image"  # path to save thumbnails (pydot needs the full system path)


imlist = imtools.get_imlist(download_path)
nbr_images = len(imlist)


featlist = [imname[:-3] + 'sift' for imname in imlist]
for i, imname in enumerate(imlist):
    sift.process_image(imname, featlist[i])

matchscores = zeros((nbr_images, nbr_images))

for i in range(nbr_images):
    for j in range(i, nbr_images):  # only compute upper triangle
        print('comparing ', imlist[i], imlist[j])
        l1, d1 = sift.read_features_from_file(featlist[i])
        l2, d2 = sift.read_features_from_file(featlist[j])
        matches = sift.match_twosided(d1, d2)
        nbr_matches = sum(matches > 0)
        print('number of matches = ', nbr_matches)
        matchscores[i, j] = nbr_matches
print("The match scores is: \n", matchscores)

# copy values
for i in range(nbr_images):
    for j in range(i + 1, nbr_images):  # no need to copy diagonal
        matchscores[j, i] = matchscores[i, j]

#可视化

threshold = 2  # min number of matches needed to create link

g = pydot.Dot(graph_type='graph')  # don't want the default directed graph

for i in range(nbr_images):
    for j in range(i + 1, nbr_images):
        if matchscores[i, j] > threshold:
            # first image in pair
            im = Image.open(imlist[i])
            im.thumbnail((100, 100))
            filename = path + str(i) + '.png'
            im.save(filename)  # need temporary files of the right size
            g.add_node(pydot.Node(str(i), fontcolor='transparent', shape='rectangle', image=filename))

            # second image in pair
            im = Image.open(imlist[j])
            im.thumbnail((100, 100))
            filename = path + str(j) + '.png'
            im.save(filename)  # need temporary files of the right size
            g.add_node(pydot.Node(str(j), fontcolor='transparent', shape='rectangle', image=filename))

            g.add_edge(pydot.Edge(str(i), str(j)))
g.write_png('whitehouse.png')