Python计算机视觉编程第二章局部图像描述子

LuoY、

已于 2022-09-28 11:28:10 修改

阅读量794

点赞数 2

分类专栏： Python计算机视觉编程文章标签：计算机视觉 python 人工智能

于 2022-09-25 14:08:15 首次发布

本文链接：https://blog.csdn.net/qq_59026468/article/details/126347093

版权

Python计算机视觉编程专栏收录该内容

6 篇文章 3 订阅

订阅专栏

第二章局部图像描述子

2.1Harris角点检测器
2.2SIFT（尺度不变特征变换）
2.3匹配地理标记图像

2.1Harris角点检测器

该算法的主要思想是，如果像素周围显示存在多于一个方向的边，我们认为改点为兴趣点，该点就称为角点。
我们把图像域中点x上的对称半正定矩阵 $M_{1}=M_{1}(x)$ 定义为
$M_{1}=\triangledown I\triangledown I^{T}=\begin{bmatrix} I_{x}\\ I_{y} \end{bmatrix}\begin{bmatrix} I_{x} &I_{y} \end{bmatrix}=\begin{bmatrix} I_{x}^{2} &I_{x}I_{y} \\ I_{x}I_{y} & I_{y}^{2} \end{bmatrix}$
其中 $\triangledown I$ 为包含导数 $I_{x}和I_{y}$ 的图像梯度。由于该定义。 $M_{1}$ 的秩为1，特征值为 $\lambda _{1}=\left | \triangledown I\right |^{2}和\lambda _{2}=0。$ 对于图像的每一个像素我们可以计算出该矩阵。
选择权重矩阵W（通常为高斯滤波器 $G_{\sigma }$ ），我们可以得到卷积：
$\overline{M_{1}}=W*M_{1}$
该卷积的目的是得到 $M_{1}$ 在周围像素上的局部平均。计算出的矩阵 $\overline{M_{1}}$ 有称为Harris矩阵。W的宽度决定了在像素x周围的感兴趣区域。
代码：

from pylab import *
from PIL import Image
from PCV.localdescriptors import harris

"""
Example of detecting Harris corner points (Figure 2-1 in the book).
"""

# 读入图像
im = array(Image.open(r'JMU1.jpg').convert('L'))

# 检测harris角点
harrisim = harris.compute_harris_response(im)

# Harris响应函数
harrisim1 = 255 - harrisim

figure()
gray()

#画出Harris响应图
subplot(231)
imshow(im), title('original')
axis('off')
axis('equal')

subplot(233)
imshow(harrisim1),title('harrisim')
axis('off')
axis('equal')

threshold = [0.01, 0.05, 0.1]
for i, thres in enumerate(threshold):
    filtered_coords = harris.get_harris_points(harrisim, 6, thres)
    subplot(2, 3, i+4)
    imshow(im),title(threshold[i])
    print (im.shape)
    plot([p[1] for p in filtered_coords], [p[0] for p in filtered_coords], '*')
    axis('off')

#原书采用的PCV中PCV harris模块
#harris.plot_harris_points(im, filtered_coords)

# plot only 200 strongest
# harris.plot_harris_points(im, filtered_coords[:200])

show()

结果：
在这里插入图片描述

分析：
增大α的值，将减小角点响应值R，降低角点检测的灵性，减少被检测角点的数量；减小α值，将增大角点响应值R，增加角点检测的灵敏性，增加被检测角点的数量。使用阈值 0.01、0.05 和 0.1 检测出的角点依次减少。

在图像中寻找对应点：
Harris 角点检测器仅仅能够检测出图像中的兴趣点，但是没有给出通过比较图像间的兴趣点来寻找匹配角点的方法。我们需要在每个点上加入描述子信息，并给出一个比较这些描述子的方法。
兴趣点描述子是分配给兴趣点的一个向量，描述该点附近的图像的表观信息。描述子越好，寻找到的对应点越好。我们用对应点或者点的对应来描述相同物体和场景点在不同图像上形成的像素点。
代码：

from pylab import *
from PIL import Image
from PCV.localdescriptors import harris
from PCV.tools.imtools import imresize

im1 = array(Image.open("JMU1.jpg").convert("L"))
im2 = array(Image.open("JMU1.1.jpg").convert("L"))

# resize加快匹配速度
im1 = imresize(im1, (im1.shape[1] // 2, im1.shape[0] // 2))
im2 = imresize(im2, (im2.shape[1] // 2, im2.shape[0] // 2))

wid = 5
harrisim = harris.compute_harris_response(im1, 5)
filtered_coords1 = harris.get_harris_points(harrisim, wid + 1)
d1 = harris.get_descriptors(im1, filtered_coords1, wid)

harrisim = harris.compute_harris_response(im2, 5)
filtered_coords2 = harris.get_harris_points(harrisim, wid + 1)
d2 = harris.get_descriptors(im2, filtered_coords2, wid)

print('starting matching')
matches = harris.match_twosided(d1, d2)

figure()
gray()
harris.plot_matches(im1, im2, filtered_coords1, filtered_coords2, matches)
show()

结果：
在这里插入图片描述

分析：
该算法的结果存在很多不正确匹配。这是因为，与现代的一些方法（下面将会提到）相比，图像像素块的互相关矩阵具有较弱的描述性。实际运用中，我们通常使用更稳健的方法来处理这些对应匹配。这些描述符还有一个问题，它们不具有尺度不变性和旋转不变性，而算法中像素块的大小也会影响对应匹配的结果。

2.2SIFT（尺度不变特征变换）

SIFT 特征包括兴趣点检测器和描述子。SIFT 描述子具有非常强的稳健性，这在很大程度上也是 SIFT 特征能够成功和流行的主要原因。自从 SIFT 特征的出现，许多其他本质上使用相同描述子的方法也相继出现。现在，SIFT 描述符经常和许多不同的兴趣点检测器相结合使用（有些情况下是区域检测器），有时甚至在整幅图像上密集地使用。SIFT 特征对于尺度、旋转和亮度都具有不变性，因此，它可以用于三维视角和噪声的可靠匹配。
SIFT特征检测的步骤：

1、尺度空间的极值检测：搜索所有尺度空间上的图像，通过高斯微分函数来识别潜在的对尺度和旋转不变的兴趣点。
2、特征点定位：在每个候选的位置上，通过一个拟合精细模型来确定位置尺度，关键点的选取依据他们的稳定程度。
3、特征方向赋值：基于图像局部的梯度方向，分配给每个关键点位置一个或多个方向，后续的所有操作都是对于关键点的方向、尺度和位置进行变换，从而提供这些特征的不变性。
4、特征点描述：在每个特征点周围的邻域内，在选定的尺度上测量图像的局部梯度，这些梯度被变换成一种表示，这种表示允许比较大的局部形状的变形和光照变换。

SIFT算法的特点：

1、图像的局部特征，对旋转、尺度缩放、亮度变化保持不变，对视角变化、仿射变换、噪声也保持一定程度的稳定性。
2、独特性好，信息量丰富，适用于海量特征库进行快速、准确的匹配。
3、多量性，即使是很少几个物体也可以产生大量的SIFT特征
4、高速性，经优化的SIFT匹配算法甚至可以达到实时性
5、扩招性，可以很方便的与其他的特征向量进行联合。

2.2.1兴趣点

SIFT 特征使用高斯差分函数来定位兴趣点：
$D(X,\sigma )=[G_{k\sigma }(X)-G_{\sigma }(X)]*I(X)=[G_{k\sigma }-G_{\sigma }]*I=I_{k\sigma }-I_{\sigma }$
其中， $G_{\sigma }$ 是上一章中介绍的二维高斯核， $I_{\sigma }$ 是使用 $G_{\sigma }$ 模糊的灰度图像， $k$ 是决定相差尺度的常数。兴趣点是在图像位置和尺度变化下 $D(X,\sigma )$ 的最大值和最小值点。这些候选位置点通过滤波去除不稳定点。基于一些准则，比如认为低对比度和位于边上的点不是兴趣点，我们可以去除一些候选兴趣点。

2.2.2描述子

基于上面讨论的兴趣点位置描述子给出了兴趣点的位置和尺度信息。为了实现旋转不变性，基于每个点周围图像梯度的方向和大小，SIFT描述子引入了参考方向。SIFT描述子使用主方向描述参考方向，主方向使用方向直方图（以大小为权重）来度量。

2.2.3检测兴趣点

使用开源工具包VLFeat提供的二进制文件来计算图像SIFT特征。其中VLFeat工具包可以从 http://www.vlfeat.org/ 下载，二进制文件可以在所有主要的平台上运行。VLFeat库是用C语言写的，但是可以使用该库提供的命令行接口。
代码：

# -*- coding: utf-8 -*-
from PIL import Image
from pylab import *
from PCV.localdescriptors import sift
from PCV.localdescriptors import harris

# 添加中文字体支持
from matplotlib.font_manager import FontProperties

font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)

imname = 'JMU1.jpg'
im = array(Image.open(imname).convert('L'))
sift.process_image(imname, 'empire.sift')
l1, d1 = sift.read_features_from_file('empire.sift')

figure()
gray()
subplot(131)
sift.plot_features(im, l1, circle=False)
title(u'(a)SIFT特征', fontproperties=font)
subplot(132)
sift.plot_features(im, l1, circle=True)
title(u'(b)用圆圈表示SIFT特征尺度', fontproperties=font)

# 检测harris角点
harrisim = harris.compute_harris_response(im)

subplot(133)
filtered_coords = harris.get_harris_points(harrisim, 6, 0.1)
imshow(im)
plot([p[1] for p in filtered_coords], [p[0] for p in filtered_coords], '*')
axis('off')
title(u'(c)Harris角点', fontproperties=font)

show()

结果：
在这里插入图片描述
分析：
从中发现，SIFT算法角点检测和Harris角点检测不同，选取的兴趣点也不一样。SIFT选取的对象会使用DoG检测关键点，并且对每个关键点周围的区域计算特征向量，它主要包括两个操作：检测和计算，操作的返回值是关键点信息和描述符，最后在图像上绘制关键点。
在运行程序的过程中会遇到empire.sift 找不到的问题，可以参考http://t.csdn.cn/um5b5中的方法。如果按照其中的做法后仍然出现这种错误，不放试试w32中的sift.exe文件。我就是这样才成功的，很神奇，具体什么原因尚不了解。

2.2.4匹配描述子

代码：

from PIL import Image
from pylab import *
import sys
from PCV.localdescriptors import sift


if len(sys.argv) >= 3:
  im1f, im2f = sys.argv[1], sys.argv[2]
else:
  im1f = 'wangzhe1.jpg'
  im2f = 'wangzhe1.1.jpg'
im1 = array(Image.open(im1f))
im2 = array(Image.open(im2f))

sift.process_image(im1f, 'out_sift_1.txt')
l1, d1 = sift.read_features_from_file('out_sift_1.txt')
figure()
gray()
subplot(121)
sift.plot_features(im1, l1, circle=False)

sift.process_image(im2f, 'out_sift_2.txt')
l2, d2 = sift.read_features_from_file('out_sift_2.txt')
subplot(122)
sift.plot_features(im2, l2, circle=False)

#matches = sift.match(d1, d2)
matches = sift.match_twosided(d1, d2)
print ('{} matches'.format(len(matches.nonzero()[0])))

figure()
gray()

sift.plot_matches(im1, im2, l1, l2, matches, show_below=True)
show()

结果（尺度不同）：
在这里插入图片描述

结果（旋转）：
在这里插入图片描述

分析：
从结果可以看到，某手游的模型通过旋转和放大模型都能成功匹配。SIFT 特征对于尺度、旋转和亮度都具有不变性。（其实旋转得到的匹配结果并不理想，大多数匹配的描述子都是图片中未旋转的部分，而旋转的只是英雄的模型。通过查找资料才知道，由于英雄有动态特效，不同时刻英雄的形态可能会有一些细微变化，导致匹配结果并不理想。大家可以尝试不同角度拍摄同一物体，结果肯定会比这好很多。）

2.3匹配地理标记图像

由于谷歌提供的图片共享服务Panoramio 打不开，于是自己选用了一些图片用来实践。
在这里插入图片描述

首先需要配置环境，这个很重要。
第一步：安装graphviz
进入官网https://graphviz.org/download/下载，将文件安装到项目文件中，然后进入该文件夹的bin，找到dot.exe文件，复制路径。在Pycharm中安装graphviz模块。（由于版本的原因，在代码无误的情况下依然可能会报错，这个时候可以尝试下不同版本的试着跑一跑程序。这个“玄学”问题困扰了我很久！）
在这里插入图片描述
第二步：安装pydot
在Pycharm中安装pydot模块，接着在项目的文件夹中找到并打开pydot.py文件，将其中的代码换成Graphviz中bin的路径复制上去。

第三步：运行代码

from pylab import *
from PIL import Image
from PCV.localdescriptors import sift
from PCV.tools import imtools
import pydot

""" This is the example graph illustration of matching images from Figure 2-10.
To download the images, see ch2_download_panoramio.py."""

#download_path = "panoimages"  # set this to the path where you downloaded the panoramio images
#path = "/FULLPATH/panoimages/"  # path to save thumbnails (pydot needs the full system path)

download_path = r"D:\CV\picture"  # set this to the path where you downloaded the panoramio images
path = r"D:\CV\picture\Graphviz\bin"  # path to save thumbnails (pydot needs the full system path)

imlist = imtools.get_imlist(download_path)
nbr_images = len(imlist)

featlist = [imname[:-3] + 'sift' for imname in imlist]
for i, imname in enumerate(imlist):
    sift.process_image(imname, featlist[i])

matchscores = zeros((nbr_images, nbr_images))

for i in range(nbr_images):
    for j in range(i, nbr_images):  # only compute upper triangle
        print('comparing ', imlist[i], imlist[j])
        l1, d1 = sift.read_features_from_file(featlist[i])
        l2, d2 = sift.read_features_from_file(featlist[j])
        matches = sift.match_twosided(d1, d2)
        nbr_matches = sum(matches > 0)
        print('number of matches = ', nbr_matches)
        matchscores[i, j] = nbr_matches
print("The match scores is: \n", matchscores)
for i in range(nbr_images):
    for j in range(i + 1, nbr_images):  # no need to copy diagonal
        matchscores[j, i] = matchscores[i, j]

#可视化

threshold = 2  # min number of matches needed to create link

g = pydot.Dot(graph_type='graph')  # don't want the default directed graph

for i in range(nbr_images):
    for j in range(i + 1, nbr_images):
        if matchscores[i, j] > threshold:
            # first image in pair
            im = Image.open(imlist[i])
            im.thumbnail((100, 100))
            filename = path + str(i) + '.png'
            im.save(filename)  # need temporary files of the right size
            g.add_node(pydot.Node(str(i), fontcolor='transparent', shape='rectangle', image=filename))

            # second image in pair
            im = Image.open(imlist[j])
            im.thumbnail((100, 100))
            filename = path + str(j) + '.png'
            im.save(filename)  # need temporary files of the right size
            g.add_node(pydot.Node(str(j), fontcolor='transparent', shape='rectangle', image=filename))
            g.add_edge(pydot.Edge(str(i), str(j)))
g.write_png('restult.png')

结果：
在这里插入图片描述

在这里插入图片描述
改变阈值后的结果（阈值由2改成了5）：

分析：
该数据集选取了三个场景进行光照，尺度以及角度的变换，从实验结果可以看出SIFT根据局部描述子较为成功地联合了同一场景的图片。验证了SIFT 特征对于尺度、旋转和亮度都具有不变性。
但是在实验过程以及实验结果中可以看出SIFT存在不足：越大的图片或者是越复杂的图片程序运行时间越长，本次实验中匹配十张图片花费了近一分钟，耗费时间过长，这有悖于实时性。