第二章：局部图像描述子

Joyrich.

已于 2024-08-26 20:56:38 修改

阅读量921

点赞数 27

文章标签：深度学习计算机视觉人工智能

于 2024-08-23 18:23:10 首次发布

本文链接：https://blog.csdn.net/qq_63029071/article/details/141392929

版权

1 Harris角点检测算法

1.1 Harris角点检测器

Harris角点检测算法（也称Harris & Stephens角点检测器）是一个极为简单的角点检测算法。该算法的主要思想是，如果像素周围显示存在多于一个方向的边，则认为该点为兴趣点，该点也就是称为角点。角点就是极值点，即在某方面属性特别突出的点，是在某些属性上强度最大或者最小的孤立点、线段的终点。

在图像域中点x上的对称半正定矩阵可以定义为：

$M_l=\bigtriangledown I \bigtriangledown I^T=\begin{bmatrix} I_x\\ I_y \end{bmatrix}\begin{bmatrix} I_x & I_y\end{bmatrix}=\begin{bmatrix} I_x^2 & I_xI_y\\ I_xI_y & I_y^2 \end{bmatrix}$

其中▽I为包含导数Ix和Iy的图像梯度。

选择权重矩阵W（通常为高斯滤波器G），可以得到卷积：

$\overline{M}_l=W*M_l$

该卷积的目的是得到M1在周围像素上的局部平均。计算出的矩阵称为Harris矩阵。W的宽度决定了在像素x周围的感兴趣区域。像这样在区域附近对Harris矩阵取平均的原因是特征值会依赖于局部图像特性而变化。如果图像的梯度在该区域变化，那么Harris的第二个特征值将不再是0。如果图像的梯度没有变化，Harris的特征值也不会变化。

取决于该区域▽I的值，Harris矩阵的特征值有三种情况：

如果λ1和λ2都是很大的正数，则该x点为角点；
如果λ1很大，λ2≈0，则该区域内存在一个边，该区域的平均M1的特征值不会变化太大；
如果λ1≈λ2≈0，则该区域内为空。
在不需要实际计算特征值的情况下，引入指示函数：

$det(\overline{M}_l)-ktrace(\overline{M}_l)^2$

为了去除加权常数k，通常使用商数：

$\frac{det(\overline{M}_l)}{trace\overline{M}_l^2}$

作为指示器。

算法的核心是利用局部窗口在图像上进行移动，判断灰度是否发生较大的变化。如果窗口内的灰度值（在梯度图上）都有较大的变化，那么这个窗口所在区域就存在角点。
这样就可以将 Harris 角点检测算法分为以下三步：

当窗口（局部区域）同时向 xx （水平）和 yy（垂直）两个方向移动时计算窗口内部的像素值变化量 E(x,y)E(x,y) ；
对于每个窗口，都计算其对应的一个角点响应函数R；
然后对该函数进行阈值处理，如果 R>threshold，表示该窗口对应一个角点特征。

1.2 Harris角点检测代码

在这里需要使用到scipy.ndimage.filters模块中的高斯导数滤波器来计算导数，因为需要在角点检测过程中抑制噪声强度。

首先将角点响应函数添加到harris.py文件中，该函数使用高斯导数实现。同样地，参数σ定义了使用的高斯滤波器的尺度大小。

def compute_harris_response(im,sigma=3):
    #计算导数
    imx=zeros(im.shape)
    filters.gaussian_filter(im,(sigma,sigma),(0,1),imx)
    imy=zeros(im.shape)
    filters.gaussian_filter(im,(sigma,sigma),(1,0),imy)
    #计算Harris矩阵的分量
    Wxx=filters.gaussian_filter(imx*imx,sigma)
    Wxy=filters.gaussian_filter(imx*imy,sigma)
    Wyy=filters.gaussian_filter(imy*imy,sigma)
    #计算特征值和迹
    Wdet=Wxx*Wyy-Wxy**2
    Wtr=Wxx+Wyy

    return Wdet/Wtr

上面的函数返回像素值为Harris响应函数值的一幅图像。然后需要从中挑选出像素值高于阈值的所有图像点，再加上额外的限制，即角点之间的间隔必须大于设定的最小距离。这种方法会产生很好的角点检测结果。

为了实现该算法，我们获取所有的候选像素点，以角点响应值递减的顺序排序，然后将距离已标记为角点位置过近的区域从候选像素点中删除。将下面的函数添加到harris.py中：

def get_harris_points(harrisim,min_dist=10,threshold=0.1):

    corner_threshold=harrisim.max()*threshold
    harrisim_t=(harrisim>corner_threshold)*1

    coords=array(harrisim_t.nonzero()).T

    candidate_values=[harrisim[c[0],c[1]] for c in coords]

    index=argsort(candidate_values)

    allowed_locations=zeros(harrisim.shape)
    allowed_locations[min_dist:-min_dist,min_dist:-min_dist]=1

    filtered_coords=[]
    for i in index:
        if allowed_locations[coords[i,0],coords[i,1]]==1:
            filtered_coords.append(coords[i])
            allowed_locations[(coords[i,0]-min_dist):(coords[i,0]+min_dist),
            (coords[i,1]-min_dist):(coords[i,1]+min_dist)]=0
    return filtered_coords

为了显示图像中的角点，可以使用Matplotlib模块绘制函数

def plot_harris_points(image, filtered_coords)：
	figure()
	gray()
	imshow(image)
	plot([p[1] for p in filtered_coords], [p[0] for p in filtered_coords], '*')
	axis('off')
	show()

这样准备工作就已经完成，接下来运行下边这段代码，打开灰度图，计算响应函数并基于响应值选择角点，最后在原始图像中覆盖绘制检测出的角点

im=array(Image.open('1.jpg').convert('L'))
harrisim=imtool.compute_harris_response(im)
filtered_coords = imtool.get_harris_points(harrisim,6)
imtool.plot_harris_points(im,filtered_coords)

1.3 在图像间寻找对应点

Harris角点检测器仅仅能够检测出图像中的兴趣点，但是没有给出通过比较图像间的兴趣点来寻找匹配角点的方法。所以就需要在每个点添加一个描述子并给出一个比较这些描述子的方法。

兴趣点描述子是分配给兴趣点的一个向量，描述该点附近的图像的表观信息。描述子越好，寻找到的对应点越好。我们用对应点或者点的对应来描述相同物体和场景点在不同图像上形成的像素点。

Harris角点的描述子通常是由周围图像像素块的灰度值，以及用于比较的归一化互相关矩阵构成的。图像的像素块由以该像素点为中心的周围矩形部分图像构成。

为了获取图像像素块，并使用归一化的互相关矩阵来比较它们，需要将以下两个函数添加到harris.py中：

def get_descriptors(image,filtered_coords,wid=5):

    desc = []
    for coords in filtered_coords:
        patch = image[coords[0]-wid:coords[1]+wid+1,
                coords[1]-wid:coords[1]+wid+1].flatten()
        desc.append(patch)
    return desc

def match(desc1,desc2,threshold=0.5):

    n=len(desc1[0])

    d=-ones((len(desc1),len(desc2)))
    for i in range(len(desc1)):
        for j in range(len(desc2)):
            d1=(desc1[i]-mean(desc1[i]))/std(desc1[i])
            d2=(desc2[i]-mean(desc2[i]))/std(desc2[i])
            ncc_value=sum(d1*d2)/(n-1)
            if ncc_value>threshold:
                d[i,j]=ncc_value
    ndx=argsort(-d)
    matchscores=ndx[:,0]
    
    return matchscores

第一个函数的参数为奇数大小长度的方形灰度图像块，该图像块的中心为处理的像素点。该函数将图像块像素值压平成一个向量，然后添加到描述子列表中。第二个函数使用归一化的互相关矩阵，将每个描述子匹配到另一个图像中的最优的候选点。由于数值较高的距离代表两个点能够更好地匹配，所以在排序之前，对距离取相反数。为了获得更稳定的匹配，我们从第二幅图像向第一幅图像匹配，然后过滤掉在两种方法中不都是最好的匹配。用下边的函数实现

def match_twosided(desc1,desc2,threshold=0.5):
	
    matches_12 = match(desc1,desc2,threshold)
    matches_21 = match(desc2,desc1,threshold)

    ndx_12 = where(matches_12 >= 0)[0]

    for n in ndx_12:
        if matches_21[matches_12[n]] != n:
            matches_12[n] = -1
            
    return matches_12

这些匹配可以通过在两边分别绘制出图像，使用线段连接匹配的像素点来直观地可视化。可视化过程由以下代码实现：

def appendimages(im1,im2):

    rows1 = im1.shape[0]
    rows2 = im2.shape[0]

    if rows1 < rows2:
        im1 = concatenate((im1,zeros((rows2-rows1,im1,shape[1]))),axis=0)
    elif rows1>rows2:
        im2 = concatenate((im2,zeros((rows1-rows2,im2.shape[1]))),axis=0)

    return concatenate((im1,im2),axis=1)

def plot_matches(im1,im2,locs1,locs2,matchscores,show_below=True):

    im3 = appendimages(im1,im2)
    if show_below:
        im3 = vstack((im3,im3))

    imshow(im3)

    cols1 = im1.shape[1]
    for i,m in enumerate(matchscores):
        if m>0:
            plot([locs1[i][1],locs2[m][1]+cols1],[locs1[i][0],locs2[m][0]],'c')
    axis('off')

使用归一化互相关矩阵寻找对应点的实例：

im1=array(Image.open('1.jpg').convert('L'))
im2 = array(Image.open('2 .jpg').convert('L'))

wid =5

harrisim= imtool.compute_harris_response(im1,5)
filtered_coords1=imtool.get_harris_points(harrisim,wid+1)
d1=imtool.get_descriptors(im1,filtered_coords1,wid)

harrisim=imtool.compute_harris_response(im2,5)
filtered_coords2=imtool.get_harris_points(harrisim,wid+1)
d2=imtool.get_descriptors(im2,filtered_coords2,wid)

print("start matching")
matches = imtool.match_twosided(d1,d2)

figure()
gray()
imtool.plot_matches(im1,im2,filtered_coords1,filtered_coords2,matches)
show()

从上图可以看出，算法的结果存在一些不正确匹配，因为图像像素块的互相关矩阵具有较弱的描述性，而且描述符不具有尺度不变性和旋转不变性，算法中像素块的大小也会影响匹配的结果。

2 SIFT（尺度不变特征变换）

2.1 兴趣点

SIFT特征使用高斯差分函数来定位兴趣点：

$D(x,\sigma)=[G_{x\sigma}(x)-G_{\sigma}(x)]*I(x)=[G_{x\sigma}-G_{\sigma}]*I=I_{x\sigma}-I_{\sigma}$

其中，Gσ是使用Gσ模糊的灰度图像，κ是决定相差尺度的常数。兴趣点是在图像位置和尺度变化下D(x, σ)的最大值和最小值点。这些候选位置点通过滤波去除不稳定点。基于一些准则，比如认为低对比度和位于边上的点不是兴趣点，可以去除一些候选兴趣点。

2.2 描述子

SIFT描述子使用主方向描述参考方向。主方向使用方向直方图来度量。

为了对图像亮度具有稳定性，SIFT描述子使用图像梯度。SIFT描述子在每个像素点附近选取子区域网格，在每个子区域内计算图像梯度方向直方图。每个子区域的直方图拼接起来自称描述子向量。SIFT描述子的标准设置使用4*4的子区域，每个子区域使用8个小区间的方向直方图，会产生共128个小区间的直方图。

2.3 检测兴趣点

创建sift.py文件，将下面调用可执行文件的函数添加到该文件中：

def process_image(imagename, resultname, params="--edge-thresh 10 --peak-thresh 5"):
    """处理一幅图像，然后将结果保存在文件中"""
    
    if imagename[-3:] != 'pgm':
        #创建一个pgm文件
        im = Image.open(imagename).convert('L')
        im.save('tmp.pgm')
        imagename = 'tmp.pgm'
        
    cmmd= str("sift"+imagename+"--output="+resultname+" "+params)
    os.system(cmmd)
    print('processed',imagename,'to',resultname)

由于该二进制文件需要的图像格式为灰度.pgm，所以如果图像为其他各是，我们需要首先将其转换成.pgm格式文件。其中数据的每一行前4个数值依次表示兴趣点的坐标、尺度和方向角度，后面紧跟着的是对应描述符的128维向量。

下面函数用于从上面输出文件中，将特征读取到Numpy数组中的函数。

def read_features_from_file(filename):
	f = loadtxt(filename)
	return f[:, :4], f[:, 4:]

读取特征后，通过在图像上绘制出它们的位置，可以将其可视化。下面的函数可以实现这个功能：

def plot_features(im, locs, circle=False):
    """显示带有特征的图像
        输入：im（数组图像），locs（每个特征的行、列、尺度和方向角度）"""

    def draw_circle(c,r):
        t = arange(0,1.01,.01)*2*pi
        x = r*cos(t) + c[0]
        y = r*sin(t) + c[1]
        plot(x,y,'b',linewidth=2)

    imshow(im)
    if circle:
        for p in locs:
            draw_circle(p[:2],p[2])
    else:
        plot(locs[:,0],locs[:,1],'ob')
    axis('off')
    return

我们通过下面的命令绘制SIFT特征位置的图像：

imname='1.jpg'
im1=array(Image.open(imname).convert('L'))
sift.process_image(imname,'t.sift')
l1,d1=sift.read_features_from_file('t.sift')
figure()
gray()
sift.plot_features(im1,l1,circle=True)
show()

2.4 匹配描述子

对于将一幅图像中的特征匹配到另一幅图像的特征，一种稳健的准则（同样是由Lowe提出的）是使用者两个特征距离和两个最匹配特征距离的比率。相比于图像中的其他特征，该准则保证能够找到足够相似的唯一特征。使用该方法可以使错误的匹配数降低。下面的代码实现了匹配函数。

def match(desc1, desc2):
    """对于第一幅图像的每个描述子，选取其在第二幅图像中的匹配
        输入：desc1（第一幅图像中的描述子），desc2（第二幅图像中的描述子）"""

    desc1 = array([d/linalg.norm(d) for d in desc1])
    desc2 = array([d/linalg.norm(d) for d in desc2])

    dist_ratio = 0.6
    desc1_size = desc1.shape

    matchscores = zeros((desc1_size[0],1), 'int')
    desc2t = desc2.T    #预先计算矩阵转置
    for i in range(desc1_size[0]):
        dotprods = dot(desc1[i,:], desc2t) #向量点乘
        dotprods = 0.9999*dotprods
        # 反余弦和反排序，返回第二幅图像中特征的索引
        index = argsort(arccos(dotprods))

        # 检查最近邻的角度是否小于dist_ratio乘以第二近邻的角度
        if arccos(dotprods)[index[0]] < dist_ratio * arccos(dotprods)[index[1]]:
            matchscores[i] = int(index[0])

    return matchscores

def match_twosided(desc1,decs2):
    """双向对称版本的match"""

    matches_12 = match(desc1, decs2)
    matches_21 = match(decs2, decs2)

    ndx_12 = matches_12.nonzero()[0]

    # 去除不对称匹配
    for n in ndx_12:

        if matches_21[int(matches_12[n])] != n:
            matches_12[n] = 0

    return matches_12

def appendimages(im1, im2):
    """返回将两幅图像并排拼接成的一幅新图像"""

    # 选取具有最少行数的图像，然后填充足够的空行
    row1 = im1.shape[0]
    row2 = im2.shape[0]

    if row1 < row2:
        im1 = concatenate((im1,zeros((row2-row1,im1.shape[1]))), axis=0)
    elif row1 > row2:
        im2 = concatenate((im2,zeros((row1-row2,im2.shape[1]))), axis=0)

    # 如果这些情况都没有，那么他们的行数相同，不需要进行填充

    return concatenate((im1,im2), axis=1)

def plot_matches(im1, im2, locs1, locs2, matchscores, show_below=True):
    """显示一幅带有连接匹配之间连线的图片
        输入：im1，im2（数组图像），locs1，locs2（特征位置），matchscores（match的输出），
        show_below（如果图像应该显示再匹配下方）"""

    im3 = appendimages(im1,im2)
    if show_below:
        im3 = vstack((im3,im3))

    imshow(im3)

    cols1 = im1.shape[1]
    for i in range(len(matchscores)):
        if matchscores[i] > 0:
            plot([locs1[i, 0], locs2[matchscores[i, 0], 0] + cols1], [locs1[i, 1], locs2[matchscores[i, 0], 1]], 'c')
    axis('off')

两张图片检测到的特征点：

得到的匹配结果如下图所示

通过实验对比可以发现sift相比Harris的检测效果更好，匹配效果也更好，sift匹配除去个别点，其余特征点匹配的都很正确，但是Harris匹配的特征点较为杂乱

3 匹配地理标记图像

3.1 下载地理标记图像

从Panoramio下载图像，其提供了一个API接口

import os
import urllib, urlparse 
import json

# query for images
url = 'http://www.panoramio.com/map/get_panoramas.php?order=popularity&set=public&from=0&to=20&minx=-77.037564&miny=38.896662&maxx=-77.035564&maxy=38.898662&size=medium'
c = urllib.urlopen(url)

# get the urls of individual images from JSON
j = json.loads(c.read()) 
imurls = []
for im in j['photos']:
    imurls.append(im['photo_file_url'])

# download images
for url in imurls:
    image = urllib.URLopener()
    image.retrieve(url, os.path.basename(urlparse.urlparse(url).path)) 
    print 'downloading:', url

3.2 使用局部描述子匹配

对图像进行SIFT特征处理后，将特征保存，使用下列代码进行逐个匹配

import sift
nbr_images = len(imlist)
matchscores = zeros((nbr_images,nbr_images))
for i in range(nbr_images): 
 for j in range(i,nbr_images): # 仅仅计算上三角
 print 'comparing ', imlist[i], imlist[j]
 l1,d1 = sift.read_features_from_file(featlist[i])
 l2,d2 = sift.read_features_from_file(featlist[j])
 matches = sift.match_twosided(d1,d2)
 nbr_matches = sum(matches > 0)
 print 'number of matches = ', nbr_matches
 matchscores[i,j] = nbr_matches
 # 复制值
 for i in range(nbr_images):
 for j in range(i+1,nbr_images): # 不需要复制对角线
 matchscores[j,i] = matchscores[i,j]

3、可视化连接的图像

使用pydot工具包进行可视化连接。

创建一个图，图表示深度为2的树，具有5个分支，将分支添加到分支节点上

当匹配的数目高于一个阈值，使用边来连接相应的图像节点。

from pylab import *
from numpy import *
from PIL import Image

from PCV.localdescriptors import sift
from PCV.tools import imtools

import pydot

download_path = "panoimages" # set this to the path where you downloaded the panoramio images
path = "/FULLPATH/panoimages/" # path to save thumbnails (pydot needs the full system path)

# list of downloaded filenames
imlist = imtools.get_imlist(download_path)
nbr_images = len(imlist)

# extract features
featlist = [imname[:-3]+'sift' for imname in imlist]
for i,imname in enumerate(imlist):
    sift.process_image(imname, featlist[i])

matchscores = zeros((nbr_images,nbr_images))

for i in range(nbr_images):
    for j in range(i,nbr_images): # only compute upper triangle
        print 'comparing ', imlist[i], imlist[j]
        l1,d1 = sift.read_features_from_file(featlist[i]) 
        l2,d2 = sift.read_features_from_file(featlist[j])
        matches = sift.match_twosided(d1,d2)
        nbr_matches = sum(matches > 0)
        print 'number of matches = ', nbr_matches 
        matchscores[i,j] = nbr_matches

# copy values
for i in range(nbr_images):
    for j in range(i+1,nbr_images): # no need to copy diagonal
        matchscores[j,i] = matchscores[i,j]

threshold = 2 # min number of matches needed to create link

g = pydot.Dot(graph_type='graph') # don't want the default directed graph 

for i in range(nbr_images):
    for j in range(i+1,nbr_images):
        if matchscores[i,j] > threshold:
            # first image in pair
            im = Image.open(imlist[i])
            im.thumbnail((100,100))
            filename = path+str(i)+'.png'
            im.save(filename) # need temporary files of the right size 
            g.add_node(pydot.Node(str(i),fontcolor='transparent',shape='rectangle',image=filename))
    
            # second image in pair
            im = Image.open(imlist[j])
            im.thumbnail((100,100))
            filename = path+str(j)+'.png'
            im.save(filename) # need temporary files of the right size 
            g.add_node(pydot.Node(str(j),fontcolor='transparent',shape='rectangle',image=filename)) 
            
            g.add_edge(pydot.Edge(str(i),str(j)))

g.write_png('whitehouse.png')