【python计算机视觉编程——2.局部图像描述子】

方寸星河yu

于 2024-08-26 19:00:01 发布

阅读量1.5k

点赞数 15

分类专栏： python计算机视觉编程文章标签： python 计算机视觉

本文链接：https://blog.csdn.net/m0_53694086/article/details/141571142

版权

python计算机视觉编程专栏收录该内容

10 篇文章 1 订阅

订阅专栏

python计算机视觉编程——2.局部图像描述子

2.局部图像描述子

2.局部图像描述子

2.1 Harris角点检测器

算法步骤
1. 计算图像梯度：首先，计算图像在 x 和 y 方向上的梯度。
2. 构建矩阵：利用梯度信息构建一个矩阵，通常是一个 2x2 的矩阵，来描述局部图像的变化。
3. 计算响应值：通过该矩阵计算角点响应值，用来评估图像中每个点的角点强度。
4. 应用阈值：根据计算出的响应值，筛选出角点，即那些具有较高响应值的点。

from scipy.ndimage import filters
from PIL import Image
from numpy import *
from pylab import *
from scipy.ndimage import filters
import matplotlib.gridspec as gridspec
matplotlib.rcParams['font.family'] = 'sans-serif'
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 黑体字体

角点响应图像:图像中每个像素的值反映了其角点的强度或重要性。高响应值的区域通常被认为是更重要的角点，实现了一个计算 Harris 角点检测响应的函数。Harris 角点检测是一种用于检测图像中的角点（即局部区域的显著特征点）的方法。

compute_harris_response（）：使用高斯滤波计算图像的梯度，然后利用这些梯度计算 Harris 矩阵的元素，最终计算并返回角点响应图像。响应值高的区域通常是图像中的角点，这些点在角点检测中非常有用。

def compute_harris_response(im,sigma=3):
    imx=zeros(im.shape)  # 创建一个与输入图像 im 形状相同的零数组 imx。这个数组将用于存储图像在 x 方向上的梯度信息。
    # (0,1):表示滤波器在x方向上应用，而在y方向上不应用。这使得 filters.gaussian_filter 计算图像在 x 方向上的梯度。梯度计算结果被存储在 imx 中。
    filters.gaussian_filter(im,(sigma,sigma),(0,1),imx)
    imy=zeros(im.shape)
    filters.gaussian_filter(im,(sigma,sigma),(1,0),imy)

    Wxx=filters.gaussian_filter(imx*imx,sigma) # 计算I_x^2的高斯平滑结果
    Wxy=filters.gaussian_filter(imx*imy,sigma) # 计算I_x\cdot I_y的高斯平滑结果
    Wyy=filters.gaussian_filter(imy*imy,sigma) # 计算I_y^2的高斯平滑结果
    # Wxx,Wxy和Wyy用于构建 Harris 矩阵。
    
    Wdet=Wxx*Wyy-Wxy**2  #Harris矩阵的行列式
    Wtr=Wxx+Wyy          #Harris矩阵的迹
    
    return Wdet/Wtr  # 返回角点响应图像,Harris矩阵的比值,用于表示每个像素点的角点响应强度

get_harris_points（）：用于从角点响应图像中提取角点。角点检测是一种常用的特征点检测方法，用于识别图像中的角点。

def get_harris_points(harrisim, min_dist = 10, threshold = 0.1):
    # 从一幅Harris响应图像中返回角点.min_dist为分割角点和图像边界的最少像素数目
    
    corner_threshold = harrisim.max() * threshold   # 计算角点的响应阈值,阈值由Harris响应图像的最大值和设定的threshold比例决定.这个阈值用于筛选出响应较强的角点
    harrisim_t = (harrisim > corner_threshold) * 1  # 通过比较每个像素的响应值与阈值,生成一个二值图像 harrisim_t,其中响应值大于阈值的点标记为 1，其余点标记为 0。
    
    #提取图像中响应值大于阈值的点的坐标。
    # harrisim_t.nonzero()的结果是(array([621, 621, 621, ..., 926, 927, 927], dtype=int64), array([611, 612, 613, ..., 653, 652, 653], dtype=int64))
    # 第一个数组存的是角点的y坐标,第二个数组存的是角点的x坐标
    coords = array(harrisim_t.nonzero()).T     # nonzero():会返回一个包含非零值的位置索引。(具体看下面例子)

    # 根据角点响应值对角点进行排序,以便优先处理响应值较大的角点
    candidate_values = [harrisim[c[0], c[1]] for c in coords]  # c[1]为x横坐标,c[0]为y纵坐标,(具体看下面例子)
    index = argsort(candidate_values)     #argsort函数返回一个数组,这个数组包含原数组中元素的索引,并按照这些元素的升序排列。

    # 创建一个掩码 allowed_locations,用于记录允许角点存在的位置.最小距离 min_dist 确保角点之间有足够的间隔。
    allowed_locations = zeros(harrisim.shape)
    allowed_locations[min_dist: -min_dist, min_dist: -min_dist] = 1 # 从第 10 行到倒数第 10 行，并从第 10 列到倒数第 10 列的子区域，并将这些区域的所有值设置为 1


    # 遍历排序后的角点坐标，检查每个角点是否在允许的区域内。如果在，则将其添加到 filtered_coords 中，并更新掩码以防止在该角点附近再次选择角点。
    filtered_coords = []
    for i in index:
        if allowed_locations[coords[i, 0], coords[i, 1]] == 1:
            filtered_coords.append(coords[i])
            allowed_locations[(coords[i, 0] - min_dist) : (coords[i, 0] + min_dist), (coords[i, 1] - min_dist) : (coords[i, 1] + min_dist)] = 0
    
    return filtered_coords  #返回角点坐标

def plot_harris_points(image, filtered_coords,ax,threshold=0.1):
    gray()
    ax.imshow(image)
    ax.plot([p[1] for p in filtered_coords], [p[0] for p in filtered_coords], '*')
    ax.axis('off')
    ax.set_title('使用阈值%.2f检测出的角点'%threshold,fontsize=30)
	# show()

im = array(Image.open('sun.jpg').convert('L'))
harrisim = compute_harris_response(im)
# 角点响应图像
figure()
imshow(harrisim)


figure(figsize=(30, 10))
# 创建 GridSpec 对象
gs = gridspec.GridSpec(1,3,width_ratios=[1,1,1])  # width_ratios=[2, 1]:第一列的宽度是第二列的两倍。height_ratios=[1, 2]:第二行的高度是第一行的两倍。
ax1 = subplot(gs[0])
ax2 = subplot(gs[1])
ax3 = subplot(gs[2])

# 阈值为0.01
filtered_coords=get_harris_points(harrisim,6,0.01)
plot_harris_points(im,filtered_coords,ax1,0.01)

# 阈值为0.05
# subplot(132)
filtered_coords=get_harris_points(harrisim,6,0.05)
plot_harris_points(im,filtered_coords,ax2,0.05)

# 阈值为0.1
# subplot(133)
filtered_coords = get_harris_points(harrisim,6)
plot_harris_points(im,filtered_coords,ax3)

在这里插入图片描述

例子：nonzero（）的使用

import numpy as np
# 创建一个示例 Harris 角点响应图像
harrisim_t = np.array([
    [0, 0, 0, 1],
    [0, 0, 1, 0],
    [0, 1, 0, 0],
    [1, 0, 0, 0]
])

# 获取非零元素的坐标
coords = np.array(harrisim_t.nonzero()).T

# 打印坐标
for coord in coords:
    print(f"坐标: (y={coord[0]}, x={coord[1]})")
print(harrisim_t[0,3])  # 需要注意的是，原点位于左上角

在这里插入图片描述

在图像间寻找对应点

从给定的图像和角点坐标中提取描述符。它通过从图像中提取以每个特征点为中心的窗口区域，并将这些区域展平成一维数组来生成描述符。

初始化一个空列表desc用于存储描述符。
遍历每个特征点坐标coords
1. 从图像中提取一个以coords为中心的窗口区域。
2. 将窗口区域展平成一维数组。
3. 将展平的数组添加到desc列表中。
返回包含所有描述符的列表 desc。

def get_descriptors(image,filtered_coords,wid=5):
    desc=[]
    height,width = image.shape
    for coords in filtered_coords:
        # 但是当特征点位于图像边缘时，提取窗口可能会超出图像边界。所以需要通过添加边界检查来解决
        y, x = coords
        
        # 计算窗口的边界
        y_min = max(0, y - wid)
        y_max = min(height, y + wid + 1)
        x_min = max(0, x - wid)
        x_max = min(width, x + wid + 1)

        patch=image[y_min:y_max,x_min:x_max].flatten()
        
        desc.append(patch)
    return desc

match（）：用于匹配两个描述符集合的特征点。函数基于归一化互相关 (NCC) 计算匹配度。

def match(desc1,desc2,threshold=0.5):
    n=len(desc1[0])   #每个描述符的长度
    d=-ones((len(desc1),len(desc2)))   # 用于存储每队描述符之间的NCC值的矩阵,初始化为-1(表示未匹配)
    for i in range(len(desc1)):
        for j in range(len(desc2)):
            #对每个描述符进行归一化处理:减去均值并除以标准差
            d1=(desc1[i]-mean(desc1[i]))/std(desc1[i])
            d2=(desc2[j]-mean(desc2[j]))/std(desc2[j])

            #对于每队描述符(i,j),计算其归一化后的NCC值
            ncc_value=sum(d1*d2)/(n-1)
            if ncc_value>threshold:#如果NCC值大于阈值,则将其存储在d矩阵中
                d[i,j]=ncc_value
    #对每一行的NCC值进行排序，取最匹配的描述符（即 NCC 值最大的那个）
    ndx=argsort(-d)  #argsort函数返回一个数组,这个数组包含数组-d(因为加了负号,最大变为了最小,所以下一行代码就直接取第1列元素即可)中元素的索引,并按照这些元素的升序排列。（具体看下面例子）
    matchscores=ndx[:,0]   #每一行记为i,matchscores[i][0]里的元素表示与desc1的第i个描述符最匹配的描述符是desc2中的哪一个
    return matchscores  #包含与 desc1 中每个描述符最匹配的 desc2 中的描述符索引。

match_twosided（）：用于进行双向匹配，以提高匹配的准确性。它首先使用 match 函数从 desc1 匹配到 desc2，然后再从 desc2 匹配回 desc1，确保匹配的对称性。

def match_twosided(desc1,desc2,threshold=0.5):
    matches_12=match(desc1,desc2,threshold)
    matches_21=match(desc2,desc1,threshold)
    ndx_12=where(matches_12>=0)[0]   #where():获取matches_12中所有非负值的索引(因为负值表示的是未匹配,所以不考虑),也就是获取的是与desc2匹配的desc1的索引值（具体看下面例子）
    for n in ndx_12:
        if matches_21[matches_12[n]]!=n:  # matches_12[n]是desc1中第n个描述符在desc2中的匹配索引
            matches_12[n]=-1
    return matches_12

def appendimages(im1,im2):
    rows1=im1.shape[0]
    rows2=im2.shape[0]
    if rows1<rows2:
        im1=concatenate((im1,zeros((rows2-rows1,im1.shape[1]))),axis=0)
    elif rows1>rows2:
        im2=concatenate((im2,zeros((rows1-rows2,im2.shape[1]))),axis=0)
    return concatenate((im1,im2),axis=1)

def plot_matches(im1,im2,locs1,locs2,matchscores,show_below=True):
    im3=appendimages(im1,im2)  # 将两张图像水平拼接成一张新图像
    if show_below:
        im3=vstack((im3,im3))  # 如果 show_below 为 True，将拼接后的图像在垂直方向上再拼接一次
    figure(figsize=(20, 10))
    imshow(im3)
    cols1=im1.shape[1]  # 存储 im1 的宽度，用于计算绘制线条时的水平偏移量。
    for i,m in enumerate(matchscores):  # 会返回一个由索引和值组成的元组
        if m>0:
            plot([locs1[i][1],locs2[m][1]+cols1],[locs1[i][0],locs2[m][0]],'c')
    axis('off')

im1 = array(Image.open('sun1.jpg').convert('L'))
im2 = array(Image.open('sun2.jpg').convert('L'))

wid=5
harrisim=compute_harris_response(im1,5)  # 返回角点响应图像
filtered_coords1=get_harris_points(harrisim,wid+1)  # 返回角点坐标
d1=get_descriptors(im1,filtered_coords1,wid)  #以角点为中心,展开区域生成描述符

harrisim=compute_harris_response(im2,5)
filtered_coords2=get_harris_points(harrisim,wid+1)
d2=get_descriptors(im2,filtered_coords2,wid)

print('starting matching')
matches=match_twosided(d1,d2)

figure()
gray()
plot_matches(im1,im2,filtered_coords1,filtered_coords2,matches)
show()

在这里插入图片描述

例子1：np.where（）的使用

matches_12 = np.array([1, -1, 3, 5, -1])
ndx_12 = np.where(matches_12 >= 0)[0]
print(np.where(matches_12 >= 0))
print(ndx_12)

在这里插入图片描述

例子2：argsort（）的使用

d=array([[1,2,10,7,3],[3,2,11,20,100]])
print(argsort(-d))

在这里插入图片描述

2.2 SIFT（尺度不变特征变换）

2.2.3 检测兴趣点

from PIL import Image
from numpy import *
from pylab import *
import os
import subprocess
matplotlib.rcParams['font.family'] = 'sans-serif'
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 黑体字体

需要在"https://www.vlfeat.org/download/“上下载"vlfeat-0.9.20-bin.tar.gz”( $\color{red}{一定要下载0.9.20的版本，使用0.9.21的版本生成的sift文件为空！！！}$ )并把目录"bin/win64"中三个文件(sift.exe,vl.dll,vl.lib)拖入与本文件同级的目录中

在这里插入图片描述

处理pgm文件，转为sift文件

def process_image(imagename, resultname, params="--edge-thresh 10 --peak-thresh 5"):
    if imagename[-3:] != 'pgm':
        im = Image.open(imagename).convert('L')
        im.save('tmp.pgm')
        imagename = 'tmp.pgm'
         
    cmmd = str(".\sift.exe " + imagename + " --output=" + resultname + " " + params)
    os.system(cmmd)
    print('processed', imagename, 'to', resultname)

读取兴趣点的坐标、尺度、方向角度和描述子

def read_features_from_file(filename):
    f=loadtxt(filename)
    print("Array shape:", f.shape)  # (1071, 132)
    #前四列是兴趣点的坐标,尺度和方向角度,后128列是用整数数值表示的描述子
    return f[:,:4],f[:,4:]

def write_features_to_file(filename,locs,desc):
    savetxt(filename,hstack((locs,desc)))

用于可视化图像特征点

def plot_features(im,locs,circle=False):
    def draw_circle(c,r):  # 用于绘制圆形。c 是圆心坐标，r 是半径
        t=arange(0,1.01,.01)*2*pi  # 创建角度值，从0到2π（完整的圆）
        x=r*cos(t)+c[0]  # 计算圆形边界的x坐标
        y=r*sin(t)+c[1]  # 计算圆形边界的y坐标
        plot(x,y,'b',linewidth=2)  # 绘制圆形的边界，使用蓝色和线宽2
        imshow(im)
    if circle:
        for p in locs:
            draw_circle(p[:2],p[2])
    else:
        plot(locs[:,0],locs[:,1],'ob')
    axis('off')

imname='sun.jpg'
im1=array(Image.open(imname).convert('L'))
process_image(imname,'sun.sift')
l1,d1=read_features_from_file('sun.sift')

figure(figsize=(30, 10))
gray()
subplot(121)
imshow(im1)
plot_features(im1,l1)

subplot(122)
gray()
plot_features(im1,l1,circle=True)
show()

在这里插入图片描述

2.2.4 匹配描述子

imname1='sun1.jpg'
imname2='sun2.jpg'
im1 = array(Image.open(imname1).convert('L'))
process_image(imname1,'sun1.sift')

im2 = array(Image.open(imname2).convert('L'))
process_image(imname2,'sun2.sift')


l1,d1=read_features_from_file('sun1.sift')
l2,d2=read_features_from_file('sun2.sift')

figure(figsize=(20,10))
gray()
subplot(121)
imshow(im1)
plot_features(im1,l1)
subplot(122)
imshow(im2)
plot_features(im2,l2)

在这里插入图片描述

def match(desc1,desc2):
    desc1=array([d/linalg.norm(d) for d in desc1])
    desc2=array([d/linalg.norm(d) for d in desc2])
    dist_ratio=0.6
    desc1_size=desc1.shape
    matchscores=zeros((desc1_size[0],1),'int')
    desc2t=desc2.T
    for i in range(desc1_size[0]):
        dotprods=dot(desc1[i,:],desc2t)
        dotprods=0.9999*dotprods
        indx=argsort(arccos(dotprods))
        if arccos(dotprods)[indx[0]]<dist_ratio*arccos(dotprods)[indx[1]]:
            matchscores[i]=int(indx[0])
    return matchscores

def match_twosided(desc1,desc2):
    matches_12=match(desc1,desc2)
#     print(matches_12.shape)
    matches_21=match(desc2,desc1)
    ndx_12=matches_12.nonzero()[0] #获取的是与desc2匹配的desc1的索引值
    for n in ndx_12:
        if matches_21[int(matches_12[n])]!=n:  # matches_12[n]是desc1中第n个描述符在desc2中的匹配索引
            matches_12[n]=0
    return matches_12

def appendimages(im1,im2):
    rows1=im1.shape[0]
    rows2=im2.shape[0]
    if rows1<rows2:
        im1=concatenate((im1,zeros((rows2-rows1,im1.shape[1]))),axis=0)
    elif rows1>rows2:
        im2=concatenate((im2,zeros((rows1-rows2,im2.shape[1]))),axis=0)
    return concatenate((im1,im2),axis=1)

def plot_matches(im1,im2,locs1,locs2,matchscores,show_below=True):
    print(locs1.shape,locs2.shape)
    im3=appendimages(im1,im2)  # 将两张图像水平拼接成一张新图像
    if show_below:
        im3=vstack((im3,im3))  # 如果 show_below 为 True，将拼接后的图像在垂直方向上再拼接一次
    figure(figsize=(20, 10))
    imshow(im3)
    cols1=im1.shape[1]  # 存储im1的宽度，用于计算绘制线条时的水平偏移量。
#     print(matchscores)
    for i,m in enumerate(matchscores):  # 会返回一个由索引和值组成的元组
        value=m[0]
        if value>0:
            plot([locs1[i][0],locs2[value][0]+cols1],[locs1[i][1],locs2[value][1]],'c')
    axis('off')

figure()
gray()
print('starting matching')
matches=match_twosided(d1,d2)
plot_matches(im1,im2,l1,l2,matches)
show()

在这里插入图片描述

2.3 匹配地理标记图像

首先需要前往官网graphviz.gitlab.io下载graphviz安装包，安装期间勾选自动添加进环境变量，之后到conda的环境中下载（因为我是在conda的虚拟环境中执行）

conda install graphviz
conda install pydot

import os
from PIL import Image
from numpy import *
import pydot

def get_imlist(path,endIdentifier):
    return [os.path.join(path,f) for f in os.listdir(path) if f.endswith(endIdentifier)]
# 因为文件夹里面有png和jpg文件，统一一起整理
imlist1=get_imlist('panoimages','.png')
imlist2=get_imlist('panoimages','.jpg')
imlist=imlist1+imlist2
# print(len(imlist))

featlist=[]#特征列表
for i in range(0,len(imlist)):
    imagename=imlist[i]
    if imagename.endswith('.png') or imagename.endswith('.jpg'):
        siftname = imagename[:-4] + '.sift'
#     print(imagename,siftname)
    process_image(imagename,siftname)
    featlist.append(siftname)

初始化：设置匹配分数矩阵。
计算分数：遍历所有图像对，计算每对图像的匹配分数。
对称化：确保矩阵的对称性，使得matchscores矩阵反映了图像对之间的对称匹配关系。

通过这些步骤，能够生成一个反映图像匹配情况的对称矩阵。

nbr_images=len(imlist)
matchscores=zeros((nbr_images,nbr_images))
for i in range(nbr_images):
    for j in range(i,nbr_images):
        print('comparing',imlist[i],imlist[j])
        l1,d1=read_features_from_file(featlist[i])
        l2,d2=read_features_from_file(featlist[j])
        matches=match_twosided(d1,d2)
        nbr_matches=sum(matches>0)
        print('number of matched =',nbr_matches)
        matchscores[i,j]=nbr_matches
for i in range(nbr_images):
    for j in range(i+1,nbr_images):
        matchscores[j,i]=matchscores[i,j]

生成一个图片匹配关系的可视化图，将每张图片作为图中的一个节点，根据图片之间的匹配分数决定是否连接这些节点，并将图保存为 PNG 文件。

path="..."#存放缩略图的路径
threshold=2
g=pydot.Dot(graph_type='graph') # 创建一个 pydot.Dot 对象g，用于生成无向图（graph_type='graph'）


# 遍历所有图片对，i 和 j 分别表示图片的索引。
# matchscores[i, j] 是图片 i 和图片 j 之间的匹配分数。
# 如果匹配分数大于 threshold，则进行下一步操作
for i in range(nbr_images):
    for j in range(i+1,nbr_images):
        if matchscores[i,j]>threshold:
            
            im=Image.open(imlist[i])
            im.thumbnail((100,100))  #将图片缩略为 100x100 像素
            filename=str(i)+'.png'
            im.save(filename)
            # 将图片添加到图形 g 中，作为一个节点
            g.add_node(pydot.Node(str(i),fontcolor='transparent',
                                 shape='rectangle',image=path+filename))  
            
            im=Image.open(imlist[j])
            im.thumbnail((100,100))
            filename=str(j)+'.png'
            im.save(filename)
            g.add_node(pydot.Node(str(j),fontcolor='transparent',
                                 shape='rectangle',image=path+filename))
            g.add_edge(pydot.Edge(str(i),str(j)))
g.write_png('whitehouse.png')
# im = array(Image.open('whitehouse.png').convert('L'))
# gray()
# imshow(im)