2024.9.14 Python与图像处理新国大EE5731课程大作业单应性矩阵和RANSAC，拼接透视图

RaidenQ

已于 2024-09-14 15:53:54 修改

阅读量1.4k

点赞数 15

文章标签： python 图像处理课程设计算法

于 2024-09-14 15:53:09 首次发布

本文链接：https://blog.csdn.net/RaidenMessi/article/details/142251428

版权

1.SIFT匹配

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg
import random as rm
import math
import cv2
from tkinter import *
from tkinter import messagebox
from PIL import Image, ImageTk
import random

# 读取图像
im1 = cv2.imread('im01.jpg', cv2.IMREAD_COLOR)
im2 = cv2.imread('im02.jpg', cv2.IMREAD_COLOR)

# 将BGR格式转换为RGB格式
img1 = cv2.cvtColor(im1, cv2.COLOR_BGR2RGB)
img2 = cv2.cvtColor(im2, cv2.COLOR_BGR2RGB)

# 拼接图像
input_images = np.hstack((img1, img2))

# 使用Matplotlib显示图像
plt.imshow(input_images)
plt.axis('off')  # 隐藏坐标轴
plt.show()

代码逻辑：
引入库，以颜色模式加载图片

# mini square
def mini_square(des1,des2):
    [p1,n] = des1.shape
    [p2,n] = des2.shape
    
    # find mini distance position
    # return the matches relationship of both sides
    matches1 = []
    for j in range(0,p1):
        dis = np.zeros([p2,1])
        for k in range(0,p2):
            v = des1[j]-des2[k]
            dis[k] = np.dot(v,v.T)
            
        # find the mini position
        pos = np.argmin(dis)
        # _distance should not be float
        tmp = int(dis[pos])
        
        # store the position in DMatch
        matches1.append(cv2.DMatch(_distance=tmp,_queryIdx=j,_trainIdx=pos,_imgIdx=0))

    matches2 = []
    for j in range(0,p2):
        dis = np.zeros([p1,1])
        for k in range(0,p1):
            v = des2[j]-des1[k]
            dis[k] = np.dot(v,v.T)
            
        # find the mini position
        pos = np.argmin(dis)
        tmp = int(dis[pos])
        
        # store the position in DMatch
        matches2.append(cv2.DMatch(_distance=tmp,_queryIdx=j,_trainIdx=pos,_imgIdx=0))
    
    # matches points that are same in query index and train index
    matches = []
    for i in matches1:
        for j in matches2:
            if i.queryIdx == j.trainIdx and i.trainIdx == j.queryIdx:
                matches.append(i)
    
    return matches

这段代码实现了一个匹配算法 mini_square，用于比较两个特征描述符集合 des1 和 des2，并根据特征之间的距离找到最匹配的点对。该算法的基本思路是通过最小距离匹配，确定哪些特征点在两组特征描述符中相互匹配
代码逻辑：
1.[p1, n] = des1.shape [p2, n] = des2.shape ：获取描述符的维度，其中p1是个数，n为匹配符维度，
2. 找到 des1 中每个描述符到 des2 中所有描述符的最小距离
3. 对于 des1 中的每个描述符 des1[j]，计算它与 des2 中所有描述符 des2[k] 的欧几里得距离的平方（通过向量差的内积计算）。
然后，使用 np.argmin(dis) 找到距离最小的那个描述符的索引 pos，并将对应的距离和匹配关系存入 matches1 列表中。
4.对des2做同样的操作，然后进行双向匹配验证，如果某个匹配对的 queryIdx 和 trainIdx 在两者之间是相互一致的（即 des1 和 des2 之间的匹配是对称的），就将该匹配对存入 matches 列表中。

# Find SIFT and return Homography Matrix
def SIFT(im1, im2):
    # im1 and im2 are grayscale image

    # Initialize SIFT 
    sift=cv2.SIFT_create() 
    
    # turn to grayscale
    im1 = cv2.cvtColor(im1,cv2.COLOR_BGR2GRAY)
    im2 = cv2.cvtColor(im2,cv2.COLOR_BGR2GRAY)

    # Extract keypoints and descriptors
    kp1,des1=sift.detectAndCompute(im1,None)
    kp2,des2=sift.detectAndCompute(im2,None)

    # match the points
    # using DMatch instead of matrix. matrix not so powerful and easily went wrong
    matches = mini_square(des1,des2)
    
    # Mimnum number of matches
    min_matches=8
    if len(matches)>min_matches:
        return matches,kp1,kp2
    else:
        print('Error: Not enough matches')
        exit()

初始化 SIFT，将图像转换为灰度，提取关键点和描述符，用刚才定义的mini_square匹配描述符，如果匹配值小于8个，那就报错g。

# Use SIFT to find keypoints and match the keypoints
[kpm,kp1,kp2] = SIFT(im1, im2)

match_res = cv2.drawMatches(img1, kp1, img2, kp2, kpm, None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
plt.figure( figsize=(15,15) )
plt.imshow(match_res)
plt.show()

# method 2
# plot matching line
mat_images=np.hstack((im1,im2))
line_num = len(kpm)
_,y_com,_ = im1.shape

line_size = 1
thickness = 1 #  0, 2, 4, 8

# draw lines between matching pairs of points
for index in kpm:
    # random color
    bgr = np.random.randint(0, 255, 3, dtype=np.int32)

    # ps-the start of the line
    # pe-the end of the line
    ps = int(index.queryIdx)
    pe = int(index.trainIdx)

    # draw line
    cv2.line(mat_images,[int(kp1[ps].pt[0]),int(kp1[ps].pt[1])],
             [int(kp2[pe].pt[0]+y_com),int(kp2[pe].pt[1])],(int(bgr[0]),int(bgr[1]),int(bgr[2])),thickness)

# show the matching points
cv2.imshow('Matches points Images',mat_images)
cv2.waitKey(0)
cv2.destroyWindow('Matches points Images')
cv2.waitKey(1)

1.通过 SIFT 函数提取两个图像 im1 和 im2 的关键点和描述符，并将匹配结果返回。kpm 是匹配点的集合，kp1 和 kp2 分别是两个图像的关键点列表。
2.绘制匹配点 (方法1：cv2.drawMatches)
3.自定义绘制匹配点连线 (方法2)
在这里插入图片描述
可以看到，大部分还行，但是少部分的匹配是纯在糊弄人，也就是之后我们需要用RANSAC来优化的部分。

2.RANSAC

Random Sample Consensus）是一种用于估计模型参数的迭代方法，特别适合于数据包含大量噪声或离群点的情况。它在计算机视觉、图像处理、机器人定位等领域非常常用。
RANSAC的主要思想：
随机采样：从数据集中随机抽取一个小样本（最少数量的点），用于拟合模型。
拟合模型：使用选取的样本来计算模型的参数。
验证模型：通过将该模型应用于整个数据集，判断有多少点符合这个模型（称为内点），这些内点是接近模型的样本点。
迭代：多次重复上述步骤，每次都可能生成不同的模型，并统计模型的内点数量。
选择最佳模型：最终，选择具有最多内点的模型作为结果。
接下来我们就使用RANSAC来处理我们刚才那些出问题的点。
题目要求：
Compute the best homography matrix using RANSAC, and show all the inlier matches (matches that support your best homography matrix):

# homography computing
def hmat(points):
    row = len(points)
    n_mat = int(row/2)
    A = np.zeros([row,9])
    
    for i in range(0,n_mat):
        h1_x = points[i][0]
        h1_y = points[i][1]
        h2_x = points[i+n_mat][0]
        h2_y = points[i+n_mat][1]
        A[2*i][:] = [h1_x,h1_y,1,0,0,0,-h2_x*h1_x,-h2_x*h1_y,-h2_x]
        A[2*i+1][:] = [0,0,0,h1_x,h1_y,1,-h2_y*h1_x,-h2_y*h1_y,-h2_y]

    _, _, vt = linalg.svd(A)
    H = vt[-1].reshape(3,3)
    H = H / H[2,2] # H(3,3) = 1

    return H

计算单应性矩阵，大概逻辑可以理解为，构建方程矩阵 A，然后对A进行SVD奇异值分解，然后解向量重塑后做标准化。

# count inliers
def inlier(H,mp,matches,threshold):
    row = len(mp)
    p1 = mp[:,0:2]
    p2 = mp[:,2:4]
    
    x1 = np.concatenate((p1,np.ones([row,1])),axis = 1)
    # transfer 
    x2 = np.dot(H,x1.T)
    x2 = x2[0:2,:]/x2[2,:]
    es_p2 = x2[0:2,:].T
    
    # distance
    diff = p2 - es_p2
    sq = diff * diff
    dis = np.sqrt(sq[:,0] + sq[:,1])
    
    # calculate the number of points that distance < threshold
    # inliers' position
    match = []
    for i in range(0,row):
        if dis[i] < threshold:
            match.append(matches[i])
    
    return match

这段代码定义了一个 inlier 函数，用来计算在给定的单应性矩阵（Homography Matrix） H 下，哪些匹配点是内点（inliers）。即，这些点在通过单应性变换后，其投影误差小于指定的阈值 threshold。函数返回那些符合条件的匹配点。inlier 函数的目标是通过给定的单应性矩阵 H，计算变换后点对之间的距离。若距离小于某个阈值，则认为该匹配点对是一个内点，否则是外点（outlier）。
代码逻辑：
1.为图像1的点添加齐次坐标，为图像1中的点 p1 添加一个维度，使得每个点变为齐次坐标形式，即 (x, y, 1)，方便进行矩阵乘法。
2.np.dot(H, x1.T)：通过单应性矩阵 H 变换图像1的点 x1，计算其在图像2中的对应点 x2。
x2[0:2, :] / x2[2, :]：由于点变换后仍然是齐次坐标，需要将前两维除以第三维，将其转回到二维坐标。
es_p2：表示由单应性矩阵 H 变换得到的图像2中的估计点集。
3.计算图像2中的真实点与估计点之间的距离
4.对每个匹配点，检查其误差是否小于给定的阈值 threshold。阈值是输入的。之后要通过对比不断降低阈值。
如果距离小于阈值，说明该点是一个内点，将该匹配点加入 match 列表。
5.返回内点

# find whether there is same value in a matrix
def judgequale(a):
    count = 0
    # sort the matrix by the 1st col
    a = np.sort(a)
    while True:
        if a[count,0] == a[count+1,0]:
            return True
        elif count == len(a)-2:
            return False
        else:
            count += 1

目标：该函数旨在检查矩阵的第一列是否存在重复值。
逻辑：
1.通过 np.sort 将矩阵按列排序（应改为 axis=0）。
2.遍历每一行的第一列，逐个比较相邻行的第一列值是否相等。
3.如果找到相同值，则返回 True；否则，遍历结束后返回 False。
重点：
接下来这个 homo_ransac 函数使用 RANSAC 算法来估计图像之间的单应性矩阵（Homography Matrix），并返回最佳的单应性矩阵以及对应的内点（inliers）

# RANSAC + Homography
def homo_ransac(kp1,kp2,matches,repeat):
    # n the pairs of keypoints selected randomly (n >= 4)
    n = 4
    # loop times
    loop = repeat
    # HH the largest inliers H
    HH = []
    # threshold of the inliers distance
    threshold = 1
    # inliers box
    num_in = 0
    
    # change the kp1 and kp2 from tuple to list
    N = len(matches)
    k1 = np.zeros([N,2])
    k2 = np.zeros([N,2])

    # Get the coordinate of keypoint
    for i in range(N):
        for j in range(2):
            k1[i,j] = kp1[matches[i].queryIdx].pt[j]
            k2[i,j] = kp2[matches[i].trainIdx].pt[j]
    
    # matched points
    mp = np.concatenate((k1,k2),axis = 1)
    
    for i in range(0,loop):
        # Use four random pair of keypoints to compute the homography matrix
        rdm = random.sample(range(0,N),4)
        points = np.array([[k1[rdm[0]][0],k1[rdm[0]][1]],[k1[rdm[1]][0],k1[rdm[1]][1]],
                      [k1[rdm[2]][0],k1[rdm[2]][1]],[k1[rdm[3]][0],k1[rdm[3]][1]],
                            [k2[rdm[0]][0],k2[rdm[0]][1]],[k2[rdm[1]][0],k2[rdm[1]][1]],
                      [k2[rdm[2]][0],k2[rdm[2]][1]],[k2[rdm[3]][0],k2[rdm[3]][1]]])
        
        # compute H
        H = hmat(points)
        
        # count inliers
        match = inlier(H,mp,matches,threshold)
        
        # keep the max inliers parameter
        if len(match) > num_in:
            num_in = len(match)
            HH = H
            match_best = match

    return HH,match_best

参数
kp1：图像 1 的关键点列表，每个关键点包含坐标信息。
kp2：图像 2 的关键点列表，每个关键点包含坐标信息。
matches：关键点匹配的列表，每个匹配包括两个图像中的关键点索引。
repeat：RANSAC 算法的迭代次数。
n = 4：计算单应性矩阵所需的最小点对数（至少 4 对）。
loop = repeat：RANSAC 的迭代次数，定义了算法的重复次数。
HH：用于存储当前找到的最优单应性矩阵。
threshold：用于确定内点的距离阈值，距离小于该值的点被认为是内点。
num_in：当前找到的最大内点数。
N：匹配点对的总数。
1.k1 和 k2：初始化数组，用于存储图像 1 和图像 2 中匹配点的坐标。
2.通过循环填充 k1 和 k2 数组，其中 k1 存储图像 1 中的匹配点坐标，k2 存储图像 2 中的匹配点坐标。
3.mp：将 k1 和 k2 合并，形成一个包含所有匹配点坐标的矩阵。每一行包含一个匹配点对的四个坐标值：[x1, y1, x2, y2]。
4.循环 for i in range(0,loop)：进行 repeat 次迭代。
random.sample(range(0,N), 4)：从所有匹配点中随机选择 4 对点，计算单应性矩阵。
points：将选择的点对提取并整理成矩阵。
H = hmat(points)：计算单应性矩阵 H。
match = inlier(H, mp, matches, threshold)：计算在单应性矩阵 H 下的内点。
if len(match) > num_in：如果当前找到的内点数大于之前的最大值，则更新最佳单应性矩阵 HH 和最佳内点 match_best。
5.HH：最佳的单应性矩阵。
match_best：与最佳单应性矩阵 HH 对应的内点集合。
通过多次迭代尝试找到最佳模型（单应性矩阵）来适应大部分数据点（内点），从而提高估计的准确性。

match_bestres = cv2.drawMatches(img1, kp1, img2, kp2, match_best, None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
plt.figure( figsize=(15,15) )
plt.imshow(match_bestres)
plt.show()

在这里插入图片描述

3.拼接透视图

# recalculate H matrix with all inliers

# change the inliers from tuple to list
N = len(match_best)
in1 = np.zeros([N,2])
in2 = np.zeros([N,2])

# Get the coordinate of keypoint
for i in range(N):
    for j in range(2):
        in1[i] = [kp1[match_best[i].queryIdx].pt[1],kp1[match_best[i].queryIdx].pt[0]]
        in2[i] = [kp2[match_best[i].trainIdx].pt[1],kp2[match_best[i].trainIdx].pt[0]]

# compare the Homography compute from the random points and the inliers
in_points = np.concatenate((in1,in2),axis = 0)
H_best = hmat(in_points)
print(H_best)
print(HH)

所有通过 RANSAC 找到的内点（匹配的点对）重新计算更精确的单应性矩阵

# transform h1 to h2
def get_size(h1,H):
    [row,col,c] = h1.shape
    
    # 4 cornor of the image h1
    lt = np.array([[0,0,1]])
    rt = np.array([[0,col,1]])
    lb = np.array([[row,0,1]])
    rb = np.array([[row,col,1]])
    
    # edge matrix
    edge = np.concatenate((lt,rt,lb,rb),axis = 0).T
    
    T_edge = np.dot(H,edge)
    # normalize
    T_edge = T_edge[0:2,:]/T_edge[2,:]
    T_edge = T_edge
    
    return np.max(T_edge[0,:]),np.min(T_edge[0,:]),np.max(T_edge[1,:]),np.min(T_edge[1,:])

通过给定的单应性矩阵
𝐻，将图像 ℎ1的四个角点变换到新的坐标系中，并计算变换后的坐标范围（最大和最小的 x、y 值），从而获得变换后图像的尺寸范围。

# build a canvas
def get_canvas(im1,im2,Hi):
    # Hi is im2 to im1
    [row1,col1,c] = im1.shape
    [row2,col2,c] = im2.shape

    # get the edge
    [x_max,x_min,y_max,y_min] = get_size(im2,Hi)
    print([x_min,x_max,y_min,y_max])
    x_minus = int(np.min([0,x_min])-10)
    x_plus = int(np.max([0,x_max-row1])+10)
    y_minus = int(np.min([0,y_min])-10)
    y_plus = int(np.max([0,y_max-col1])+10)
    
    # x direction adding space
    add_x_minus = np.zeros([-x_minus,col1,c])
    add_x_plus = np.zeros([x_plus,col1,c])
    
    # y direction adding space
    add_y_minus = np.zeros([-x_minus+row1+x_plus,-y_minus,c])
    add_y_plus = np.zeros([-x_minus+row1+x_plus,y_plus,c])
    
    # concatenate the adding space with im1
    canvas = np.concatenate((add_x_minus,im1,add_x_plus),axis = 0)
    canvas = np.concatenate((add_y_minus,canvas,add_y_plus),axis = 1)
    
    return canvas.astype(int),[x_minus,y_minus]

这个 get_canvas 函数的目标是为图像拼接准备一个足够大的画布（canvas），以容纳变换后的两张图像。它会根据图像变换后的边界调整画布大小，并在图像周围添加适当的空白区域，使图像能够完整地显示在同一个画布上。
代码逻辑：
1.获取输入图像的尺寸,计算第二张图像通过单应性矩阵 𝐻𝑖变换后的边界
2.根据边界确定在画布中需要添加的空白区域
3.为画布的 x 方向和 y 方向添加空白区域
4.将原图与空白区域拼接，构建画布
5.返回画布和空白区域的位移量

# im2 to im1
Hi = linalg.inv(H_best)

# get canvas and axis compensation
canvas,compensation = get_canvas(img1,img2,Hi)

# show the canvas
print(compensation)
plt.figure( figsize=(15,15) )
plt.imshow(canvas)

def trans2im1(canvas,im2,Hi,compensation):
    # the wight of the im1 when overlay
    w = 0.6
    
    # step of the transmitting 1/step
    step = 3
    [row,col,c] = im2.shape
    
    # draw im2 on the canvas
    for j in range(0,row*step):
        for k in range(0,col*step):
            # positon: points of im2 in im1 plane 
            p = np.array([[j/step,k/step,1]]).T
            Tp = np.dot(Hi,p)

            # normalize and move the image
            x2 = int(round(Tp[0,0]/Tp[2,0])-compensation[0])
            y2 = int(round(Tp[1,0]/Tp[2,0])-compensation[1])
            
            # position in im1
            x1 = int(j/step)
            y1 = int(k/step)

            if np.any(canvas[x2,y2] > 0):
                canvas[x2,y2] = w * canvas[x2,y2][:] + (1 - w) * im2[x1,y1][:]
            else:
                canvas[x2,y2] = im2[x1,y1][:]
    
    return canvas.astype(int)
# transfer im2 to im1 plane
Tim1 = trans2im1(canvas,img2,Hi,compensation)

# show the sitching image
plt.figure( figsize=(15,15) )
plt.imshow(Tim1)