计算机视觉第三章图像到图像的映射

鲸~屿

已于 2024-08-27 23:56:13 修改

阅读量720

点赞数 11

文章标签：计算机视觉人工智能

于 2024-08-27 22:37:29 首次发布

本文链接：https://blog.csdn.net/qq_53034510/article/details/141591962

版权

3.1 单应性变换

单应性变换是将一个平面内的点映射到另一个平面内的二维投影变换。单应性变换可用于图像配准、图像纠正和纹理扭曲，以及创建全景图像。实际上，单应性变换 $H$ ，按照下面的方程映射二维中的点：

$\begin{bmatrix} x'\\ y'\\ w' \end{bmatrix}=\begin{bmatrix} h_{1}& h_{2} & h_{3}\\ h_{4}& h_{5} & h_{6}\\ h_{7}& h_{8} & h_{9} \end{bmatrix}\begin{bmatrix} x\\ y\\ w \end{bmatrix}$ 或 x'=Hx

对点进行归一化和转换齐次坐标的功能实现代码如下所示

from numpy import *
def normalize(points):
    """ 在齐次坐标意义下，对点集进行归一化，使最后一行为1 """
    for row in points:
      row /= points[-1]
    return points
def make_homog(points):
    """ 将点集（dim×n的数组）转换为齐次坐标表示 """
    return vstack((points,ones((1,points.shape[1]))))

在这些投影变换中，有一些特别重要的变换。

（1）仿射变换

$\begin{bmatrix} x'\\ y'\\ 1 \end{bmatrix}=\begin{bmatrix} a_{1}& a_{2} & t_{x}\\ a_{3}& a_{4} & t_{y}\\ 0& 0 & 1 \end{bmatrix}\begin{bmatrix} x\\ y\\ 1 \end{bmatrix}$ 或 $x'=\begin{bmatrix} A & t \\ 0 & 1 \end{bmatrix} x$

（2）相似变换

$\begin{bmatrix} x'\\ y'\\ 1 \end{bmatrix}=\begin{bmatrix} scos(\theta )& -ssin(\theta ) & t_{x}\\ ssin(\theta )& scos(\theta ) & t_{y}\\ 0& 0 & 1 \end{bmatrix}\begin{bmatrix} x\\ y\\ 1 \end{bmatrix}$ 或 $x'=\begin{bmatrix} sR & t \\ 0 & 1 \end{bmatrix} x$

3.1.1 直接线性变换算法

直接线性变换（DLT）算法是一种用于计算从一个平面（或空间）到另一个平面（或空间）的变换矩阵的方法。它给定４个或者更多对应点对矩阵，来计算单应性矩阵H的算法。将单应性矩阵H作用在对应点对上，重新写出该方程，我们可以得到下面的方程：

或者Ah=0，其中A是一个具有对应点对二倍数量行数的矩阵。将这些对应点对方程的系数堆叠到一个矩阵中，我们可以使用奇异值分解算法找到H的最小二乘解。我们可以用以下代码来实现：

def H_from_points(fp,tp):
    """使用线性DLT方法，计算单应性矩阵H，使fp映射到tp。点自动进行归一化"""
    if fp.shape != tp.shape:
        raise RuntimeError('number of points do not match')
    # 对点进行归一化（对数值计算很重要）
    # ---映射起始点--
    m = mean(fp[:2], axis=1)
    maxstd = max(std(fp[:2], axis=1)) + 1e-9
    C1 = diag([1/maxstd, 1/maxstd, 1])
    C1[0][2] = -m[0]/maxstd
    C1[1][2] = -m[1]/maxstd
    fp = dot(C1,fp)
    
    # ---映射对应点--
    m = mean(tp[:2], axis=1)
    maxstd = max(std(tp[:2], axis=1)) + 1e-9
    C2 = diag([1/maxstd, 1/maxstd, 1])
    C2[0][2] = -m[0]/maxstd
    C2[1][2] = -m[1]/maxstd
    tp = dot(C2,tp)
    
    #创建用于线性方法的矩阵，对于每个对应对，在矩阵中会出现两行数值
    nbr_correspondences = fp.shape[1]
    A = zeros((2*nbr_correspondences,9))
    for i in range(nbr_correspondences):
        A[2*i] = [-fp[0][i],-fp[1][i],-1,0,0,0,
           tp[0][i]*fp[0][i],tp[0][i]*fp[1][i],tp[0][i]]
        A[2*i+1] = [0,0,0,-fp[0][i],-fp[1][i],-1,
           tp[1][i]*fp[0][i],tp[1][i]*fp[1][i],tp[1][i]]
    U,S,V = linalg.svd(A)
    H = V[8].reshape((3,3))
    
    #反归一化
    H = dot(linalg.inv(C2),dot(H,C1))
    
    #归一化，然后返  回
    return H / H[2,2]

3.1.2 仿射变换

由于仿射变换具有6个自由度，因此我们需要三个对应点对来估计矩阵H。假设 $h_{7} =h_{8} =0$ ，我们可以用以下代码实现：

def Haffine_from_points(fp,tp):
    """ 计算 H，仿射变换，使得tp是fp经过仿射变换H得到的""" 
    if fp.shape != tp.shape:
        raise RuntimeError('number of points do not match')
    # 对点进行归一化
    # --- 映射起始点--
    m = mean(fp[:2], axis=1)
    maxstd = max(std(fp[:2], axis=1)) + 1e-9
    C1 = diag([1/maxstd, 1/maxstd, 1])
    C1[0][2] = -m[0]/maxstd
    C1[1][2] = -m[1]/maxstd
    fp_cond = dot(C1,fp)
    # --- 映射对应点--
    m = mean(tp[:2], axis=1)
    C2 = C1.copy() # 两个点集，必须都进行相同的缩放
    C2[0][2] = -m[0]/maxstd
    C2[1][2] = -m[1]/maxstd
    tp_cond = dot(C2,tp)
    # 因为归一化后点的均值为0，所以平移量为0
    A = concatenate((fp_cond[:2],tp_cond[:2]), axis=0)
    U,S,V = linalg.svd(A.T)
    # 如Hartley 和Zisserman 著的Multiple View Geometry in Computer, Scond Edition 所示，
    #创建矩阵B和C
    tmp = V[:2].T
    B = tmp[:2]
    C = tmp[2:4]
    #反归一化
    tmp2 = concatenate((dot(C,linalg.pinv(B)),zeros((2,1))), axis=1)
    H = vstack((tmp2,[0,0,1]))
    H = dot(linalg.inv(C2),dot(H,C1))
    return H / H[2,2]

同样地，类似于DLT算法，这些点需要经过预处理和去处理化操作。

3.2 图像扭曲

图像扭曲指的是通过几何变换（如旋转、缩放、平移）或非线性变换对图像进行修改，从而改变图像的形状或结构的过程。扭曲可以用于图像处理的多个方面，例如在图像增强、特效制作、视觉艺术等。扭曲操作可以使用 SciPy 工具包中的ndimage包来简单完成。其命令为

transformed_im = ndimage.affine_transform(im,A,b,size)

from scipy import ndimage
from PIL import Image
from pylab import *

im = array(Image.open('xiaozhou.jpg').convert('L'))
H = array([[1.4,0.05,-100],[0.05,1.5,-100],[0,0,1]])
im2 = ndimage.affine_transform(im,H[:2,:2],(H[0,2],H[1,2]))
figure()
gray()
imshow(im2)
show()

运行结果如下图所示，不难发现，输出图像结果中丢失的像素用零来填充。

3.2.1 图像中的图像

将图像或者图像的一部分放置在另一幅图像中，使得它们能够和指定的区域或者标记物对齐。实现代码如下：

import homography

def image_in_image(im1,im2,tp):
    """ 使用仿射变换将im1放置在im2上，使im1图像的角和tp尽可能的靠近
        tp 是齐次表示的，并且是按照从左上角逆时针计算的"""
    #扭曲的点
    m,n = im1.shape[:2]
    fp = array([[0,m,m,0],[0,0,n,n],[1,1,1,1]])
    # 计算仿射变换，并且将其应用于图像im1
    H = homography.Haffine_from_points(tp,fp)
    im1_t = ndimage.affine_transform(im1,H[:2,:2],
          (H[0,2],H[1,2]),im2.shape[:2])
    alpha = (im1_t > 0)
    return (1-alpha)*im2 + alpha*im1_t

def image_in_image(im1,im2,tp):
    """ 使用仿射变换将im1放置在im2上，使im1图像的角和tp尽可能的靠近
        tp 是齐次表示的，并且是按照从左上角逆时针计算的"""
    #扭曲的点
    m,n = im1.shape[:2]
    fp = np.array([[0,m,m,0],[0,0,n,n],[1,1,1,1]])
    # 计算仿射变换，并且将其应用于图像im1
    H = Haffine_from_points(tp,fp)
    im1_t = ndimage.affine_transform(im1,H[:2,:2],(H[0,2],H[1,2]),im2.shape[:2])
    alpha = (im1_t > 0)
    return (1 - alpha) * im2 + alpha * im1_t

im1 = np.array(Image.open('xiaozhou.jpg').convert('L'))
im2 = np.array(Image.open('touxiang.jpg').convert('L'))
tp = np.array([[250,500,500,250],[40,40,600,600],[1,1,1,1]])
im3 = image_in_image(im2,im1,tp)
figure()
gray()
imshow(im3)
axis('equal')
axis('off')
show()

运行结果如下图所示

需要注意，标记物的坐标tp是用齐次坐标意义下的坐标表示的，改变了tp的坐标值，那么原始图像2在原始图像1中的位置也不同。

3.2.2 分段仿射扭曲

分段仿射扭曲是图像处理中的一种技术，它将图像分成多个区域，用不同的仿射变换（如旋转、缩放、平移）对每个区域进行变形。给定任意图像的标记点，通过将这些点进行三角剖分，然后使用仿射扭曲来扭曲每个三角形，我们可以将图像和另一幅图像的对应标记点扭曲对应。为了三角化这些点，我们经常使用狄洛克三角剖分方法，方法如下：

x,y = array(np.random.standard_normal((2,100)))
tri = Delaunay(np.c_[x,y]).simplices
figure()
for t in tri:
    t_ext = [t[0], t[1], t[2], t[0]] # 将第一个点加入到最后
    plot(x[t_ext],y[t_ext],'r')
plot(x,y,'*')
axis('off')
show()

运行结果为：

现在让我们将该算法应用于一个例子，在该例子中，在5×6的网格上使用30个控制点，将一幅图像扭曲到另一幅图像中的非平坦区域。目标点是使用ginput()函数手工选取出来的，将结果保存在turningtorso_points.txt 文件中。

def triangulate_points(x,y):
  """ 二维点的Delaunay 三角剖分"""
  tri = Delaunay(np.c_[x,y]).simplices
  return tri

def pw_affine(fromim,toim,fp,tp,tri):
    """ 从一幅图像中扭曲矩形图像块
    fromim= 将要扭曲的图像
    toim= 目标图像
    fp= 齐次坐标表示下，扭曲前的点
    tp= 齐次坐标表示下，扭曲后的点
    tri= 三角剖分 """
    im = toim.copy()

    #检查图像是灰度图像还是彩色图象
    is_color = len(fromim.shape) == 3
    #创建扭曲后的图像（如果需要对彩色图像的每个颜色通道进行迭代操作，那么有必要这样做）
    im_t = np.zeros(im.shape, 'uint8')
    for t in tri:
    #计算仿射变换
        H = homography.Haffine_from_points(tp[:,t],fp[:,t])
    if is_color:
        for col in range(fromim.shape[2]):
            im_t[:, :, col] = ndimage.affine_transform(
                fromim[:, :, col], H[:2, :2], (H[0, 2], H[1, 2]), im.shape[:2])
        else:
            im_t = ndimage.affine_transform(
                fromim, H[:2, :2], (H[0, 2], H[1, 2]), im.shape[:2])
        # 三角形的alpha
        alpha = alpha_for_triangle(tp[:, t], im.shape[0], im.shape[1])
        # 将三角形加入到图像中
        im[alpha > 0] = im_t[alpha > 0]
    return im

def plot_mesh(x,y,tri):
  """ 绘制三角形"""
  for t in tri:
    t_ext = [t[0], t[1], t[2], t[0]] # 将第一个点加入到最后
    plot(x[t_ext],y[t_ext],'r')


 # 打开图像，并将其扭曲
fromim = np.array(Image.open('xiaozhou.jpg'))
x,y = np.meshgrid(range(5),range(6))
x = (fromim.shape[1]/4) * x.flatten()
y = (fromim.shape[0]/5) * y.flatten()
# 三角剖分
tri = triangulate_points(x,y)
 # 打开图像和目标点
im = np.array(Image.open('touxiang.jpg'))
tp = np.loadtxt('turningtorso_points.txt') # destination points
# 将点转换成齐次坐标
fp = np.vstack((y,x,ones((1,len(x)))))
tp = np.vstack((tp[:,1],tp[:,0],np.ones((1,len(tp)))))
 # 扭曲三角形
im1 = pw_affine(fromim,im,fp,tp,tri)

# 绘制图像
figure()
imshow(im)
plot_mesh(tp[1],tp[0],tri)
axis('off')
show()

运行效果如下所示

3.3 创建全景图

创建全景图通常涉及将多张重叠的图片拼接在一起，生成一张宽视野的图像。这一过程有多个步骤，主要包括特征检测、特征匹配、图像配准、变换应用和图像合成。

3.3.1 RANSAC

RANSAC（随机抽样一致性算法）是一种用于数据估计和模型拟合的迭代方法，特别适合于处理含有异常值的数据集。它通过从数据中随机选择小的子集来估计模型参数，然后评估所有数据点是否与该模型一致，最终选取最优模型。实现代码如下所示

import numpy as np
import matplotlib.pyplot as plt


def generate_data(n_inliers, n_outliers):
    np.random.seed(42)
    # 生成内点
    x_inliers = np.random.uniform(0, 1, n_inliers)
    y_inliers = 2 * x_inliers + 1 + np.random.normal(0, 0.1, n_inliers)  # 线性关系 + 噪声

    # 生成外点
    x_outliers = np.random.uniform(0, 1, n_outliers)
    y_outliers = np.random.uniform(0, 2, n_outliers)  # 随机分布的外点

    # 合并内点和外点
    x_data = np.concatenate((x_inliers, x_outliers))
    y_data = np.concatenate((y_inliers, y_outliers))

    # 打乱数据
    indices = np.arange(len(x_data))
    np.random.shuffle(indices)
    return x_data[indices], y_data[indices]


def ransac(x_data, y_data, n_iterations=100, threshold=0.1):
    best_inliers_count = 0
    best_line = None

    for _ in range(n_iterations):
        # 随机选择两个点
        indices = np.random.choice(len(x_data), 2, replace=False)
        x_sample = x_data[indices]
        y_sample = y_data[indices]

        # 计算直线方程 y = mx + b
        m = (y_sample[1] - y_sample[0]) / (x_sample[1] - x_sample[0]) if (x_sample[1] - x_sample[0]) != 0 else 0
        b = y_sample[0] - m * x_sample[0]

        # 计算所有点到该线的距离
        distances = np.abs(y_data - (m * x_data + b))  # 计算垂直距离
        inliers = distances < threshold

        # 更新最佳模型
        inliers_count = np.sum(inliers)
        if inliers_count > best_inliers_count:
            best_inliers_count = inliers_count
            best_line = (m, b)

    return best_line, inliers


if __name__ == "__main__":
    # 生成数据
    x_data, y_data = generate_data(n_inliers=100, n_outliers=20)

    # 使用 RANSAC 拟合直线
    best_line, inliers = ransac(x_data, y_data, n_iterations=1000, threshold=0.1)

    # 绘制结果
    plt.scatter(x_data, y_data, color='red', label='Data Points')
    plt.scatter(x_data[inliers], y_data[inliers], color='green', label='Inliers')

    # 绘制拟合直线
    x_fit = np.linspace(0, 1, 100)
    y_fit = best_line[0] * x_fit + best_line[1]
    plt.plot(x_fit, y_fit, color='blue', label='RANSAC Fit')

    plt.legend()
    plt.title('RANSAC Line Fitting')
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.show()

上述代码运行结果如下。可以看到，该算法只选择了和直线模型一致的数据点，成功地找到了正确的解。

3.3.2 稳健的单应性矩阵估计

稳健的单应性矩阵估计通常涉及使用鲁棒性的方法来处理特征点匹配中的误配。这通常是通过使用 RANSAC（随机抽样一致性算法）来实现的。RANSAC 可以帮助我们从在含有噪声和杂项数据的情况下估计单应性矩阵（Homography Matrix），并仅选择一致的数据点（即匹配的特征点）来进行计算。以下是如何在 Python 中使用 OpenCV 的 RANSAC 方法来稳健估计单应性矩阵的步骤和示例代码。

import cv2
import numpy as np

# 加载图像
img1 = cv2.imread('bule_sky1.jpg') 
img2 = cv2.imread('bule_sky2.jpg') 

# 转换为灰度图
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

# 创建 ORB 检测器
orb = cv2.ORB_create()

# 找到关键点和描述子
keypoints1, descriptors1 = orb.detectAndCompute(gray1, None)
keypoints2, descriptors2 = orb.detectAndCompute(gray2, None)

# 创建 BFMatcher 对象
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

# 匹配描述子
matches = bf.match(descriptors1, descriptors2)

# 排序匹配结果
matches = sorted(matches, key=lambda x: x.distance)

# 选择良好的匹配点
good_matches = matches[:50]  # 选择前50个匹配点

# 获取匹配点的位置
points1 = np.zeros((len(good_matches), 2), dtype=np.float32)
points2 = np.zeros((len(good_matches), 2), dtype=np.float32)

for i, match in enumerate(good_matches):
    points1[i, :] = keypoints1[match.queryIdx].pt  # 图像1的关键点
    points2[i, :] = keypoints2[match.trainIdx].pt  # 图像2的关键点

# 使用 RANSAC 估计单应性矩阵
H, mask = cv2.findHomography(points1, points2, cv2.RANSAC)

# 选择一致的匹配点
matches_mask = mask.ravel().tolist()

# 绘制匹配结果
draw_params = dict(matchColor=(0, 0, 255), singlePointColor=(255, 0, 0), matchesMask=matches_mask, flags=0)
matched_img = cv2.drawMatches(img1, keypoints1, img2, keypoints2, good_matches, None, **draw_params)

# 调整显示结果的大小
scaled_matched_img = cv2.resize(matched_img, (1400, 500))  # 将图像缩放到800x600

# 显示结果
cv2.imshow("RANSAC Homography Matches", scaled_matched_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

# 输出单应性矩阵
print("Estimated Homography Matrix:")
print(H)

运行结果如下

3.3.3 拼接图像

图像拼接是将两幅或多幅图像合成一幅大的全景图像。使用已估计的单应性矩阵，可以通过透视变换将一幅图像映射到另一幅图像的坐标系中。以下是如何利用 OpenCV 将前面提到的两张图像拼接在一起的步骤。

import cv2
import numpy as np

# 加载图像
img1 = cv2.imread('bule_sky1.jpg')
img2 = cv2.imread('bule_sky2.jpg')

# 转换为灰度图
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

# 创建 ORB 检测器
orb = cv2.ORB_create()

# 找到关键点和描述子
keypoints1, descriptors1 = orb.detectAndCompute(gray1, None)
keypoints2, descriptors2 = orb.detectAndCompute(gray2, None)

# 创建 BFMatcher 对象
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

# 匹配描述子
matches = bf.match(descriptors1, descriptors2)

# 排序匹配结果
matches = sorted(matches, key=lambda x: x.distance)

# 选择良好的匹配点
good_matches = matches[:50]  # 选择前50个匹配点

# 获取匹配点的位置
points1 = np.zeros((len(good_matches), 2), dtype=np.float32)
points2 = np.zeros((len(good_matches), 2), dtype=np.float32)

for i, match in enumerate(good_matches):
    points1[i, :] = keypoints1[match.queryIdx].pt  # 图像1的关键点
    points2[i, :] = keypoints2[match.trainIdx].pt  # 图像2的关键点

# 使用 RANSAC 估计单应性矩阵
H, mask = cv2.findHomography(points1, points2, cv2.RANSAC)

# 定义拼接的新图像大小
height, width = img1.shape[:2]
corners = np.array([[0, 0], [0, height-1], [width-1, 0], [width-1, height-1]], dtype='float32')
new_corners = cv2.perspectiveTransform(corners.reshape(-1, 1, 2), H)

# 计算拼接图像的全局范围
all_corners = np.vstack((new_corners, [[0, 0]], [[img2.shape[1], 0]], [[img2.shape[1], img2.shape[0]]], [[0, img2.shape[0]]]))
[x_min, y_min] = np.int32(all_corners.min(axis=0).flatten())
[x_max, y_max] = np.int32(all_corners.max(axis=0).flatten())

# 创建平移矩阵以安置新图像
trans_mat = np.array([[1, 0, -x_min], [0, 1, -y_min], [0, 0, 1]])
result_image = cv2.warpPerspective(img2, trans_mat @ H, (x_max - x_min, y_max - y_min))
result_image[-y_min:height - y_min, -x_min:width - x_min] = img1

# 显示结果
cv2.imshow("Stitched Image", result_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

上述代码只是展示了图像拼接的步骤和原理，还需要注意以下事项。首先，图像拼接可能需要进一步处理（例如，图像边缘平滑和色彩调整），以获得更好的融合效果。其次，在关键点匹配和单应性估计中，选择和过滤匹配点会直接影响拼接的结果。

鲸~屿

关注

11
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
计算机视觉第三章图像到图像的映射

给定任意图像的标记点，通过将这些点进行三角剖分，然后使用仿射扭曲来扭曲每个三角形，我们可以将图像和另一幅图像的对应标记点扭曲对应。图像扭曲指的是通过几何变换（如旋转、缩放、平移）或非线性变换对图像进行修改，从而改变图像的形状或结构的过程。现在让我们将该算法应用于一个例子，在该例子中，在5×6的网格上使用30个控制点，将一幅图像扭曲到另一幅图像中的非平坦区域。需要注意，标记物的坐标tp是用齐次坐标意义下的坐标表示的，改变了tp的坐标值，那么原始图像2在原始图像1中的位置也不同。上述代码运行结果如下。
复制链接

扫一扫