【计算机视觉】图像映射与全景拼接

最新推荐文章于 2024-04-26 21:57:13 发布

JMU-HZH

最新推荐文章于 2024-04-26 21:57:13 发布

阅读量1.7k

点赞数 3

文章标签：计算机视觉算法人工智能

本文链接：https://blog.csdn.net/qq_45603919/article/details/124647614

版权

文章目录

【计算机视觉】图像映射与全景拼接

【计算机视觉】图像映射与全景拼接

1. 图像映射

图像映射流程
1.提取特征点，生成描述符
2.特征匹配
3.根据图像变换特点，选取合适的变换类型
4.根据单应性变换等方法计算变换结构
5.采用正向或逆向映射，利用插值方式实现图像映射变换

图像变换类型
1.刚体变换：只改变物体位置，不改变物体形状（如平移、旋转）
2.仿射变换：改变物体位置和形状，但是保持“平直性”
3.投影变换：彻底改变物体位置和形状

1.1 单应性变换

单应性变换是将一个平面内的点映射到另一个平面内的二维投影变换。实现单应变换的关键是求出单应性矩阵，而针对不同的变换类型需要相应数量的对应点对来求出相应的单应性矩阵（求解矩阵中的未知数）。

不同变换需要相应的对应点对（一个点对分别得到x和y两个方程）
平移变换：2个自由度（需要一个点对）
相似变换：4个自由度（需要两个点对）
仿射变换：6个自由度（需要三个点对）
投影变换：8个自由度（需要四个点对）

我们通过分别点对，求解出单应性矩阵后就可以将其带入图像的所有像素点中获得经单应性变换后的图像。单应性矩阵求解过程如下：

（1） 其中x,y为原图像的像素坐标，x’,y’为变换后图像的像素坐标
$\left[\begin{array}{l} x^{\prime} \\ y^{\prime} \\ w^{\prime} \end{array}\right]=\left[\begin{array}{lll} h_{1} & h_{2} & h_{3} \\ h_{4} & h_{5} & h_{6} \\ h_{7} & h_{8} & h_{9} \end{array}\right]\left[\begin{array}{l} x \\ y \\ w \end{array}\right] \text { 或 } \mathbf{x}^{\prime}=\boldsymbol{H} \mathbf{x}$
（2） 令w=1，目的：进行点的归一化
$\left[\begin{array}{c} w x^{\prime} \\ w y^{\prime} \\ w \end{array}\right]=\left[\begin{array}{lll} h_{00} & h_{01} & h_{02} \\ h_{10} & h_{11} & h_{12} \\ h_{20} & h_{21} & h_{22} \end{array}\right]\left[\begin{array}{l} x \\ y \\ 1 \end{array}\right]$
（3） 将其转化为Ah = 0的形式
$\begin{aligned} x_{i}^{\prime} &=\frac{h_{00} x_{i}+h_{01} y_{i}+h_{02}}{h_{20} x_{i}+h_{21} y_{i}+h_{22}} \\ y_{i}^{\prime} &=\frac{h_{10} x_{i}+h_{11} y_{i}+h 12}{h_{20} x_{i}+h_{21} y_{i}+h_{22}} \end{aligned}\quad\bold{(1)}$

$\begin{aligned} &x_{i}^{\prime}\left(h_{20} x_{i}+h_{21} y_{i}+h_{22}\right)=h_{00} x_{i}+h_{01} y_{i}+h_{02} \\ &y_{i}^{\prime}\left(h_{20} x_{i}+h_{21} y_{i}+h_{22}\right)=h_{10} x_{i}+h_{11} y_{i}+h_{12} \end{aligned}\quad\bold{(2)}$

$\left[\begin{array}{ccccccccc} x_{i} & y_{i} & 1 & 0 & 0 & 0 & -x_{i}^{\prime} x_{i} & -x_{i}^{\prime} y_{i} & -x_{i}^{\prime} \\ 0 & 0 & 0 & x_{i} & y_{i} & 1 & -y_{i}^{\prime} x_{i} & -y_{i}^{\prime} y_{i} & -y_{i}^{\prime} \end{array}\right]\left[\begin{array}{l} h_{00} \\ h_{01} \\ h_{02} \\ h_{10} \\ h_{11} \\ h_{12} \\ h_{20} \\ h_{21} \\ h_{22} \end{array}\right]=\left[\begin{array}{l} 0 \\ 0 \end{array}\right]\quad\bold{(3)}$

$\left[\begin{array}{ccccccccc} x_{1} & y_{1} & 1 & 0 & 0 & 0 & -x_{1}^{\prime} x_{1} & -x_{1}^{\prime} y_{1} & -x_{1}^{\prime} \\ 0 & 0 & 0 & x_{1} & y_{1} & 1 & -y_{1}^{\prime} x_{1} & -y_{1}^{\prime} y_{1} & -y_{1}^{\prime} \\ \cdots \cdots & & & & & & & & \\ x_{n} & y_{n} & 1 & 0 & 0 & 0 & -x_{n}^{\prime} x_{n} & -x_{n}^{\prime} y_{n} & -x_{n}^{\prime} \\ 0 & 0 & 0 & x_{n} & y_{n} & 1 & -y_{n}^{\prime} x_{n} & -y_{n}^{\prime} y_{n} & -y_{n}^{\prime} \end{array}\right]\left[\begin{array}{c} h_{00} \\ h_{01} \\ h_{02} \\ h_{10} \\ h_{11} \\ h_{12} \\ h_{20} \\ h_{21} \\ h_{22} \end{array}\right]=\left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 0 \\ 0 \end{array}\right]\quad\bold{(4)}$

（4） 当我们凑齐Ah = 0的公式后，A是一个对应点对二倍数量的行数的一个矩阵（比如要实现投影变换需要4个点对，这个A矩阵就有2x4=8行）。所以我们根据不同的图像变换类型设置不同的点对数n，A矩阵的维度为2n x 9，h矩阵的维度为9，0矩阵的维度为2n，最后根据最小二乘法进行求解。
$\hat{h}=A^{T} A \\ \text { 求最小特征值对应的特征向量 }$
（5） 代码验证

import numpy as np
from pylab import *
from PIL import Image

def H_from_points(fp,tp):
    """ Find homography H, such that fp is mapped to tp
        using the linear DLT method. Points are conditioned
        automatically. """
    
    if fp.shape != tp.shape:
        raise RuntimeError('number of points do not match')
        
    # condition points (important for numerical reasons)
    # --from points--
    m = mean(fp[:2], axis=1)
    maxstd = max(std(fp[:2], axis=1)) + 1e-9
    C1 = diag([1/maxstd, 1/maxstd, 1]) 
    C1[0][2] = -m[0]/maxstd
    C1[1][2] = -m[1]/maxstd
    fp = dot(C1,fp)
    
    # --to points--
    m = mean(tp[:2], axis=1)
    maxstd = max(std(tp[:2], axis=1)) + 1e-9
    C2 = diag([1/maxstd, 1/maxstd, 1])
    C2[0][2] = -m[0]/maxstd
    C2[1][2] = -m[1]/maxstd
    tp = dot(C2,tp)
    
    # create matrix for linear method, 2 rows for each correspondence pair
    nbr_correspondences = fp.shape[1]
    A = zeros((2*nbr_correspondences,9))
    for i in range(nbr_correspondences):        
        A[2*i] = [-fp[0][i],-fp[1][i],-1,0,0,0,
                    tp[0][i]*fp[0][i],tp[0][i]*fp[1][i],tp[0][i]]
        A[2*i+1] = [0,0,0,-fp[0][i],-fp[1][i],-1,
                    tp[1][i]*fp[0][i],tp[1][i]*fp[1][i],tp[1][i]]
    
    U,S,V = linalg.svd(A)
    H = V[8].reshape((3,3))    
    
    # decondition
    H = dot(linalg.inv(C2),dot(H,C1))
    
    # normalize and return
    return H / H[2,2]

if __name__ == "__main__":
    fp = np.random.randint(0, 500, size=(3, 3))
    tp = np.random.randint(0, 500, size=(3, 3))
    result = H_from_points(fp,tp)
    print(result)

（6） 结果验证，根据代码求解出的单应性矩阵满足八个自由度的求解

[[-3.69744393e+00  5.73332427e+00 -3.60193467e+02]
 [-1.31628505e+00  2.40124692e+00 -1.83652536e+03]
 [-2.12511707e-03  3.04655705e-03  1.00000000e+00]]

（7） 可视化

from scipy import ndimage
from PIL import Image
from pylab import *

im = array(Image.open('sun.jpg').convert('L'))
H = array([[1.4,0.05,-100],[0.05,1.5,-100],[0,0,1]])
im2 = ndimage.affine_transform(im,H[:2,:2],(H[0,2],H[1,2]))

figure()
gray()
subplot(121)
axis('off')
imshow(im)
subplot(122)
axis('off')
imshow(im2)
show()

在这里插入图片描述

1.2 正向映射和逆向映射

在获得单应性矩阵后，我们的下一步目标就是将图像之间的像素坐标进行映射操作。

操作分为正向映射和逆向映射，归根到底就是上述x’=Hx的一个参数位置的不同。
在这里插入图片描述
$正向映射：x^{\prime}=h(x)$

$反向映射：\boldsymbol{x}=\boldsymbol{h}^{-1}\left(\boldsymbol{x}^{\prime}\right)$
但是根据上述的操作容易出现多个像素坐标映射到同一个坐标上（因为坐标都是整数，通过四舍五入容易出现多个坐标对应同一坐标），这种情况会出现映射图像会出现一条黑线（未被映射），因此这个时候我们会采取插值操作来避免这种问题（近邻插值、线性插值）

2. 全景拼接

全景拼接整体流程：
1.根据给定图像集，实现特征匹配
2.通过匹配特征计算图像之间的变换结构
3.利用图像变换结构，实现图像映射
4.针对叠加后的图像，采用APAP之类的算法对齐特征点
5.通过图割算法，自动选取拼接缝
6.根据multi-band blendin策略实现融合

2.1 RANSAC 算法

将上述操作得到的单应性变换图像进行叠加拼接容易出重影现象。这是因为特征点的错误匹配会对通过最小二乘法求解的单应性矩阵带来极大的影响，因此在求解单应性矩阵时，我们需要先通过RANSAC算法实现对于特征点对的筛选，再求解单应性矩阵（将噪声减到最小再求解）。

RANSAC算法流程：
1.随机选择四对匹配特征
2.根据DLT计算单应性矩阵H
3.对所有匹配点，计算映射误差
4.根据误差阈值，确定inliers
5.针对最大inliers集合，重新计算单应性矩阵H

RANSAC算法本质是一个直线拟合算法，我们通过拟合出最佳的直线再设定阈值排除远离拟合直线的噪声点，保留内点。

import numpy as np
import matplotlib.pyplot as plt
import random
import math

# 数据量。
SIZE = 50
# 产生数据。np.linspace 返回一个一维数组，SIZE指定数组长度。
# 数组最小值是0，最大值是10。所有元素间隔相等。
X = np.linspace(0, 10, SIZE)
Y = 3 * X + 10

fig = plt.figure()
# 画图区域分成1行1列。选择第一块区域。
ax1 = fig.add_subplot(1,1, 1)
# 标题
ax1.set_title("RANSAC")


# 让散点图的数据更加随机并且添加一些噪声。
random_x = []
random_y = []
# 添加直线随机噪声
for i in range(SIZE):
    random_x.append(X[i] + random.uniform(-0.5, 0.5)) 
    random_y.append(Y[i] + random.uniform(-0.5, 0.5)) 
# 添加随机噪声
for i in range(SIZE):
    random_x.append(random.uniform(0,10))
    random_y.append(random.uniform(10,40))
RANDOM_X = np.array(random_x) # 散点图的横轴。
RANDOM_Y = np.array(random_y) # 散点图的纵轴。

# 画散点图。
ax1.scatter(RANDOM_X, RANDOM_Y)
# 横轴名称。
ax1.set_xlabel("x")
# 纵轴名称。
ax1.set_ylabel("y")

# 使用RANSAC算法估算模型
# 迭代最大次数，每次得到更好的估计会优化iters的数值
iters = 100000
# 数据和模型之间可接受的差值
sigma = 0.25
# 最好模型的参数估计和内点数目
best_a = 0
best_b = 0
pretotal = 0
# 希望的得到正确模型的概率
P = 0.99
for i in range(iters):
    # 随机在数据中红选出两个点去求解模型
    sample_index = random.sample(range(SIZE * 2),2)
    x_1 = RANDOM_X[sample_index[0]]
    x_2 = RANDOM_X[sample_index[1]]
    y_1 = RANDOM_Y[sample_index[0]]
    y_2 = RANDOM_Y[sample_index[1]]

    # y = ax + b 求解出a，b
    a = (y_2 - y_1) / (x_2 - x_1)
    b = y_1 - a * x_1

    # 算出内点数目
    total_inlier = 0
    for index in range(SIZE * 2):
        y_estimate = a * RANDOM_X[index] + b
        if abs(y_estimate - RANDOM_Y[index]) < sigma:
            total_inlier = total_inlier + 1

    # 判断当前的模型是否比之前估算的模型好
    if total_inlier > pretotal:
        iters = math.log(1 - P) / math.log(1 - pow(total_inlier / (SIZE * 2), 2))
        pretotal = total_inlier
        best_a = a
        best_b = b

    # 判断是否当前模型已经符合超过一半的点
    if total_inlier > SIZE:
        break

# 用我们得到的最佳估计画图
Y = best_a * RANDOM_X + best_b

# 直线图
ax1.plot(RANDOM_X, Y)
plt.show()

在这里插入图片描述
opencv里有自带RANSAC算法，我们将其可视化展示RANSAC对于匹配点筛选的影响，下图红线表示被RANSAC排除掉的匹配对，绿色表示最终计算单应性矩阵的匹配对。

在这里插入图片描述

import cv2
import numpy as np
from datetime import datetime
 
 
def detectAndCompute(image):
    # 计算SIFT特征点和特征向量
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    sift = cv2.xfeatures2d.SIFT_create()
    (kps, features) = sift.detectAndCompute(image, None)
    kps = np.float32([kp.pt for kp in kps])
    return (kps, features)
 
 
def matchKeyPoints(kpsA, kpsB, featuresA, featuresB, ratio=0.75, reprojThresh=4.0):
    # 匹配
    matcher = cv2.BFMatcher()
    rawMatches = matcher.knnMatch(featuresA, featuresB, 2)
    matches = []
    for m in rawMatches:
        if len(m) == 2 and m[0].distance < ratio * m[1].distance:
            matches.append((m[0].queryIdx, m[0].trainIdx))
    # 使用np.float32转化列表
    kpsA = np.float32([kpsA[m[0]] for m in matches])
    kpsB = np.float32([kpsB[m[1]] for m in matches])
    # cv2自带RANSAC筛选状态
    (M, status) = cv2.findHomography(kpsA, kpsB, cv2.RANSAC, reprojThresh)
    return (M, matches, status)
 
def drawMatches(imgA, imgB, kpsA, kpsB, matches, status, imageA, imageB):
    (hA, wA) = imgA.shape[0:2]
    (hB, wB) = imgB.shape[0:2]
    # 注意：3通道和uint8类型
    drawImg = np.zeros((max(hA, hB), wA + wB, 3), 'uint8')
    drawImg[0:hB, 0:wB] = imageB
    drawImg[0:hA, wB:] = imageA
    for ((queryIdx, trainIdx), s) in zip(matches, status):
        if s == 1:
            pt1 = (int(kpsB[trainIdx][0]), int(kpsB[trainIdx][1]))
            pt2 = (int(kpsA[trainIdx][0]) + wB, int(kpsA[trainIdx][1]))
            cv2.line(drawImg, pt1, pt2, (0, 255, 0))
        else:
            pt1 = (int(kpsB[trainIdx][0]), int(kpsB[trainIdx][1]))
            pt2 = (int(kpsA[trainIdx][0]) + wB, int(kpsA[trainIdx][1]))
            cv2.line(drawImg, pt1, pt2, (0, 0, 255))
    cv2.imwrite("drawMatches.jpg", drawImg)
    return
 
 
if __name__ == '__main__':
    imageA = cv2.imread("test_image/1.jpg")
    imageB = cv2.imread("test_image/2.jpg")
    (kpsA, featuresA) = detectAndCompute(imageA)
    (kpsB, featuresB) = detectAndCompute(imageB)
    (M, matches, status) = matchKeyPoints(kpsA, kpsB, featuresA, featuresB)
    print(status)
    drawMatches(imageA, imageB, kpsA, kpsB, matches, status, imageA, imageB)

2.2 APAP算法

上述通过单应性变换实现的图像映射只能应用于输入图像在一个平面上，或者相机沿着同一个角点旋转拍摄图像，否则容易出现鬼影现象。CVPR2013提出了APAP算法，通过局部单应性变换的思路使两个图像的重叠区域准确对齐。

如下图所示，图上的点表示两张图像之间匹配上的特征点的坐标位置，而红线表示计算出的矩阵。（a）中采用全局的单应性变换操作，会产生很明显的局部偏差；而（c）中采用APAP的方法可以有效的解决局部误差并且推断出全局变换的趋势。
在这里插入图片描述
简单来说，APAP相比之前的全局单应性矩阵变换不同的地方在于，它将图像划分为多个patch，然后针对每个patch进行单应性变换的计算。这样可以根据每个patch周围的背景特点和特征点位置进行更准确的映射，有效避免局部误差。
在这里插入图片描述
公式推导：

和全局单应性变换相同，求解单应性变换矩阵，但不同的是这里的H是具有位置依赖性的单应性变换矩阵。
$\tilde{\mathbf{x}}_{*}^{\prime}=\mathbf{H}_{*} \tilde{\mathbf{x}}_{*}\quad\bold{(1)}$
这里的N指将图像划分为N个patch，而这里的单应性矩阵需要添加一个权重w（针对每个patch）
$\mathbf{h}_{*}=\underset{\mathbf{h}}{\operatorname{argmin}} \sum_{i=1}^{N}\left\|w_{*}^{i} \mathbf{a}_{i} \mathbf{H}\right\|^{2}\quad\bold{(2)}$
权重w的计算由每个像素于该像素所在patch的左上角距离确定，中σ是一个尺度因子，其中距离格子越远的产生的权重越小。这里设计权重的目的我们可以联系上面全局单应性的计算，需要四个特征点的匹配对才能实现投影变换，这时候可能出现patch里没有特征点匹配对的情况，这个时候就需要从其他patch去找特征点匹配对，所以有个权重的概念。而当patch周围的特征点匹配对离它特别远时，权重的概念可以忽略，因此我们采用一个小的值抵消掉权重。
$w_{*}^{i}=\exp \left(-\left\|\mathbf{x}_{*}-\mathbf{x}_{i}\right\|^{2} / \sigma^{2}\right)\quad\bold{(3)}$

$w_{*}^{i}=\max \left(\exp \left(-\left\|\mathbf{x}_{*}-\mathbf{x}_{i}\right\|^{2} / \sigma^{2}\right), \gamma\right)(0 <\gamma<1)\quad\bold{(4)}$

因此局部扭曲变换可以表示为，W是个对角矩阵。
$\mathbf{h}_{*}=\underset{\mathbf{h}}{\operatorname{argmin}}\left\|\mathbf{W}_{*} \mathbf{A h}\right\|^{2}\quad\bold{(5)}$

$\mathbf{W}_{*}=\operatorname{diag}\left(\left[w_{*}^{1} w_{*}^{1} \ldots w_{*}^{N} w_{*}^{N}\right]\right)\quad\bold{(6)}$

2.3 最小割最大流算法

在这里插入图片描述
$\mathbf{A}, \mathbf{B})=\|\mathbf{A}(s)-\mathbf{B}(s)\|+\|\mathbf{A}(t)-\mathbf{B}(t)\|$
每个重叠区域都会包含Patch A和Patch B对应的像素点，这个时候采用最小割算法，将重叠部分以图的形式表示，边的权重计算方式如上表示，最后找到权重最小的割线作为拼接缝。最大流算法的结果=最小割算法的结果。
在这里插入图片描述

2.4 multi-band bleing算法

在通过图割算法找到拼接缝后，由于图像光照、噪声等因素，会使拼接缝很突兀。而multi-band bleing算法通过图像融合消除边缘痕迹。

1.对两张图像进行高斯金字塔和拉普拉斯金字塔分解，得到A图和B图的拉普拉斯金字塔。
在这里插入图片描述
2.将拉普拉斯金字塔A和拉普拉斯金字塔B进行加权融合

3.重构融合的金字塔

2.5 全景拼接代码

from pylab import *
from numpy import *
from PIL import Image

# If you have PCV installed, these imports should work

from PCV.geometry import homography, warp
from PCV.localdescriptors import sift
np.seterr(invalid='ignore')
"""
This is the panorama example from section 3.3.
"""

# 设置数据文件夹的路径
featname = ['D:/desktop/cv_task/concate/photo/' + str(i + 1) + '.sift' for i in range(3)]
imname = ['D:/desktop/cv_task/concate/photo/' + str(i + 1) + '.jpg' for i in range(3)]

# 提取特征并匹配使用sift算法
l = {}
d = {}
for i in range(3):
    sift.process_image(imname[i], featname[i])
    l[i], d[i] = sift.read_features_from_file(featname[i])

matches = {}
for i in range(2):
    matches[i] = sift.match(d[i + 1], d[i])

# 可视化匹配
for i in range(2):
    im1 = array(Image.open(imname[i]))
    im2 = array(Image.open(imname[i + 1]))
    figure()
    sift.plot_matches(im2, im1, l[i + 1], l[i], matches[i], show_below=True)


# 将匹配转换成齐次坐标点的函数
def convert_points(j):
    ndx = matches[j].nonzero()[0]
    fp = homography.make_homog(l[j + 1][ndx, :2].T)
    ndx2 = [int(matches[j][i]) for i in ndx]
    tp = homography.make_homog(l[j][ndx2, :2].T)

    # switch x and y - TODO this should move elsewhere
    fp = vstack([fp[1], fp[0], fp[2]])
    tp = vstack([tp[1], tp[0], tp[2]])
    return fp, tp


# 估计单应性矩阵
model = homography.RansacModel()

fp, tp = convert_points(1)
H_12 = homography.H_from_ransac(fp, tp, model)[0]  # im 1 to 2

fp, tp = convert_points(0)
H_01 = homography.H_from_ransac(fp, tp, model)[0]  # im 0 to 1

# 扭曲图像
delta = 2000  # for padding and translation用于填充和平移

im1 = array(Image.open(imname[1]), "uint8")
im2 = array(Image.open(imname[2]), "uint8")
im_12 = warp.panorama(H_12, im1, im2, delta, delta)

im1 = array(Image.open(imname[0]), "f")
im_02 = warp.panorama(dot(H_12, H_01), im1, im_12, delta, delta)

figure()
imshow(array(im_02, "uint8"))
axis('off')
show()

在这里插入图片描述

JMU-HZH

关注

3
点赞
踩
12

收藏

觉得还不错? 一键收藏
2
评论
【计算机视觉】图像映射与全景拼接

文章目录【计算机视觉】图像映射与全景拼接1. 图像映射1.1 单应性变换1.2 正向映射和逆向映射2. 全景拼接2.1 RANSAC 算法2.2 APAP算法2.3 最小割最大流算法2.4 multi-band bleing算法2.5 全景拼接代码【计算机视觉】图像映射与全景拼接1. 图像映射图像映射流程1.提取特征点，生成描述符2.特征匹配3.根据图像变换特点，选取合适的变换类型4.根据单应性变换等方法计算变换结构5.采用正向或逆向映射，利用插值方式实现图像映射变换图像变换类型1.刚
复制链接

扫一扫