计算机视觉编程第二章

最新推荐文章于 2024-09-06 10:05:08 发布

_Taylor

最新推荐文章于 2024-09-06 10:05:08 发布

阅读量248

点赞数

分类专栏：计算机视觉编程文章标签：计算机视觉人工智能

本文链接：https://blog.csdn.net/sketch_2314/article/details/132102517

版权

计算机视觉编程专栏收录该内容

7 篇文章 0 订阅

订阅专栏

第二章局部图像描述子

前言
2.1 Harris角点检测器
2.2 SIFT(尺度不变特性变换)
2.3 匹配地理标记图像
- 2.3.1 下载地理标记图像
- 2.3.2 局部描述子匹配并可视化

前言

本专栏按《python计算机视觉编程 ——Jan Erik Solem》一书为参考，第二章介绍图像的局部角点和图像间对应点和区域，通过匹配描述子点达到图像匹配的效果，主要有Harris角点检测器和SIFT技术，在创建全景图、目标跟踪、三维重建等技术中有重要的作用

2.1 Harris角点检测器

Harris角点检测器（Harris Corner Detector）算法用于寻找图像中的角点，即具有明显变化的位置，这些位置在不同方向上都有明显的灰度变化

角点是图像中具有特殊性质的点，当图像平移、旋转或有尺度变换时，这些点的位置在变换后仍能保持稳定，即这些点的像素周围显示出有多方向的边，该点也称为兴趣点

在图像域中某像素点x上的对称半正定矩阵 $M_I =M_I(x)$ 定义为 $M_I = \nabla I\nabla I^T=\begin{bmatrix}I_x\\I_y\end{bmatrix}\begin{bmatrix}I_x&I_y\end{bmatrix}=\begin{bmatrix}I_x^2 & I_xI_y \\I_xI_y & I_y^2 \\\end{bmatrix}$ 其中 $\nabla I$ 表示图像梯度，在前一章已定义。在上面的定义里 $M_I$ 的秩为1，特征值为 $\lambda_1=|\nabla I^2|,\lambda_2=0$ 再通过对 $M_I$ 卷积（一般是高斯滤波器 $G_\sigma$ ）得到点x周围像素的局部平均，得到Harris矩阵 $\overline M_I$ ，而Harris矩阵 $\overline M_I$ 根据区域 $\nabla I$ 的值有三种情况

$\lambda_1$ 和 $\lambda_2$ 都是很大的正数表示x为一个角点
$\lambda_1$ 很大， $\lambda_2\approx0$ 表示区域内有个边
$\lambda_1\approx\lambda_2\approx0$ 表示该区域为空

角点响应函数compute_harris_response使用高斯导数实现，返回一个Harris响应函数值图像，再通过get_harris_points得到角点，同时先显示响应函数值图像，再使用for循环显示三个不同阈值的焦点图

im = array(Image.open('filelist/PIL4.jpg').convert('L'))

threshold = [0.01,0.05,0.1]
harrisim = harris.compute_harris_response(im)
harrisim0 = 255 - harrisim
subplot(2, 2, 1)
gray()
imshow(harrisim0)
axis('off')

for i in range(3):
    harrisim = harris.compute_harris_response(im)
    filtered_coords = harris.get_harris_points(harrisim, 10, threshold[i])
    subplot(2, 2, i + 2)
    gray()
    imshow(im)
    plot([p[1] for p in filtered_coords], [p[0] for p in filtered_coords], '*')
    axis('off')

show()

请添加图片描述
正如你所看到的，随着阈值的增大，检测到的角点的数量就少了。这是由于阈值控制了响应函数的值的范围，只有当响应函数的值大于阈值时，该像素才被认为是角点，而较高的阈值意味着只有在像素周围窗口内灰度变化非常显著的像素才会被视为角点

上述的代码只能在一幅图像中检测角点。兴趣点描述子是分配给角点的向量，表示此点附近的信息，对应点用来描述不同图像上同个物体和场景，寻找描述子越好的点对应点就越好。通常两个像素块 $I_1(x)$ 和 $I_2(x)$ 的相关矩阵为 $c(I_1,I_2)=\sum_xf(I_1(x),I_2(x))$ 对于互关矩阵(描述图像与其自身在不同时间或位置上的相似程度) $f(I_1,I_2)=I_1·I_2$ ， $c(I_1,I_2)$ 值越大， $I_1(x)$ 和 $I_2(x)$ 相似度越高

以下是在图像见寻找对应点的操作代码，其中各个函数以添加至相关文件中

# 在图像间寻找对应点
# 寻找两张图像的Harris角点的匹配对应点
image1 = cv2.imread('filelist/SIFT1.jpg', 0)
image2 = cv2.imread('filelist/SIFT2.jpg', 0)

def detect_harris_corners(image):
    dst = cv2.cornerHarris(image, 2, 3, 0.04)
    # 通过阈值或非极大值抑制来找到角点
    dst = cv2.dilate(dst, None)
    corners = np.where(dst > 0.01 * dst.max())
    return list(zip(corners[1], corners[0]))

corners1 = detect_harris_corners(image1)
corners2 = detect_harris_corners(image2)

sift = cv2.SIFT_create()

# 在图像中计算SIFT特征和描述符
keypoints1, descriptors1 = sift.detectAndCompute(image1, None)
keypoints2, descriptors2 = sift.detectAndCompute(image2, None)

# 使用FLANN匹配器进行特征匹配
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(descriptors1, descriptors2, k=2)

# 使用比值测试来过滤好的匹配
good_matches = []
for m, n in matches:
    if m.distance < 0.7 * n.distance:
        good_matches.append(m)

# 提取匹配点对应的角点坐标
matched_points1 = [keypoints1[match.queryIdx].pt for match in good_matches]
matched_points2 = [keypoints2[match.trainIdx].pt for match in good_matches]

def draw_matches(image1, image2, matched_points1, matched_points2):
    combined_image = np.concatenate((image1, image2), axis=1)
    offset = image1.shape[1]
    for pt1, pt2 in zip(matched_points1, matched_points2):
        pt1 = (int(pt1[0]), int(pt1[1]))
        pt2 = (int(pt2[0] + offset), int(pt2[1]))
        cv2.line(combined_image, pt1, pt2, (0, 255, 0), 1)
        cv2.circle(combined_image, pt1, 5, (255, 0, 0), -1)
        cv2.circle(combined_image, pt2, 5, (255, 0, 0), -1)
    plt.imshow(combined_image, cmap='gray')
    plt.axis('off')
    plt.show()

draw_matches(image1, image2, matched_points1, matched_points2)

请添加图片描述
从图上看山的边缘还有那颗竖直树木的边缘都很好的匹配到了

2.2 SIFT(尺度不变特性变换)

SIFT（Scale-Invariant Feature Transform）是一种用于图像特征提取的算法。它能够在图像中检测出具有尺度不变性的关键点，并生成稳定的特征描述符。SIFT算法通过在不同尺度和方向上对图像进行高斯滤波和差分运算来检测关键点，然后通过局部图像梯度和方向来生成独特的描述符，从而实现在不同场景下对图像进行匹配和识别

2.2.1 兴趣点和描述子

兴趣点在图像中对应于特定的视觉结构，例如边缘、角点、纹理等，能够在不同图像之间保持一定程度的不变性，即使图像发生了缩放、旋转或视角变化，兴趣点在不同图像中仍然可以对应，SIFT特征使用高斯差分函数来定义兴趣点 $D(\mathrm x,\sigma)=[G_{\mathrm k\sigma}(x)-G_{\sigma}(x)]\ast I(x)=[G_{\mathrm k\sigma}-G_{\sigma}]\ast I=I_{\mathrm k\sigma}-I_{\sigma}$ 其中 $G_{\sigma}$ 是二维高斯核， $I_{\sigma}$ 是使用 $G_{\sigma}$ 模糊的灰度图， $\mathrm k$ 是决定相差尺度的常数

描述子用来描述兴趣点附近的图像信息，它们是由兴趣点附近的像素值或梯度等信息计算而来。描述子的目标是捕获兴趣点周围的局部结构，使得在不同图像中相似的兴趣点具有相似的描述子。通过对兴趣点周围区域进行采样和特征提取，将局部图像信息转化为向量表示来计算描述子

2.2.2 检测兴趣点

这里使用SIFT检测兴趣点，SIFT能够检测图像中的稳定特征点，并且对尺度、旋转和亮度变化具有较好的鲁棒性，下面是操作代码

def draw_sift_keypoints(image, keypoints):
    for kp in keypoints:
        x, y = kp.pt
        scale = kp.size
        if scale < 10:  
            radius = int(scale * 1.5)
            cv2.circle(image, (int(x), int(y)), radius, (0, 255, 0), 2)

im = array(Image.open('filelist/PIL4.jpg').convert('L'))
subplot(1, 3, 1)
gray()
imshow(im)
axis('off')

harrisim = harris.compute_harris_response(im)
filtered_coords = harris.get_harris_points(harrisim, 10, 0.05)
subplot(1, 3, 2)
gray()
imshow(im)
plot([p[1] for p in filtered_coords], [p[0] for p in filtered_coords], '*')
axis('off')

gray_image = cv2.cvtColor(im, cv2.IMREAD_GRAYSCALE)

sift = cv2.SIFT_create()
keypoints, descriptors = sift.detectAndCompute(gray_image, None)
image_with_keypoints = copy(image)
draw_sift_keypoints(image_with_keypoints, keypoints)

subplot(1, 3, 3)
gray()
imshow(image_with_keypoints)
axis('off')

show()

在这里插入图片描述
能够从上图看到，与之前的Harris角点对比（中间蓝色五角星图），SIFT特征的呈现（右边黄色圈圈）在很多区域还是有所区别的，两者选择的特征点的位置都有不同

2.2.3 匹配描述子

同上面的检测兴趣点，这里也使用SIFT来匹配对应的特征描述点，其中主要代码如下

def draw_matches(img1, kp1, img2, kp2, matches):
    # Draw matches between two images
    h1, w1 = img1.shape[:2]
    h2, w2 = img2.shape[:2]
    vis = np.zeros((max(h1, h2), w1 + w2, 3), dtype=np.uint8)
    vis[:h1, :w1] = img1
    vis[:h2, w1:w1 + w2] = img2
    for m in matches:
        (x1, y1) = kp1[m.queryIdx].pt
        (x2, y2) = kp2[m.trainIdx].pt
        cv2.circle(vis, (int(x1), int(y1)), 4, (0, 255, 0), 1)
        cv2.circle(vis, (int(x2) + w1, int(y2)), 4, (0, 255, 0), 1)
        cv2.line(vis, (int(x1), int(y1)), (int(x2) + w1, int(y2)), (0, 255, 0), 1)
    return vis

def main():
    # Load the two images
    image1_path = 'filelist/SIFT1.jpg'
    image2_path = 'filelist/SIFT2.jpg'
    image1 = cv2.imread(image1_path)
    image2 = cv2.imread(image2_path)

    # Convert the images to grayscale
    gray_image1 = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
    gray_image2 = cv2.cvtColor(image2, cv2.COLOR_BGR2GRAY)

    # Create SIFT detector
    sift = cv2.SIFT_create()

    # Detect SIFT keypoints and descriptors for both images
    keypoints1, descriptors1 = sift.detectAndCompute(gray_image1, None)
    keypoints2, descriptors2 = sift.detectAndCompute(gray_image2, None)

    # Create a Brute-Force Matcher
    bf = cv2.BFMatcher()

    # Match descriptors of both images
    matches = bf.knnMatch(descriptors1, descriptors2, k=2)

    # Apply ratio test to get good matches
    good_matches = []
    for m, n in matches:
        if m.distance < 0.75 * n.distance:
            good_matches.append(m)

    # Draw and display the matches
    matched_image = draw_matches(image1, keypoints1, image2, keypoints2, good_matches)

    cv2.imshow('SIFT Matches', matched_image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

if __name__ == "__main__":
    main()

请添加图片描述
红色实心圆圈标出的是图像的SIFT特征，黄色细线连接了两副图像对应的SIFT特征，可以看到并不是所有的特征点都被一一对应了，但是大部分可以对应的特征点都能够连接起来

2.3 匹配地理标记图像

2.3.1 下载地理标记图像

由于书上的Panoramio.com网站已经关闭无法使用，因此直接从网络上找了美国自由女神像的各个角度来代替

2.3.2 局部描述子匹配并可视化

由于书上的process_image函数代码在本人电脑上无法导出.sift文件导致后续操作的中断，这里使用networkx库的方法进行匹配和可视化

import os
import cv2
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt

def compute_descriptors(image_path):
    image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
    sift = cv2.SIFT_create()
    keypoints, descriptors = sift.detectAndCompute(image, None)
    return keypoints, descriptors

def match_images(imlist):
    matches = {}
    for i in range(len(imlist)):
        matches[i] = []
        keypoints_i, descriptors_i = compute_descriptors(imlist[i])
        for j in range(i + 1, len(imlist)):
            keypoints_j, descriptors_j = compute_descriptors(imlist[j])

            # BFMatcher with default params
            bf = cv2.BFMatcher()
            matches_ij = bf.knnMatch(descriptors_i, descriptors_j, k=2)

            # Apply ratio test
            good_matches = []
            for m, n in matches_ij:
                if m.distance < 0.75 * n.distance:
                    good_matches.append(m)

            matches[i].append((j, len(good_matches)))

    return matches

def visualize_matches(imlist, matches):
    G = nx.Graph()

    for i, image_path in enumerate(imlist):
        image = cv2.imread(image_path)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        G.add_node(i, image=image)

    for i in range(len(imlist)):
        for j, num_matches in matches[i]:
            G.add_edge(i, j, weight=num_matches)

    pos = nx.spring_layout(G, seed=42)  # Positioning the nodes using a spring layout algorithm
    fig, ax = plt.subplots(figsize=(12, 8))
    nx.draw(G, pos, with_labels=False, node_size=2000, node_color="skyblue", font_size=10, font_color="black",
            font_weight="bold", edge_color="gray", width=1.5, cmap=plt.cm.Blues, alpha=0.7)

    # Adding images to the nodes
    trans = ax.transData.transform
    trans2 = fig.transFigure.inverted().transform

    for node in G.nodes():
        (x, y) = pos[node]
        xx, yy = trans((x, y))
        xa, ya = trans2((xx, yy))
        image = G.nodes[node]["image"]
        imagebox = ax.inset_axes([xa - 0.03, ya - 0.03, 0.06, 0.06])
        imagebox.imshow(image)
        imagebox.set_aspect('auto')
        imagebox.axis("off")

    plt.savefig("output_graph.png", bbox_inches="tight")
    plt.show()

if __name__ == "__main__":
    imlist_folder = "./imlist"
    imlist = [os.path.join(imlist_folder, filename) for filename in os.listdir(imlist_folder) if filename.endswith(".jpg")]

    matches = match_images(imlist)
    visualize_matches(imlist, matches)

请添加图片描述
每个节点（图像）上都显示了一个缩略图，代表了相应图像的内容。图中的边连接了一对图像节点，并且边上的标签显示了这两个图像之间的局部描述子匹配数目。这些匹配是通过应用SIFT算法并利用特征点之间的距离来筛选出的好的匹配

图中较长的边表示两个图像之间的局部特征点匹配数较多，这表明它们在视觉上更相似。相反，较短的边表示匹配数较少，这意味着这两个图像在视觉上更不相似

_Taylor

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
计算机视觉编程第二章

本专栏按《python计算机视觉编程 ——Jan Erik Solem》一书为参考，第二章介绍图像的局部角点和图像间对应点和区域，通过匹配描述子点达到图像匹配的效果，主要有Harris角点检测器和SIFT技术，在创建全景图、目标跟踪、三维重建等技术中有重要的作用。
复制链接

扫一扫