【AI中数学-线代-综合实例-包括python实现】幻影变换：图像的仿射艺术-CSDN博客

本文链接：https://blog.csdn.net/l35633/article/details/145070901

第三章线性代数-综合实例

第10节幻影变换：图像的仿射艺术

仿射变换（Affine Transformation）是图像处理和计算机视觉中一种基本而强大的几何变换方法。它通过线性变换和平移操作，实现图像的旋转、缩放、平移、剪切等多种变换效果。在深度学习和AI应用中，仿射变换广泛用于数据增强、图像校正、目标检测等任务。本节将通过五个实际应用案例，深入探讨仿射变换在AI中的应用，包括案例描述、案例分析、算法步骤以及对应的Python代码详解。

案例一：图像数据增强中的仿射变换

1. 案例描述

在深度学习模型的训练过程中，数据量的充足性对模型性能至关重要。然而，获取大量标注数据往往困难且昂贵。图像数据增强（Data Augmentation）通过对现有图像应用各种变换，生成更多样化的训练样本，从而提高模型的泛化能力。本案例将展示如何使用仿射变换进行图像数据增强，包括旋转、缩放和平移操作。

2. 案例分析

数据增强通过增加训练数据的多样性，帮助模型更好地学习到数据的内在特征，减少过拟合现象。仿射变换作为一种线性变换，能够有效地模拟图像在不同视角和条件下的变化。具体而言：

旋转（Rotation）：模拟图像在不同角度下的观察，增强模型对角度变化的鲁棒性。
缩放（Scaling）：调整图像的大小，帮助模型适应不同尺寸的目标。
平移（Translation）：改变图像的位置，增强模型对目标位置变化的适应能力。

3. 案例算法步骤

加载图像：读取需要增强的图像。
定义仿射变换矩阵：
- 计算旋转矩阵。
- 计算缩放矩阵。
- 计算平移矩阵。
应用仿射变换：将定义好的变换矩阵应用到图像上，生成增强后的图像。
显示和保存结果：展示原始图像与增强后的图像，并根据需要保存到本地。

4. 案例对应Python代码及详解

以下示例使用Python的OpenCV库实现图像的旋转、缩放和平移增强。

import cv2
import numpy as np
import matplotlib.pyplot as plt

# 1. 加载图像
def load_image(path):
    image = cv2.imread(path)
    if image is None:
        raise FileNotFoundError(f"无法加载图像，请检查路径: {path}")
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # 转换为RGB
    return image

# 2. 定义仿射变换矩阵
def get_rotation_matrix(image, angle):
    (h, w) = image.shape[:2]
    center = (w / 2, h / 2)
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    return M

def get_scaling_matrix(image, scale_x, scale_y):
    M = np.array([[scale_x, 0, 0],
                  [0, scale_y, 0]], dtype=np.float32)
    return M

def get_translation_matrix(tx, ty):
    M = np.array([[1, 0, tx],
                  [0, 1, ty]], dtype=np.float32)
    return M

# 3. 应用仿射变换
def apply_affine_transform(image, M, output_size):
    transformed = cv2.warpAffine(image, M, output_size, borderMode=cv2.BORDER_REFLECT)
    return transformed

# 4. 显示图像
def display_images(original, transformed, title):
    plt.figure(figsize=(10,5))
    plt.subplot(1,2,1)
    plt.imshow(original)
    plt.title('原始图像')
    plt.axis('off')
    
    plt.subplot(1,2,2)
    plt.imshow(transformed)
    plt.title(title)
    plt.axis('off')
    
    plt.show()

# 主函数
def main():
    # 设置图像路径
    image_path = 'path_to_your_image.jpg'  # 请替换为实际图像路径
    
    # 加载原始图像
    original_image = load_image(image_path)
    
    # 旋转
    angle = 45  # 旋转角度
    rotation_matrix = get_rotation_matrix(original_image, angle)
    rotated_image = apply_affine_transform(original_image, rotation_matrix, (original_image.shape[1], original_image.shape[0]))
    display_images(original_image, rotated_image, f'旋转 {angle}°')
    
    # 缩放
    scale_x, scale_y = 1.5, 1.5  # 缩放因子
    scaling_matrix = get_scaling_matrix(original_image, scale_x, scale_y)
    scaled_image = apply_affine_transform(original_image, scaling_matrix, 
                                         (int(original_image.shape[1]*scale_x), int(original_image.shape[0]*scale_y)))
    display_images(original_image, scaled_image, f'缩放 X:{scale_x} Y:{scale_y}')
    
    # 平移
    tx, ty = 100, 50  # 平移量
    translation_matrix = get_translation_matrix(tx, ty)
    translated_image = apply_affine_transform(original_image, translation_matrix, (original_image.shape[1], original_image.shape[0]))
    display_images(original_image, translated_image, f'平移 X:{tx} Y:{ty}')
    
    # 组合变换：旋转后平移
    combined_matrix = rotation_matrix.copy()
    combined_matrix[0, 2] += tx
    combined_matrix[1, 2] += ty
    combined_image = apply_affine_transform(original_image, combined_matrix, (original_image.shape[1], original_image.shape[0]))
    display_images(original_image, combined_image, f'旋转 {angle}° 后平移 X:{tx} Y:{ty}')

if __name__ == "__main__":
    main()

代码详解

加载图像：
- 使用cv2.imread读取图像，并将BGR格式转换为RGB格式，以便在Matplotlib中正确显示。
- 检查图像是否成功加载，若未加载成功，抛出错误提示。
定义仿射变换矩阵：
- 旋转矩阵：使用cv2.getRotationMatrix2D计算旋转矩阵，指定旋转中心、角度和缩放因子（此处为1.0，表示不缩放）。
- 缩放矩阵：构建缩放矩阵，指定x轴和y轴的缩放因子。
- 平移矩阵：构建平移矩阵，指定x轴和y轴的平移量。
应用仿射变换：
- 使用cv2.warpAffine函数将变换矩阵应用到图像上，指定输出图像的尺寸和边界处理方式（此处使用反射填充）。
显示图像：
- 使用Matplotlib并排显示原始图像和变换后的图像，便于直观比较变换效果。
主函数流程：
1. 设置图像路径：将image_path变量替换为实际的图像文件路径。
2. 加载原始图像：调用load_image函数加载图像。
3. 应用旋转变换：定义旋转角度，计算旋转矩阵，应用变换，展示结果。
4. 应用缩放变换：定义缩放因子，计算缩放矩阵，应用变换，展示结果。
5. 应用平移变换：定义平移量，计算平移矩阵，应用变换，展示结果。
6. 组合变换：先旋转后平移，展示组合变换的效果。

案例二：人脸图像校正中的仿射变换

1. 案例描述

在人脸识别和分析中，图像校正是提高识别准确率的重要步骤。由于人脸在拍摄过程中可能存在倾斜、旋转等姿态变化，使用仿射变换对人脸图像进行校正，可以使人脸处于统一的姿态，从而提高后续识别算法的性能。本案例将展示如何通过仿射变换实现人脸图像的对齐和校正。

2. 案例分析

人脸图像校正通常包括以下步骤：

人脸检测：定位图像中的人脸区域。
特征点检测：检测人脸的关键点，如眼睛、鼻子、嘴巴等位置。
计算仿射变换矩阵：基于特征点计算将人脸对齐到标准位置的仿射变换矩阵。
应用变换：使用计算得到的矩阵对图像进行仿射变换，实现人脸对齐。

通过统一人脸的姿态，可以减少姿态变化对识别算法的干扰，提升识别准确率。

3. 案例算法步骤

加载图像：读取包含人脸的图像。
人脸检测：使用预训练的人脸检测模型检测图像中的人脸位置。
特征点检测：在检测到的人脸区域中，识别关键特征点（如眼睛中心）。
计算仿射变换矩阵：基于特征点计算仿射变换矩阵，将人脸对齐到标准姿态。
应用变换：使用计算得到的矩阵对图像进行仿射变换，实现人脸校正。
显示和保存结果：展示校正前后的图像，并根据需要保存到本地。

4. 案例对应Python代码及详解

以下示例使用Python的OpenCV库和Dlib库实现人脸图像的校正。

import cv2
import numpy as np
import dlib
import matplotlib.pyplot as plt

# 1. 加载图像
def load_image(path):
    image = cv2.imread(path)
    if image is None:
        raise FileNotFoundError(f"无法加载图像，请检查路径: {path}")
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # 转换为RGB
    return image

# 2. 人脸检测
def detect_faces(image, detector):
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
    faces = detector(gray)
    return faces

# 3. 特征点检测
def get_facial_landmarks(image, face, predictor):
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
    landmarks = predictor(gray, face)
    # 提取左眼和右眼中心点
    left_eye = np.mean([(landmarks.part(n).x, landmarks.part(n).y) for n in range(36, 42)], axis=0)
    right_eye = np.mean([(landmarks.part(n).x, landmarks.part(n).y) for n in range(42, 48)], axis=0)
    return left_eye, right_eye

# 4. 计算仿射变换矩阵
def calculate_affine_matrix(left_eye, right_eye, desired_left_eye=(0.35, 0.35), desired_face_width=256, desired_face_height=256):
    # 计算当前眼睛中心间距
    dY = right_eye[1] - left_eye[1]
    dX = right_eye[0] - left_eye[0]
    angle = np.degrees(np.arctan2(dY, dX))  # 计算旋转角度
    # 计算缩放因子
    dist = np.sqrt((dX ** 2) + (dY ** 2))
    desired_dist = (1.0 - 2 * desired_left_eye[0]) * desired_face_width
    scale = desired_dist / dist
    # 计算旋转矩阵
    eyes_center = ((left_eye[0] + right_eye[0]) / 2,
                   (left_eye[1] + right_eye[1]) / 2)
    M = cv2.getRotationMatrix2D(eyes_center, angle, scale)
    # 计算平移
    tX = desired_face_width * 0.5
    tY = desired_face_height * desired_left_eye[1]
    M[0, 2] += (tX - eyes_center[0])
    M[1, 2] += (tY - eyes_center[1])
    return M

# 5. 应用仿射变换
def apply_affine_transform(image, M, output_size=(256, 256)):
    aligned = cv2.warpAffine(image, M, output_size, flags=cv2.INTER_CUBIC)
    return aligned

# 6. 显示图像
def display_images(original, aligned, title):
    plt.figure(figsize=(12,6))
    plt.subplot(1,2,1)
    plt.imshow(original)
    plt.title('原始图像')
    plt.axis('off')
    
    plt.subplot(1,2,2)
    plt.imshow(aligned)
    plt.title(title)
    plt.axis('off')
    
    plt.show()

# 主函数
def main():
    # 设置图像路径
    image_path = 'path_to_face_image.jpg'  # 请替换为实际图像路径
    
    # 加载原始图像
    original_image = load_image(image_path)
    
    # 初始化Dlib的面部检测器和预测器
    detector = dlib.get_frontal_face_detector()
    predictor_path = 'shape_predictor_68_face_landmarks.dat'  # 请下载并设置正确路径
    predictor = dlib.shape_predictor(predictor_path)
    
    # 检测人脸
    faces = detect_faces(original_image, detector)
    if len(faces) == 0:
        print("未检测到人脸")
        return
    face = faces[0]  # 仅处理第一张人脸
    
    # 获取特征点
    left_eye, right_eye = get_facial_landmarks(original_image, face, predictor)
    
    # 计算仿射变换矩阵
    M = calculate_affine_matrix(left_eye, right_eye)
    
    # 应用变换
    aligned_image = apply_affine_transform(original_image, M)
    
    # 显示结果
    display_images(original_image, aligned_image, '人脸校正后图像')

if __name__ == "__main__":
    main()

代码详解

加载图像：
- 使用cv2.imread读取图像，并将BGR格式转换为RGB格式。
- 检查图像是否成功加载，若未加载成功，抛出错误提示。
人脸检测：
- 使用Dlib库的get_frontal_face_detector检测图像中的人脸区域。
- 将图像转换为灰度图，提高检测效率。
特征点检测：
- 使用Dlib的预训练68点人脸标志检测器（shape_predictor_68_face_landmarks.dat）识别关键点。
- 提取左眼和右眼的中心点位置。
计算仿射变换矩阵：
- 计算眼睛中心之间的距离和角度，确定旋转角度和缩放因子。
- 构建旋转矩阵，并调整平移量，使人脸对齐到标准位置。
应用仿射变换：
- 使用cv2.warpAffine函数将变换矩阵应用到图像上，生成校正后的人脸图像。
显示图像：
- 使用Matplotlib并排显示原始图像和校正后的人脸图像，便于直观比较校正效果。
主函数流程：
1. 设置图像路径：将image_path变量替换为实际的人脸图像文件路径。
2. 加载原始图像：调用load_image函数加载图像。
3. 初始化检测器和预测器：加载Dlib的人脸检测器和特征点预测器（需提前下载shape_predictor_68_face_landmarks.dat文件）。
4. 检测人脸：调用detect_faces函数检测图像中的人脸区域。
5. 获取特征点：提取左眼和右眼的中心点。
6. 计算变换矩阵：基于特征点计算仿射变换矩阵。
7. 应用变换：调用apply_affine_transform函数实现人脸校正。
8. 显示结果：展示校正前后的图像。

案例三：图像拼接中的仿射变换

1. 案例描述

图像拼接（Image Stitching）是将多张具有重叠区域的图像合成为一幅全景图像的过程。在图像拼接过程中，仿射变换用于对齐不同视角下的图像，确保拼接区域的无缝衔接。本案例将展示如何使用仿射变换实现两张图像的对齐与拼接。

2. 案例分析

图像拼接的主要步骤包括：

特征检测与匹配：识别图像中的关键点并进行匹配。
计算变换矩阵：基于匹配的特征点计算仿射变换矩阵。
应用变换：将一张图像通过仿射变换对齐到另一张图像的坐标系中。
拼接图像：将对齐后的图像合并，形成全景图像。

仿射变换在图像拼接中起到关键作用，通过线性变换确保图像间的几何关系保持一致，实现自然的拼接效果。

3. 案例算法步骤

加载图像：读取需要拼接的两张图像。
特征检测与描述：使用SIFT或ORB检测并描述图像中的关键点。
特征匹配：通过特征描述子匹配两张图像的关键点。
筛选匹配点：使用RANSAC算法筛选出稳健的匹配点对。
计算仿射变换矩阵：基于匹配点对计算仿射变换矩阵。
应用变换：将一张图像通过仿射变换对齐到另一张图像的坐标系中。
拼接图像：合并两张对齐后的图像，生成全景图像。
显示和保存结果：展示原始图像与拼接后的全景图像，并根据需要保存到本地。

4. 案例对应Python代码及详解

以下示例使用Python的OpenCV库实现两张图像的仿射拼接。

import cv2
import numpy as np
import matplotlib.pyplot as plt

# 1. 加载图像
def load_images(path1, path2):
    img1 = cv2.imread(path1)
    img2 = cv2.imread(path2)
    if img1 is None or img2 is None:
        raise FileNotFoundError("无法加载其中一张图像，请检查路径")
    img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
    img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
    return img1, img2

# 2. 特征检测与描述
def detect_and_describe(image, detector):
    keypoints, descriptors = detector.detectAndCompute(image, None)
    return keypoints, descriptors

# 3. 特征匹配
def match_features(desc1, desc2, matcher):
    matches = matcher.knnMatch(desc1, desc2, k=2)
    # 应用比率测试
    good_matches = []
    for m, n in matches:
        if m.distance < 0.75 * n.distance:
            good_matches.append(m)
    return good_matches

# 4. 计算仿射变换矩阵
def compute_affine_matrix(kp1, kp2, matches):
    if len(matches) < 3:
        raise ValueError("匹配点太少，无法计算仿射变换")
    src_pts = np.float32([kp1[m.queryIdx].pt for m in matches]).reshape(-1,1,2)
    dst_pts = np.float32([kp2[m.trainIdx].pt for m in matches]).reshape(-1,1,2)
    M, mask = cv2.estimateAffinePartial2D(src_pts, dst_pts, method=cv2.RANSAC, ransacReprojThreshold=5.0)
    return M, mask

# 5. 应用仿射变换
def warp_image(image, M, dsize):
    warped = cv2.warpAffine(image, M, dsize, flags=cv2.INTER_LINEAR)
    return warped

# 6. 拼接图像
def stitch_images(img1, img2, M):
    # 获取尺寸
    h1, w1 = img1.shape[:2]
    h2, w2 = img2.shape[:2]
    # 计算变换后图像的尺寸
    corners_img2 = np.float32([[0,0], [0,h2], [w2,h2], [w2,0]]).reshape(-1,1,2)
    transformed_corners = cv2.transform(corners_img2, M)
    all_corners = np.concatenate((transformed_corners, np.float32([[0,0], [0,h1], [w1,h1], [w1,0]]).reshape(-1,1,2)), axis=0)
    [xmin, ymin] = np.int32(all_corners.min(axis=0).ravel() - 0.5)
    [xmax, ymax] = np.int32(all_corners.max(axis=0).ravel() + 0.5)
    translation = [-xmin, -ymin]
    # 更新变换矩阵以考虑平移
    M_translation = np.array([[1, 0, translation[0]],
                              [0, 1, translation[1]]], dtype=np.float32)
    # 扩展图像1到全景图像
    panorama = cv2.warpAffine(img2, M_translation.dot(np.vstack([M, [0,0,1]]))[:2], (xmax - xmin, ymax - ymin))
    panorama[translation[1]:h1+translation[1], translation[0]:w1+translation[0]] = img1
    return panorama

# 7. 显示图像
def display_panorama(img1, img2, panorama):
    plt.figure(figsize=(20,10))
    plt.subplot(1,3,1)
    plt.imshow(img1)
    plt.title('图像1')
    plt.axis('off')
    
    plt.subplot(1,3,2)
    plt.imshow(img2)
    plt.title('图像2')
    plt.axis('off')
    
    plt.subplot(1,3,3)
    plt.imshow(panorama)
    plt.title('全景图像')
    plt.axis('off')
    
    plt.show()

# 主函数
def main():
    # 设置图像路径
    image_path1 = 'path_to_image1.jpg'  # 请替换为实际图像路径
    image_path2 = 'path_to_image2.jpg'  # 请替换为实际图像路径
    
    # 加载图像
    img1, img2 = load_images(image_path1, image_path2)
    
    # 初始化特征检测器（使用ORB）
    detector = cv2.ORB_create(nfeatures=5000)
    
    # 检测并描述特征
    kp1, desc1 = detect_and_describe(img1, detector)
    kp2, desc2 = detect_and_describe(img2, detector)
    
    # 初始化匹配器（使用BFMatcher）
    matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=False)
    
    # 匹配特征
    good_matches = match_features(desc1, desc2, matcher)
    print(f'良好匹配点数量: {len(good_matches)}')
    
    if len(good_matches) < 3:
        print("匹配点太少，无法进行拼接")
        return
    
    # 计算仿射变换矩阵
    M, mask = compute_affine_matrix(kp1, kp2, good_matches)
    if M is None:
        print("无法计算仿射变换矩阵")
        return
    
    # 拼接图像
    panorama = stitch_images(img1, img2, M)
    
    # 显示结果
    display_panorama(img1, img2, panorama)
    
    # 保存全景图像（可选）
    save_path = 'panorama_result.jpg'
    cv2.imwrite(save_path, cv2.cvtColor(panorama, cv2.COLOR_RGB2BGR))
    print(f'已保存全景图像: {save_path}')

if __name__ == "__main__":
    main()

代码详解

加载图像：
- 使用cv2.imread读取两张需要拼接的图像，并转换为RGB格式。
- 检查图像是否成功加载，若未加载成功，抛出错误提示。
特征检测与描述：
- 使用ORB（Oriented FAST and Rotated BRIEF）检测图像中的关键点，并计算其描述子。
- ORB是一种高效的特征检测和描述算法，适用于实时应用。
特征匹配：
- 使用暴力匹配器（BFMatcher）进行特征点的匹配，采用Hamming距离作为匹配标准。
- 应用KNN（K-Nearest Neighbors）匹配，并通过比率测试筛选出良好匹配点。
计算仿射变换矩阵：
- 基于良好匹配点对，使用RANSAC算法估计稳健的仿射变换矩阵。
- 若匹配点不足或变换矩阵无法计算，终止拼接过程。
应用仿射变换与拼接图像：
- 使用计算得到的仿射变换矩阵对第二张图像进行变换，调整其视角和位置。
- 计算变换后的图像边界，并进行平移调整，确保拼接区域的无缝衔接。
- 合并两张对齐后的图像，生成全景图像。
显示图像：
- 使用Matplotlib并排显示原始的两张图像和拼接后的全景图像，便于直观比较拼接效果。
主函数流程：
1. 设置图像路径：将image_path1和image_path2变量替换为实际的图像文件路径。
2. 加载图像：调用load_images函数加载两张图像。
3. 初始化特征检测器：使用ORB算法检测图像中的特征点。
4. 检测并描述特征：调用detect_and_describe函数获取特征点和描述子。
5. 初始化匹配器：使用暴力匹配器进行特征匹配。
6. 匹配特征：调用match_features函数获取良好匹配点。
7. 计算变换矩阵：调用compute_affine_matrix函数计算仿射变换矩阵。
8. 拼接图像：调用stitch_images函数生成全景图像。
9. 显示结果：展示拼接前后的图像。

保存全景图像：可选步骤，将拼接后的全景图像保存到本地。

案例四：图像旋转对齐中的仿射变换

1. 案例描述

在许多计算机视觉任务中，图像的旋转对齐（Image Rotation Alignment）是一个基本操作。例如，在文档扫描中，文档可能由于拍摄角度不正而倾斜，通过仿射变换进行旋转校正，可以使文本水平，便于后续的OCR（光学字符识别）处理。本案例将展示如何使用仿射变换实现图像的自动旋转对齐。

2. 案例分析

图像旋转对齐的主要步骤包括：

文本检测：识别图像中的文本区域。
倾斜角度估计：计算文本区域的倾斜角度。
计算旋转矩阵：基于估计的角度计算仿射旋转矩阵。
应用旋转：将图像通过仿射旋转矩阵校正到水平位置。
显示和保存结果：展示校正前后的图像，并根据需要保存到本地。

通过自动旋转对齐，可以显著提升文本识别的准确性和效率。

3. 案例算法步骤

加载图像：读取需要校正的图像。
文本检测：使用预训练的文本检测模型识别文本区域。
倾斜角度估计：基于检测到的文本区域，估计整体的倾斜角度。
计算旋转矩阵：构建旋转矩阵，指定旋转中心和旋转角度。
应用旋转：使用仿射旋转矩阵对图像进行旋转校正。
显示和保存结果：展示校正前后的图像，并根据需要保存到本地。

4. 案例对应Python代码及详解

以下示例使用Python的OpenCV库和Tesseract OCR库实现图像的自动旋转对齐。

import cv2
import numpy as np
import pytesseract
import matplotlib.pyplot as plt

# 确保已安装Tesseract OCR，并设置正确的路径
# pytesseract.pytesseract.tesseract_cmd = r'path_to_tesseract.exe'

# 1. 加载图像
def load_image(path):
    image = cv2.imread(path)
    if image is None:
        raise FileNotFoundError(f"无法加载图像，请检查路径: {path}")
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    return image

# 2. 文本检测与倾斜角度估计
def estimate_skew_angle(image):
    # 转换为灰度图
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
    # 二值化处理
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY_INV)
    # 形态学操作，闭运算填充文本区域
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (30, 5))
    closed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
    # 查找轮廓
    contours, _ = cv2.findContours(closed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    if not contours:
        return 0
    # 选择最大的轮廓
    largest_contour = max(contours, key=cv2.contourArea)
    # 计算最小外接矩形
    rect = cv2.minAreaRect(largest_contour)
    angle = rect[-1]
    if angle < -45:
        angle = 90 + angle
    return -angle

# 3. 计算旋转矩阵
def get_rotation_matrix(image, angle):
    (h, w) = image.shape[:2]
    center = (w / 2, h / 2)
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    return M

# 4. 应用旋转
def rotate_image(image, M):
    (h, w) = image.shape[:2]
    rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE)
    return rotated

# 5. 显示图像
def display_images(original, rotated, angle):
    plt.figure(figsize=(12,6))
    plt.subplot(1,2,1)
    plt.imshow(original)
    plt.title('原始图像')
    plt.axis('off')
    
    plt.subplot(1,2,2)
    plt.imshow(rotated)
    plt.title(f'旋转校正后 (角度: {angle}°)')
    plt.axis('off')
    
    plt.show()

# 主函数
def main():
    # 设置图像路径
    image_path = 'path_to_skewed_image.jpg'  # 请替换为实际图像路径
    
    # 加载原始图像
    original_image = load_image(image_path)
    
    # 估计倾斜角度
    angle = estimate_skew_angle(original_image)
    print(f'估计的倾斜角度: {angle}°')
    
    # 计算旋转矩阵
    M = get_rotation_matrix(original_image, angle)
    
    # 应用旋转
    rotated_image = rotate_image(original_image, M)
    
    # 显示结果
    display_images(original_image, rotated_image, angle)
    
    # 保存校正后的图像（可选）
    save_path = 'aligned_image.jpg'
    cv2.imwrite(save_path, cv2.cvtColor(rotated_image, cv2.COLOR_RGB2BGR))
    print(f'已保存校正后的图像: {save_path}')
    
    # 可选：进行OCR识别验证
    # text = pytesseract.image_to_string(rotated_image)
    # print("识别文本内容:")
    # print(text)

if __name__ == "__main__":
    main()

代码详解

加载图像：
- 使用cv2.imread读取图像，并转换为RGB格式。
- 检查图像是否成功加载，若未加载成功，抛出错误提示。
文本检测与倾斜角度估计：
- 将图像转换为灰度图，便于后续处理。
- 进行二值化处理，将文本区域与背景分离。
- 使用形态学闭运算填充文本区域，增强轮廓。
- 查找图像中的轮廓，并选择最大的轮廓作为主要文本区域。
- 计算最小外接矩形，获取倾斜角度。
- 调整角度范围，确保旋转方向正确。
计算旋转矩阵：
- 使用cv2.getRotationMatrix2D根据估计的倾斜角度计算旋转矩阵，指定旋转中心和缩放因子（此处为1.0，表示不缩放）。
应用旋转：
- 使用cv2.warpAffine函数将旋转矩阵应用到图像上，生成校正后的图像。
- 设置边界处理方式为复制边缘，避免出现黑边。
显示图像：
- 使用Matplotlib并排显示原始图像和校正后的图像，便于直观比较旋转效果。
主函数流程：
1. 设置图像路径：将image_path变量替换为实际的倾斜图像文件路径。
2. 加载原始图像：调用load_image函数加载图像。
3. 估计倾斜角度：调用estimate_skew_angle函数计算图像的倾斜角度。
4. 计算旋转矩阵：调用get_rotation_matrix函数计算仿射旋转矩阵。
5. 应用旋转：调用rotate_image函数实现图像的旋转校正。
6. 显示结果：展示校正前后的图像。
7. 保存校正后的图像：可选步骤，将校正后的图像保存到本地。
8. OCR识别验证：可选步骤，使用Tesseract OCR对校正后的图像进行文本识别，验证校正效果。

案例五：车辆图像透视变换中的仿射变换

1. 案例描述

在自动驾驶和交通监控系统中，车辆的透视变换（Perspective Transformation）是分析和理解道路场景的重要步骤。通过仿射变换，可以将车辆图像从摄像头视角转换为鸟瞰视角，便于检测车道线、障碍物等信息。本案例将展示如何使用仿射变换实现车辆图像的透视校正。

2. 案例分析

透视变换能够改变图像的视角，使其从一个角度转换到另一个角度。在自动驾驶中，常用鸟瞰视角（Top-Down View）来更清晰地展示道路和车辆的位置。主要步骤包括：

选取源点和目标点：确定图像中需要变换的四个点（如车道线的四个角点）及其在目标视角下的位置。
计算仿射变换矩阵：基于源点和目标点计算变换矩阵。
应用变换：使用仿射变换矩阵将图像转换到目标视角。
显示和保存结果：展示透视校正前后的图像，并根据需要保存到本地。

3. 案例算法步骤

加载图像：读取需要透视校正的车辆图像。
选取源点和目标点：手动或自动选取图像中的四个关键点及其对应的目标位置。
计算仿射变换矩阵：使用cv2.getAffineTransform计算仿射变换矩阵。
应用变换：使用cv2.warpAffine函数将图像转换到目标视角。
显示和保存结果：展示校正前后的图像，并根据需要保存到本地。

4. 案例对应Python代码及详解

以下示例使用Python的OpenCV库实现车辆图像的透视校正。

import cv2
import numpy as np
import matplotlib.pyplot as plt

# 1. 加载图像
def load_image(path):
    image = cv2.imread(path)
    if image is None:
        raise FileNotFoundError(f"无法加载图像，请检查路径: {path}")
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    return image

# 2. 选取源点和目标点
def select_points(image):
    # 显示图像并手动选取三对对应点
    plt.figure(figsize=(8,6))
    plt.imshow(image)
    plt.title('请在图像上点击三个对应的点（源点）')
    points = plt.ginput(3)
    plt.close()
    src_pts = np.float32(points)
    
    # 定义目标点（例如，转换到标准坐标系下）
    # 此处为示例，可根据实际需求调整
    width, height = 300, 300
    dst_pts = np.float32([[0,0], [width,0], [0, height]])
    return src_pts, dst_pts, (width, height)

# 3. 计算仿射变换矩阵
def get_affine_transform_matrix(src_pts, dst_pts):
    M = cv2.getAffineTransform(src_pts, dst_pts)
    return M

# 4. 应用变换
def apply_affine_transform(image, M, output_size):
    transformed = cv2.warpAffine(image, M, output_size, flags=cv2.INTER_LINEAR)
    return transformed

# 5. 显示图像
def display_images(original, transformed, title):
    plt.figure(figsize=(12,6))
    plt.subplot(1,2,1)
    plt.imshow(original)
    plt.title('原始图像')
    plt.axis('off')
    
    plt.subplot(1,2,2)
    plt.imshow(transformed)
    plt.title(title)
    plt.axis('off')
    
    plt.show()

# 主函数
def main():
    # 设置图像路径
    image_path = 'path_to_vehicle_image.jpg'  # 请替换为实际图像路径
    
    # 加载原始图像
    original_image = load_image(image_path)
    
    # 选取源点和目标点
    print("请在弹出的图像窗口中点击三个源点")
    src_pts, dst_pts, output_size = select_points(original_image)
    print(f'源点坐标: {src_pts}')
    print(f'目标点坐标: {dst_pts}')
    
    # 计算仿射变换矩阵
    M = get_affine_transform_matrix(src_pts, dst_pts)
    
    # 应用变换
    transformed_image = apply_affine_transform(original_image, M, output_size)
    
    # 显示结果
    display_images(original_image, transformed_image, '透视校正后图像')
    
    # 保存校正后的图像（可选）
    save_path = 'perspective_corrected_image.jpg'
    cv2.imwrite(save_path, cv2.cvtColor(transformed_image, cv2.COLOR_RGB2BGR))
    print(f'已保存透视校正后的图像: {save_path}')

if __name__ == "__main__":
    main()

代码详解

加载图像：
- 使用cv2.imread读取车辆图像，并转换为RGB格式。
- 检查图像是否成功加载，若未加载成功，抛出错误提示。
选取源点和目标点：
- 使用Matplotlib的plt.ginput功能，手动在图像上点击三个对应的源点。
- 定义目标点的位置，此处设定为一个300x300的标准坐标系下的三个点（可根据实际需求调整）。
- 返回源点和目标点的坐标，以及输出图像的尺寸。
计算仿射变换矩阵：
- 使用cv2.getAffineTransform根据源点和目标点计算仿射变换矩阵。
应用变换：
- 使用cv2.warpAffine函数将仿射变换矩阵应用到图像上，生成透视校正后的图像。
- 指定输出图像的尺寸和插值方法。
显示图像：
- 使用Matplotlib并排显示原始图像和透视校正后的图像，便于直观比较变换效果。
主函数流程：
1. 设置图像路径：将image_path变量替换为实际的车辆图像文件路径。
2. 加载原始图像：调用load_image函数加载图像。
3. 选取源点和目标点：通过Matplotlib手动点击图像中的三个关键点，定义对应的目标点。
4. 计算变换矩阵：调用get_affine_transform_matrix函数计算仿射变换矩阵。
5. 应用变换：调用apply_affine_transform函数实现透视校正。
6. 显示结果：展示校正前后的图像。
7. 保存校正后的图像：可选步骤，将校正后的图像保存到本地。