如何用opencv找到图片中的橘猫？【python】

本文链接：https://blog.csdn.net/qq_51908093/article/details/142791436

如何用opencv找到图片中的橘猫？【python】

这张图片中有一只橘猫。我们如何才能让计算机找到它呢？由于橘猫的颜色与草地的颜色有较大的差别，我们可以尝试识别橘猫的颜色。

一、HSV色彩空间

计算机中的图片通常使用RGB色彩模型来保存，但是这三种颜色分量的取值与所生成的颜色之间的联系并不直观。这时我们可以通过HSV来进行颜色筛选，而不是RGB。HSV实际上是一种将RGB色彩模型中的点表示在圆柱坐标系中的方法。HSV即色相、饱和度、明度（英语：Hue, Saturation, Value），又称HSB，其中B即英语：Brightness。

import cv2
import numpy as np


def show_hsv():
    global lower_bound
    global upper_bound
    # 创建一个空白图像，大小为256x256
    h, w = 256, 256
    image = np.zeros((h, w, 3), dtype=np.uint8)

    # 在图像中填充HSV颜色
    for i in range(h):
        for j in range(w):
            # Hue 值从 lower_bound 到 upper_bound 之间均匀分布
            hue = lower_bound[0] + (upper_bound[0] - lower_bound[0]) * j // w
            saturation = lower_bound[1] + (upper_bound[1] - lower_bound[1]) * i // h
            value = lower_bound[2] + (upper_bound[2] - lower_bound[2]) * (h - i) // h

            # 将HSV颜色转换为BGR格式
            hsv_color = np.uint8([[[hue, saturation, value]]])
            bgr_color = cv2.cvtColor(hsv_color, cv2.COLOR_HSV2BGR)
            image[i, j] = bgr_color

    # 显示生成的颜色图像
    cv2.imshow("Color Range", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()


if __name__ == '__main__':
    # 定义 HSV 范围
    lower_bound = np.array([0, 0, 0])
    upper_bound = np.array([255, 255, 255])
    show_hsv()

运行以上脚本，即可看到完整的HSV色彩空间。如图所示：

现在我们需要修改 lower_bound 和 upper_bound，筛选颜色范围为橘黄色。经验证，将上文代码中的

lower_bound = np.array([0, 0, 0])
upper_bound = np.array([255, 255, 255])

修改为

lower_bound = np.array([10, 90, 150])
upper_bound = np.array([30, 180, 230])

可以比较好地筛选出橘猫的颜色。修改后运行代码，结果如图所示：

二、寻找猫猫

现在让我们开始寻找图片中的橘猫。
把上方的橘猫图片保存为 test.jpg，放置在 python 脚本相同路径下。脚本如下：

import cv2
import numpy as np


def highlight_color_and_find_centers(frame, roi_area, do_paint_targets=True):
    global lower_bound
    global upper_bound
    y_start, y_end, x_start, x_end = roi_area
    roi = frame[y_start:y_end, x_start:x_end]

    # 转换为 HSV 颜色空间
    hsv_roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)

    # 创建掩膜，只保留在颜色范围内的区域
    mask = cv2.inRange(hsv_roi, lower_bound, upper_bound)

    # 查找掩膜中的轮廓
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    centers_with_area = []

    for contour in contours:
        area = cv2.contourArea(contour)
        print(f"area = {area}")
        if area > int(2):
            x, y, w, h = cv2.boundingRect(contour)
            print(f"w = {w}, h = {h}")
            M = cv2.moments(contour)
            if M['m00'] != 0:
                cX = int(M['m10'] / M['m00'])
                cY = int(M['m01'] / M['m00'])
                centers_with_area.append(((cX + x_start, cY + y_start), area))

    centers_with_area = sorted(centers_with_area, key=lambda x: x[1], reverse=True)
    centers = [center for center, area in centers_with_area]

    if do_paint_targets:
        cv2.rectangle(frame, (x_start, y_start), (x_end, y_end), (0, 255, 0), 2)
        for contour in contours:
            cv2.drawContours(roi, [contour], -1, (255, 0, 255), thickness=cv2.FILLED)

    return centers


def main():
    input_image_path = "test.jpg"
    output_image_path = "combined_image.jpg"
    roi_area = (0, 720, 0, 1280)  # y1, y2, x1, x2

    # 读取原始图像
    frame = cv2.imread(input_image_path)

    # 处理图像
    processed_frame = frame.copy()  # 创建处理后的图像副本
    highlight_color_and_find_centers(processed_frame, roi_area)

    # 调整图像大小比例
    scale_percent = 50  # 将图像缩放为原始大小的50%
    width = int(frame.shape[1] * scale_percent / 100)
    height = int(frame.shape[0] * scale_percent / 100)
    dim = (width, height)

    # 对原始图像和处理后的图像进行缩放
    resized_frame = cv2.resize(frame, dim, interpolation=cv2.INTER_AREA)
    resized_processed_frame = cv2.resize(processed_frame, dim, interpolation=cv2.INTER_AREA)

    # 将处理前后的图像水平拼接
    combined_frame = cv2.vconcat([resized_frame, resized_processed_frame])  # 水平拼接
    cv2.imwrite(output_image_path, combined_frame)

    # 显示拼接后的图像
    cv2.imshow("Before and After (Resized)", combined_frame)

    # 等待键盘输入
    if cv2.waitKey(0):
        print("Keyboard interrupt")

    # 关闭所有窗口
    cv2.destroyAllWindows()


if __name__ == '__main__':
    # 定义 HSV 范围
    lower_bound = np.array([10, 90, 150])
    upper_bound = np.array([30, 180, 230])
    main()

运行脚本，结果如下：

这个对比图片将会保存为 combined_image.jpg，与脚本在相同路径下。

三、代码解析

上文中的代码已经可以完成任务了，以下是对代码的一些解析。

1. ROI

代码中，roi（Region of interest）是需要筛选颜色的区域。由于橘猫图片的尺寸为1280×720，脚本中将 roi 设置为

roi_area = (0, 720, 0, 1280)  # y1, y2, x1, x2

这样将会识别整个图片中的目标颜色（本文中为橘黄色）。如果修改 roi，将会只对 roi 范围内的部分进行颜色识别。例如将 roi 修改为

roi_area = (0, 360, 0, 640)  # y1, y2, x1, x2

则只会对（0, 0）至（640, 360）这一矩形范围内进行颜色识别。运行结果如图所示：

2. 缩放图像

代码中以下内容的作用是将原始图像（frame）和处理后的图像（processed_frame）缩放到原尺寸的一半。

    # 调整图像大小比例
    scale_percent = 50  # 将图像缩放为原始大小的50%
    width = int(frame.shape[1] * scale_percent / 100)
    height = int(frame.shape[0] * scale_percent / 100)
    dim = (width, height)

    # 对原始图像和处理后的图像进行缩放
    resized_frame = cv2.resize(frame, dim, interpolation=cv2.INTER_AREA)
    resized_processed_frame = cv2.resize(processed_frame, dim, interpolation=cv2.INTER_AREA)

3. 图像拼接

代码中以下内容的作用是将缩小后的原图与处理后的图像进行拼接，并且把拼接后的图片保存为 combined_image.jpg 。

    # 将处理前后的图像拼接
    combined_frame = cv2.vconcat([resized_frame, resized_processed_frame])  # 拼接
    cv2.imwrite(output_image_path, combined_frame)

如果想要水平拼接，将 vconcat 改为 hconcat 即可。

4. 显示图像

代码中以下内容的作用是将拼接后的图像显示出来。

    # 显示拼接后的图像
    cv2.imshow("Before and After (Resized)", combined_frame)

    # 等待键盘输入
    if cv2.waitKey(0):
        print("Keyboard interrupt")

    # 关闭所有窗口
    cv2.destroyAllWindows()