使用OpenCV和Python掌握计算机视觉：深入探讨高级技术及代码演示

最新推荐文章于 2025-04-13 12:20:13 发布

小北的北

最新推荐文章于 2025-04-13 12:20:13 发布

阅读量865

点赞数 18

文章标签：计算机视觉 opencv python 人工智能开发语言

本文链接：https://blog.csdn.net/weixin_38739735/article/details/137946629

版权

点击下方卡片，关注“小白玩转Python”公众号

在不断发展的技术领域中，计算机视觉作为一种变革性力量脱颖而出，使机器能够解释和理解视觉信息。OpenCV（开源计算机视觉库）成为该领域的基石，提供了丰富的工具和功能，用于图像和视频处理。在本文中，我们将探索OpenCV的基础知识，并深入研究9个高级Python代码示例，展示其多样性和强大功能。

理解OpenCV

OpenCV是一个开源的计算机视觉和机器学习软件库，提供图像和视频分析工具。它是用C++开发的，后来扩展到包括Python绑定，OpenCV支持广泛的计算机视觉任务，包括图像和视频处理、对象检测、人脸识别等。其多功能性使其成为研究人员、开发人员和爱好者的首选。

安装OpenCV

在深入代码示例之前，请确保正确安装了OpenCV。使用以下命令在Python环境中安装OpenCV：

pip install opencv-python

安装完成后，在您的Python脚本或Jupyter笔记本中导入OpenCV：

import cv2

1. 加载和显示图像

让我们从一个简单的示例开始，加载一张图像并使用OpenCV显示它：

2. 图像灰度转换

使用OpenCV将图像转换为灰度：

# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)


# Display the grayscale image
plt.imshow(gray_image, cmap='gray')
plt.axis('off')
plt.show()

3. 图像模糊

对图像应用高斯模糊以减少噪声：

# Apply Gaussian blur
blurred_image = cv2.GaussianBlur(image, (5, 5), 0)


# Display the blurred image
plt.imshow(cv2.cvtColor(blurred_image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

4. 边缘检测

利用Canny边缘检测算法突出显示图像中的边缘：

# Apply Canny edge detection
edges = cv2.Canny(gray_image, 50, 150)


# Display the edges
plt.imshow(edges, cmap='gray')
plt.axis('off')
plt.show()

5. 对象检测

使用预先训练的Haar级联进行图像中的人脸检测：

# Load the pre-trained face cascade
faceCascade = cv2.CascadeClassifier('./opencv-master/data/haarcascades/' + 'haarcascade_frontalface_default.xml')


# Detect faces in the image
faces = faceCascade.detectMultiScale(
    gray_image,
    scaleFactor = 1.1,
    minNeighbors = 0,
    minSize=(10,10)
)
how_many_faces = len(faces)


# Draw rectangles around the detected faces
for (x, y, w, h) in faces:
    cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0, 0), 2)


# Display the image with face detection
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

6. 图像直方图

为图像生成并显示直方图：

# Calculate the histogram
hist = cv2.calcHist([image], [0], None, [256], [0, 256])


# Plot the histogram
plt.plot(hist)
plt.title('Image Histogram')
plt.xlabel('Pixel Value')
plt.ylabel('Frequency')
plt.show()

理解直方图：

直方图的x轴表示像素值（强度水平）从0到255。
y轴表示图像中每个像素值的出现频率。
直方图中的峰值表示图像中的高强度或颜色浓度区域。
直方图提供了关于像素强度分布的见解，有助于理解图像的整体亮度和对比度。

7. 图像拼接

将多个图像拼接在一起创建全景视图：

import cv2


stitcher = cv2.Stitcher_create()
image1 = cv2.imread("./foo.png")
image2 = cv2.imread("./bar.png")
result, panorama = stitcher.stitch([image1, image2])


cv2.imwrite("./result.jpg", panorama)

8. 使用网络摄像头进行实时人脸检测

使用OpenCV和网络摄像头进行实时人脸检测：

# Open a connection to the webcam
cap = cv2.VideoCapture(0)


while True:
    # Capture frame-by-frame
    ret, frame = cap.read()


    # Convert the frame to grayscale for face detection
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)


    # Detect faces in the frame
    faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))


    # Draw rectangles around the detected faces
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)


    # Display the frame
    cv2.imshow('Real-time Face Detection', frame)


    # Break the loop when 'q' key is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break


# Release the webcam and close all windows
cap.release()
cv2.destroyAllWindows()

9. 文档扫描仪

创建一个文档扫描仪，从文档或照片中提取文本或图像。此示例演示如何应用透视变换以获得文档的俯视图：

import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the image of the document
document_path = 'path/to/your/document.jpg'
document_image = cv2.imread(document_path)
# Convert the image to grayscale
gray_document = cv2.cvtColor(document_image, cv2.COLOR_BGR2GRAY)
# Apply Gaussian blur to reduce noise and improve edge detection
blurred_document = cv2.GaussianBlur(gray_document, (5, 5), 0)
# Use Canny edge detection to find edges in the image
edges_document = cv2.Canny(blurred_document, 50, 150)
# Find contours in the edged image
contours, _ = cv2.findContours(edges_document, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Sort the contours by area and find the largest one (assuming it's the document)
largest_contour = max(contours, key=cv2.contourArea)
# Calculate the perimeter of the contour
perimeter = cv2.arcLength(largest_contour, True)
# Approximate the polygonal curves of the contour
approx = cv2.approxPolyDP(largest_contour, 0.02 * perimeter, True)
# Ensure the approximated contour has four points (a rectangle)
if len(approx) == 4:
    # Apply perspective transformation to obtain a top-down view of the document
    transformed_document = cv2.warpPerspective(document_image, cv2.getPerspectiveTransform(approx.reshape(4, 2), np.float32([[0, 0], [800, 0], [800, 1200], [0, 1200]])), (800, 1200))
    # Display the original and transformed document side by side
    plt.figure(figsize=(10, 5))
    plt.subplot(1, 2, 1)
    plt.imshow(cv2.cvtColor(document_image, cv2.COLOR_BGR2RGB))
    plt.title('Original Document')
    plt.axis('off')
    plt.subplot(1, 2, 2)
    plt.imshow(cv2.cvtColor(transformed_document, cv2.COLOR_BGR2RGB))
    plt.title('Transformed Document')
    plt.axis('off')
    plt.show()

Extracted Text:
Tesseract at UB Mannheim


The Mannheim University Library (UB Mannheim) uses Tesseract to perform text recognition (OCR = optical character
recognition) for historical German newspapers ( ' ). The latest
results with text from more than 700000 pages are available


Tesseract installer for Windows


Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version. That's why we have built
a Tesseract installer for Windows.


WARNING: Tesseract should be either installed in the directory which is suggested during the installation or in a new
directory. The uninstaller removes the whole installation directory. If you installed Tesseract in an existing directory, that
directory will be removed with all its subdirectories and files.


The latest installer can be downloaded here:
e (64 bit)
There are also available.


In addition, we also provide which was generated by Doxygen.

10. 光学字符识别（OCR）

使用Tesseract OCR引擎通过pytesseract库实现光学字符识别。此示例从图像中提取文本：

import cv2
import pytesseract
from PIL import Image
# Path to the Tesseract executable (replace with your path)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
# Load an image containing text
text_image_path = 'path/to/your/text_image.jpg'
text_image = cv2.imread(text_image_path)
# Convert the image to grayscale
gray_text_image = cv2.cvtColor(text_image, cv2.COLOR_BGR2GRAY)
# Use thresholding to emphasize the text
_, thresholded_text = cv2.threshold(gray_text_image, 150, 255, cv2.THRESH_BINARY)
# Use pytesseract to perform OCR on the thresholded image
text = pytesseract.image_to_string(Image.fromarray(thresholded_text))
# Display the original image and extracted text
plt.figure(figsize=(8, 6))
plt.imshow(cv2.cvtColor(text_image, cv2.COLOR_BGR2RGB))
plt.title('Original Image')
plt.axis('off')
plt.show()
print("Extracted Text:")
print(text)

Extracted Text:
Tesseract at UB Mannheim


The Mannheim University Library (UB Mannheim) uses Tesseract to perform text recognition (OCR = optical character
recognition) for historical German newspapers ( ' ). The latest
results with text from more than 700000 pages are available


Tesseract installer for Windows


Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version. That's why we have built
a Tesseract installer for Windows.


WARNING: Tesseract should be either installed in the directory which is suggested during the installation or in a new
directory. The uninstaller removes the whole installation directory. If you installed Tesseract in an existing directory, that
directory will be removed with all its subdirectories and files.

这些示例展示了OpenCV可以处理的各种任务的多样性，从文档扫描到通过OCR提取文本。通过将这些技术纳入您的项目中，您可以利用计算机视觉的力量来解决现实世界中的问题。

结论

OpenCV赋予开发人员和研究人员探索计算机视觉广阔世界的能力。本文提供了对OpenCV的全面介绍，以及高级Python代码示例，展示了其在图像和视频处理、对象检测等方面的能力。随着技术的不断发展，OpenCV仍然是那些希望推动计算机视觉领域边界的人士的无价工具。

· END ·

HAPPY LIFE