使用PyTorch实现Canny边缘检测-CSDN博客

本文链接：https://blog.csdn.net/qq_24951479/article/details/130354546

边缘检测是什么？

边缘检测是计算机视觉领域中的一项基本任务，其目的是在图像中找到物体的边缘。边缘是物体的边界或者是物体内部的强度变化区域。边缘检测在很多应用中都有着重要的作用，例如图像分割、目标识别、三维重建等。

边缘检测的步骤

边缘检测的基本步骤如下：

将图像转换为灰度图像，使得每个像素只有一个强度值。
对图像进行滤波，以去除噪声和平滑图像。
计算图像中每个像素的梯度，以找到强度变化的位置。
应用非极大值抑制，以保留梯度方向上的局部极大值。
应用双阈值算法，以将边缘像素分为强边缘和弱边缘。
应用连接分析，以将弱边缘转换为强边缘或者去除它们。

边缘检测的实现

在本文中，我们将使用PyTorch实现边缘检测。我们将使用Canny算法，这是一种经典的边缘检测算法。

灰度化

首先，我们将图像转换为灰度图像。这可以通过下面的代码实现：

import torch
import torchvision.transforms.functional as TF

def to_gray(image):
    return TF.to_grayscale(image, num_output_channels=1)

滤波

然后，我们需要对图像进行滤波，以去除噪声和平滑图像。我们将使用高斯滤波器，这可以通过下面的代码实现：

import torch.nn.functional as F

def gaussian_kernel(size, sigma=1.5):
    x = torch.arange(size).float()
    k = torch.exp(-(x - size // 2)**2 / (2 * sigma**2))
    return k / k.sum()

def gaussian_filter(image, size=5, sigma=1.5):
    kernel = gaussian_kernel(size, sigma).unsqueeze(0).unsqueeze(0)
    return F.conv2d(image, kernel, padding=size // 2)

计算梯度

然后，我们需要计算图像中每个像素的梯度。我们将使用Sobel算子，这可以通过下面的代码实现：

def sobel_filter(image):
    kernel_x = torch.tensor([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]]).float().unsqueeze(0).unsqueeze(0)
    kernel_y = torch.tensor([[-1, -2, -1], [0, 0, 0], [1, 2, 1]]).float().unsqueeze(0).unsqueeze(0)
    gradient_x = F.conv2d(image, kernel_x, padding=1)
    gradient_y = F.conv2d(image, kernel_y, padding=1)
    gradient = torch.sqrt(gradient_x**2 + gradient_y**2)
    angle = torch.atan2(gradient_y, gradient_x)
    return gradient, angle

非极大值抑制

然后，我们需要应用非极大值抑制，以保留梯度方向上的局部极大值。这可以通过下面的代码实现：

def non_maximum_suppression(gradient, angle):
    h, w = gradient.shape[-2:]
    suppressed = torch.zeros_like(gradient)
    for i in range(1, h - 1):
        for j in range(1, w - 1):
            a = angle[0, 0, i, j].item() / np.pi * 180
            if (a < -22.5 or a >= 157.5) and gradient[0, 0, i, j] >= gradient[0, 0, i, j - 1] and gradient[0, 0, i, j] >= gradient[0, 0, i, j + 1]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= -22.5 and a < 22.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= 22.5 and a < 67.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j - 1] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j + 1]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= 67.5 and a < 112.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= 112.5 and a < 157.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j + 1] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j - 1]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
    return suppressed

双阈值算法

然后，我们需要应用双阈值算法，以将边缘像素分为强边缘和弱边缘。这可以通过下面的代码实现：

def double_threshold(suppressed, low_threshold=20, high_threshold=50):
    strong = (suppressed >= high_threshold).float()
    weak = (suppressed >= low_threshold).float() - strong
    return strong, weak

连接分析

最后，我们需要应用连接分析，以将弱边缘转换为强边缘或者去除它们。这可以通过下面的代码实现：

def edge_tracking(strong, weak):
    h, w = strong.shape[-2:]
    for i in range(1, h - 1):
        for j in range(1, w - 1):
            if weak[0, 0, i, j] and strong[0, 0, i - 1:i + 2, j - 1:j + 2].max() > 0:
                strong[0, 0, i, j] = 1
                weak[0, 0, i, j] = 0
    return strong

完整代码

下面是完整的边缘检测代码：

import numpy as np
import torch
import torchvision.transforms.functional as TF
import torch.nn.functional as F

def to_gray(image):
    return TF.to_grayscale(image, num_output_channels=1)

def gaussian_kernel(size, sigma=1.5):
    x = torch.arange(size).float()
    k = torch.exp(-(x - size // 2)**2 / (2 * sigma**2))
    return k / k.sum()

def gaussian_filter(image, size=5, sigma=1.5):
    kernel = gaussian_kernel(size, sigma).unsqueeze(0).unsqueeze(0)
    return F.conv2d(image, kernel, padding=size // 2)

def sobel_filter(image):
    kernel_x = torch.tensor([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]]).float().unsqueeze(0).unsqueeze(0)
    kernel_y = torch.tensor([[-1, -2, -1], [0, 0, 0], [1, 2, 1]]).float().unsqueeze(0).unsqueeze(0)
    gradient_x = F.conv2d(image, kernel_x, padding=1)
    gradient_y = F.conv2d(image, kernel_y, padding=1)
    gradient = torch.sqrt(gradient_x**2 + gradient_y**2)
    angle = torch.atan2(gradient_y, gradient_x)
    return gradient, angle

def non_maximum_suppression(gradient, angle):
    h, w = gradient.shape[-2:]
    suppressed = torch.zeros_like(gradient)
    for i in range(1, h - 1):
        for j in range(1, w - 1):
            a = angle[0, 0, i, j].item() / np.pi * 180
            if (a < -22.5 or a >= 157.5) and gradient[0, 0, i, j] >= gradient[0, 0, i, j - 1] and gradient[0, 0, i, j] >= gradient[0, 0, i, j + 1]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= -22.5 and a < 22.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= 22.5 and a < 67.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j - 1] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j + 1]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= 67.5 and a < 112.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
            elif (a >= 112.5 and a < 157.5) and gradient[0, 0, i, j] >= gradient[0, 0, i - 1, j + 1] and gradient[0, 0, i, j] >= gradient[0, 0, i + 1, j - 1]:
                suppressed[0, 0, i, j] = gradient[0, 0, i, j]
    return suppressed

def double_threshold(suppressed, low_threshold=20, high_threshold=50):
    strong = (suppressed >= high_threshold).float()
    weak = (suppressed >= low_threshold).float() - strong
    return strong, weak

def edge_tracking(strong, weak):
    h, w = strong.shape[-2:]
    for i in range(1, h - 1):
        for j in range(1, w - 1):
            if weak[0, 0, i, j] and strong[0, 0, i - 1:i + 2, j - 1:j + 2].max() > 0:
                strong[0, 0, i, j] = 1
                weak[0, 0, i, j] = 0
    return strong

def canny(image):
    gray = to_gray(image)
    filtered = gaussian_filter(gray)
    gradient, angle = sobel_filter(filtered)
    suppressed = non_maximum_suppression(gradient, angle)
    strong, weak = double_threshold(suppressed)
    strong = edge_tracking(strong, weak)
    return strong

# 例子
import matplotlib.pyplot as plt
from PIL import Image

image = Image.open('example.jpg')
edges = canny(image)
plt.imshow(edges.squeeze(), cmap='gray')
plt.show()

结构图

下面是边缘检测的结构图：

下面介绍几种常见的边缘检测方法。

1. Canny边缘检测

Canny边缘检测是一种经典的边缘检测方法，它基于一系列的图像处理步骤，包括高斯滤波、计算梯度、非极大值抑制和双阈值处理。其中，高斯滤波用于平滑图像，计算梯度用于检测图像中的边缘，非极大值抑制用于压缩边缘，双阈值处理用于确定边缘的强度。

Canny边缘检测的步骤如下：

对图像进行高斯滤波，以平滑图像，去除噪声。
计算图像的梯度，找到图像中的边缘。
对梯度幅值进行非极大值抑制，以压缩边缘。
通过双阈值处理，确定边缘的强度，进一步压缩边缘。

2. Sobel边缘检测

Sobel边缘检测是一种基于梯度的边缘检测方法，它使用两个卷积核（Sobel X和Sobel Y）来计算图像的水平和垂直梯度。然后，使用这些梯度计算边缘的强度和方向。

Sobel边缘检测的步骤如下：

对图像进行灰度化处理，以便进行梯度计算。
使用Sobel X和Sobel Y卷积核计算图像的水平和垂直梯度。
计算梯度幅值和方向。
通过设置阈值，确定边缘的强度。

3. Laplacian边缘检测

Laplacian边缘检测是一种基于二阶导数的边缘检测方法，它使用拉普拉斯算子来计算图像的二阶导数。然后，使用这些导数计算边缘的强度。

Laplacian边缘检测的步骤如下：

对图像进行灰度化处理，以便进行导数计算。
使用拉普拉斯算子计算图像的二阶导数。
计算导数的幅值。
通过设置阈值，确定边缘的强度。

4. 边缘检测在PyTorch中的实现

在PyTorch中，可以使用torchvision包中的transforms模块来实现边缘检测。其中，transforms模块提供了多种边缘检测方法，如Canny边缘检测、Sobel边缘检测和Laplacian边缘检测。

下面是使用transforms模块实现Canny边缘检测的代码示例：

import torch
import torchvision.transforms as transforms
from PIL import Image

# 加载图像
img = Image.open('image.jpg')

# 定义Canny边缘检测变换
canny = transforms.Compose([
    transforms.Grayscale(),
    transforms.Canny(100, 200)
])

# 应用Canny边缘检测变换
edge = canny(img)

# 显示边缘检测结果
edge.show()