目标检测-YOLOv4

wydxry

已于 2024-09-06 11:14:19 修改

阅读量1k

点赞数 24

分类专栏：深度学习文章标签：目标检测 YOLO 目标跟踪

于 2024-09-06 10:05:08 首次发布

本文链接：https://blog.csdn.net/wydxry/article/details/141952198

版权

深度学习专栏收录该内容

10 篇文章 0 订阅

订阅专栏

YOLOv4介绍

YOLOv4 是 YOLO 系列的第四个版本，继承了 YOLOv3 的高效性，并通过大量优化和改进，在目标检测任务中实现了更高的精度和速度。相比 YOLOv3，YOLOv4 在框架设计、特征提取、训练策略等方面进行了全面升级。它在保持实时检测的同时，显著提升了检测性能，尤其在复杂场景中的表现尤为出色。

相比 YOLOv3 的改进与优势

改进的 Backbone (CSPDarknet-53)
YOLOv4 使用了 CSPDarknet-53 作为其主干网络 (Backbone)。CSPNet（Cross Stage Partial Network）通过部分特征的逐层传递减少了冗余梯度信息，提高了推理速度和精度。此外，它能够有效降低内存占用，使得网络更加轻量化。
PANet (Path Aggregation Network)
YOLOv4 引入了 PANet 来替代 YOLOv3 中的 FPN (Feature Pyramid Networks)。PANet 更有效地聚合不同尺度的特征，增强了特征表达能力，尤其对小目标检测性能提升明显。
Mish 激活函数
YOLOv4 使用 Mish 激活函数代替 YOLOv3 中的 Leaky ReLU。Mish 激活函数可以提供更平滑的梯度传递，提升了模型的学习能力和泛化性。
自适应输入分布 (Mosaic Data Augmentation)
在数据增强方面，YOLOv4 引入了 Mosaic Data Augmentation，这种技术通过将四张不同的图片拼接在一起进行训练，使得模型能够更好地适应不同尺寸、位置和背景的物体，从而提高泛化能力。
CIoU Loss
YOLOv4 使用了 CIoU Loss（Complete IoU Loss），相比 YOLOv3 使用的 IoU Loss，CIoU 更好地考虑了边界框的重叠度、中心点距离以及长宽比，使得目标定位更加精确。
DropBlock Regularization
为了防止过拟合，YOLOv4 引入了 DropBlock 正则化方法，这是一种空间正则化技术，可以在卷积层中随机去除一部分神经元，以增强模型的泛化能力。
自适应权重计算 (SAM, Self-Adversarial Training)
SAM (Spatial Attention Module) 提升了网络的注意力机制，使模型更好地聚焦于重要的特征部分。此外，YOLOv4 还引入了 Self-Adversarial Training 作为一种新型的自对抗训练方法，增强了模型的鲁棒性。

核心代码展示

以下是 YOLOv4 的核心部分代码，包括主干网络 CSPDarknet-53 和 PANet 构建模块。

import torch
import torch.nn as nn

# 1. Mish 激活函数
class Mish(nn.Module):
    def forward(self, x):
        return x * torch.tanh(nn.functional.softplus(x))

# 2. 卷积块，包含卷积、BN 和 Mish 激活函数
class ConvBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(ConvBlock, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, bias=False)
        self.bn = nn.BatchNorm2d(out_channels)
        self.activation = Mish()

    def forward(self, x):
        return self.activation(self.bn(self.conv(x)))

# 3. CSP 模块
class CSPBlock(nn.Module):
    def __init__(self, in_channels, out_channels, num_blocks):
        super(CSPBlock, self).__init__()
        half_channels = out_channels // 2
        self.conv1 = ConvBlock(in_channels, half_channels, 1, 1, 0)
        self.conv2 = ConvBlock(in_channels, half_channels, 1, 1, 0)

        self.res_blocks = nn.Sequential(
            *[ResidualBlock(half_channels) for _ in range(num_blocks)]
        )

        self.conv3 = ConvBlock(half_channels * 2, out_channels, 1, 1, 0)

    def forward(self, x):
        x1 = self.conv1(x)
        x2 = self.conv2(x)
        x1 = self.res_blocks(x1)
        return self.conv3(torch.cat([x1, x2], dim=1))

# 4. PANet 下采样模块
class PANetDownsample(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(PANetDownsample, self).__init__()
        self.conv = ConvBlock(in_channels, out_channels, 3, 2, 1)  # 步长为 2，进行下采样

    def forward(self, x):
        return self.conv(x)

# 5. YOLOv4 Backbone: CSPDarknet53
class CSPDarknet53(nn.Module):
    def __init__(self):
        super(CSPDarknet53, self).__init__()
        self.conv1 = ConvBlock(3, 32, 3, 1, 1)
        self.conv2 = ConvBlock(32, 64, 3, 2, 1)

        self.csp_block1 = CSPBlock(64, 128, 2)
        self.csp_block2 = CSPBlock(128, 256, 8)
        self.csp_block3 = CSPBlock(256, 512, 8)
        self.csp_block4 = CSPBlock(512, 1024, 4)

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.csp_block1(x)
        x_52x52 = self.csp_block2(x)
        x_26x26 = self.csp_block3(x_52x52)
        x_13x13 = self.csp_block4(x_26x26)
        return x_52x52, x_26x26, x_13x13

# 6. YOLOv4 Head: PANet
class PANet(nn.Module):
    def __init__(self, num_classes):
        super(PANet, self).__init__()
        self.num_classes = num_classes

        # 下采样和卷积操作
        self.downsample_52x52 = PANetDownsample(256, 512)
        self.downsample_26x26 = PANetDownsample(512, 1024)

        # 最终输出预测层 (每个尺度的 YOLO head)
        self.yolo_head_52x52 = YOLOHead(256, num_classes)
        self.yolo_head_26x26 = YOLOHead(512, num_classes)
        self.yolo_head_13x13 = YOLOHead(1024, num_classes)

    def forward(self, x_52x52, x_26x26, x_13x13):
        x_26x26 = self.downsample_52x52(x_52x52) + x_26x26
        x_13x13 = self.downsample_26x26(x_26x26) + x_13x13

        yolo_output_52x52 = self.yolo_head_52x52(x_52x52)
        yolo_output_26x26 = self.yolo_head_26x26(x_26x26)
        yolo_output_13x13 = self.yolo_head_13x13(x_13x13)

        return [yolo_output_52x52, yolo_output_26x26, yolo_output_13x13]

# 7. YOLOv4 完整模型
class YOLOv4(nn.Module):
    def __init__(self, num_classes):
        super(YOLOv4, self).__init__()
        self.backbone = CSPDarknet53()
        self.panet = PANet(num_classes)

    def forward(self, x):
        x_52x52, x_26x26, x_13x13 = self.backbone(x)
        return self.panet(x_52x52, x_26x26, x_13x13)

# YOLO Head 定义
class YOLOHead(nn.Module):
    def __init__(self, in_channels, num_classes):
        super(YOLOHead, self).__init__()
        self.conv = ConvBlock(in_channels, in_channels * 2, 3, 1, 1)
        self.pred = nn.Conv2d(in_channels * 2, 3 * (num_classes + 5), 1, 1, 0)

    def forward(self, x):
        x = self.conv(x)
        return self.pred(x)