【YOLOv8改进 - 特征融合NECK】 GIRAFFEDET之GFPN :广义特征金字塔网络,高效地融合多尺度特征

YOLOv8目标检测创新改进与实战案例专栏

专栏目录: YOLOv8有效改进系列及项目实战目录 包含卷积,主干 注意力,检测头等创新机制 以及 各种目标检测分割项目实战案例

专栏链接: YOLOv8基础解析+创新改进+实战案例

介绍

image-20240716101929672

摘要

在传统的目标检测框架中,通常采用从图像识别模型继承的主干网络来提取深层潜在特征,然后通过颈部模块融合这些潜在特征,以捕捉不同尺度的信息。由于目标检测中的分辨率远高于图像识别,主干网络的计算成本往往占据总推理成本的主要部分。这种重型主干设计范式主要是由于将图像识别模型转移到目标检测中时的历史遗留,而不是针对目标检测进行的端到端优化设计。在本研究中,我们表明这种范式确实导致了次优的目标检测模型。为此,我们提出了一种新颖的重型颈部范式,GiraffeDet,这是一种类长颈鹿的高效目标检测网络。GiraffeDet使用极其轻量级的主干网络和非常深且大的颈部模块,鼓励在不同空间尺度和不同潜在语义层次之间进行密集的信息交换。这种设计范式使检测器即使在网络的早期阶段也能以同等优先级处理高级语义信息和低级空间信息,从而在检测任务中更加有效。对多个流行的目标检测基准的数值评估表明,GiraffeDet在各种资源约束下始终优于先前的SOTA模型。源码可在 https://github.com/jyqi/GiraffeDet 获取。

文章链接

论文地址:论文地址

代码地址:代码地址

基本原理

GiraffeDet 基本原理和组件

GiraffeDet 是一个创新的对象检测框架,其设计宗旨是通过轻量级骨干网络和深度、庞大的颈部模块实现高效的多尺度信息交换,从而提高检测性能。其核心思想包括轻量级的空间到深度链(Space-to-Depth Chain, S2D-chain)和广义特征金字塔网络(Generalized Feature Pyramid Network, GFPN),共同组成了一个“长颈鹿”形网络。

1. 基本原理
  • 轻量级骨干(Lightweight Backbone)

    • GiraffeDet 使用轻量级的空间到深度链(S2D-chain)作为骨干网络,取代传统的CNN骨干,减少了计算成本和域迁移问题【10:7†source】。
    • S2D-chain 包括两个 3x3 卷积层和多个 S2D 块,每个 S2D 块由一个 S2D 层和一个 1x1 卷积组成,通过将空间维度的信息转移到深度维度来实现特征的下采样【10:7†source】。
  • 广义特征金字塔网络(Generalized FPN, GFPN)

    • GFPN 提供了跨层次和跨尺度的特征融合,通过“Queen-Fusion”实现像国际象棋中的王后路径般的高效信息交换【10:7†source】。
    • GFPN 设计中包含跳跃层连接(log2n-link),能够有效传递早期节点到后期节点的信息,并减少冗余【10:13†source】。
2. 组件
  • S2D链

    • 包含初始下采样的 3x3 卷积和多个 S2D 块。S2D 块通过固定间隔的均匀采样和重组特征实现空间维度到深度维度的转换【10:7†source】。
  • GFPN

    • 由多个深度和宽度可调的层组成。每层包括多种尺度和层次的特征融合,使用跳跃层连接和跨尺度连接【10:8†source】【10:13†source】。
    • Queen-Fusion 融合了当前层和相邻层的特征,提供了高效的高低层信息交换【10:17†source】。
  • 预测网络(Prediction Network)

    • 负责生成对象的边界框和分类标签。通过 GFPN 提供的丰富特征进行准确的对象检测【10:7†source】。
3. GiraffeDet 家族
  • 多样化模型
    • 根据 GFPN 的深度和宽度,GiraffeDet 开发了多个适应不同计算资源限制的模型,包括 Giraffe-D7、D11、D14、D16、D25 和 D29【10:8†source】。
    • 实验结果表明,GiraffeDet 在各个 FLOPs 级别上都表现出了较高的准确性和效率【10:10†source】。

GFPN (广义特征金字塔网络) 详解

GFPN 是 GiraffeDet 中的一个关键组件,其设计旨在高效地融合多尺度特征,以提升目标检测性能。GFPN 结合了跳跃层连接(skip-layer connections)和跨尺度连接(cross-scale connections)等创新技术,解决了传统特征金字塔网络(FPN)设计中的局限性,增强了不同特征层次之间的信息交换。

GFPN 设计要点
  1. 多尺度特征融合

    • GFPN 的目的是汇聚从骨干网络提取的不同分辨率的特征。
    • 它基于传统 FPN 的概念,但引入了更复杂的连接,以增强信息流动。
  2. 特征金字塔网络的演进

    • 传统 FPN:引入自上而下的路径来融合多尺度特征。
    • PANet:在 FPN 结构的基础上增加了自下而上的路径,增强了双向信息流。
    • BiFPN:移除只有一个输入边的节点,并增加同一级别原始输入的额外边,进一步提高了连接性。
    • GFPN:结合跳跃层连接和跨尺度连接,优化了水平和垂直方向上的信息流。
  3. 跳跃层连接(Skip-layer Connection)

    • 设计用于减少深层网络中的梯度消失问题。
    • 具体实现方式有两种:密集连接(dense-link)和 log2n 连接(log2n-link)。
  4. 跨尺度连接(Cross-scale Connection)

    • 设计用于克服大尺度变异问题。
    • 以前的工作仅考虑相邻层之间的特征连接,而 GFPN 提出了新的融合方法——皇后融合(Queen-fusion),考虑了相同层次和邻近层次的特征。
GFPN 详细设计
  1. 跳跃层连接

    • 密集连接(dense-link):每一层都接收所有前一层的特征图,并进行卷积操作。
    • log2n 连接(log2n-link):每一层接收至多 log2(l) + 1 层的前一层特征图,减少了计算复杂度,并有效传递信息。
  2. 皇后融合(Queen-fusion)

    • 类似于国际象棋中的皇后路径,融合了当前层和相邻层的特征。
    • 例如,在 P5 层,融合了上一层 P4 的下采样、上一层 P6 的上采样、上一层 P5 和当前层 P4 的特征。
实验结果与性能
  • 实验结果表明,GFPN 在处理大尺度变异方面表现优异,在各个 FLOPs 级别上均实现了更高的准确性和效率。
  • 连接分析显示,log2n 连接在信息传输方面比密集连接更有效,而皇后融合则能实现充分的高低层信息交换。

GFPN 的创新设计使得 GiraffeDet 能够在目标检测任务中提供卓越的性能,并在处理不同尺度的对象时表现出色。通过跳跃层连接和跨尺度连接,GFPN 实现了高效的信息融合,提高了检测的准确性和效率。

核心代码

import warnings

import torch
import torch.nn as nn
from mmcv.cnn import ConvModule, constant_init, kaiming_init
from mmcv.cnn.bricks.activation import build_activation_layer
from mmcv.cnn.bricks.norm import build_norm_layer
from mmcv.runner import BaseModule
from mmcv.runner.fp16_utils import auto_fp16
from torch.nn.modules.batchnorm import _BatchNorm

from ..builder import BACKBONES


def space_to_depth(x, block_size):
    n, c, h, w = x.size()
    unfolded_x = torch.nn.functional.unfold(x, block_size, stride=block_size)
    return unfolded_x.view(n, c * block_size ** 2, h // block_size, w // block_size)


class Conv(ConvModule):
    # Standard convolution
    def __init__(self,
                 in_channels,
                 out_channels,
                 kernel_size=1,
                 stride=1,
                 padding=None,
                 groups=1,
                 norm_cfg=dict(type='BN'),
                 act_cfg=dict(type='Mish'),
                 **kwargs):
        super(Conv, self).__init__(
            in_channels,
            out_channels,
            kernel_size=kernel_size,
            stride=stride,
            padding=kernel_size // 2 if padding is None else padding,
            groups=groups,
            norm_cfg=norm_cfg,
            act_cfg=act_cfg)


class SimpleFocus(nn.Module):
    # Focus wh information into c-space
    def __init__(self,
                 in_channels,
                 out_channels,
                 b,
                 kernel_size=1,
                 stride=1,
                 groups=1,
                 init_cfg=None,
                 **kwargs):
        super(SimpleFocus, self).__init__()
        padding = kernel_size // 2
        self.b = b
        self.conv = Conv(in_channels, out_channels, kernel_size, stride, padding, groups, **kwargs)

    def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
        x = space_to_depth(x, self.b)
        return self.conv(x)


class Bottleneck(BaseModule):

    def __init__(self,
                 in_channels,
                 out_channels,
                 shortcut=True,
                 groups=1,
                 expansion=0.5,
                 init_cfg=None,
                 **kwargs):
        super(Bottleneck, self).__init__(init_cfg)
        hidden_channels = int(out_channels * expansion)  # hidden channels
        self.conv1 = Conv(
            in_channels, hidden_channels, kernel_size=1, **kwargs)
        self.conv2 = Conv(
            hidden_channels,
            out_channels,
            kernel_size=3,
            groups=groups,
            **kwargs)
        self.shortcut = shortcut and in_channels == out_channels

    def forward(self, x):
        if self.shortcut:
            return x + self.conv2(self.conv1(x))
        else:
            return self.conv2(self.conv1(x))


class BottleneckCSP(BaseModule):
    # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self,
                 in_channels,
                 out_channels,
                 repetition=1,
                 shortcut=True,
                 groups=1,
                 expansion=0.5,
                 csp_act_cfg=dict(type='Mish'),
                 init_cfg=None,
                 **kwargs):
        super(BottleneckCSP, self).__init__(init_cfg)
        hidden_channels = int(out_channels * expansion)  # hidden channels
        self.conv1 = Conv(
            in_channels, hidden_channels, kernel_size=1, **kwargs)
        self.conv2 = nn.Conv2d(in_channels, hidden_channels, 1, 1, bias=False)
        self.conv3 = nn.Conv2d(
            hidden_channels, hidden_channels, 1, 1, bias=False)
        self.conv4 = Conv(
            2 * hidden_channels, out_channels, kernel_size=1, **kwargs)
        csp_norm_cfg = kwargs.get('norm_cfg', dict(type='BN')).copy()
        self.bn = build_norm_layer(csp_norm_cfg, 2 * hidden_channels)[-1]
        csp_act_cfg_ = csp_act_cfg.copy()
        if csp_act_cfg_['type'] not in [
                'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish'
        ]:
            csp_act_cfg_.setdefault('inplace', True)
        self.csp_act = build_activation_layer(csp_act_cfg_)
        self.bottlenecks = nn.Sequential(*[
            Bottleneck(
                hidden_channels,
                hidden_channels,
                shortcut,
                groups,
                expansion=1.0,
                **kwargs) for _ in range(repetition)
        ])

    def forward(self, x):
        y1 = self.conv3(self.bottlenecks(self.conv1(x)))
        y2 = self.conv2(x)
        return self.conv4(self.csp_act(self.bn(torch.cat((y1, y2), dim=1))))


class BottleneckCSP2(BaseModule):
    # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self,
                 in_channels,
                 out_channels,
                 repetition=1,
                 shortcut=False,
                 groups=1,
                 csp_act_cfg=dict(type='Mish'),
                 init_cfg=None,
                 **kwargs):
        super(BottleneckCSP2, self).__init__(init_cfg)
        hidden_channels = int(out_channels)  # hidden channels
        self.conv1 = Conv(
            in_channels, hidden_channels, kernel_size=1, **kwargs)
        self.conv2 = nn.Conv2d(
            hidden_channels, hidden_channels, 1, 1, bias=False)
        self.conv3 = Conv(
            2 * hidden_channels, out_channels, kernel_size=1, **kwargs)
        csp_norm_cfg = kwargs.get('norm_cfg', dict(type='BN')).copy()
        self.bn = build_norm_layer(csp_norm_cfg, 2 * hidden_channels)[-1]
        csp_act_cfg_ = csp_act_cfg.copy()
        if csp_act_cfg_['type'] not in [
                'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish'
        ]:
            csp_act_cfg_.setdefault('inplace', True)
        self.csp_act = build_activation_layer(csp_act_cfg_)
        self.bottlenecks = nn.Sequential(*[
            Bottleneck(
                hidden_channels,
                hidden_channels,
                shortcut,
                groups,
                expansion=1.0,
                **kwargs) for _ in range(repetition)
        ])

    def forward(self, x):
        x1 = self.conv1(x)
        y1 = self.bottlenecks(x1)
        y2 = self.conv2(x1)
        return self.conv3(self.csp_act(self.bn(torch.cat((y1, y2), dim=1))))


class SPPV5(BaseModule):
    # Spatial pyramid pooling layer used in YOLOv3-SPP
    def __init__(self,
                 in_channels,
                 out_channels,
                 pooling_kernel_size=(5, 9, 13),
                 init_cfg=None,
                 **kwargs):
        super(SPPV5, self).__init__(init_cfg)
        hidden_channels = in_channels // 2  # hidden channels
        self.conv1 = Conv(
            in_channels, hidden_channels, kernel_size=1, **kwargs)
        self.conv2 = Conv(
            hidden_channels * (len(pooling_kernel_size) + 1),
            out_channels,
            kernel_size=1,
            **kwargs)
        self.maxpools = nn.ModuleList([
            nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2)
            for x in pooling_kernel_size
        ])

    def forward(self, x):
        x = self.conv1(x)
        return self.conv2(
            torch.cat([x] + [maxpool(x) for maxpool in self.maxpools], 1))


class SPPV4(BaseModule):
    # CSP SPP https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self,
                 in_channels,
                 out_channels,
                 expansion=0.5,
                 pooling_kernel_size=(5, 9, 13),
                 csp_act_cfg=dict(type='Mish'),
                 init_cfg=None,
                 **kwargs):
        super(SPPV4, self).__init__(init_cfg)
        hidden_channels = int(2 * out_channels * expansion)  # hidden channels
        self.conv1 = Conv(
            in_channels, hidden_channels, kernel_size=1, **kwargs)
        self.conv2 = nn.Conv2d(in_channels, hidden_channels, 1, 1, bias=False)
        self.conv3 = Conv(
            hidden_channels, hidden_channels, kernel_size=3, **kwargs)
        self.conv4 = Conv(
            hidden_channels, hidden_channels, kernel_size=1, **kwargs)
        self.maxpools = nn.ModuleList([
            nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2)
            for x in pooling_kernel_size
        ])
        self.conv5 = Conv(
            4 * hidden_channels, hidden_channels, kernel_size=1, **kwargs)
        self.conv6 = Conv(
            hidden_channels, hidden_channels, kernel_size=3, **kwargs)
        csp_norm_cfg = kwargs.get('norm_cfg', dict(type='BN')).copy()
        self.bn = build_norm_layer(csp_norm_cfg, 2 * hidden_channels)[-1]
        csp_act_cfg_ = csp_act_cfg.copy()
        if csp_act_cfg_['type'] not in [
                'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish'
        ]:
            csp_act_cfg_.setdefault('inplace', True)
        self.csp_act = build_activation_layer(csp_act_cfg_)
        self.conv7 = Conv(
            2 * hidden_channels, out_channels, kernel_size=1, **kwargs)

    def forward(self, x):
        x1 = self.conv4(self.conv3(self.conv1(x)))
        y1 = self.conv6(
            self.conv5(
                torch.cat([x1] + [maxpool(x1) for maxpool in self.maxpools],
                          1)))
        y2 = self.conv2(x)
        return self.conv7(self.csp_act(self.bn(torch.cat((y1, y2), dim=1))))


class Focus(BaseModule):
    # Focus wh information into c-space
    # Implement with ordinary Conv2d with
    # doubled kernel/padding size & stride 2
    def __init__(self,
                 in_channels,
                 out_channels,
                 kernel_size=1,
                 stride=1,
                 groups=1,
                 init_cfg=None,
                 **kwargs):
        super(Focus, self).__init__(init_cfg)
        padding = kernel_size // 2
        kernel_size *= 2
        padding *= 2
        stride *= 2
        self.conv = Conv(
            in_channels,
            out_channels,
            kernel_size=kernel_size,
            stride=stride,
            padding=padding,
            groups=groups,
            **kwargs)

    def forward(self, x):
        return self.conv(x)


class CSPStage(BaseModule):

    def __init__(self,
                 in_channels,
                 out_channels,
                 repetition,
                 init_cfg=None,
                 **kwargs):
        super(CSPStage, self).__init__(init_cfg)
        self.conv_downscale = Conv(
            in_channels, out_channels, kernel_size=3, stride=2, **kwargs)
        self.conv_csp = BottleneckCSP(out_channels, out_channels, repetition,
                                      **kwargs)

    def forward(self, x):
        return self.conv_csp(self.conv_downscale(x))


class SPPV5Stage(BaseModule):

    def __init__(self,
                 in_channels,
                 out_channels,
                 repetition,
                 init_cfg=None,
                 **kwargs):
        super(SPPV5Stage, self).__init__(init_cfg)
        self.conv_downscale = Conv(
            in_channels, out_channels, kernel_size=3, stride=2, **kwargs)
        self.spp = SPPV5(
            out_channels, out_channels, pooling_kernel_size=(5, 9, 13))
        # self.conv_csp = BottleneckCSP(out_channels, out_channels, repetition,
        #                               **kwargs)

    def forward(self, x):
        # return self.conv_csp(self.spp(self.conv_downscale(x)))
        return self.spp(self.conv_downscale(x))


class SPPV4Stage(BaseModule):

    def __init__(self,
                 in_channels,
                 out_channels,
                 repetition,
                 init_cfg=None,
                 **kwargs):
        super(SPPV4Stage, self).__init__(init_cfg)
        self.conv_downscale = Conv(
            in_channels, out_channels * 2, kernel_size=3, stride=2, **kwargs)
        self.conv_csp = BottleneckCSP(out_channels * 2, out_channels * 2,
                                      repetition, **kwargs)
        self.spp = SPPV4(
            out_channels * 2, out_channels, pooling_kernel_size=(5, 9, 13))

    def forward(self, x):
        return self.spp(self.conv_csp(self.conv_downscale(x)))


class BottleneckStage(BaseModule):

    def __init__(self,
                 in_channels,
                 out_channels,
                 repetition,
                 init_cfg=None,
                 **kwargs):
        super(BottleneckStage, self).__init__(init_cfg)
        self.conv_downscale = Conv(
            in_channels, out_channels, kernel_size=3, stride=2, **kwargs)
        self.conv_bottleneck = Bottleneck(out_channels, out_channels,
                                          repetition, **kwargs)

    def forward(self, x):
        return self.conv_bottleneck(self.conv_downscale(x))


@BACKBONES.register_module()
class DarknetCSP(BaseModule):
    """Darknet backbone.

    Args:
        scale (int): scale of DarknetCSP. 's'|'x'|'m'|'l'|
        out_indices (Sequence[int]): Output from which stages.
        frozen_stages (int): Stages to be frozen (stop grad and set eval mode).
            -1 means not freezing any parameters. Default: -1.
        conv_cfg (dict): Config dict for convolution layer. Default: None.
        norm_cfg (dict): Dictionary to construct and config norm layer.
            Default: dict(type='BN', requires_grad=True)
        act_cfg (dict): Config dict for activation layer.
            Default: dict(type='Mish').
        norm_eval (bool): Whether to set norm layers to eval mode, namely,
            freeze running stats (mean and var). Note: Effect on Batch Norm
            and its variants only.
    """

    arch_settings = {
        'v4s5p': [['conv', 'bottleneck', 'csp', 'csp', 'csp', 'sppv4'],
                  [None, 1, 1, 3, 3, 1], [16, 32, 64, 128, 256, 256]],
        'v4m5p': [['conv', 'bottleneck', 'csp', 'csp', 'csp', 'sppv4'],
                  [None, 1, 1, 5, 5, 3], [24, 48, 96, 192, 384, 384]],
        'v4l5p': [['conv', 'bottleneck', 'csp', 'csp', 'csp', 'sppv4'],
                  [None, 1, 2, 8, 8, 4], [32, 64, 128, 256, 512, 512]],
        'v4x5p': [['conv', 'bottleneck', 'csp', 'csp', 'csp', 'sppv4'],
                  [None, 1, 3, 11, 11, 5], [40, 80, 160, 320, 640, 640]],
        'v4l6p': [['conv', 'csp', 'csp', 'csp', 'csp', 'csp', 'sppv4'],
                  [None, 1, 3, 15, 15, 7, 7],
                  [32, 64, 128, 256, 512, 1024, 512]],
        'v4x7p': [['conv', 'csp', 'csp', 'csp', 'csp', 'csp', 'csp', 'sppv4'],
                  [None, 1, 3, 15, 15, 7, 7, 7],
                  [40, 80, 160, 320, 640, 1280, 1280, 640]],
        'v5s5p': [['focus', 'csp', 'csp', 'csp', 'sppv5'], [None, 1, 3, 3, 1],
                  [32, 64, 128, 256, 512]],
        's2dcsp': [['focus', 'csp', 'csp', 'csp', 'sppv5'], [None, 1, 1, 1, 1],
                  [32, 64, 128, 256, 512]],
        'v5m5p': [['focus', 'csp', 'csp', 'csp', 'sppv5'], [None, 2, 6, 6, 2],
                  [48, 96, 192, 384, 768]],
        'v5l5p': [['focus', 'csp', 'csp', 'csp', 'sppv5'], [None, 3, 9, 9, 3],
                  [64, 128, 256, 512, 1024]],
        'v5x5p': [['focus', 'csp', 'csp', 'csp', 'sppv5'],
                  [None, 4, 12, 12, 4], [80, 160, 320, 640, 1280]],
    }

    def __init__(self,
                 scale='x5p',
                 out_indices=(3, 4, 5),
                 frozen_stages=-1,
                 norm_cfg=dict(
                     type='BN', requires_grad=True, eps=0.001, momentum=0.03),
                 act_cfg=dict(type='Mish'),
                 csp_act_cfg=dict(type='Mish'),
                 norm_eval=False,
                 pretrained=None,
                 init_cfg=None):
        super(DarknetCSP, self).__init__(init_cfg)

        if isinstance(scale, str):
            if scale not in self.arch_settings:
                raise KeyError(f'invalid scale {scale} for DarknetCSP')
            stage, repetition, channels = self.arch_settings[scale]
        else:
            stage, repetition, channels = scale

        self.out_indices = out_indices
        self.frozen_stages = frozen_stages

        cfg = dict(
            norm_cfg=norm_cfg,
            act_cfg=act_cfg,
            csp_act_cfg=csp_act_cfg,
            init_cfg=init_cfg)

        self.layers = []
        cin = 3
        for i, (stg, rep, cout) in enumerate(zip(stage, repetition, channels)):
            layer_name = f'{stg}{i}'
            self.layers.append(layer_name)
            if stg == 'conv':
                self.add_module(layer_name, Conv(cin, cout, 3, **cfg))
            elif stg == 'bottleneck':
                self.add_module(layer_name,
                                BottleneckStage(cin, cout, rep, **cfg))
            elif stg == 'csp':
                self.add_module(layer_name, CSPStage(cin, cout, rep, **cfg))
            elif stg == 'focus':
                self.add_module(layer_name, Focus(cin, cout, 3, **cfg))
            elif stg == 'sppv4':
                self.add_module(layer_name, SPPV4Stage(cin, cout, rep, **cfg))
            elif stg == 'sppv5':
                self.add_module(layer_name, SPPV5Stage(cin, cout, rep, **cfg))
            else:
                raise NotImplementedError
            cin = cout

        self.norm_eval = norm_eval

        self.fp16_enabled = False

        assert not (init_cfg and pretrained), \
            'init_cfg and pretrained cannot be setting at the same time'
        if isinstance(pretrained, str):
            warnings.warn('DeprecationWarning: pretrained is a deprecated, '
                          'please use "init_cfg" instead')
            self.init_cfg = dict(type='Pretrained', checkpoint=pretrained)
        elif pretrained is None:
            if init_cfg is None:
                self.init_cfg = [
                    dict(type='Kaiming', layer='Conv2d'),
                    dict(
                        type='Constant',
                        val=1,
                        layer=['_BatchNorm', 'GroupNorm'])
                ]
        else:
            raise TypeError('pretrained must be a str or None')

    @auto_fp16()
    def forward(self, x):
        outs = []
        for i, layer_name in enumerate(self.layers):
            layer = getattr(self, layer_name)
            x = layer(x)
            if i in self.out_indices:
                outs.append(x)

        # return tuple(outs)
        return outs

    def init_weights(self, pretrained=None):
        if isinstance(pretrained, str):
            logger = logging.getLogger()
            load_checkpoint(self, pretrained, strict=False, logger=logger)
        elif pretrained is None:
            for m in self.modules():
                if isinstance(m, nn.Conv2d):
                    kaiming_init(m)
                elif isinstance(m, (_BatchNorm, nn.GroupNorm)):
                    constant_init(m, 1)


    def _freeze_stages(self):
        if self.frozen_stages >= 0:
            for i in range(0, self.frozen_stages):
                m = getattr(self, self.layers[i])
                m.eval()
                for param in m.parameters():
                    param.requires_grad = False

    def train(self, mode=True):
        super(DarknetCSP, self).train(mode)
        self._freeze_stages()
        if mode and self.norm_eval:
            for m in self.modules():
                if isinstance(m, _BatchNorm):
                    m.eval()

下载YoloV8代码

直接下载

GitHub地址

image-20240116225427653

Git Clone

git clone https://github.com/ultralytics/ultralytics

安装环境

进入代码根目录并安装依赖。

image-20240116230741813

image-20240116230741813

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

在最新版本中,官方已经废弃了requirements.txt文件,转而将所有必要的代码和依赖整合进了ultralytics包中。因此,用户只需安装这个单一的ultralytics库,就能获得所需的全部功能和环境依赖。

pip install ultralytics

引入代码

在根目录下的ultralytics/nn/目录,新建一个 featureFusion目录,然后新建一个以 GFPN为文件名的py文件, 把代码拷贝进去。

import torch
import torch.nn as nn
import torch.nn.functional as F
 
 
def conv_bn(in_channels, out_channels, kernel_size, stride, padding, groups=1):
    '''Basic cell for rep-style block, including conv and bn'''
    result = nn.Sequential()
    result.add_module(
        'conv',
        nn.Conv2d(in_channels=in_channels,
                  out_channels=out_channels,
                  kernel_size=kernel_size,
                  stride=stride,
                  padding=padding,
                  groups=groups,
                  bias=False))
    result.add_module('bn', nn.BatchNorm2d(num_features=out_channels))
    return result
 
 
class RepConv(nn.Module):
    '''RepConv is a basic rep-style block, including training and deploy status
    Code is based on https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py
    '''
 
    def __init__(self,
                 in_channels,
                 out_channels,
                 kernel_size=3,
                 stride=1,
                 padding=1,
                 dilation=1,
                 groups=1,
                 padding_mode='zeros',
                 deploy=False,
                 act='relu',
                 norm=None):
        super(RepConv, self).__init__()
        self.deploy = deploy
        self.groups = groups
        self.in_channels = in_channels
        self.out_channels = out_channels
 
        assert kernel_size == 3
        assert padding == 1
 
        padding_11 = padding - kernel_size // 2
 
        if isinstance(act, str):
            self.nonlinearity = get_activation(act)
        else:
            self.nonlinearity = act
 
        if deploy:
            self.rbr_reparam = nn.Conv2d(in_channels=in_channels,
                                         out_channels=out_channels,
                                         kernel_size=kernel_size,
                                         stride=stride,
                                         padding=padding,
                                         dilation=dilation,
                                         groups=groups,
                                         bias=True,
                                         padding_mode=padding_mode)
 
        else:
            self.rbr_identity = None
            self.rbr_dense = conv_bn(in_channels=in_channels,
                                     out_channels=out_channels,
                                     kernel_size=kernel_size,
                                     stride=stride,
                                     padding=padding,
                                     groups=groups)
            self.rbr_1x1 = conv_bn(in_channels=in_channels,
                                   out_channels=out_channels,
                                   kernel_size=1,
                                   stride=stride,
                                   padding=padding_11,
                                   groups=groups)
 
    def forward(self, inputs):
        '''Forward process'''
        if hasattr(self, 'rbr_reparam'):
            return self.nonlinearity(self.rbr_reparam(inputs))
 
        if self.rbr_identity is None:
            id_out = 0
        else:
            id_out = self.rbr_identity(inputs)
 
        return self.nonlinearity(
            self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out)
 
    def get_equivalent_kernel_bias(self):
        kernel3x3, bias3x3 = self._fuse_bn_tensor(self.rbr_dense)
        kernel1x1, bias1x1 = self._fuse_bn_tensor(self.rbr_1x1)
        kernelid, biasid = self._fuse_bn_tensor(self.rbr_identity)
        return kernel3x3 + self._pad_1x1_to_3x3_tensor(
            kernel1x1) + kernelid, bias3x3 + bias1x1 + biasid
 
    def _pad_1x1_to_3x3_tensor(self, kernel1x1):
        if kernel1x1 is None:
            return 0
        else:
            return torch.nn.functional.pad(kernel1x1, [1, 1, 1, 1])
 
    def _fuse_bn_tensor(self, branch):
        if branch is None:
            return 0, 0
        if isinstance(branch, nn.Sequential):
            kernel = branch.conv.weight
            running_mean = branch.bn.running_mean
            running_var = branch.bn.running_var
            gamma = branch.bn.weight
            beta = branch.bn.bias
            eps = branch.bn.eps
        else:
            assert isinstance(branch, nn.BatchNorm2d)
            if not hasattr(self, 'id_tensor'):
                input_dim = self.in_channels // self.groups
                kernel_value = np.zeros((self.in_channels, input_dim, 3, 3),
                                        dtype=np.float32)
                for i in range(self.in_channels):
                    kernel_value[i, i % input_dim, 1, 1] = 1
                self.id_tensor = torch.from_numpy(kernel_value).to(
                    branch.weight.device)
            kernel = self.id_tensor
            running_mean = branch.running_mean
            running_var = branch.running_var
            gamma = branch.weight
            beta = branch.bias
            eps = branch.eps
        std = (running_var + eps).sqrt()
        t = (gamma / std).reshape(-1, 1, 1, 1)
        return kernel * t, beta - running_mean * gamma / std
 
    def switch_to_deploy(self):
        if hasattr(self, 'rbr_reparam'):
            return
        kernel, bias = self.get_equivalent_kernel_bias()
        self.rbr_reparam = nn.Conv2d(
            in_channels=self.rbr_dense.conv.in_channels,
            out_channels=self.rbr_dense.conv.out_channels,
            kernel_size=self.rbr_dense.conv.kernel_size,
            stride=self.rbr_dense.conv.stride,
            padding=self.rbr_dense.conv.padding,
            dilation=self.rbr_dense.conv.dilation,
            groups=self.rbr_dense.conv.groups,
            bias=True)
        self.rbr_reparam.weight.data = kernel
        self.rbr_reparam.bias.data = bias
        for para in self.parameters():
            para.detach_()
        self.__delattr__('rbr_dense')
        self.__delattr__('rbr_1x1')
        if hasattr(self, 'rbr_identity'):
            self.__delattr__('rbr_identity')
        if hasattr(self, 'id_tensor'):
            self.__delattr__('id_tensor')
        self.deploy = True
 
 
class Swish(nn.Module):
    def __init__(self, inplace=True):
        super(Swish, self).__init__()
        self.inplace = inplace
 
    def forward(self, x):
        if self.inplace:
            x.mul_(F.sigmoid(x))
            return x
        else:
            return x * F.sigmoid(x)
 
 
def get_activation(name='silu', inplace=True):
    if name is None:
        return nn.Identity()
 
    if isinstance(name, str):
        if name == 'silu':
            module = nn.SiLU(inplace=inplace)
        elif name == 'relu':
            module = nn.ReLU(inplace=inplace)
        elif name == 'lrelu':
            module = nn.LeakyReLU(0.1, inplace=inplace)
        elif name == 'swish':
            module = Swish(inplace=inplace)
        elif name == 'hardsigmoid':
            module = nn.Hardsigmoid(inplace=inplace)
        elif name == 'identity':
            module = nn.Identity()
        else:
            raise AttributeError('Unsupported act type: {}'.format(name))
        return module
 
    elif isinstance(name, nn.Module):
        return name
 
    else:
        raise AttributeError('Unsupported act type: {}'.format(name))
 
 
def get_norm(name, out_channels, inplace=True):
    if name == 'bn':
        module = nn.BatchNorm2d(out_channels)
    else:
        raise NotImplementedError
    return module
 
 
class ConvBNAct(nn.Module):
    """A Conv2d -> Batchnorm -> silu/leaky relu block"""
 
    def __init__(
            self,
            in_channels,
            out_channels,
            ksize,
            stride=1,
            groups=1,
            bias=False,
            act='silu',
            norm='bn',
            reparam=False,
    ):
        super().__init__()
        # same padding
        pad = (ksize - 1) // 2
        self.conv = nn.Conv2d(
            in_channels,
            out_channels,
            kernel_size=ksize,
            stride=stride,
            padding=pad,
            groups=groups,
            bias=bias,
        )
        if norm is not None:
            self.bn = get_norm(norm, out_channels, inplace=True)
        if act is not None:
            self.act = get_activation(act, inplace=True)
        self.with_norm = norm is not None
        self.with_act = act is not None
 
    def forward(self, x):
        x = self.conv(x)
        if self.with_norm:
            x = self.bn(x)
        if self.with_act:
            x = self.act(x)
        return x
 
    def fuseforward(self, x):
        return self.act(self.conv(x))
 
 
class BasicBlock_3x3_Reverse(nn.Module):
    def __init__(self,
                 ch_in,
                 ch_hidden_ratio,
                 ch_out,
                 act='relu',
                 shortcut=True):
        super(BasicBlock_3x3_Reverse, self).__init__()
        assert ch_in == ch_out
        ch_hidden = int(ch_in * ch_hidden_ratio)
        self.conv1 = ConvBNAct(ch_hidden, ch_out, 3, stride=1, act=act)
        self.conv2 = RepConv(ch_in, ch_hidden, 3, stride=1, act=act)
        self.shortcut = shortcut
 
    def forward(self, x):
        y = self.conv2(x)
        y = self.conv1(y)
        if self.shortcut:
            return x + y
        else:
            return y
 
 
class SPP(nn.Module):
    def __init__(
            self,
            ch_in,
            ch_out,
            k,
            pool_size,
            act='swish',
    ):
        super(SPP, self).__init__()
        self.pool = []
        for i, size in enumerate(pool_size):
            pool = nn.MaxPool2d(kernel_size=size,
                                stride=1,
                                padding=size // 2,
                                ceil_mode=False)
            self.add_module('pool{}'.format(i), pool)
            self.pool.append(pool)
        self.conv = ConvBNAct(ch_in, ch_out, k, act=act)
 
    def forward(self, x):
        outs = [x]
 
        for pool in self.pool:
            outs.append(pool(x))
        y = torch.cat(outs, axis=1)
 
        y = self.conv(y)
        return y
 
 
class CSPStage(nn.Module):
    def __init__(self,
                 ch_in,
                 ch_out,
                 n,
                 block_fn='BasicBlock_3x3_Reverse',
                 ch_hidden_ratio=1.0,
                 act='silu',
                 spp=False):
        super(CSPStage, self).__init__()
 
        split_ratio = 2
        ch_first = int(ch_out // split_ratio)
        ch_mid = int(ch_out - ch_first)
        self.conv1 = ConvBNAct(ch_in, ch_first, 1, act=act)
        self.conv2 = ConvBNAct(ch_in, ch_mid, 1, act=act)
        self.convs = nn.Sequential()
 
        next_ch_in = ch_mid
        for i in range(n):
            if block_fn == 'BasicBlock_3x3_Reverse':
                self.convs.add_module(
                    str(i),
                    BasicBlock_3x3_Reverse(next_ch_in,
                                           ch_hidden_ratio,
                                           ch_mid,
                                           act=act,
                                           shortcut=True))
            else:
                raise NotImplementedError
            if i == (n - 1) // 2 and spp:
                self.convs.add_module(
                    'spp', SPP(ch_mid * 4, ch_mid, 1, [5, 9, 13], act=act))
            next_ch_in = ch_mid
        self.conv3 = ConvBNAct(ch_mid * n + ch_first, ch_out, 1, act=act)
 
    def forward(self, x):
        y1 = self.conv1(x)
        y2 = self.conv2(x)
 
        mid_out = [y1]
        for conv in self.convs:
            y2 = conv(y2)
            mid_out.append(y2)
        y = torch.cat(mid_out, axis=1)
        y = self.conv3(y)
        return y

注册

ultralytics/nn/tasks.py中进行如下操作:

步骤1:

from ultralytics.nn.featureFusion.GFPN import CSPStage

步骤2

修改def parse_model(d, ch, verbose=True):

        if m in (
            Classify,
            Conv,
            ConvTranspose,
            GhostConv,
            Bottleneck,
            GhostBottleneck,
            SPP,
            SPPF,
            DWConv,
            Focus,
            BottleneckCSP,
            C1,
            C2,
            C2f,
            C3,
            C3TR,
            C3Ghost,
            nn.ConvTranspose2d,
            DWConvTranspose2d,
            C3x,
            RepC3,
            EVCBlock,
            CloFormerAttnConv,
            C2f_iAFF,
            CSPStage,
           
        ):
            c1, c2 = ch[f], args[0]
            if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                c2 = make_divisible(min(c2, max_channels) * width, 8)

            args = [c1, c2, *args[1:]]
            if m in (BottleneckCSP, C1, C2, C2f, C3, C3TR, C3Ghost, C3x, RepC3, C2f_deformable_LKA,Sea_AttentionBlock, C2f_iAFF,CSPStage):
                args.insert(2, n)  # number of repeats
                n = 1
                

image-20240621154357914

配置yolov8_GFPN.yaml

ultralytics/ultralytics/cfg/models/v8/yolov8_GFPN.yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
 
# Parameters
nc: 2 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
 
# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9
 
# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, CSPStage, [512]] # 12
 
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 3, CSPStage, [256]] # 15 (P3/8-small)
 
  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]] # cat head P4
  - [-1, 3, CSPStage, [512]] # 18 (P4/16-medium)
 
  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]] # cat head P5
  - [-1, 3, CSPStage, [1024]] # 21 (P5/32-large)
 
  - [[15, 18, 21], 1, Detect, [nc]] # Detect(P3, P4, P5)

配置yolov8_GFPN.yaml

 
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
 
# Parameters
nc: 2  # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
 
# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9
 
# DAMO-YOLO GFPN Head
head:
  - [-1, 1, Conv, [512, 1, 1]] # 10
  - [6, 1, Conv, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]]
  - [-1, 3, CSPStage, [512]] # 13
 
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']] #14
  - [4, 1, Conv, [256, 3, 2]] # 15
  - [[14, -1, 6], 1, Concat, [1]]
  - [-1, 3, CSPStage, [512]] # 17
 
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]
  - [-1, 3, CSPStage, [256]] # 20
 
  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 17], 1, Concat, [1]]
  - [-1, 3, CSPStage, [512]] # 23
 
  - [17, 1, Conv, [256, 3, 2]] # 24
  - [23, 1, Conv, [256, 3, 2]] # 25
  - [[13, 24, -1], 1, Concat, [1]]
  - [-1, 3, CSPStage, [1024]] # 27
 
  - [[20, 23, 27], 1, Detect, [nc]]  # Detect(P3, P4, P5)

配置yolov8_GFPN.yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
 
# Parameters
nc: 80  # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
 
# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9
 
# DAMO-YOLO GFPN Head
head:
  - [-1, 1, Conv, [512, 1, 1]] # 10
  - [6, 1, Conv, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]]
  - [-1, 3, CSPStage, [512]] # 13
 
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']] #14
  - [4, 1, Conv, [256, 3, 2]] # 15
  - [[14, -1, 6], 1, Concat, [1]]
  - [-1, 3, CSPStage, [512]] # 17
 
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]
  - [-1, 3, CSPStage, [256]] # 20
 
  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 17], 1, Concat, [1]]
  - [-1, 3, CSPStage, [512]] # 23
 
  - [17, 1, Conv, [256, 3, 2]] # 24
  - [23, 1, Conv, [256, 3, 2]] # 25
  - [[13, 24, -1], 1, Concat, [1]]
  - [-1, 3, CSPStage, [1024]] # 27
 
  - [[17, 20, 23, 27], 1, Detect, [nc]]  # Detect(P3, P4, P5)

实验

脚本

import os
from ultralytics import YOLO

# Define the configuration options directly
yaml = 'ultralytics/cfg/models/v8/yolov8_GFPN.yaml'

# Initialize the YOLO model with the specified YAML file
model = YOLO(yaml) 

# Print model information
model.info()
if __name__ == "__main__":
    # Train the model with the specified parameters
    results = model.train(data='ultralytics/datasets/original-license-plates.yaml',
                          name='GFPN',
                          epochs=10, 
                          workers=8, 
                          batch=1)

结果

image-20240621154305811


在这里插入图片描述

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
YOLOv5模型的特征融合网络是PANet,它能够更好地融合不同尺度目标的特征,从而提升检测效果。然而,还存在改进的空间,可以采用更先进的特征融合网络。一种改进方法是使用adaptively spatial feature fusion (ASFF)金字塔特征融合策略,它可以在空域过滤冲突信息以抑制不一致特征,提升网络对不同尺度目标的特征融合能力。此外,还有其他改进方法可以针对具体应用场景下的检测难点进行优化。通过这些改进,可以进一步提高YOLOv5模型的性能和效果。\[1\]\[2\] #### 引用[.reference_title] - *1* [YOLOv5、v7改进之二十六:改进特征融合网络PANet为ASFF自适应特征融合网络](https://blog.csdn.net/m0_70388905/article/details/126926244)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [YOLOv5改进之十七:CNN+Transformer——融合Bottleneck Transformers](https://blog.csdn.net/weixin_43960370/article/details/130073696)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *3* [改进YOLO系列:YOLOv5结合跨局部连接,实现多尺度特征融合,测试有效涨2个点](https://blog.csdn.net/qq_44224801/article/details/130016849)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

YOLO大王

你的打赏,我的动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值