YOLOv8轻量化改进——Coordinate Attention注意力机制

现在针对YOLOv8的架构改进越来越多,今天尝试引入了Coordinate Attention注意力机制以改进对小目标物体的检测效率。

yolov8的下载和安装参考我这篇博客:

基于SeaShips数据集的yolov8训练教程_seaships处理成yolov8-CSDN博客

首先我们可以去官网找到CA注意力机制模块的相关文件:

官网链接:GitHub - houqb/CoordAttention: Code for our CVPR2021 paper coordinate attention

接下来我们就可以去下载好的yolov8配置文件中进行修改:

首先是modules.py文件,路径如下:

在最下方添加CA注意力机制类的定义:

class CA(nn.Module):
    def __init__(self, inp, oup, reduction=32):
        super(CA, self).__init__()
        torch.use_deterministic_algorithms(True, warn_only=True)  # 允许非确定性操作但继续运行
        self.pool_h = nn.AdaptiveAvgPool2d((None, 1))
        self.pool_w = nn.AdaptiveAvgPool2d((1, None))
        mip = max(8, inp // reduction)

        self.conv1 = nn.Conv2d(inp, mip, kernel_size=1, stride=1, padding=0)
        self.bn1 = nn.BatchNorm2d(mip)
        self.act = h_swish()
        
        self.conv_h = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
        self.conv_w = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
        

    def forward(self, x):
        identity = x
        
        n,c,h,w = x.size()
        x_h = self.pool_h(x)
        x_w = self.pool_w(x).permute(0, 1, 3, 2)

        y = torch.cat([x_h, x_w], dim=2)
        y = self.conv1(y)
        y = self.bn1(y)
        y = self.act(y) 
        
        x_h, x_w = torch.split(y, [h, w], dim=2)
        x_w = x_w.permute(0, 1, 3, 2)

        a_h = self.conv_h(x_h).sigmoid()
        a_w = self.conv_w(x_w).sigmoid()

        out = identity * a_w * a_h

        return out

结果如下:

接着去tasks.py文件下:

最上方注册CA模块:

用以下代码:

from ultralytics.nn.modules import (C1, C2, C3, C3TR, SPP, SPPF, Bottleneck, BottleneckCSP, C2f, C3Ghost, C3x, Classify,
                                    Concat, Conv, ConvTranspose, Detect, DWConv, DWConvTranspose2d, Ensemble, Focus,
                                    GhostBottleneck, GhostConv, Segment,CA,DualConv)

然后找到parse_model函数:

可以Crtl+F全局搜索一下比较快,找到其中一行:

if m in (Classify, Conv, ConvTranspose, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, Focus,
                 BottleneckCSP, C1, C2, C2f, C3, C3TR, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x):

在最后C3x后面加入CA即可。

最后再找到你的yaml配置文件:

修改backbone和检测头head,根据你需要添加CA机制的地方来引入CA模块,注意前后输入输出的一致性,以及添加层之后的层序号变动,这些都是需要考虑的,这里给出我的配置文件代码供参考:

# Parameters
nc: 6  # number of classes
depth_multiple: 1.00  # scales module repeats
width_multiple: 1.00  # scales convolution channels

# YOLOv8.0l backbone-only CA
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]   # 0-P1/2 [3,64,320,320]
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4 [64,128,160,160]
  - [-1, 3, C2f, [128, True]]   # 2      [128,128,160,160]
  - [-1, 1, CA, [128]]           # 3      [128,128,160,160]
  - [-1, 1, Conv, [256, 3, 2]]  # 4-P3/8 [128,256,80,80]
  - [-1, 6, C2f, [256, True]]   # 5      [256,256,80,80]
  - [-1, 1, CA, [256]]           # 6      [256,256,80,80]
  - [-1, 1, Conv, [512, 3, 2]]  # 7-P4/16 [256,512,40,40]
  - [-1, 6, C2f, [512, True]]   # 8       [512,512,40,40]
  - [-1, 1, CA, [512]]           # 9       [512,512,40,40]
  - [-1, 1, Conv, [512, 3, 2]]  # 10-P5/32 [512,512,20,20]
  - [-1, 3, C2f, [512, True]]   # 11       [512,512,20,20]
  - [-1, 1, CA, [512]]           # 12       [512,512,20,20]
  - [-1, 1, SPPF, [512, 5]]     # 13      [512,512,20,20]

# YOLOv8.0l head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]  # 14 [512,512,40,40]
  - [[-1, 8], 1, Concat, [1]]                   # 15 拼接 [512,512,40,40] + [512,512,40,40] → [1024,512,40,40]
  - [-1, 3, C2f, [512]]                        # 16 [1024,512,40,40] → [512,512,40,40]
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]  # 17 [512,512,80,80]
  - [[-1, 5], 1, Concat, [1]]                   # 18 拼接 [512,512,80,80] + [256,256,80,80] → [768,768,80,80]
  - [-1, 3, C2f, [256]]                        # 19 [768,256,80,80]
  - [-1, 1, Conv, [256, 3, 2]]                  # 20 [256,256,40,40]
  - [[-1, 16], 1, Concat, [1]]                  # 21 拼接 [256,256,40,40] + [512,512,40,40] → [768,768,40,40]
  - [-1, 3, C2f, [512]]                        # 22 [768,512,40,40]
  - [-1, 1, Conv, [512, 3, 2]]                  # 23 [512,512,20,20]
  - [[-1, 13], 1, Concat, [1]]                  # 24 拼接 [512,512,20,20] + [512,512,20,20] → [1024,1024,20,20]
  - [-1, 3, C2f, [512]]                        # 25 [1024,512,20,20]
  - [[19, 22, 25], 1, Detect, [nc]]            # 26 Detect(P3, P4, P5)

这样就全部改进完毕了,接下来运行命令就可以看结果了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值