YOLO v8改进 之 引入注意力机制 MCA

1 在 ultralytics/nn/attention 目录下创建MCA.py文件

import torch
from torch import nn
import math

__all__ = ['MCALayer', 'MCAGate']


class StdPool(nn.Module):
    def __init__(self):
        super(StdPool, self).__init__()

    def forward(self, x):
        b, c, _, _ = x.size()

        std = x.view(b, c, -1).std(dim=2, keepdim=True)
        std = std.reshape(b, c, 1, 1)

        return std


class MCAGate(nn.Module):
    def __init__(self, k_size, pool_types=['avg', 'std']):
        """Constructs a MCAGate module.
        Args:
            k_size: kernel size
            pool_types: pooling type. 'avg': average pooling, 'max': max pooling, 'std': standard deviation pooling.
        """
        super(MCAGate, self).__init__()

        self.pools = nn.ModuleList([])
        for pool_type in pool_types:
            if pool_type == 'avg':
                self.pools.append(nn.AdaptiveAvgPool2d(1))
            elif pool_type == 'max':
                self.pools.append(nn.AdaptiveMaxPool2d(1))
            elif pool_type == 'std':
                self.pools.append(StdPool())
            else:
                raise NotImplementedError

        self.conv = nn.Conv2d(1, 1, kernel_size=(1, k_size), stride=1, padding=(0, (k_size - 1) // 2), bias=False)
        self.sigmoid = nn.Sigmoid()

        self.weight = nn.Parameter(torch.rand(2))

    def forward(self, x):
        feats = [pool(x) for pool in self.pools]

        if len(feats) == 1:
            out = feats[0]
        elif len(feats) == 2:
            weight = torch.sigmoid(self.weight)
            out = 1 / 2 * (feats[0] + feats[1]) + weight[0] * feats[0] + weight[1] * feats[1]
        else:
            assert False, "Feature Extraction Exception!"

        out = out.permute(0, 3, 2, 1).contiguous()
        out = self.conv(out)
        out = out.permute(0, 3, 2, 1).contiguous()

        out = self.sigmoid(out)
        out = out.expand_as(x)

        return x * out


class MCALayer(nn.Module):
    def __init__(self, inp, no_spatial=False):
        """Constructs a MCA module.
        Args:
            inp: Number of channels of the input feature maps
            no_spatial: whether to build channel dimension interactions
        """
        super(MCALayer, self).__init__()

        lambd = 1.5
        gamma = 1
        temp = round(abs((math.log2(inp) - gamma) / lambd))
        kernel = temp if temp % 2 else temp - 1

        self.h_cw = MCAGate(3)
        self.w_hc = MCAGate(3)
        self.no_spatial = no_spatial
        if not no_spatial:
            self.c_hw = MCAGate(kernel)

    def forward(self, x):
        x_h = x.permute(0, 2, 1, 3).contiguous()
        x_h = self.h_cw(x_h)
        x_h = x_h.permute(0, 2, 1, 3).contiguous()

        x_w = x.permute(0, 3, 2, 1).contiguous()
        x_w = self.w_hc(x_w)
        x_w = x_w.permute(0, 3, 2, 1).contiguous()

        if not self.no_spatial:
            x_c = self.c_hw(x)
            x_out = 1 / 3 * (x_c + x_h + x_w)
        else:
            x_out = 1 / 2 * (x_h + x_w)

        return x_out

2 修改ultralytics/nn/tasks.py文件

将下面代码添加到函数parse_model中

 elif m in {MCALayer,MCAGate}:
            args = [ch[f], *args]

3  修改yaml文件,在head中引入MCA注意力机制

### 如何在YOLOv8中集成MCA注意力机制 为了在YOLOv8中实现MCA(多维协同注意力)机制,可以遵循以下结构化的方法。此过程涉及修改YOLOv8架构以纳入MCA模块,从而增强其目标检测能力。 #### 修改YOLOv8的Backbone部分 首先,在YOLOv8的backbone层引入MCA模块。这一步骤旨在利用MCA来捕捉更丰富的特征表示,进而提高模型的整体表现[^1]。具体操作如下: - 定位到YOLOv8源代码中的`models/yolov8.py`文件。 - 找到定义backbone的部分,并在此处插入自定义的MCA层。 ```python from models.common import Conv, BottleneckCSP import torch.nn as nn class MCALayer(nn.Module): """Implementation of the Multi-dimensional Collaborative Attention (MCA) module.""" def __init__(self, channels_in, reduction_ratio=16): super(MCALayer, self).__init__() reduced_channels = int(channels_in / reduction_ratio) self.spatial_attention = SpatialAttention() self.channel_attention = ChannelAttention(reduced_channels) self.positional_encoding = PositionEncoding() def forward(self, x): spatial_out = self.spatial_attention(x) channel_out = self.channel_attention(spatial_out) pos_encoded = self.positional_encoding(channel_out) return pos_encoded + x # Residual connection def add_mca_to_backbone(model): """ Adds an MCA layer after each CSP block within the backbone. Args: model: The original YOLOv8 model instance. Returns: Modified model with added MCA layers. """ for name, module in list(model.backbone.named_children()): if isinstance(module, BottleneckCSP): mca_layer = MCALayer(module.conv[-2].out_channels).to(next(model.parameters()).device) setattr(model.backbone, f"{name}_mca", mca_layer) # Modify forward pass to include new MCA layer orig_forward = getattr(module, 'forward') def wrapped_forward(*args, **kwargs): result = orig_forward(*args, **kwargs) return getattr(model.backbone, f"{name}_mca")(result) setattr(module, 'forward', wrapped_forward) ``` 这段代码展示了如何创建一个新的类`MCALayer`来代表MCA模块,并将其添加至现有的YOLOv8 backbone之后。注意这里假设了存在名为`SpatialAttention`, `ChannelAttention`, 和 `PositionEncoding` 的子组件;这些应该按照论文描述的具体细节去实现。 #### 调整训练配置 完成上述更改后,还需要调整训练设置以便更好地适应新的网络结构。建议增加batch size并适当延长epoch数目,因为加入了额外的关注机制可能会使收敛变得稍微缓慢一些。此外,考虑到计算资源消耗可能有所增长,应监控GPU内存使用情况并相应地优化超参数。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值