每日Attention学习6——Context Aggregation Module

模块出处

[link] [code] [IJCAI 22] Boundary-Guided Camouflaged Object Detection


模块名称

Context Aggregation Module (CAM)


模块作用

增大感受野,全局特征提取


模块结构

在这里插入图片描述


模块代码
import torch
import torch.nn as nn
import torch.nn.functional as F


class ConvBNR(nn.Module):
    def __init__(self, inplanes, planes, kernel_size=3, stride=1, dilation=1, bias=False):
        super(ConvBNR, self).__init__()

        self.block = nn.Sequential(
            nn.Conv2d(inplanes, planes, kernel_size, stride=stride, padding=dilation, dilation=dilation, bias=bias),
            nn.BatchNorm2d(planes),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        return self.block(x)


class Conv1x1(nn.Module):
    def __init__(self, inplanes, planes):
        super(Conv1x1, self).__init__()
        self.conv = nn.Conv2d(inplanes, planes, 1)
        self.bn = nn.BatchNorm2d(planes)
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x):
        x = self.conv(x)
        x = self.bn(x)
        x = self.relu(x)

        return x
    

class CAM(nn.Module):
    def __init__(self, hchannel, channel):
        super(CAM, self).__init__()
        self.conv1_1 = Conv1x1(hchannel + channel, channel)
        self.conv3_1 = ConvBNR(channel // 4, channel // 4, 3)
        self.dconv5_1 = ConvBNR(channel // 4, channel // 4, 3, dilation=2)
        self.dconv7_1 = ConvBNR(channel // 4, channel // 4, 3, dilation=3)
        self.dconv9_1 = ConvBNR(channel // 4, channel // 4, 3, dilation=4)
        self.conv1_2 = Conv1x1(channel, channel)
        self.conv3_3 = ConvBNR(channel, channel, 3)

    def forward(self, lf, hf):
        if lf.size()[2:] != hf.size()[2:]:
            hf = F.interpolate(hf, size=lf.size()[2:], mode='bilinear', align_corners=False)
        x = torch.cat((lf, hf), dim=1)
        x = self.conv1_1(x)
        xc = torch.chunk(x, 4, dim=1)
        x0 = self.conv3_1(xc[0] + xc[1])
        x1 = self.dconv5_1(xc[1] + x0 + xc[2])
        x2 = self.dconv7_1(xc[2] + x1 + xc[3])
        x3 = self.dconv9_1(xc[3] + x2)
        xx = self.conv1_2(torch.cat((x0, x1, x2, x3), dim=1))
        x = self.conv3_3(x + xx)

        return x

    
if __name__ == '__main__':
    x1 = torch.randn([3, 256, 16, 16])
    x2 = torch.randn([3, 512, 8, 8])
    cam = CAM(hchannel=512, channel=256)
    out = cam(x1, x2)
    print(out.shape)  # 3, 256, 16, 16

原文表述

为了将多层次的融合特征整合到伪装物体预测中,我们设计了一个上下文聚合模块(CAM)来挖掘上下文语义,以增强物体检测,如图5所示。不同于BBSNet中的全局上下文模块不考虑各分支之间的语义关联,CAM考虑到跨尺度交互作用以增强特征表示。

YOLOv5中的context aggregation是通过CONTAINER(CONText AggregatIon NEtwoRK)模块来实现的。这个模块是一种用于多头上下文集成的广义构建模块,它可以通过神经网络集成空间上下文信息。\[2\]这种方法可以提升检测算法的性能,而且相对于使用超级复杂的算法来获得较高的检测精度,工业界更喜欢使用这种方法。\[1\]在YOLOv5中,context aggregation的实现方式是通过YOLOV7 head,它采用了pafpn的结构。具体来说,它首先对backbone最后输出的32倍降采样特征图C5进行处理,然后经过SPPCSP将通道数从1024减少到512。接着按照top down和C4、C3进行融合,得到P3、P4和P5,然后再按照bottom-up的方式与P4、P5进行融合。不同的是,YOLOv5中将CSP模块换成了ELAN-H模块,并且下采样变为了MP2层。\[3\]通过这种context aggregation的方式,YOLOv5可以更好地集成空间上下文信息,从而提升检测性能。 #### 引用[.reference_title] - *1* *3* [YOLO系列详解:YOLOv1、YOLOv2、YOLOv3、YOLOv4、YOLOv5、YOLOv6](https://blog.csdn.net/qq_40716944/article/details/114822515)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [Yolov5涨点神器:注意力机制---多头上下文集成(Context Aggregation)的广义构建模块,助力小目标检测,...](https://blog.csdn.net/m0_63774211/article/details/130865001)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值