注意力机制之SGE Attention

最新推荐文章于 2025-01-03 18:21:30 发布

果粒橙_LGC

最新推荐文章于 2025-01-03 18:21:30 发布

阅读量5.2k

点赞数 6

文章标签：深度学习神经网络人工智能

本文链接：https://blog.csdn.net/qq_38915354/article/details/130552516

版权

论文提出了一种名为空间组增强(SGE)的模块，旨在改进卷积神经网络(CNN)中的语义特征学习。SGE通过生成注意因子调整子特征的重要性，增强了每个组的表达能力并抑制潜在噪声。该模块设计轻量级，仅基于组内特征相似性引导注意因子，无需大量额外参数。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

论文

Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks

论文链接

paper:Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks

模型结构

在这里插入图片描述

论文主要内容

卷积神经网络（CNN）通过收集不同层次和不同部分的语义子特征来生成复杂对象的特征表示。这些子特征通常可以以分组形式分布在每一层的特征向量中，代表各种语义实体。然而，这些子特征的激活往往在空间上受到相似模式和噪声背景的影响，从而导致错误的定位和识别。本文提出了一个空间组增强（SGE）模块，该模块可以通过为每个语义组中的每个空间位置生成一个注意因子来调整每个子特征的重要性，从而每个单独的组可以自主地增强其学习的表达，并抑制可能的噪声。注意因素仅由各组内部的全局和局部特征描述符之间的相似性来引导，因此SGE模块的设计非常轻量级，几乎没有额外的参数和计算。

import numpy as np
import torch
from torch import nn
from torch.nn import init



class SpatialGroupEnhance(nn.Module):

    def __init__(self, groups):
        super().__init__()
        self.groups=groups
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.weight=nn.Parameter(torch.zeros(1,groups,1,1))
        self.bias=nn.Parameter(torch.zeros(1,groups,1,1))
        self.sig=nn.Sigmoid()
        self.init_weights()


    def init_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                init.kaiming_normal_(m.weight, mode='fan_out')
                if m.bias is not None:
                    init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d):
                init.constant_(m.weight, 1)
                init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                init.normal_(m.weight, std=0.001)
                if m.bias is not None:
                    init.constant_(m.bias, 0)

    def forward(self, x):
        b, c, h,w=x.shape
        x=x.view(b*self.groups,-1,h,w) #bs*g,dim//g,h,w
        xn=x*self.avg_pool(x) #bs*g,dim//g,h,w
        xn=xn.sum(dim=1,keepdim=True) #bs*g,1,h,w
        t=xn.view(b*self.groups,-1) #bs*g,h*w

        t=t-t.mean(dim=1,keepdim=True) #bs*g,h*w
        std=t.std(dim=1,keepdim=True)+1e-5
        t=t/std #bs*g,h*w
        t=t.view(b,self.groups,h,w) #bs,g,h*w
        
        t=t*self.weight+self.bias #bs,g,h*w
        t=t.view(b*self.groups,1,h,w) #bs*g,1,h*w
        x=x*self.sig(t)
        x=x.view(b,c,h,w)

        return x 


if __name__ == '__main__':
    input=torch.randn(50,512,7,7)
    sge = SpatialGroupEnhance(groups=8)
    output=sge(input)
    print(output.shape)