Attention机制中SEnet CBAM以及Dual pooling的pytorch实现

最新推荐文章于 2024-08-01 22:26:03 发布

eilot_c

最新推荐文章于 2024-08-01 22:26:03 发布

阅读量2.7k

点赞数 7

本文链接：https://blog.csdn.net/eilot_c/article/details/106858204

版权

本来自己写了，关于SENet的注意力截止，但是在准备写其他注意力机制代码的时候，看到一篇文章总结的很好，所以对此篇文章进行搬运，以供自己查阅，并加上自己的理解。

[TOC]

1.SENET中的channel-wise加权的实现

实现代码参考自：senet.pytorch
senet
代码如下：
SEnet 模块

 from torch import nn
class SELayer(nn.Module):
    def __init__(self, channel, reduction=16):
        super(SELayer, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),
            nn.Sigmoid()
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc(y).view(b, c, 1, 1)
        return x * y. (x)

senet2
以上代码设计到的API：

AdaptiveAvgPool2d: 自适应平均池化，参数为（n,m）则将原来的feature（w,h）通过pooling得到（n,m）的feature，如果是（n）,则将原来的feature从（w,h）通过pooling得到（n,n）
Sequential: torch容器，存放网络层等内容。
Linear: 线性层，参数为（in, out）,将原有的in个feature转为out个feature
ReLU: 激活层， inplace进行原地操作，节省内存
Sigmoid: 激活层，将输入压缩到0-1
分析forward进行模型的构建：
x是输入的feature,一般各个通道意义如下：（batch size，channel, width , height）,这里获取了batch(b), channel

x通过AdaptiveAvgPool2d(1)以后将得到（batch size, channel, 1, 1）, 然后view（b,c）意思是按照b,c进行展开

In [1]: import torch
In [2]:  x = torch.zeros((16,256,256,256))
In [3]:  import torch.nn as nn
In [4]: avg_pool = nn.AdaptiveAvgPool2d(1)
In [5]: avg_pool(x).shape
Out[5]: torch.Size([16, 256, 1, 1])
In [6]: avg_pool(x).view((16,256)).shape
Out[6]: torch.Size([16, 256])
In [7]: avg_pool(x).squeeze().shape # squeeze()函数也可以将所有通道个数为1的进行挤压
Out[7]: torch.Size([16, 256])