深入理解轻量级模块SENet,cSENet,sSENet,csSENet,SKNet

SENet模块是2017年国内汽车自动驾驶公司Momentum即魔门塔在CVPR2017上提出的,参考文章地址地址:SENet参考文章

cSENet,sSENet,csSENet是三种SENet的变体模块,是在CVPR2018上提出的,论文地址:《Concurrent Spatial and Channel `Squeeze & Excitation’ in Fully Convolutional Networks》

SKNet也是SENet的变体模块,是在CVPR2019上提出的,论文地址:Select Kernel Networks

一、先来理解什么是SENet:

                                                                        fig1.Squeeze and Excitation Module

                                          fig2.SE-Inception Module,SE-ResNet Module计算流程图

如图1所示:SENet通过显式地建模特征通道之间的相互依赖关系,在通道之间引入Attention 机制。包含两部分操作,Squeeze操作即挤压操作,和Excitation操作即激发操作。首先是输入(C,W,H)大小的feature map U,经过Squeeze操作,具体而言,如图2所示,就是先从通道角度进行GP即全局池化,将空间大小变为1*1(可以理解为实数),通道仍为C的feature map,可以理解为C个实数。然后是一个全连接层FC,在这一过程中一般把输入的维度降为1/16,紧跟其后是一个Relu层和一个FC层,把维度恢复到输入的大小(Excitation操作)。最后通过一个Sigmoid层把权重归一化到0~1之间,再通过scale逐通道加权到先前的特征上,至此完成一次SE模块操作。

 

二、cSE、sSE和scSE模块

                                                                      fig3.cSE、sSE和scSE模块

cSE模块:

其实cSE模块与SE模块本质上是一样的。只是在FC那,升降维度这里用的是1/2,而不是SENet论文里的1/16,原因是作者通过大量实验然后确定了这个超参数r是2或16,这样的话兼顾了模型的准确率和运算复杂度。

sSE模块:

sSE模块则是从另一个角度来引入注意力机制,即空间角度。首先是用1*1卷积降维,用Sigmoid函数激活,得到一个1*H*W维度的特征图。然后再经过特征重标定,与原来的U对应空间上相乘得到U^。

scSE模块:

scSE模块则是前两个模块的结合体,将通道和空间都考虑在内。即将两种模块的输出做一个加和操作。

 

 

三、SKNet(Select Kernel Network)

       SKNet同样是一个轻量级嵌入式的模块,其灵感来源是,我们在看不同尺寸不同远近的物体时,视觉皮层神经元接受域大小是会根据刺激来进行调节的。那么对应于CNN网络,一般来说对于特定任务特定模型,卷积核大小是确定的,那么是否可以构建一种模型,使网络可以根据输入信息的多个尺度自适应的调节接受域大小呢?
基于这种想法,作者提出了Selective Kernel Networks(SKNet)。结构图如下

SKNet主要分为3个操作:Split、Fuse、Select:

Split:

将一个H'*W'*C'大小的特征图X做两种不同转换:X->U~和X->U^,所运用的卷积核大小分别是3和5(感觉有点类似Inception网络的想法,多尺度融合)进行完整卷积操作(包括efficient grouped/depthwise convolutions,Batch Normalization,ReLU function)。

Fuse:

这个操作和SENet的操作差不多,先将上一层即:U=U~+U^,然后对U做操作:先是一个GAP,即全局平均池化,将每个空间压缩成一系列实数。然后经过两个FC层,先降维再升维。需要注意的是输出的两个矩阵a和b,其中矩阵b为冗余矩阵,在如图两个分支的情况下b=1-a。

Select:

Select操作和SENet中的scale操作相似,不同的是select操作需要对两个权重矩阵做加权操作,然后加和输出最终向量Vc。即:

  • 3
    点赞
  • 61
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
SENet(Squeeze-and-Excitation Network)是一种有效的注意力机制,可以提高深度神经网络的性能。在轻量级OpenPose中添加SENet模块,可以提高其人体姿态估计的精度。 以下是在轻量级OpenPose中添加SENet模块的代码实现。 首先,需要导入必要的库: ```python import torch import torch.nn as nn ``` 然后,定义SENet模块。 ```python class SELayer(nn.Module): def __init__(self, in_channels, reduction=16): super(SELayer, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(in_channels, in_channels // reduction), nn.ReLU(inplace=True), nn.Linear(in_channels // reduction, in_channels), nn.Sigmoid() ) def forward(self, x): b, c, _, _ = x.size() y = self.avg_pool(x).view(b, c) y = self.fc(y).view(b, c, 1, 1) return x * y ``` SELayer包含了一个全局平均池化层和两个全连接层,其中第二个全连接层输出的Sigmoid函数用于生成注意力权重。 接下来,将SENet模块添加到轻量级OpenPose网络中。 ```python class PoseEstimationWithSENet(nn.Module): def __init__(self, num_keypoints=17, num_channels=128): super(PoseEstimationWithSENet, self).__init__() self.se_layer1 = SELayer(num_channels) self.se_layer2 = SELayer(num_channels) self.se_layer3 = SELayer(num_channels) self.se_layer4 = SELayer(num_channels) self.stage1 = nn.Sequential( nn.Conv2d(3, 32, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(32), nn.ReLU(inplace=True), nn.Conv2d(32, 32, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(32), nn.ReLU(inplace=True), nn.Conv2d(32, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), self.se_layer1, nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), self.se_layer2, ) self.stage2 = nn.Sequential( nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), self.se_layer3, ) self.stage3 = nn.Sequential( nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), self.se_layer4, ) self.stage4 = nn.Sequential( nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=2, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), ) self.stage5 = nn.Sequential( nn.ConvTranspose2d(num_channels, num_channels, kernel_size=4, stride=2, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_channels, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(num_channels), nn.ReLU(inplace=True), nn.Conv2d(num_channels, num_keypoints, kernel_size=1, stride=1), ) self._initialize_weights() def forward(self, x): out1 = self.stage1(x) out2 = self.stage2(out1) out3 = self.stage3(out2) out4 = self.stage4(out3) out5 = self.stage5(out4) return out5 def _initialize_weights(self): for m in self.modules(): if isinstance(m, nn.Conv2d): nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') if m.bias is not None: nn.init.constant_(m.bias, 0) elif isinstance(m, nn.BatchNorm2d): nn.init.constant_(m.weight, 1) nn.init.constant_(m.bias, 0) elif isinstance(m, nn.Linear): nn.init.normal_(m.weight, 0, 0.01) nn.init.constant_(m.bias, 0) ``` 在轻量级OpenPose的每个阶段中,都添加了一个SELayer模块,用于增加注意力权重。 最后,定义一个函数来创建轻量级OpenPose网络和SENet模块。 ```python def create_model_with_senet(num_keypoints, num_channels): model = PoseEstimationWithSENet(num_keypoints=num_keypoints, num_channels=num_channels) return model ``` 现在,您已经知道如何在轻量级OpenPose人体姿态估计网络中添加SENet注意力模块了。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值