1.模块介绍
Sequeeze-and-Excitation bloack (SENet) 并不是一个完整的网络结构,而是一个子模块,可以嵌到其他分类,分割或检测模型中。作者在文中做了将其嵌入到Resnet,VGG,Inception等网络模型当中,发现效果都有了一定的提升。
SENet的核心思想在于通过网络根据loss去学习特征权重,使得不同Channel中有效的feature map权重大,无效或效果小的feature map权重小的方式训练模型达到更好的结果。
2. 各个部分详细介绍
Squeeze:顺着空间维度进行特征压缩,并将每一个feature压缩成一个实数,所以这个实数某种程度上具有全局的感受野,并且输出的维度和输入的特征通道相匹配。
Excitation:基于特征通道之间的相关性,每个通道生成一个权重,用来表示这个通道的重要程度。
Reweight:将Excitation输出的权重看作是每个通道的重要性,然后直接加权(乘)到之前的完整的feature maps上,完成在通道维度上对原始特征的重要性加权标注。
设计细节:
- Squeeze()中哪种池化方式好:全局平均池化
- Excitation中reduction® 设置多少好 r = 16
- Excitation中那种激活方式最好?==Sigmoid ==
Code:
class SENet(nn.Module):
def __init__(self, block, layers, num_classes=1000):
self.inplanes = 64
super(SENet, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = nn.BatchNorm2d(64)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
self.avgpool = nn.AvgPool2d(7, stride=1)
self.fc = nn.Linear(512 * block.expansion, num_classes)
for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
def _make_layer(self, block, planes, blocks, stride=1):
downsample = None
if stride != 1 or self.inplanes != planes * block.expansion:
downsample = nn.Sequential(
nn.Conv2d(self.inplanes, planes * block.expansion,
kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(planes * block.expansion),
)
layers = []
layers.append(block(self.inplanes, planes, stride, downsample))
self.inplanes = planes * block.expansion
for i in range(1, blocks):
layers.append(block(self.inplanes, planes))
return nn.Sequential(*layers)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = x.view(x.size(0), -1)
x = self.fc(x)
return x