题目:Going deeper with convolutions、Rethinking the Inception Architecture for Computer Vision
链接:https://arxiv.org/abs/1409.4842、https://arxiv.org/abs/1512.00567
作者:Christian Szegedy等
摘要:
Inception-v1是一篇挺好玩的文章,一方面模块“Inception”和标题“go deeper”致敬了经典科幻电影盗梦空间(《Inception》&“We need to go deeper”);另一方面GoogLeNet也致敬了开山的LeNet。
简介
GoogLeNet受Network in network的启发,引入了 1 × 1 1\times 1 1×1的卷积层来增加网络的表示能力,从而能够加宽和加深网络(22层)。同时,GoogLeNet也考虑了参数数量(AlexNet的 1 12 \frac{1}{12} 121)和推断时间的权衡,从而具有很大的实际意义。题目中的deeper既指表面上的增加网络深度,也指探索新型的网络架构Inception。
动机
最直接的提高DNN性能的方式就是增加深度和宽度了。然而这样做有两个缺点:过拟合&时空低效率。于是作者提出把全联接层改成稀疏连接(尽管卷积本身就是一种稀疏连接)。稀疏连接在数值计算方面也有着明显的高效性。
网络架构
1 × 1 1\times 1 1×1卷积
这里先解释一下为什么
1
×
1
1\times 1
1×1卷积等效于全连接。
对于
5
×
5
5\times 5
5×5的feature map,如果用25*10的转移矩阵,就会得到10维向量。按照矩阵乘法的定义,向量的每个元素都是feature map中25个元素的线性组合;而如果换成
10
×
1
×
1
10\times 1\times 1
10×1×1的卷积核,则会得到
10
×
5
×
5
10\times 5\times 5
10×5×5的feature map,每个通道内求和即得到等效的10维向量。
辅助模块
在v1中,辅助模块主要用于在训练阶段增加浅层的特征提取能力,然而后续版本中提到,这一模块的主要作用是进行正则化,和dropout、bn类似。但是pytorch代码中缺少了ReLU和Dropout。
上图是v1;下面的代码是PyTorch Inceptionv3,见Rethink一文的fig.8。
class InceptionAux(nn.Module):
def __init__(self, in_channels, num_classes):
super(InceptionAux, self).__init__()
self.conv0 = BasicConv2d(in_channels, 128, kernel_size=1)
self.conv1 = BasicConv2d(128, 768, kernel_size=5)
self.conv1.stddev = 0.01
self.fc = nn.Linear(768, num_classes)
self.fc.stddev = 0.001
def forward(self, x):
# N x 768 x 17 x 17
x = F.avg_pool2d(x, kernel_size=5, stride=3)
# N x 768 x 5 x 5
x = self.conv0(x)
# N x 128 x 5 x 5
x = self.conv1(x)
# N x 768 x 1 x 1
# Adaptive average pooling
x = F.adaptive_avg_pool2d(x, (1, 1))
# N x 768 x 1 x 1
x = x.view(x.size(0), -1)
# N x 768
x = self.fc(x)
# N x 1000
return x
Inception之前
图来自v1;代码来自v3,将大核改进成为感受域相同的小卷积核。
def forward(self, x):
if self.transform_input:
x_ch0 = torch.unsqueeze(x[:, 0], 1) * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
x_ch1 = torch.unsqueeze(x[:, 1], 1) * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
x_ch2 = torch.unsqueeze(x[:, 2], 1) * (0.225 / 0.5) + (0.406 - 0.5) / 0.5
x = torch.cat((x_ch0, x_ch1, x_ch2), 1)
# N x 3 x 299 x 299
x = self.Conv2d_1a_3x3(x)
# N x 32 x 149 x 149
x = self.Conv2d_2a_3x3(x)
# N x 32 x 147 x 147
x = self.Conv2d_2b_3x3(x)
# N x 64 x 147 x 147
x = F.max_pool2d(x, kernel_size=3, stride=2)
# N x 64 x 73 x 73
x = self.Conv2d_3b_1x1(x)
# N x 80 x 73 x 73
x = self.Conv2d_4a_3x3(x)
# N x 192 x 71 x 71
x = F.max_pool2d(x, kernel_size=3, stride=2)
# N x 192 x 35 x 35
v3中的各种Inception模块
以下代码均来自 https://github.com/pytorch/vision/blob/master/torchvision/models/inception.py (2019.3.1)
另附TF代码
https://github.com/tensorflow/models/tree/master/research/inception
#InceptionA
def forward(self, x):
branch1x1 = self.branch1x1(x)
branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5)
branch3x3dbl = self.branch3x3dbl_1(x)
branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)
outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]
return torch.cat(outputs, 1)
# InceptionB
def forward(self, x):
branch3x3 = self.branch3x3(x)
branch3x3dbl = self.branch3x3dbl_1(x)
branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)
branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)
outputs = [branch3x3, branch3x3dbl, branch_pool]
return torch.cat(outputs, 1)
# InceptionC
def forward(self, x):
branch1x1 = self.branch1x1(x)
branch7x7 = self.branch7x7_1(x)
branch7x7 = self.branch7x7_2(branch7x7)
branch7x7 = self.branch7x7_3(branch7x7)
branch7x7dbl = self.branch7x7dbl_1(x)
branch7x7dbl = self.branch7x7dbl_2(branch7x7dbl)
branch7x7dbl = self.branch7x7dbl_3(branch7x7dbl)
branch7x7dbl = self.branch7x7dbl_4(branch7x7dbl)
branch7x7dbl = self.branch7x7dbl_5(branch7x7dbl)
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)
outputs = [branch1x1, branch7x7, branch7x7dbl, branch_pool]
return torch.cat(outputs, 1)
# InceptionD
def forward(self, x):
branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)
branch7x7x3 = self.branch7x7x3_1(x)
branch7x7x3 = self.branch7x7x3_2(branch7x7x3)
branch7x7x3 = self.branch7x7x3_3(branch7x7x3)
branch7x7x3 = self.branch7x7x3_4(branch7x7x3)
branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)
outputs = [branch3x3, branch7x7x3, branch_pool]
return torch.cat(outputs, 1)
# InceptionE
def forward(self, x):
branch1x1 = self.branch1x1(x)
branch3x3 = self.branch3x3_1(x)
branch3x3 = [
self.branch3x3_2a(branch3x3),
self.branch3x3_2b(branch3x3),
]
branch3x3 = torch.cat(branch3x3, 1)
branch3x3dbl = self.branch3x3dbl_1(x)
branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
branch3x3dbl = [
self.branch3x3dbl_3a(branch3x3dbl),
self.branch3x3dbl_3b(branch3x3dbl),
]
branch3x3dbl = torch.cat(branch3x3dbl, 1)
branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool)
outputs = [branch1x1, branch3x3, branch3x3dbl, branch_pool]
return torch.cat(outputs, 1)
训练方式
随机梯度下降,momentum0.9,学习率按照epoch递减,通过Polyak平均更新参数。
参考:
https://blog.csdn.net/docrazy5351/article/details/78993269
https://blog.csdn.net/dcrmg/article/details/79246654
https://www.cnblogs.com/Allen-rg/p/5833919.html
https://towardsdatascience.com/a-simple-guide-to-the-versions-of-the-inception-network-7fc52b863202
后记:
以前读这篇文章的时候敷衍了事却又始终耿耿于怀,商汤面试终究还是问到了模型的细节,算是墨菲定律了hhhh。现在结合代码重读此文,也算是解开了一直以来的一个心结,也算是不白挂。
——19.3.1