【论文阅读】GoogLeNet（2014）

最新推荐文章于 2023-06-23 20:05:16 发布

玄云飘风

最新推荐文章于 2023-06-23 20:05:16 发布

阅读量480

点赞数

分类专栏：论文阅读 CV 文章标签： GoogLeNet 图像分类

本文链接：https://blog.csdn.net/tfcy694/article/details/87546290

版权

CV 同时被 2 个专栏收录

35 篇文章 4 订阅

订阅专栏

论文阅读

23 篇文章 1 订阅

订阅专栏

题目：Going deeper with convolutions、Rethinking the Inception Architecture for Computer Vision
链接：https://arxiv.org/abs/1409.4842、https://arxiv.org/abs/1512.00567
作者：Christian Szegedy等
摘要：
在这里插入图片描述
Inception-v1是一篇挺好玩的文章，一方面模块“Inception”和标题“go deeper”致敬了经典科幻电影盗梦空间（《Inception》&“We need to go deeper”）；另一方面GoogLeNet也致敬了开山的LeNet。

简介

GoogLeNet受Network in network的启发，引入了 $1\times 1$ 的卷积层来增加网络的表示能力，从而能够加宽和加深网络（22层）。同时，GoogLeNet也考虑了参数数量（AlexNet的 $\frac{1}{12}$ ）和推断时间的权衡，从而具有很大的实际意义。题目中的deeper既指表面上的增加网络深度，也指探索新型的网络架构Inception。

动机

最直接的提高DNN性能的方式就是增加深度和宽度了。然而这样做有两个缺点：过拟合&时空低效率。于是作者提出把全联接层改成稀疏连接（尽管卷积本身就是一种稀疏连接）。稀疏连接在数值计算方面也有着明显的高效性。

网络架构

$1\times 1$ 卷积

这里先解释一下为什么 $1\times 1$ 卷积等效于全连接。
对于 $5\times 5$ 的feature map，如果用25*10的转移矩阵，就会得到10维向量。按照矩阵乘法的定义，向量的每个元素都是feature map中25个元素的线性组合；而如果换成 $10\times 1\times 1$ 的卷积核，则会得到 $10\times 5\times 5$ 的feature map，每个通道内求和即得到等效的10维向量。

辅助模块

在v1中，辅助模块主要用于在训练阶段增加浅层的特征提取能力，然而后续版本中提到，这一模块的主要作用是进行正则化，和dropout、bn类似。但是pytorch代码中缺少了ReLU和Dropout。
在这里插入图片描述

上图是v1；下面的代码是PyTorch Inceptionv3，见Rethink一文的fig.8。

class InceptionAux(nn.Module):

    def __init__(self, in_channels, num_classes):
        super(InceptionAux, self).__init__()
        self.conv0 = BasicConv2d(in_channels, 128, kernel_size=1)
        self.conv1 = BasicConv2d(128, 768, kernel_size=5)
        self.conv1.stddev = 0.01
        self.fc = nn.Linear(768, num_classes)
        self.fc.stddev = 0.001

    def forward(self, x):
        # N x 768 x 17 x 17
        x = F.avg_pool2d(x, kernel_size=5, stride=3)
        # N x 768 x 5 x 5
        x = self.conv0(x)
        # N x 128 x 5 x 5
        x = self.conv1(x)
        # N x 768 x 1 x 1
        # Adaptive average pooling
        x = F.adaptive_avg_pool2d(x, (1, 1))
        # N x 768 x 1 x 1
        x = x.view(x.size(0), -1)
        # N x 768
        x = self.fc(x)
        # N x 1000
        return x

Inception之前

在这里插入图片描述

图来自v1；代码来自v3，将大核改进成为感受域相同的小卷积核。

    def forward(self, x):
        if self.transform_input:
            x_ch0 = torch.unsqueeze(x[:, 0], 1) * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
            x_ch1 = torch.unsqueeze(x[:, 1], 1) * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
            x_ch2 = torch.unsqueeze(x[:, 2], 1) * (0.225 / 0.5) + (0.406 - 0.5) / 0.5
            x = torch.cat((x_ch0, x_ch1, x_ch2), 1)
        # N x 3 x 299 x 299
        x = self.Conv2d_1a_3x3(x)
        # N x 32 x 149 x 149
        x = self.Conv2d_2a_3x3(x)
        # N x 32 x 147 x 147
        x = self.Conv2d_2b_3x3(x)
        # N x 64 x 147 x 147
        x = F.max_pool2d(x, kernel_size=3, stride=2)
        # N x 64 x 73 x 73
        x = self.Conv2d_3b_1x1(x)
        # N x 80 x 73 x 73
        x = self.Conv2d_4a_3x3(x)
        # N x 192 x 71 x 71
        x = F.max_pool2d(x, kernel_size=3, stride=2)
        # N x 192 x 35 x 35

v3中的各种Inception模块

以下代码均来自 https://github.com/pytorch/vision/blob/master/torchvision/models/inception.py （2019.3.1）
另附TF代码
https://github.com/tensorflow/models/tree/master/research/inception

#InceptionA
    def forward(self, x):
        branch1x1 = self.branch1x1(x)

        branch5x5 = self.branch5x5_1(x)
        branch5x5 = self.branch5x5_2(branch5x5)

        branch3x3dbl = self.branch3x3dbl_1(x)
        branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
        branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)

        branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
        branch_pool = self.branch_pool(branch_pool)

        outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]
        return torch.cat(outputs, 1)

# InceptionB
    def forward(self, x):
        branch3x3 = self.branch3x3(x)

        branch3x3dbl = self.branch3x3dbl_1(x)
        branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
        branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)

        branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)

        outputs = [branch3x3, branch3x3dbl, branch_pool]
        return torch.cat(outputs, 1)

# InceptionC
    def forward(self, x):
        branch1x1 = self.branch1x1(x)

        branch7x7 = self.branch7x7_1(x)
        branch7x7 = self.branch7x7_2(branch7x7)
        branch7x7 = self.branch7x7_3(branch7x7)

        branch7x7dbl = self.branch7x7dbl_1(x)
        branch7x7dbl = self.branch7x7dbl_2(branch7x7dbl)
        branch7x7dbl = self.branch7x7dbl_3(branch7x7dbl)
        branch7x7dbl = self.branch7x7dbl_4(branch7x7dbl)
        branch7x7dbl = self.branch7x7dbl_5(branch7x7dbl)

        branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
        branch_pool = self.branch_pool(branch_pool)

        outputs = [branch1x1, branch7x7, branch7x7dbl, branch_pool]
        return torch.cat(outputs, 1)

# InceptionD
    def forward(self, x):
        branch3x3 = self.branch3x3_1(x)
        branch3x3 = self.branch3x3_2(branch3x3)

        branch7x7x3 = self.branch7x7x3_1(x)
        branch7x7x3 = self.branch7x7x3_2(branch7x7x3)
        branch7x7x3 = self.branch7x7x3_3(branch7x7x3)
        branch7x7x3 = self.branch7x7x3_4(branch7x7x3)

        branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)
        outputs = [branch3x3, branch7x7x3, branch_pool]
        return torch.cat(outputs, 1)

# InceptionE
    def forward(self, x):
        branch1x1 = self.branch1x1(x)

        branch3x3 = self.branch3x3_1(x)
        branch3x3 = [
            self.branch3x3_2a(branch3x3),
            self.branch3x3_2b(branch3x3),
        ]
        branch3x3 = torch.cat(branch3x3, 1)

        branch3x3dbl = self.branch3x3dbl_1(x)
        branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
        branch3x3dbl = [
            self.branch3x3dbl_3a(branch3x3dbl),
            self.branch3x3dbl_3b(branch3x3dbl),
        ]
        branch3x3dbl = torch.cat(branch3x3dbl, 1)

        branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
        branch_pool = self.branch_pool(branch_pool)

        outputs = [branch1x1, branch3x3, branch3x3dbl, branch_pool]
        return torch.cat(outputs, 1)