Res2Net

最新推荐文章于 2024-05-29 16:20:46 发布

00000cj

最新推荐文章于 2024-05-29 16:20:46 发布

阅读量1.2k

点赞数

分类专栏： Backbones 文章标签：深度学习计算机视觉人工智能

本文链接：https://blog.csdn.net/ooooocj/article/details/122430069

版权

Backbones 专栏收录该内容

31 篇文章 2 订阅

订阅专栏

Res2Net: A New Multi-scale Backbone Architecture

原理介绍

我们知道两个3×3卷积等价于1个5×5卷积，3个3×3卷积等价于1个7×7卷积。卷积神经网络随着depth的增加，感受野也在增加，其本身就有学习multi scale的能力。本文的创新点在于提出了一个结构，增强了网络学习multi scale特征的能力，使其在一个更 granular（颗粒）的level来表示multi scale特征。

对于一个通道数为c的input feature map，我们需要一个通道数同样为c的卷积核去学习其特征，在网络的某一层，我们有n个这样的卷积核，这一层的输出通道数就是n。作者在width、depth、cardinality之外提出一个新的dimension，scale。将输入特征图沿channel方向等分为scale份，如图2右边所示，对其中的一份进行3×3的卷积，输出一方面concatenate到输出中，一方面和下一份进行add，然后再进行3×3卷积，依次这样操作，最终输出再接1×1卷积得到这个block的输出。

因为层与层之间卷积的堆叠以及下采样的操作，我们可以在不同层之间学习不同scale的特征。现在我们在同一层中就可以学习到不同scale的特征，大大增强了网络multi scale representation ability。

代码实现

    def forward(self, x):
        """Forward function."""

        def _inner_forward(x):
            identity = x

            out = self.conv1(x)
            out = self.norm1(out)
            out = self.relu(out)

            spx = torch.split(out, self.width, 1)
            sp = self.convs[0](spx[0].contiguous())
            sp = self.relu(self.bns[0](sp))
            out = sp
            for i in range(1, self.scales - 1):
                if self.stage_type == 'stage':
                    sp = spx[i]
                else:
                    sp = sp + spx[i]
                sp = self.convs[i](sp.contiguous())
                sp = self.relu(self.bns[i](sp))
                out = torch.cat((out, sp), 1)

            if self.stage_type == 'normal' and self.scales != 1:
                out = torch.cat((out, spx[self.scales - 1]), 1)
            elif self.stage_type == 'stage' and self.scales != 1:
                out = torch.cat((out, self.pool(spx[self.scales - 1])), 1)

            out = self.conv3(out)
            out = self.norm3(out)

            if self.downsample is not None:
                identity = self.downsample(x)

            out += identity

            return out

        if self.with_cp and x.requires_grad:
            out = cp.checkpoint(_inner_forward, x)
        else:
            out = _inner_forward(x)

        out = self.relu(out)

        return out

其中需要注意

如Fig. 2 所示，原论文中split后x1不经过卷积，直接作为输出的y1，论文中也提到“To reduce the number of parameters, we omit the convolution for the first split, which can also be regarded as a form of feature reuse.” 但在mmlcs的实现中，是last split不经过卷积，直接concatenate到输出后面。
和resnet一样，res2net中在是在每个stage的第一个block中的第二个3×3卷积stride=2来实现downsample。如上面代码中所示，每个stage的第一个block的self.stage_type='stage'，此时因为每个split经过3×3-s2的conv后spatial size变了就不能和下一个split进行add了。

00000cj

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
Res2Net

Res2Net: A New Multi-scale Backbone ArchitectureThe Res2Net strategy exposes a new dimension, namely scale (the number of feature groups in the Res2Net block), as an essential factor in addition to existing dimensions of depth [57], width2, and cardin
复制链接

扫一扫