经典卷积模型(四)GoogLeNet-Inception(V3)代码解析

Inception-V3

网络主干依旧由Inception、辅助分类器构成,其中Inception有六类。

BasicConv2d 基本卷积模块

BasicConv2d为带BN+relu的卷积。

class BasicConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, **kwargs):
        super(BasicConv2d, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)
        self.bn = nn.BatchNorm2d(out_channels, eps=0.001)
    def forward(self, x):
        x = self.conv(x)
        x = self.bn(x)
        return F.relu(x, inplace=True)

Inception部分

pytorch提供的有六种基本的inception模块,分别是InceptionA——InceptionE。

InceptionA

在这里插入图片描述
得到输入大小不变,通道数为224+pool_features的特征图。 假如输入为(35, 35, 192)的数据:

  • 第一个brach:
    • 经过branch1x1为带有64个1*1的卷积核,所以生成第一张特征图(35, 35, 64);
  • 第二个brach:
    • 首先经过branch5x5_1为带有48个1*1的卷积核,所以第二张特征图(35, 35, 48),
    • 然后经过branch5x5_2为带有64个5*5大小且填充为2的卷积核,特征图大小依旧不变,因此第二张特征图最终为(35, 35, 64);
  • 第三个brach:
    • 首先经过branch3x3dbl_1为带有64个1*1的卷积核,所以第三张特征图(35, 35, 64),
    • 然后经过branch3x3dbl_2为带有96个3*3大小且填充为1的卷积核,特征图大小依旧不变,因此进一步生成第三张特征图(35, 35, 96),
    • 最后经过branch3x3dbl_3为带有96个3*3大小且填充为1的卷积核,特征图大小和通道数不变,因此第三张特征图最终为(35, 35, 96);
  • 第四个brach:
    • 首先经过avg_pool2d,其中池化核3*3,步长为1,填充为1,所以第四张特征图大小不变,通道数不变,第四张特征图为(35, 35, 192),
    • 然后经过branch_pool为带有pool_features个的1*1卷积,因此第四张特征图最终为(35, 35, pool_features);
  • 最后将四张特征图进行拼接,最终得到(35,35,64+64+96+pool_features)的特征图
class InceptionA(nn.Module):
    def __init__(self, in_channels, pool_features):
        super(InceptionA, self).__init__()
        self.branch1x1 = BasicConv2d(in_channels, 64, kernel_size=1)
        self.branch5x5_1 = BasicConv2d(in_channels, 48, kernel_size=1)
        self.branch5x5_2 = BasicConv2d(48, 64, kernel_size=5, padding=2)
        self.branch3x3dbl_1 = BasicConv2d(in_channels, 64, kernel_size=1)
        self.branch3x3dbl_2 = BasicConv2d(64, 96, kernel_size=3, padding=1)
        self.branch3x3dbl_3 = BasicConv2d(96, 96, kernel_size=3, padding=1)
        self.branch_pool = BasicConv2d(in_channels, pool_features, kernel_size=1)
    def forward(self, x):
        branch1x1 = self.branch1x1(x)
        branch5x5 = self.branch5x5_1(x)
        branch5x5 = self.branch5x5_2(branch5x5)
        branch3x3dbl = self.branch3x3dbl_1(x)
        branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
        branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)
        branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
        branch_pool = self.branch_pool(branch_pool)
        outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]
        return torch.cat(outputs, 1)

InceptionB

在这里插入图片描述
得到输入大小减半,通道数+480的特征图,假如输入为(35, 35, 288)的数据:

  • 第一个brach:
    • 经过branch1x1为带有384个3*3大小且步长2的卷积核,(35-3+2*0)/2+1=17所以生成第一张特征图(17, 17, 384);
  • 第二个brach:
    • 首先经过branch3x3dbl为带有64个1*1的卷积核,特征图大小不变,即(35, 35, 64);
    • 然后经过branch3x3dbl_2为带有96个3*3大小填充1的卷积核,特征图大小不变,即(35, 35, 96),
    • 再经过branch3x3dbl_3为带有96个3*3大小步长2的卷积核,(35-3+2*0)/2+1=17,即第二张特征图为(17, 17, 96);
  • 第三个brach:
    • 经过max_pool2d,池化核大小3*3,步长为2,所以是二倍最大值下采样,通道数保持不变,第三张特征图为(17, 17, 288);
  • 最后将三张特征图进行拼接,最终得到(17(即Hin/2),17(即Win/2),384+96+288(Cin)=768)的特征图
class InceptionB(nn.Module):
    def __init__(self, in_channels):
        super(InceptionB, self).__init__()
        self.branch3x3 = BasicConv2d(in_channels, 384, kernel_size=3, stride=2)
        self.branch3x3dbl_1 = BasicConv2d(in_channels, 64, kernel_size=1)
        self.branch3x3dbl_2 = BasicConv2d(64, 96, kernel_size=3, padding=1)
        self.branch3x3dbl_3 = BasicConv2d(96, 96, kernel_size=3, stride=2)
    def forward(self, x):
        branch3x3 = self.branch3x3(x)
        branch3x3dbl = self.branch3x3dbl_1(x)
        branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
        branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)
        branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)
        outputs = [branch3x3, branch3x3dbl, branch_pool]
        return torch.cat(outputs, 1)

InceptionC

在这里插入图片描述
最终得到输入大小不变,通道数为768的特征图。 假如输入为(17,17, 768)的数据:

  • 第一个branch1x1为带有192个1*1的卷积核,所以生成第一张特征图(17,17, 192);
  • 第二个brach:
    • 首先经过branch7x7_1为带有c7个1*1的卷积核,所以第二张特征图(17,17, c7),
    • 然后经过branch7x7_2为带有c7个1*7大小且填充为0*3的卷积核,特征图大小不变,进一步生成第二张特征图(17,17, c7),
    • 然后经过branch7x7_3为带有192个7*1大小且填充为3*0的卷积核,特征图大小不变,进一步生成第二张特征图(17,17, 192),因此第二张特征图最终为(17,17, 192);
  • 第三个brach:
    • 首先经过branch7x7dbl_1为带有c7个1*1的卷积核,所以第三张特征图(17,17, c7),
    • 然后经过branch7x7dbl_2为带有c7个7*1大小且填充为3*0的卷积核,特征图大小不变,进一步生成第三张特征图(17,17, c7),
    • 然后经过branch7x7dbl_3为带有c7个1*7大小且填充为0*3的卷积核,特征图大小不变,进一步生成第三张特征图(17,17, c7),
    • 然后经过branch7x7dbl_4为带有c7个7*1大小且填充为3*0的卷积核,特征图大小不变,进一步生成第三张特征图(17,17, c7),
    • 然后经过branch7x7dbl_5为带有192个1*7大小且填充为0*3的卷积核,特征图大小不变,因此第二张特征图最终为(17,17, 192);
  • 第四个brach:
    • 首先经过avg_pool2d,其中池化核3*3,步长为1,填充为1,所以第四张特征图大小不变,通道数不变,第四张特征图为(17,17, 768),
    • 然后经过branch_pool为带有192个的1*1卷积,因此第四张特征图最终为(17,17, 192);
  • 最后将四张特征图进行拼接,最终得到(17, 17, 192+192+192+192=768)的特征图
class InceptionC(nn.Module):
    def __init__(self, in_channels, channels_7x7):
        super(InceptionC, self).__init__()
        self.branch1x1 = BasicConv2d(in_channels, 192, kernel_size=1)
        c7 = channels_7x7
        self.branch7x7_1 = BasicConv2d(in_channels, c7, kernel_size=1)
        self.branch7x7_2 = BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3))
        self.branch7x7_3 = BasicConv2d(c7, 192, kernel_size=(7, 1), padding=(3, 0))
        self.branch7x7dbl_1 = BasicConv2d(in_channels, c7, kernel_size=1)
        self.branch7x7dbl_2 = BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0))
        self.branch7x7dbl_3 = BasicConv2d(c7, c7, kernel_size=(1, 7), padding=(0, 3))
        self.branch7x7dbl_4 = BasicConv2d(c7, c7, kernel_size=(7, 1), padding=(3, 0))
        self.branch7x7dbl_5 = BasicConv2d(c7, 192, kernel_size=(1, 7), padding=(0, 3))
        self.branch_pool = BasicConv2d(in_channels, 192, kernel_size=1)
    def forward(self, x):
        branch1x1 = self.branch1x1(x)
        branch7x7 = self.branch7x7_1(x)
        branch7x7 = self.branch7x7_2(branch7x7)
        branch7x7 = self.branch7x7_3(branch7x7)
        branch7x7dbl = self.branch7x7dbl_1(x)
        branch7x7dbl = self.branch7x7dbl_2(branch7x7dbl)
        branch7x7dbl = self.branch7x7dbl_3(branch7x7dbl)
        branch7x7dbl = self.branch7x7dbl_4(branch7x7dbl)
        branch7x7dbl = self.branch7x7dbl_5(branch7x7dbl)
        branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
        branch_pool = self.branch_pool(branch_pool)
        outputs = [branch1x1, branch7x7, branch7x7dbl, branch_pool]
        return torch.cat(outputs, 1)

InceptionD

在这里插入图片描述
得到输入大小减半,通道数+512的特征图,假如输入为(17, 17, 768)的数据:

  • 第一个brach:
    • 首先经过branch3x3_1为带有192个1*1的卷积核,所以生成第一张特征图(17, 17, 192);
    • 然后经过branch3x3_2为带有320个3*3大小步长为2的卷积核,(17-3+2*0)/2+1=8,最终第一张特征图(8, 8, 320);
  • 第二个brach:
    • 首先经过branch7x7x3_1为带有192个1*1的卷积核,特征图大小不变,即(17, 17, 192);
    • 然后经过branch7x7x3_2为带有192个1*7大小且填充为0*3的卷积核,特征图大小不变,进一步生成第三张特征图(17,17, 192);
    • 再经过branch7x7x3_3为带有192个7*1大小且填充为3*0的卷积核,特征图大小不变,进一步生成第三张特征图(17,17, 192);
    • 最后经过branch7x7x3_4为带有192个3*3大小步长为2的卷积核,最终第一张特征图(8, 8, 192);
  • 第三个brach:
    • max_pool2d,池化核大小3*3,步长为2,所以是二倍最大值下采样,通道数保持不变,第三张特征图为(8, 8, 768);
  • 最后将三张特征图进行拼接,最终得到(8(即Hin/2),8(即Win/2),320+192+768(Cin)=1280)的特征图
class InceptionD(nn.Module):
    def __init__(self, in_channels):
        super(InceptionD, self).__init__()
        self.branch3x3_1 = BasicConv2d(in_channels, 192, kernel_size=1)
        self.branch3x3_2 = BasicConv2d(192, 320, kernel_size=3, stride=2)
        self.branch7x7x3_1 = BasicConv2d(in_channels, 192, kernel_size=1)
        self.branch7x7x3_2 = BasicConv2d(192, 192, kernel_size=(1, 7), padding=(0, 3))
        self.branch7x7x3_3 = BasicConv2d(192, 192, kernel_size=(7, 1), padding=(3, 0))
        self.branch7x7x3_4 = BasicConv2d(192, 192, kernel_size=3, stride=2)
    def forward(self, x):
        branch3x3 = self.branch3x3_1(x)
        branch3x3 = self.branch3x3_2(branch3x3)
        branch7x7x3 = self.branch7x7x3_1(x)
        branch7x7x3 = self.branch7x7x3_2(branch7x7x3)
        branch7x7x3 = self.branch7x7x3_3(branch7x7x3)
        branch7x7x3 = self.branch7x7x3_4(branch7x7x3)
        branch_pool = F.max_pool2d(x, kernel_size=3, stride=2)
        outputs = [branch3x3, branch7x7x3, branch_pool]
        return torch.cat(outputs, 1)

InceptionE

在这里插入图片描述
最终得到输入大小不变,通道数为2048的特征图。 假如输入为(8,8, 1280)的数据:

  • 第一个brach:
    • 首先经过branch1x1为带有320个1*1的卷积核,所以生成第一张特征图(8, 8, 320);
  • 第二个brach:
    • 首先经过branch3x3_1为带有384个1*1的卷积核,所以第二张特征图(8, 8, 384),
      • 经过分支branch3x3_2a为带有384个1*3大小且填充为0*1的卷积核,特征图大小不变,进一步生成特征图(8,8, 384),
      • 经过分支branch3x3_2b为带有192个3*1大小且填充为1*0的卷积核,特征图大小不变,进一步生成特征图(8,8, 384),
    • 因此第二张特征图最终为两个分支拼接(8,8, 384+384=768);
  • 第三个brach:
    • 首先经过branch3x3dbl_1为带有448个1*1的卷积核,所以第三张特征图(8,8, 448),
    • 然后经过branch3x3dbl_2为带有384个3*3大小且填充为1的卷积核,特征图大小不变,进一步生成第三张特征图(8,8, 384),
      • 经过分支branch3x3dbl_3a为带有384个1*3大小且填充为0*1的卷积核,特征图大小不变,进一步生成特征图(8,8, 384),
      • 经过分支branch3x3dbl_3b为带有384个3*1大小且填充为1*0的卷积核,特征图大小不变,进一步生成特征图(8,8, 384),
    • 因此第三张特征图最终为两个分支拼接(8,8, 384+384=768);
  • 第四个brach:
    • 首先经过avg_pool2d,其中池化核3*3,步长为1,填充为1,所以第四张特征图大小不变,通道数不变,第四张特征图为(8,8, 1280),
    • 然后经过branch_pool为带有192个的1*1卷积,因此第四张特征图最终为(8,8, 192);
  • 最后将四张特征图进行拼接,最终得到(8, 8, 320+768+768+192=2048)的特征图
class InceptionE(nn.Module):
    def __init__(self, in_channels):
        super(InceptionE, self).__init__()
        self.branch1x1 = BasicConv2d(in_channels, 320, kernel_size=1)
        self.branch3x3_1 = BasicConv2d(in_channels, 384, kernel_size=1)
        self.branch3x3_2a = BasicConv2d(384, 384, kernel_size=(1, 3), padding=(0, 1))
        self.branch3x3_2b = BasicConv2d(384, 384, kernel_size=(3, 1), padding=(1, 0))
        self.branch3x3dbl_1 = BasicConv2d(in_channels, 448, kernel_size=1)
        self.branch3x3dbl_2 = BasicConv2d(448, 384, kernel_size=3, padding=1)
        self.branch3x3dbl_3a = BasicConv2d(384, 384, kernel_size=(1, 3), padding=(0, 1))
        self.branch3x3dbl_3b = BasicConv2d(384, 384, kernel_size=(3, 1), padding=(1, 0))
        self.branch_pool = BasicConv2d(in_channels, 192, kernel_size=1)
    def forward(self, x):
        branch1x1 = self.branch1x1(x)
        branch3x3 = self.branch3x3_1(x)
        branch3x3 = [
            self.branch3x3_2a(branch3x3),
            self.branch3x3_2b(branch3x3),
        ]
        branch3x3 = torch.cat(branch3x3, 1)
        branch3x3dbl = self.branch3x3dbl_1(x)
        branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
        branch3x3dbl = [
            self.branch3x3dbl_3a(branch3x3dbl),
            self.branch3x3dbl_3b(branch3x3dbl),
        ]
        branch3x3dbl = torch.cat(branch3x3dbl, 1)
        branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
        branch_pool = self.branch_pool(branch_pool)
        outputs = [branch1x1, branch3x3, branch3x3dbl, branch_pool]
        return torch.cat(outputs, 1)

InceptionAux 辅助分类器

在中间层使用中间层特征+辅助分类器,以便最终的损失函数加入该正则化项,优化参数,以达到提升模型分类效果的作用。
结构:Pool——>1x1Conv——>5x5Conv——>FC

class InceptionAux(nn.Module):
    def __init__(self, in_channels, num_classes):
        super(InceptionAux, self).__init__()
        self.conv0 = BasicConv2d(in_channels, 128, kernel_size=1)
        self.conv1 = BasicConv2d(128, 768, kernel_size=5)
        self.conv1.stddev = 0.01
        self.fc = nn.Linear(768, num_classes)
        self.fc.stddev = 0.001
    def forward(self, x):
        # 17 x 17 x 768
        x = F.avg_pool2d(x, kernel_size=5, stride=3)
        # 5 x 5 x 768
        x = self.conv0(x)
        # 5 x 5 x 128
        x = self.conv1(x)
        # 1 x 1 x 768
        x = x.view(x.size(0), -1)
        # 768
        x = self.fc(x)
        # 1000
        return x

InceptionV3 主要代码

  1. 输入(229,229,3)的数据,首先归一化输入,经过5个卷积,2个最大池化层。
    if self.transform_input: # 1
       x = x.clone()
       x[:, 0] = x[:, 0] * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
       x[:, 1] = x[:, 1] * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
       x[:, 2] = x[:, 2] * (0.225 / 0.5) + (0.406 - 0.5) / 0.5
    # 299 x 299 x 3
    x = self.Conv2d_1a_3x3(x) # BasicConv2d(3, 32, kernel_size=3, stride=2)
    # 149 x 149 x 32
    x = self.Conv2d_2a_3x3(x) #BasicConv2d(32, 32, kernel_size=3)
    # 147 x 147 x 32
    x = self.Conv2d_2b_3x3(x) #BasicConv2d(32, 64, kernel_size=3, padding=1)
    # 147 x 147 x 64
    x = F.max_pool2d(x, kernel_size=3, stride=2)
    # 73 x 73 x 64
    x = self.Conv2d_3b_1x1(x)# BasicConv2d(64, 80, kernel_size=1)
    # 73 x 73 x 80
    x = self.Conv2d_4a_3x3(x)# BasicConv2d(80, 192, kernel_size=3)
    # 71 x 71 x 192
    x = F.max_pool2d(x, kernel_size=3, stride=2)
    # 35 x 35 x 192
    
  2. 然后经过3个InceptionA结构,1个InceptionB,3个InceptionC,1个InceptionD,2个InceptionE,其中InceptionA,辅助分类器AuxLogits以经过最后一个InceptionC的输出为输入。
    • InceptionA:得到输入大小不变,通道数为224+pool_features的特征图。
    • InceptionB:得到输入大小减半,通道数+480的特征图。
    • InceptionC:得到输入大小不变,通道数为768的特征图。
    • InceptionD:得到输入大小减半,通道数+512的特征图。
    • InceptionE:得到输入大小不变,通道数为2048的特征图。
    # 35 x 35 x 192
    x = self.Mixed_5b(x) # InceptionA(192, pool_features=32)
    # 35 x 35 x 256
    x = self.Mixed_5c(x) # InceptionA(256, pool_features=64)
    # 35 x 35 x 288
    x = self.Mixed_5d(x) # InceptionA(288, pool_features=64)
    # 35 x 35 x 288
    x = self.Mixed_6a(x) # InceptionB(288)
    # 17 x 17 x 768
    x = self.Mixed_6b(x) #InceptionC(768, channels_7x7=128)
    # 17 x 17 x 768
    x = self.Mixed_6c(x) # InceptionC(768, channels_7x7=160)
    # 17 x 17 x 768
    x = self.Mixed_6d(x) # InceptionC(768, channels_7x7=160)
    # 17 x 17 x 768
    x = self.Mixed_6e(x) # InceptionC(768, channels_7x7=192)
    # 17 x 17 x 768 
    if self.training and self.aux_logits:
        aux = self.AuxLogits(x) #InceptionAux(768, num_classes)
    # 17 x 17 x 768
    x = self.Mixed_7a(x) # InceptionD(768)
    # 8 x 8 x 1280
    x = self.Mixed_7b(x) # InceptionE(1280)
    # 8 x 8 x 2048
    x = self.Mixed_7c(x) # InceptionE(2048)
    # 8 x 8 x 2048
    
  3. 进入分类部分。经过平均池化层+dropout+打平+全连接层输出。
    # 8 x 8 x 2048
    x = F.avg_pool2d(x, kernel_size=8)
    # 1 x 1 x 2048
    x = F.dropout(x, training=self.training)
    # 1 x 1 x 2048
    x = x.view(x.size(0), -1)
    # 2048
    x = self.fc(x)
    # 1000 (num_classes)
    if self.training and self.aux_logits:
        return x, aux
    return x
    

代码:

class InceptionV3(nn.Module):
    def __init__(self, num_classes=1000, aux_logits=True, transform_input=False):
        super(InceptionV3, self).__init__()
        self.aux_logits = aux_logits
        self.transform_input = transform_input
        self.Conv2d_1a_3x3 = BasicConv2d(3, 32, kernel_size=3, stride=2)
        self.Conv2d_2a_3x3 = BasicConv2d(32, 32, kernel_size=3)
        self.Conv2d_2b_3x3 = BasicConv2d(32, 64, kernel_size=3, padding=1)
        self.Conv2d_3b_1x1 = BasicConv2d(64, 80, kernel_size=1)
        self.Conv2d_4a_3x3 = BasicConv2d(80, 192, kernel_size=3)
        self.Mixed_5b = InceptionA(192, pool_features=32)
        self.Mixed_5c = InceptionA(256, pool_features=64)
        self.Mixed_5d = InceptionA(288, pool_features=64)
        self.Mixed_6a = InceptionB(288)
        self.Mixed_6b = InceptionC(768, channels_7x7=128)
        self.Mixed_6c = InceptionC(768, channels_7x7=160)
        self.Mixed_6d = InceptionC(768, channels_7x7=160)
        self.Mixed_6e = InceptionC(768, channels_7x7=192)
        if aux_logits:
            self.AuxLogits = InceptionAux(768, num_classes)
        self.Mixed_7a = InceptionD(768)
        self.Mixed_7b = InceptionE(1280)
        self.Mixed_7c = InceptionE(2048)
        self.fc = nn.Linear(2048, num_classes)
        for m in self.modules():
            if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
                import scipy.stats as stats
                stddev = m.stddev if hasattr(m, 'stddev') else 0.1
                X = stats.truncnorm(-2, 2, scale=stddev)
                values = torch.Tensor(X.rvs(m.weight.data.numel()))
                m.weight.data.copy_(values)
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
    def forward(self, x):
        if self.transform_input: # 1
            x = x.clone()
            x[:, 0] = x[:, 0] * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
            x[:, 1] = x[:, 1] * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
            x[:, 2] = x[:, 2] * (0.225 / 0.5) + (0.406 - 0.5) / 0.5
        # 299 x 299 x 3
        x = self.Conv2d_1a_3x3(x)
        # 149 x 149 x 32
        x = self.Conv2d_2a_3x3(x)
        # 147 x 147 x 32
        x = self.Conv2d_2b_3x3(x)
        # 147 x 147 x 64
        x = F.max_pool2d(x, kernel_size=3, stride=2)
        # 73 x 73 x 64
        x = self.Conv2d_3b_1x1(x)
        # 73 x 73 x 80
        x = self.Conv2d_4a_3x3(x)
        # 71 x 71 x 192
        x = F.max_pool2d(x, kernel_size=3, stride=2)
        # 35 x 35 x 192
        x = self.Mixed_5b(x)
        # 35 x 35 x 256
        x = self.Mixed_5c(x)
        # 35 x 35 x 288
        x = self.Mixed_5d(x)
        # 35 x 35 x 288
        x = self.Mixed_6a(x)
        # 17 x 17 x 768
        x = self.Mixed_6b(x)
        # 17 x 17 x 768
        x = self.Mixed_6c(x)
        # 17 x 17 x 768
        x = self.Mixed_6d(x)
        # 17 x 17 x 768
        x = self.Mixed_6e(x)
        # 17 x 17 x 768
        if self.training and self.aux_logits:
            aux = self.AuxLogits(x)
        # 17 x 17 x 768
        x = self.Mixed_7a(x)
        # 8 x 8 x 1280
        x = self.Mixed_7b(x)
        # 8 x 8 x 2048
        x = self.Mixed_7c(x)
        # 8 x 8 x 2048
        x = F.avg_pool2d(x, kernel_size=8)
        # 1 x 1 x 2048
        x = F.dropout(x, training=self.training)
        # 1 x 1 x 2048
        x = x.view(x.size(0), -1)
        # 2048
        x = self.fc(x)
        # 1000 (num_classes)
        if self.training and self.aux_logits:
            return x, aux
        return x

参考

https://zhuanlan.zhihu.com/p/30172532

  • 7
    点赞
  • 50
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
深度学习中,卷积神经网络(Convolutional Neural Network, CNN)是最常用和最流行的网络结构之一。然而,CNN 有一个缺点,即每一层只能提取特定大小的特征。这可能导致信息丢失和性能下降。为了解决这个问题,动态卷积被引入了 CNN 中。与传统的卷积相比,动态卷积可以自适应地学习特征的大小和形状。 DenseNet-Inception 模型是一种结合了 DenseNet 和 Inception 的网络结构,其中 DenseNet 使用密集连接来提高特征的重用,Inception 利用不同大小的卷积核来提取不同大小的特征。在这个模型中,动态卷积被引入到了 Inception 模块中,以提高模型的性能。 以下是 DenseNet-Inception 模型代码实现: ```python import torch import torch.nn as nn import torch.nn.functional as F class DenseNetInception(nn.Module): def __init__(self, num_classes=10): super(DenseNetInception, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False) self.bn1 = nn.BatchNorm2d(64) self.relu = nn.ReLU(inplace=True) self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) # Inception module with dynamic convolution self.inception1 = InceptionModule(64, 64, 96, 128, 16, 32, 32) self.inception2 = InceptionModule(256, 128, 128, 192, 32, 96, 64) self.inception3 = InceptionModule(480, 192, 96, 208, 16, 48, 64) self.inception4 = InceptionModule(512, 160, 112, 224, 24, 64, 64) self.inception5 = InceptionModule(512, 128, 128, 256, 24, 64, 64) self.inception6 = InceptionModule(512, 112, 144, 288, 32, 64, 64) self.inception7 = InceptionModule(528, 256, 160, 320, 32, 128, 128) self.inception8 = InceptionModule(832, 256, 160, 320, 32, 128, 128) self.inception9 = InceptionModule(832, 384, 192, 384, 48, 128, 128) # DenseNet module self.dense_block1 = DenseBlock(832, 32) self.trans_block1 = TransitionBlock(1024, 0.5) self.dense_block2 = DenseBlock(512, 32) self.trans_block2 = TransitionBlock(768, 0.5) self.dense_block3 = DenseBlock(384, 32) self.trans_block3 = TransitionBlock(576, 0.5) self.dense_block4 = DenseBlock(288, 32) self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) self.fc = nn.Linear(288, num_classes) def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.relu(x) x = self.maxpool(x) x = self.inception1(x) x = self.inception2(x) x = self.inception3(x) x = self.inception4(x) x = self.inception5(x) x = self.inception6(x) x = self.inception7(x) x = self.inception8(x) x = self.inception9(x) x = self.dense_block1(x) x = self.trans_block1(x) x = self.dense_block2(x) x = self.trans_block2(x) x = self.dense_block3(x) x = self.trans_block3(x) x = self.dense_block4(x) x = self.avgpool(x) x = x.view(x.size(0), -1) x = self.fc(x) return x class InceptionModule(nn.Module): def __init__(self, in_channels, out1, out2_1, out2_2, out3_1, out3_2, out4_2): super(InceptionModule, self).__init__() self.branch1 = nn.Sequential( nn.Conv2d(in_channels, out1, kernel_size=1), nn.BatchNorm2d(out1), nn.ReLU(inplace=True) ) self.branch2 = nn.Sequential( nn.Conv2d(in_channels, out2_1, kernel_size=1), nn.BatchNorm2d(out2_1), nn.ReLU(inplace=True), DynamicConv2d(out2_1, out2_2, kernel_size=3, stride=1, padding=1, bias=False), nn.BatchNorm2d(out2_2), nn.ReLU(inplace=True) ) self.branch3 = nn.Sequential( nn.Conv2d(in_channels, out3_1, kernel_size=1), nn.BatchNorm2d(out3_1), nn.ReLU(inplace=True), DynamicConv2d(out3_1, out3_2, kernel_size=5, stride=1, padding=2, bias=False), nn.BatchNorm2d(out3_2), nn.ReLU(inplace=True) ) self.branch4 = nn.Sequential( nn.MaxPool2d(kernel_size=3, stride=1, padding=1), nn.Conv2d(in_channels, out4_2, kernel_size=1), nn.BatchNorm2d(out4_2), nn.ReLU(inplace=True) ) def forward(self, x): branch1 = self.branch1(x) branch2 = self.branch2(x) branch3 = self.branch3(x) branch4 = self.branch4(x) outputs = [branch1, branch2, branch3, branch4] return torch.cat(outputs, 1) class DenseBlock(nn.Module): def __init__(self, in_channels, growth_rate): super(DenseBlock, self).__init__() self.conv1 = nn.Conv2d(in_channels, 4 * growth_rate, kernel_size=1, bias=False) self.bn1 = nn.BatchNorm2d(4 * growth_rate) self.relu = nn.ReLU(inplace=True) self.conv2 = DynamicConv2d(4 * growth_rate, growth_rate, kernel_size=3, stride=1, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(growth_rate) def forward(self, x): out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) out = self.relu(out) out = torch.cat([x, out], 1) return out class TransitionBlock(nn.Module): def __init__(self, in_channels, reduction): super(TransitionBlock, self).__init__() self.conv = nn.Conv2d(in_channels, int(in_channels * reduction), kernel_size=1, bias=False) self.bn = nn.BatchNorm2d(int(in_channels * reduction)) self.relu = nn.ReLU(inplace=True) self.avgpool = nn.AvgPool2d(kernel_size=2, stride=2) def forward(self, x): out = self.conv(x) out = self.bn(out) out = self.relu(out) out = self.avgpool(out) return out class DynamicConv2d(nn.Module): def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True): super(DynamicConv2d, self).__init__() self.kernel_size = kernel_size self.stride = stride self.padding = padding self.dilation = dilation self.groups = groups self.weight = nn.Parameter(torch.Tensor(out_channels, in_channels // groups, kernel_size, kernel_size)) if bias: self.bias = nn.Parameter(torch.Tensor(out_channels)) else: self.register_parameter('bias', None) self.reset_parameters() def reset_parameters(self): nn.init.kaiming_uniform_(self.weight, a=math.sqrt(5)) if self.bias is not None: fan_in, _ = nn.init._calculate_fan_in_and_fan_out(self.weight) bound = 1 / math.sqrt(fan_in) nn.init.uniform_(self.bias, -bound, bound) def forward(self, x): weight = F.pad(self.weight, (self.padding, self.padding, self.padding, self.padding)) weight = weight[:, :, :x.shape[-2] + self.padding * 2, :x.shape[-1] + self.padding * 2] out = F.conv2d(x, weight, self.bias, self.stride, 0, self.dilation, self.groups) return out ``` 在这个代码中,我们定义了一个 DenseNet-Inception 模型,并实现了动态卷积。在模型中,我们首先定义了一个标准的卷积层,然后定义了 Inception 模块和 DenseNet 模块。在 Inception 模块中,我们引入了动态卷积,以提高模型的性能。在 DenseNet 中,我们使用了密集连接来提高特征的重用。最后,我们定义了一个全连接层来分类。 如果您想使用这个模型来训练数据,请确保您已经定义了数据集,并使用合适的优化器和损失函数。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

nooobme

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值