J8周打卡（Inception V1）

最新推荐文章于 2024-07-02 21:42:18 发布

Pinospot

最新推荐文章于 2024-07-02 21:42:18 发布

阅读量163

点赞数

文章标签：深度学习 pytorch 人工智能

本文链接：https://blog.csdn.net/Pinospot/article/details/129839928

版权

🍨 本文为🔗365天深度学习训练营中的学习记录博客
🍖 原作者：K同学啊|接辅导、项目定制

本周任务:

1.了解并学习图2中的卷积层运算量的计算过程(储备知识->卷积层运算量的计算，有我的推导过
程，建议先自己手动推导，然后再看)(已完成)

2．了解并学习卷积层的并行结构与1x1卷积核部分内容(重点)(已完成)

3．尝试根据模型框架图写入相应的pytorch代码(对Inception模块进行了改动，使用for循环直接进行搭建，并最后测试了代码运行的可行性)

Inception模块特点

Inception模块由四个分支组成，相比于大的卷积核，使用较小的卷积核更加节约计算量，而不同尺寸的卷积核拥有不同的感受野，从而能够提取出不同的尺度信息，进行融合，获得更好的图像表现。

Inception模块构建

我是按照基本卷积模块->特征提取网络->主干网络->分类网络四个部分构建了四个类，最后在GoogleNet中进行实现

搭建思路

原版的GoogleNet中Inception的模块的四个分支参数人为给定，并没有明确的数量关系，所以构建略微繁琐。我为了搭建方便，仿照ResNet的方式自己设定了四个分支通道数的数量关系，所以最后网络模型的参数与原本不太一样，具体设置如下：

1.根据池化层的位置将主干网络分为三个部分

2.第一个分支的输出通道为默认输出通道数的2倍

3.第二个分支的输出通道为默认输出通道数的3倍

4.第三个分支的输出通道为默认输出通道数的2倍

5.第四个分支的输出通道为默认输出通道数

因此给定默认输出通道数即可完成一个Inception模块的搭建,且输出通道数为默认输出通道数的8倍

每部分出去第一个以外的Inception模块输入通道即为前一个Inception模块的输出通道，因此使用for循环可以快速搭建

6.由于最后一部分没有池化层，需要在for循环中额外增加一个判断条件

问题

1.线性层可以用1 * 1卷积代替，不然每次改变输入图片大小都必须重新设计线性层参数

2.没有用实际图片输入来跑，但是设置了代码测试环节证明代码跑的通

3.由于每次通道数翻8倍，所以参数量翻了很多倍，但是主题思想没变

4.只添加了主干，没有添加辅助分类器

代码

导入包

import torch
import torch.nn as nn

搭建模块

# 基础卷积模块
class BaseConv(nn.Sequential):
    def __init__(self, inf, ouf, kernel_size=1, stride=1, padding=0):
        super(BaseConv, self).__init__()
        self.conv = nn.Conv2d(inf, ouf, kernel_size=kernel_size, stride=stride, padding=padding, bias=False)
        self.bn = nn.BatchNorm2d(ouf)
        self.act = nn.ReLU()

# 特征提取模块
class feature_extra(nn.Sequential):
    def __init__(self):
        super(feature_extra, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
        self.max1 = nn.MaxPool2d(kernel_size=3, stride=2)
        self.bn1 = nn.BatchNorm2d(64)
        self.conv2 = nn.Conv2d(64, 64, kernel_size=1, stride=1, padding=1, bias=False)
        self.conv3 = nn.Conv2d(64, 192, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(192)
        self.max2 = nn.MaxPool2d(kernel_size=3, stride=2)

# Inception模块
class Inception(nn.Module):
    def __init__(self, inf, ouf):
        super(Inception, self).__init__()
        # branch1
        ouf1 = ouf * 2
        self.branch1 = BaseConv(inf, ouf1)
        # branch2
        ouf2 = ouf * 3
        self.branch2 = nn.Sequential(BaseConv(inf, ouf),
                                     BaseConv(ouf, ouf2, kernel_size=3, padding=1))
        # branch3
        ouf3 = ouf * 2
        self.branch3 = nn.Sequential(BaseConv(inf, ouf),
                                     BaseConv(ouf, ouf3, kernel_size=5, padding=2))
        # branch4
        # ouf4 = ouf * 4
        self.branch4 = nn.Sequential(nn.MaxPool2d(kernel_size=3, stride=1, padding=1),
                                     BaseConv(inf, ouf))       
    def forward(self, x):
        x1 = self.branch1(x)
        x2 = self.branch2(x)
        x3 = self.branch3(x)
        x4 = self.branch4(x)
        
        out = torch.concat([x1, x2, x3, x4], 1)
        
        return out

# 分类网络
class classes(nn.Module):
    def __init__(self, inf, ouf, class_number):
        super(classes, self).__init__()
        self.av = nn.AvgPool2d(kernel_size=19, stride=1, padding=0)
        self.drop = nn.Dropout(0.4)
        self.cla = nn.Sequential(nn.Linear(inf, ouf),
                                 nn.ReLU(),
                                 nn.Linear(ouf, class_number),
                                 nn.Softmax(dim=1))
    def forward(self, x):
        out = self.av(x)
        out = self.drop(out)
        out = torch.flatten(out, start_dim=1)
        out = self.cla(out)
        return out

# GoogleNet
class Google(nn.Module):
    #def __init__(self, block, inf1, ouf1, inf2, ouf2, inf3, ouf3, block_number1=2, block_number2=5, block_number3=2, class_number=1000):
    def __init__(self, block, inf, ouf, block_num, downsample=None, class_number=1000):
        super(Google, self).__init__()
        # feature_extra
        self.feature_extra = feature_extra()
        
        # backbone
        self.Ince1 = self._make_layer(block, inf[0], ouf[0], block_num[0])
        self.Ince2 = self._make_layer(block, inf[1], ouf[1], block_num[1])
        self.Ince3 = self._make_layer(block, inf[2], ouf[2], block_num[2], mpl=False)
        
        # classes
        self.classes = classes(2048, 2048, class_number=class_number)
        
    def _make_layer(self, block, inf, ouf, block_num, mpl=True):
        layers = []
        layers.append(block(inf, ouf))
        inf = ouf * 8
        for i in range(1, block_num):
            layers.append(block(inf, ouf))
            inf = ouf * 8

        if mpl is True :
            layers.append(nn.MaxPool2d(kernel_size=3, stride=2))
            
        return nn.Sequential(*layers)
    
    def forward(self, x):
        out = self.feature_extra(x)

        out = self.Ince1(out)
        out = self.Ince2(out)
        out = self.Ince3(out)
        
        out = self.classes(out)
        return out

代码测试

运行成功即代表每个模块都没有问题（我是在jupyter直接跑的）

net1 = feature_extra()
# print(net1)

input_feature1 = torch.ones((1, 3, 28, 28))
out1 = net1(input_feature1)
out1.shape

net2 = Inception(256, 64)
# print(net2)

input_feature2 = torch.ones((1, 256, 28, 28))
out2 = net2(input_feature2)
out2.shape

net3 = classes(1024, 1024, 1000)
# print(net3)

input_feature3 = torch.ones((1, 1024, 19, 19))
out3 = net3(input_feature3)
out3.shape

# 设定GoogleNet的参数
def google(classes_number=1000):
    inf = [192, 512, 1024]
    ouf = [64, 128, 256]
    block_number = [2, 5, 2]
    return Google(Inception, inf=inf, ouf=ouf, block_num=block_number, class_number=classes_number)

net4 = google()
# 打印详细结构
net4

input = torch.ones((1, 3, 640, 640))

out = net4(input)
out.shape

参考网络结构参数

from torchinfo import summary

model = net4

summary(model)

结果如下
======================================================================
Layer (type:depth-idx)                        Param #
======================================================================
Google                                        --
├─feature_extra: 1-1                          --
│    └─Conv2d: 2-1                            9,408
│    └─MaxPool2d: 2-2                         --
│    └─BatchNorm2d: 2-3                       128
│    └─Conv2d: 2-4                            4,096
│    └─Conv2d: 2-5                            110,592
│    └─BatchNorm2d: 2-6                       384
│    └─MaxPool2d: 2-7                         --
├─Sequential: 1-2                             --
│    └─Inception: 2-8                         --
│    │    └─BaseConv: 3-1                     24,832
│    │    └─Sequential: 3-2                   123,392
│    │    └─Sequential: 3-3                   217,472
│    │    └─Sequential: 3-4                   12,416
│    └─Inception: 2-9                         --
│    │    └─BaseConv: 3-5                     65,792
│    │    └─Sequential: 3-6                   143,872
│    │    └─Sequential: 3-7                   237,952
│    │    └─Sequential: 3-8                   32,896
│    └─MaxPool2d: 2-10                        --
├─Sequential: 1-3                             --
│    └─Inception: 2-11                        --
│    │    └─BaseConv: 3-9                     131,584
│    │    └─Sequential: 3-10                  508,928
│    │    └─Sequential: 3-11                  885,504
│    │    └─Sequential: 3-12                  65,792
│    └─Inception: 2-12                        --
│    │    └─BaseConv: 3-13                    262,656
│    │    └─Sequential: 3-14                  574,464
│    │    └─Sequential: 3-15                  951,040
│    │    └─Sequential: 3-16                  131,328
│    └─Inception: 2-13                        --
│    │    └─BaseConv: 3-17                    262,656
│    │    └─Sequential: 3-18                  574,464
│    │    └─Sequential: 3-19                  951,040
│    │    └─Sequential: 3-20                  131,328
│    └─Inception: 2-14                        --
│    │    └─BaseConv: 3-21                    262,656
│    │    └─Sequential: 3-22                  574,464
│    │    └─Sequential: 3-23                  951,040
│    │    └─Sequential: 3-24                  131,328
│    └─Inception: 2-15                        --
│    │    └─BaseConv: 3-25                    262,656
│    │    └─Sequential: 3-26                  574,464
│    │    └─Sequential: 3-27                  951,040
│    │    └─Sequential: 3-28                  131,328
│    └─MaxPool2d: 2-16                        --
├─Sequential: 1-4                             --
│    └─Inception: 2-17                        --
│    │    └─BaseConv: 3-29                    525,312
│    │    └─Sequential: 3-30                  2,033,664
│    │    └─Sequential: 3-31                  3,540,480
│    │    └─Sequential: 3-32                  262,656
│    └─Inception: 2-18                        --
│    │    └─BaseConv: 3-33                    1,049,600
│    │    └─Sequential: 3-34                  2,295,808
│    │    └─Sequential: 3-35                  3,802,624
│    │    └─Sequential: 3-36                  524,800
├─classes: 1-5                                --
│    └─AvgPool2d: 2-19                        --
│    └─Dropout: 2-20                          --
│    └─Sequential: 2-21                       --
│    │    └─Linear: 3-37                      4,196,352
│    │    └─ReLU: 3-38                        --
│    │    └─Linear: 3-39                      2,049,000
│    │    └─Softmax: 3-40                     --
======================================================================
Total params: 30,533,288
Trainable params: 30,533,288
Non-trainable params: 0
======================================================================

Pinospot

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
J8周打卡（Inception V1）

原版的GoogleNet中Inception的模块的四个分支参数人为给定，并没有明确的数量关系，所以构建略微繁琐。Inception模块由四个分支组成，相比于大的卷积核，使用较小的卷积核更加节约计算量，而不同尺寸的卷积核拥有不同的感受野，从而能够提取出不同的尺度信息，进行融合，获得更好的图像表现。因此给定默认输出通道数即可完成一个Inception模块的搭建,且输出通道数为默认输出通道数的8倍。1.了解并学习图2中的卷积层运算量的计算过程(储备知识->卷积层运算量的计算，有我的推导过。
复制链接

扫一扫