pytorch实现残差网络restnet50

_-周-_

已于 2022-05-12 23:18:55 修改

阅读量2.4k

点赞数 4

分类专栏：深度学习文章标签： cnn 人工智能神经网络

于 2022-05-11 22:31:28 首次发布

本文链接：https://blog.csdn.net/qq_53345829/article/details/124713855

版权

深度学习专栏收录该内容

12 篇文章 2 订阅

订阅专栏

restnet50网络结构框架：(state1有3个残差块， state2有4个残差块<图中只画出来3个>，state3有6个残差块<图中只画出来3个>， state4有3个残差块 )

总结一下restnet50网络结构：

1. 输入图片是3*224*224， 经过7*7*64, stride=2的卷积和kernel_size=3,stride=2的池化层， 得到64*56*56大小的特征图。

2. 开始进入4个state块。第一个state 有3个残差块，每个残差块有3层卷积层。
  2.1 --- State1
    2.1.1---在第一个残差块中，仅在最后一个卷积核将通道数从64-->256， 三层卷积中间使用3*3卷积
            在第二和第三个残差块中，每次都是把图片在第一层变成64， 在第三层卷积再升高到256 
  2.2 --- State2
    2.2.1---在第一个残差块中， 第一个卷积层先把图片缩小一般，卷积步长为2， 之后每个残差块就是输入通道为128， 输出通道是512 
    2.2.2 注意，这里已经将图片缩小一半了，所以需要downsample方法使resduil和out的维度相同。
 ***其他的State和State2的思路一样。

由restnet50框架图可以看出，每个残差块有3层卷积构成，且输入输出通道不同，现在定义残差块，首先我们需要有输入通道数，输出通道数，和downsample(表示是否需要使用下采样方法将两个矩阵的维度变相同。) 在restnet50网络中，每个残差块步长都为1，只有一些state的第一个残差块的stride=2，我们在makelayer函数中来实现不同的步长。

******很重要很重要很重要*******************

在我们的第一个残差块输入是64*56*56，输出是256*56*56，第二层残差块的输入通道应该是256，所以我设置了一个midchannel来记录这个输入到第一层的256->64通道的转换。让midchannel记录outchannel的通道数，然后让下一个残差块的输入是midchannel

downsample是输入维度直接到输出维度，

# 构造一个残差块
class Bottleneck(nn.Module):
    def __init__(self, inchannel, midchannel, outchannel, stride=1, downsample=None):
        super(Bottleneck, self).__init__()

        # 第一层输入通道和输出通道都是inchannel大小， 不改变通道数， 卷积核大小为1*1，且步长为1
        self.conv1 = nn.Conv2d(in_channels=midchannel, out_channels=inchannel, kernel_size=1, stride=stride)
        self.bn1 = nn.BatchNorm2d(inchannel)

        # 第二层 使用卷积核大小为3*3，但是padding=1， 所以图片维度任然不变
        self.conv2 = nn.Conv2d(in_channels=inchannel, out_channels=inchannel, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(inchannel)

        # 第三层改变图片的通道数为outchannel
        self.conv3 = nn.Conv2d(in_channels=inchannel, out_channels=outchannel, kernel_size=1, stride=1)
        self.bn3 = nn.BatchNorm2d(outchannel)

        self.downsample = downsample

        self.relu = nn.ReLU(inplace=True)


    # 开始定义前向传播的经过的网络层
    def forward(self, x):
        residual = x

        print('x.shape=', x.shape)

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)
        out = self.relu(out)

        # 判断一下是否有下采样网络， 使residual=x的通道和out的 通道数相同，之后再矩阵相加
        if self.downsample is not None:
            residual = self.downsample(residual)


        print('residual.shape=',residual.shape)
        print('out.shape=',out.shape)
        out += residual
        out = self.bn3(out)
        out = self.relu(out)

        return out

已经定义好一个残差块，现在使用这个残差网络块构建我们的restnet网络.

由restnet50结构图可以看出，我们有4个stage，第一个stage有3个残差块(Bottleneck)，第二个stage有4个残差块<图中画出3个>，第三个stage有6个残差块<图中画出6个>，第四个stage有3个残差块<图中画出3个>。接下来开始定义我们的restnet网络模型结构. 在代码下面，主要解释一下makelayer这个函数

class restnet(nn.Module):
    def __init__(self):
        super(restnet, self).__init__()

        # restnet在使用残差块之前有一个卷积和maxpool, 所以先定义这个。
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=7, stride=2, padding=3)
        self.bn1 = nn.BatchNorm2d(64)
        self.maxpool1 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        # 此时得到的输出为64*56*56

        # 在这个stage里，有3个残差块， 每个残差块都是三层卷积， 第一个为1*1*64的卷积核，不改变图片大小和通道
        # 第二个为3*3*64的也不改变图片大小， 第三个是1*1*256改变图片通道
        self.stage1 = self.makelayer(inchannel=64, outchannel=256, block_num=3, stride=1)  # 256*56*56

        self.stage2 = self.makelayer(inchannel=128, outchannel=512, block_num=4, stride=2)  # 输出512*28*28

        self.stage3 = self.makelayer(inchannel=256, outchannel=1024, block_num=6, stride=2)  # 1024*14*14

        self.stage4 = self.makelayer(inchannel=512, outchannel=2048, block_num=3, stride=2)  # 2048*7*7

        self.avgpool = nn.AvgPool2d(7)
        self.fc = nn.Linear(512 * 4, 1000)

    def forward(self, x):
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.maxpool1(out)

        out = self.stage1(out)
        out = self.stage2(out)
        out = self.stage3(out)
        out = self.stage4(out)


        out = self.avgpool(out)

        out = torch.flatten(out, 1)
        out = self.fc(out)
        return out

    # 下面开始使用残差网络块定义网络结构， 自定义一个makelayer方法
    def makelayer(self, inchannel, outchannel, block_num, stride=1):

        block_list = []

        downsample = None


        if (stride == 1):
            downsample = nn.Sequential(
                nn.Conv2d(inchannel, outchannel, kernel_size=1, stride=stride),
                nn.BatchNorm2d(outchannel),
                nn.ReLU(inplace=True)
            )
        else:
            downsample = nn.Sequential(
                nn.Conv2d(inchannel, outchannel, kernel_size=1, stride=stride),
                nn.BatchNorm2d(outchannel),
                nn.ReLU(inplace=True)
            )

        # 定义Conv_Block
        midchannel = inchannel
        conv_block = Bottleneck(inchannel, midchannel, outchannel, stride=stride, downsample=downsample)
        block_list.append(conv_block)

        # 定义Identity Block
        midchannel = outchannel
        for i in range(block_num - 1):
            conv_block = Bottleneck(inchannel, midchannel, outchannel, stride=1, downsample=None)
            block_list.append(conv_block)

        return nn.Sequential(*block_list)

makelayer函数：

自己定义一个函数makelayer用来生成网络层，需要4个参数， 1是state层的输入通道数， 2是stage层的输出通道数， 3是这个state有几个残差块， 4是这个state第一个残差块的步长。

由restnet50网络结构图可以看出，当步长s=2时候，需要使用downsample，所以在makelayer函数中，先设置downsample=None，判断stride!=1，说明需要downsample，那么就定义一个dowmsample层，传递给Bottleneck，用来指定前向传播时候用不用downsample。

1.创建一个block_list列表，用来接受网络层

2.先判断步长stride，若不等于1，说明需要downsample，则添加一个downsample

3.开始定义每个state的第一个残差块，因为第一个残差块有些是步长为2，需要downsample的，有的是步长为1，不需要downsample，所以单独处理

****注意***********

在我有一个变量叫做midchannel，因为第一个残差块输入是64*56*56，输出是256*56*56，所以在残差块二的输出通道是64，此时的输入通道是256，不再是inchannel=64了，所以我使用了midchannel记录上一个残差块的输出通道。

midchannel的作用就是让输入层到第一层通道数的变换。

downsample也一样，输入是midchannel，变为outchannel。

残差块一：残差块二：

 def makelayer(self, inchannel, outchannel, block_num, stride=1):

        block_list = []

        downsample = None


        if (stride == 1):
            downsample = nn.Sequential(
                nn.Conv2d(inchannel, outchannel, kernel_size=1, stride=stride),
                nn.BatchNorm2d(outchannel),
                nn.ReLU(inplace=True)
            )
        else:
            downsample = nn.Sequential(
                nn.Conv2d(inchannel, outchannel, kernel_size=1, stride=stride),
                nn.BatchNorm2d(outchannel),
                nn.ReLU(inplace=True)
            )

        # 定义Conv_Block
        midchannel = inchannel
        conv_block = Bottleneck(inchannel, midchannel, outchannel, stride=stride, downsample=downsample)
        block_list.append(conv_block)

        # 定义Identity Block
        midchannel = outchannel
        for i in range(block_num - 1):
            conv_block = Bottleneck(inchannel, midchannel, outchannel, stride=1, downsample=None)
            block_list.append(conv_block)

        return nn.Sequential(*block_list)

在残差网络中有两种残差块。

一是Conv_Block,会更改图片维度，所以需要使用downsample方法将维度变统一。

二是 Identity_Block,这个残差块只是将三个卷积堆叠，没有改变图片维度，只改变了图片通道。