ResNet论文及复现代码笔记（CIFAR-10数据集）

壹||零

已于 2023-07-17 19:59:53 修改

阅读量623

点赞数 1

文章标签：笔记 pytorch 目标检测机器学习人工智能深度学习神经网络

于 2023-07-16 16:26:54 首次发布

本文链接：https://blog.csdn.net/m0_51457734/article/details/131741095

版权

一、论文阅读笔记

1.BN是什么

Batch normalization：就像激活函数层、卷积层、全连接层、池化层一样，BN也属于网络的一层。BN总的来说就是归一化层代替掉了LRN ( Local Response Normalization) 局部响应归一化层。目的是预处理使我们的一批(Batch)的feature map满足均值为0，方差为1的分布规律，这样能够加速网络的收敛。(在网络中间调整每层输入的feature map)。
一般将BN层放在卷积层（Conv）和激活层（例如ReLu）之间，且卷积层不要使用偏置Bias，因为使用偏置和不使用偏置的Yi是相等的，所以使用偏置只会徒增网络的参数，导致训练起来更加的费劲。
在这里插入图片描述

2.ResNet参数配置

在这里插入图片描述
先在conv1中进行7×7×64，stride=2的卷积，然后进入conv2，在conv2中先进行3×3，stride=2的最大池化，然后再进行conv2、conv3、conv4、conv5中的残差结构卷积，在进行conv2、conv3、conv4、conv5中的残差结构卷积时中间没有池化。

3.两种残差结构

在这里插入图片描述
上图中，左图是50层以下网络所用的残差结构，例如ResNet18/34，右图是50层及以上网络所用的残差结构，例如ResNet50/101/152。从图中可以看出Resnet50/101/152的1x1卷积->3x3卷积->1x1卷积结构中，第一个1x1的卷积是进行降维操作，再做3x3的卷积提取特征，再做1x1的卷积进行升维（恢复维度）。这里的残差结构是add操作，即两组同样shape的权值，对应位置的值进行相加。
***注意：*从上图可以看出，这里的通道数是变化的，11卷积层的作用就是用于改变特征图的通数，使得可以和恒等映射x相叠加，另外这里的11卷积层改变维度的很重要的一点是可以降低网络参数量，这也是为什么更深层的网络采用BottleNeck而不是BasicBlock的原因。
在这里插入图片描述
如图，输入X，分为两路，X为恒等映射，F(X)为残差映射，两者求和之后再进入激活函数，然后输出ReLu(F(X)+X)。
残差F(X)的作用：修正恒等映射X的误差，使网络拟合的更好。如果X足够好，则残差的参数均为0，使输出的F(X)=0;如果X不够好，则F(X)在X的基础上优化。
由于恒等映射X的存在，反向传播时，梯度可以从深层直接给到浅层，避免了梯度消失与爆炸。

4.ResNet50的网络结构图（参考链接）

在这里插入图片描述

二、代码复现笔记（PyTorch框架）

1.ResNet框架

import torch.nn as nn
import torch


#ResNet18/34使用此模块
class BasicBlock(nn.Module):  #卷积两层，F（X）和X的维度相等
    #expansion是F（X）相对X维度拓展的倍数
    expansion=1   ## 残差映射F(X)的维度有没有发生变化，1表示没有变化，downsample=None

    #downsample是用来将残差数据和卷积数据的shape变得相同，可以直接进行相加操作
    def __init__(self,in_channel,out_channel,stride=1,downsample=None,**kwargs):
        super(BasicBlock,self).__init__()
        self.conv1=nn.Conv2d(in_channels=in_channel,out_channels=out_channel,kernel_size=3,stride=stride,padding=1,bias=False)
        self.bn1=nn.BatchNorm2d(out_channel)   # BN层在conv和relu层之间

        self.conv2=nn.Conv2d(in_channels=out_channel,out_channels=out_channel,kernel_size=3,stride=1,padding=1,bias=False)
        self.bn2=nn.BatchNorm2d(out_channel)

        self.relu=nn.ReLU(inplace=True)   #这里inplace=True 的意思是原地池化操作。就是在原来的内存地址池化，覆盖掉以前的数据。好处就是可以节省运算内存，不用多储存变量
        self.downsample=downsample

    def forward(self, x):
        identity=x
        if self.downsample is not None:
            identity=self.downsample(x)

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        # out=F(x)+x
        out+=identity
        out=self.relu(out)

        return out


# Resnet 50/101/152使用此残差块
class Bottleneck(nn.Module): # 卷积3层，F(X)和X的维度不等
    """
    注意：原论文中，在虚线残差结构的主分支上，第一个1x1卷积层的步距是2，第二个3x3卷积层步距是1。
    但在pytorch官方实现过程中是第一个1x1卷积层的步距是1，第二个3x3卷积层步距是2，
    这么做的好处是能够在top1上提升大概0.5%的准确率。
    """
    # expansion是F(X)相对X维度拓展的倍数
    expansion=4

    def __init__(self,in_channel,out_channel,stride=1,downsample=None):
        super(Bottleneck,self).__init__()

        self.conv1=nn.Conv2d(in_channels=in_channel,out_channels=out_channel,kernel_size=1,stride=1,bias=False)  #1*1降维
        self.bn1=nn.BatchNorm2d(out_channel)

        self.conv2=nn.Conv2d(in_channels=out_channel,out_channels=out_channel,kernel_size=3,stride=1,bias=False) #3*3特征提取
        self.bn2=nn.BatchNorm2d(out_channel)

        self.conv3=nn.Conv2d(in_channels=out_channel,out_channels=out_channel*self.expansion,kernel_size=1,stride=1,bias=False) #1*1恢复维度
        self.bn3=nn.BatchNorm2d(out_channel*self.expansion)

        self.relu=nn.ReLU(inplace=True)
        self.downsample=downsample

    def forward(self, x):
        identity=x
        # downsample是用来将残差数据和卷积数据的shape变的相同，可以直接进行相加操作。
        if self.downsample is not None:
            identity=self.downsample(x)

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        out=self.relu(out)

        out=self.conv3(out)
        out=self.bn3(out)

        # out=F(X)+X
        out+=identity
        out=self.relu(out)

        return out
class ResNet(nn.Module):
    def __init__(self,
                 block,  # 使用的残差块类型
                 blocks_num,  # 每个卷积层，使用残差块的个数
                 num_classes=10,  # 训练集标签的分类个数
                 include_top=True,  # 是否在残差结构后接上pooling、fc、softmax
                 ):

        super(ResNet,self).__init__()
        self.include_top=include_top
        self.in_channel=64   # 第一层卷积输出特征矩阵的深度，也是后面层输入特征矩阵的深度

        # 输入层有RGB三个分量，使得输入特征矩阵的深度是3
        self.conv1=nn.Conv2d(3,self.in_channel,kernel_size=7,stride=2,padding=3,bias=False)
        self.bn1=nn.BatchNorm2d(self.in_channel)
        self.relu=nn.ReLU(inplace=True)

        self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)

        # _make_layer(残差块类型，残差块中第一个卷积层的卷积核个数，残差块个数，残差块中卷积步长)函数：生成多个连续的残差块的残差结构
        self.layer1=self._make_layer(block,64,blocks_num[0])
        self.layer2=self._make_layer(block,128,blocks_num[1],stride=2)
        self.layer3=self._make_layer(block,256,blocks_num[2],stride=2)
        self.layer4=self._make_layer(block,512,blocks_num[3],stride=2)

        if self.include_top:  # 默认为True，接上pooling、fc、softmax
            self.avgpool=nn.AdaptiveAvgPool2d((1,1)) # 自适应平均池化下采样，无论输入矩阵的shape为多少，output size均为的高宽均为1x1
            # 使矩阵展平为向量，如（W,H,C）->(1,1,W*H*C)，深度为W*H*C
            self.fc=nn.Linear(512*block.expansion,num_classes)  # 全连接层，512 * block.expansion为输入深度，num_classes为分类类别个数

        for m in self.modules():#初始化
            if isinstance(m,nn.Conv2d): #isinstance() 函数来判断一个对象(m)是否是一个已知的类型(如int等，这儿为nn.Conv2d)
                nn.init.kaiming_normal_(m.weight,mode='fan_out',nonlinearity='relu')  #相应的初始化方式

    # _make_layer()函数：生成多个连续的残差块，(残差块类型，残差块中第一个卷积层的卷积核个数，残差块个数，残差块中卷积步长)
    def _make_layer(self,block,channel,block_num,stride=1):
        downsample=None

        #寻找：卷积步长不为1或深度扩张有变化，导致F(X)与X的shape不同的残差块，就要对X定义下采样函数，使之shape相同
        if stride!=1 or self.in_channel!=channel*block.expansion:
            downsample=nn.Sequential(
                nn.Conv2d(self.in_channel,channel**block.expansion,kernel_size=1,stride=stride,bias=False),
                nn.BatchNorm2d(channel*block.expansion)
            )

        # layers用于顺序储存各连续残差块
        # 每个残差结构，第一个残差块均为需要对X下采样的残差块，后面的残差块不需要对X下采样
        layers=[]
        # 添加第一个残差块，第一个残差块均为需要对X下采样的残差块
        layers.append(block(self.in_channel,
                            channel,
                            downsample=downsample,
                            stride=stride
                            ))
        self.in_channel=channel*block.expansion
        # 后面的残差块不需要对X下采样
        for _ in range(1,block_num):
            layers.append(block(self.in_channel,
                                channel))
        # 以非关键字参数形式，将layers列表，传入Sequential(),使其中残差块串联为一个残差结构
        return nn.Sequential(*layers)

    def forward(self, x):
        x=self.conv1(x)
        x=self.bn1(x)
        x=self.relu(x)

        x=self.maxpool(x)

        x=self.layer1(x)
        x=self.layer2(x)
        x=self.layer3(x)
        x=self.layer4(x)

        if self.include_top:  #一般为True
            x=self.avgpool(x)
            x=torch.flatten(x,1)
            x=self.fc(x)

        return x
# 至此ResNet的基本框架就写好了

# 下面定义不同层的ResNet

def ResNet18(num_classes=10,include_top=True):
    return ResNet(BasicBlock,[2,2,2,2],num_classes=num_classes,include_top=include_top)
#
# def ResNet34(num_classes=1000,include_top=True):
#     return ResNet(BasicBlock,[3,4,6,3],num_classes=num_classes,include_top=include_top)

# def ResNet50(num_classes=10,include_top=True):
#     return ResNet(Bottleneck,[3,4,6,3],num_classes=num_classes,include_top=include_top)

# def ResNet101(num_classes=1000,include_top=True):
#     return ResNet(Bottleneck,[3,4,23,3],num_classes=num_classes,include_top=include_top)
#
# def ResNet152(num_classes=1000,include_top=True):
#     return ResNet(Bottleneck,[3,8,36,3],num_classes=num_classes,include_top=include_top)

2.ResNet_train(CIFAR10数据集)

import torch.nn as nn
import torch
import torchvision
import torch.optim as optim
from torchvision.datasets import CIFAR10
import torchvision.transforms as transforms
from torch.utils.data import DataLoader


# 设置设备
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

#ResNet18/34使用此模块
class BasicBlock(nn.Module):  #卷积两层，F（X）和X的维度相等
    #expansion是F（X）相对X维度拓展的倍数
    expansion=1   ## 残差映射F(X)的维度有没有发生变化，1表示没有变化，downsample=None

    #downsample是用来将残差数据和卷积数据的shape变得相同，可以直接进行相加操作
    def __init__(self,in_channel,out_channel,stride=1,downsample=None,**kwargs):
        super(BasicBlock,self).__init__()
        self.conv1=nn.Conv2d(in_channels=in_channel,out_channels=out_channel,kernel_size=3,stride=stride,padding=1,bias=False)
        self.bn1=nn.BatchNorm2d(out_channel)   # BN层在conv和relu层之间

        self.conv2=nn.Conv2d(in_channels=out_channel,out_channels=out_channel,kernel_size=3,stride=1,padding=1,bias=False)
        self.bn2=nn.BatchNorm2d(out_channel)

        self.relu=nn.ReLU(inplace=True)   #这里inplace=True 的意思是原地池化操作。就是在原来的内存地址池化，覆盖掉以前的数据。好处就是可以节省运算内存，不用多储存变量
        self.downsample=downsample

    def forward(self, x):
        identity=x
        if self.downsample is not None:
            identity=self.downsample(x)

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        # out=F(x)+x
        out+=identity
        out=self.relu(out)

        return out


# Resnet 50/101/152使用此残差块
class Bottleneck(nn.Module): # 卷积3层，F(X)和X的维度不等
    """
    注意：原论文中，在虚线残差结构的主分支上，第一个1x1卷积层的步距是2，第二个3x3卷积层步距是1。
    但在pytorch官方实现过程中是第一个1x1卷积层的步距是1，第二个3x3卷积层步距是2，
    这么做的好处是能够在top1上提升大概0.5%的准确率。
    """
    # expansion是F(X)相对X维度拓展的倍数
    expansion=4

    def __init__(self,in_channel,out_channel,stride=1,downsample=None):
        super(Bottleneck,self).__init__()

        self.conv1=nn.Conv2d(in_channels=in_channel,out_channels=out_channel,kernel_size=1,stride=1,bias=False)  #1*1降维
        self.bn1=nn.BatchNorm2d(out_channel)

        self.conv2=nn.Conv2d(in_channels=out_channel,out_channels=out_channel,kernel_size=3,stride=1,bias=False) #3*3特征提取
        self.bn2=nn.BatchNorm2d(out_channel)

        self.conv3=nn.Conv2d(in_channels=out_channel,out_channels=out_channel*self.expansion,kernel_size=1,stride=1,bias=False) #1*1恢复维度
        self.bn3=nn.BatchNorm2d(out_channel*self.expansion)

        self.relu=nn.ReLU(inplace=True)
        self.downsample=downsample

    def forward(self, x):
        identity=x
        # downsample是用来将残差数据和卷积数据的shape变的相同，可以直接进行相加操作。
        if self.downsample is not None:
            identity=self.downsample(x)

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        out=self.relu(out)

        out=self.conv3(out)
        out=self.bn3(out)

        # out=F(X)+X
        out+=identity
        out=self.relu(out)

        return out
class ResNet(nn.Module):
    def __init__(self,
                 block,  # 使用的残差块类型
                 blocks_num,  # 每个卷积层，使用残差块的个数
                 num_classes=10,  # 训练集标签的分类个数
                 include_top=True,  # 是否在残差结构后接上pooling、fc、softmax
                 ):

        super(ResNet,self).__init__()
        self.include_top=include_top
        self.in_channel=64   # 第一层卷积输出特征矩阵的深度，也是后面层输入特征矩阵的深度

        # 输入层有RGB三个分量，使得输入特征矩阵的深度是3
        self.conv1=nn.Conv2d(3,self.in_channel,kernel_size=7,stride=2,padding=3,bias=False)
        self.bn1=nn.BatchNorm2d(self.in_channel)
        self.relu=nn.ReLU(inplace=True)

        self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)

        # _make_layer(残差块类型，残差块中第一个卷积层的卷积核个数，残差块个数，残差块中卷积步长)函数：生成多个连续的残差块的残差结构
        self.layer1=self._make_layer(block,64,blocks_num[0])
        self.layer2=self._make_layer(block,128,blocks_num[1],stride=2)
        self.layer3=self._make_layer(block,256,blocks_num[2],stride=2)
        self.layer4=self._make_layer(block,512,blocks_num[3],stride=2)

        if self.include_top:  # 默认为True，接上pooling、fc、softmax
            self.avgpool=nn.AdaptiveAvgPool2d((1,1)) # 自适应平均池化下采样，无论输入矩阵的shape为多少，output size均为的高宽均为1x1
            # 使矩阵展平为向量，如（W,H,C）->(1,1,W*H*C)，深度为W*H*C
            self.fc=nn.Linear(512*block.expansion,num_classes)  # 全连接层，512 * block.expansion为输入深度，num_classes为分类类别个数

        for m in self.modules():#初始化
            if isinstance(m,nn.Conv2d): #isinstance() 函数来判断一个对象(m)是否是一个已知的类型(如int等，这儿为nn.Conv2d)
                nn.init.kaiming_normal_(m.weight,mode='fan_out',nonlinearity='relu')  #相应的初始化方式

    # _make_layer()函数：生成多个连续的残差块，(残差块类型，残差块中第一个卷积层的卷积核个数，残差块个数，残差块中卷积步长)
    def _make_layer(self,block,channel,block_num,stride=1):
        downsample=None

        #寻找：卷积步长不为1或深度扩张有变化，导致F(X)与X的shape不同的残差块，就要对X定义下采样函数，使之shape相同
        if stride!=1 or self.in_channel!=channel*block.expansion:
            downsample=nn.Sequential(
                nn.Conv2d(self.in_channel,channel**block.expansion,kernel_size=1,stride=stride,bias=False),
                nn.BatchNorm2d(channel*block.expansion)
            )

        # layers用于顺序储存各连续残差块
        # 每个残差结构，第一个残差块均为需要对X下采样的残差块，后面的残差块不需要对X下采样
        layers=[]
        # 添加第一个残差块，第一个残差块均为需要对X下采样的残差块
        layers.append(block(self.in_channel,
                            channel,
                            downsample=downsample,
                            stride=stride
                            ))
        self.in_channel=channel*block.expansion
        # 后面的残差块不需要对X下采样
        for _ in range(1,block_num):
            layers.append(block(self.in_channel,
                                channel))
        # 以非关键字参数形式，将layers列表，传入Sequential(),使其中残差块串联为一个残差结构
        return nn.Sequential(*layers)

    def forward(self, x):
        x=self.conv1(x)
        x=self.bn1(x)
        x=self.relu(x)

        x=self.maxpool(x)

        x=self.layer1(x)
        x=self.layer2(x)
        x=self.layer3(x)
        x=self.layer4(x)

        if self.include_top:  #一般为True
            x=self.avgpool(x)
            x=torch.flatten(x,1)
            x=self.fc(x)

        return x
# 至此ResNet的基本框架就写好了

# 下面定义不同层的ResNet

def ResNet18(num_classes=10,include_top=True):
    return ResNet(BasicBlock,[2,2,2,2],num_classes=num_classes,include_top=include_top)
#
# def ResNet34(num_classes=1000,include_top=True):
#     return ResNet(BasicBlock,[3,4,6,3],num_classes=num_classes,include_top=include_top)

# def ResNet50(num_classes=10,include_top=True):
#     return ResNet(Bottleneck,[3,4,6,3],num_classes=num_classes,include_top=include_top)

# def ResNet101(num_classes=1000,include_top=True):
#     return ResNet(Bottleneck,[3,4,23,3],num_classes=num_classes,include_top=include_top)
#
# def ResNet152(num_classes=1000,include_top=True):
#     return ResNet(Bottleneck,[3,8,36,3],num_classes=num_classes,include_top=include_top)




# 设置超参数
num_epochs = 10
batch_size = 128
learning_rate = 0.001

# 数据预处理
transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

# 加载数据集
print("正在加载数据集")
train_dataset = CIFAR10(root='./dataset', train=True, download=True, transform=transform_train)
test_dataset = CIFAR10(root='./dataset', train=False, download=True, transform=transform_test)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
print("数据集加载成功")

# 创建实例
model=ResNet18().to(device)

# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# 训练模型
total_step = len(train_loader)
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        images = images.to(device)
        labels = labels.to(device)

        # 前向传播
        outputs = model(images)
        loss = criterion(outputs, labels)

        # 反向传播和优化
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (i + 1) % 100 == 0:
            print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
                  .format(epoch + 1, num_epochs, i + 1, total_step, loss.item()))

# 在测试集上评估模型
model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the model on the test images: {} %'.format(100 * correct / total))

# 保存模型
torch.save(model, './models/ResNet18.ckpt')
print("模型保存成功")

3.test.py

import torch
import torchvision.transforms as transforms
from PIL import Image
import torch.nn as nn
import torchvision

#ResNet18/34使用此模块
class BasicBlock(nn.Module):  #卷积两层，F（X）和X的维度相等
    #expansion是F（X）相对X维度拓展的倍数
    expansion=1   ## 残差映射F(X)的维度有没有发生变化，1表示没有变化，downsample=None

    #downsample是用来将残差数据和卷积数据的shape变得相同，可以直接进行相加操作
    def __init__(self,in_channel,out_channel,stride=1,downsample=None,**kwargs):
        super(BasicBlock,self).__init__()
        self.conv1=nn.Conv2d(in_channels=in_channel,out_channels=out_channel,kernel_size=3,stride=stride,padding=1,bias=False)
        self.bn1=nn.BatchNorm2d(out_channel)   # BN层在conv和relu层之间

        self.conv2=nn.Conv2d(in_channels=out_channel,out_channels=out_channel,kernel_size=3,stride=1,padding=1,bias=False)
        self.bn2=nn.BatchNorm2d(out_channel)

        self.relu=nn.ReLU(inplace=True)   #这里inplace=True 的意思是原地池化操作。就是在原来的内存地址池化，覆盖掉以前的数据。好处就是可以节省运算内存，不用多储存变量
        self.downsample=downsample

    def forward(self, x):
        identity=x
        if self.downsample is not None:
            identity=self.downsample(x)

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        # out=F(x)+x
        out+=identity
        out=self.relu(out)

        return out


# Resnet 50/101/152使用此残差块
class Bottleneck(nn.Module): # 卷积3层，F(X)和X的维度不等
    """
    注意：原论文中，在虚线残差结构的主分支上，第一个1x1卷积层的步距是2，第二个3x3卷积层步距是1。
    但在pytorch官方实现过程中是第一个1x1卷积层的步距是1，第二个3x3卷积层步距是2，
    这么做的好处是能够在top1上提升大概0.5%的准确率。
    """
    # expansion是F(X)相对X维度拓展的倍数
    expansion=4

    def __init__(self,in_channel,out_channel,stride=1,downsample=None):
        super(Bottleneck,self).__init__()

        self.conv1=nn.Conv2d(in_channels=in_channel,out_channels=out_channel,kernel_size=1,stride=1,bias=False)  #1*1降维
        self.bn1=nn.BatchNorm2d(out_channel)

        self.conv2=nn.Conv2d(in_channels=out_channel,out_channels=out_channel,kernel_size=3,stride=1,bias=False) #3*3特征提取
        self.bn2=nn.BatchNorm2d(out_channel)

        self.conv3=nn.Conv2d(in_channels=out_channel,out_channels=out_channel*self.expansion,kernel_size=1,stride=1,bias=False) #1*1恢复维度
        self.bn3=nn.BatchNorm2d(out_channel*self.expansion)

        self.relu=nn.ReLU(inplace=True)
        self.downsample=downsample

    def forward(self, x):
        identity=x
        # downsample是用来将残差数据和卷积数据的shape变的相同，可以直接进行相加操作。
        if self.downsample is not None:
            identity=self.downsample(x)

        out=self.conv1(x)
        out=self.bn1(out)
        out=self.relu(out)

        out=self.conv2(out)
        out=self.bn2(out)
        out=self.relu(out)

        out=self.conv3(out)
        out=self.bn3(out)

        # out=F(X)+X
        out+=identity
        out=self.relu(out)

        return out
class ResNet(nn.Module):
    def __init__(self,
                 block,  # 使用的残差块类型
                 blocks_num,  # 每个卷积层，使用残差块的个数
                 num_classes=10,  # 训练集标签的分类个数
                 include_top=True,  # 是否在残差结构后接上pooling、fc、softmax
                 ):

        super(ResNet,self).__init__()
        self.include_top=include_top
        self.in_channel=64   # 第一层卷积输出特征矩阵的深度，也是后面层输入特征矩阵的深度

        # 输入层有RGB三个分量，使得输入特征矩阵的深度是3
        self.conv1=nn.Conv2d(3,self.in_channel,kernel_size=7,stride=2,padding=3,bias=False)
        self.bn1=nn.BatchNorm2d(self.in_channel)
        self.relu=nn.ReLU(inplace=True)

        self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)

        # _make_layer(残差块类型，残差块中第一个卷积层的卷积核个数，残差块个数，残差块中卷积步长)函数：生成多个连续的残差块的残差结构
        self.layer1=self._make_layer(block,64,blocks_num[0])
        self.layer2=self._make_layer(block,128,blocks_num[1],stride=2)
        self.layer3=self._make_layer(block,256,blocks_num[2],stride=2)
        self.layer4=self._make_layer(block,512,blocks_num[3],stride=2)

        if self.include_top:  # 默认为True，接上pooling、fc、softmax
            self.avgpool=nn.AdaptiveAvgPool2d((1,1)) # 自适应平均池化下采样，无论输入矩阵的shape为多少，output size均为的高宽均为1x1
            # 使矩阵展平为向量，如（W,H,C）->(1,1,W*H*C)，深度为W*H*C
            self.fc=nn.Linear(512*block.expansion,num_classes)  # 全连接层，512 * block.expansion为输入深度，num_classes为分类类别个数

        for m in self.modules():#初始化
            if isinstance(m,nn.Conv2d): #isinstance() 函数来判断一个对象(m)是否是一个已知的类型(如int等，这儿为nn.Conv2d)
                nn.init.kaiming_normal_(m.weight,mode='fan_out',nonlinearity='relu')  #相应的初始化方式

    # _make_layer()函数：生成多个连续的残差块，(残差块类型，残差块中第一个卷积层的卷积核个数，残差块个数，残差块中卷积步长)
    def _make_layer(self,block,channel,block_num,stride=1):
        downsample=None

        #寻找：卷积步长不为1或深度扩张有变化，导致F(X)与X的shape不同的残差块，就要对X定义下采样函数，使之shape相同
        if stride!=1 or self.in_channel!=channel*block.expansion:
            downsample=nn.Sequential(
                nn.Conv2d(self.in_channel,channel**block.expansion,kernel_size=1,stride=stride,bias=False),
                nn.BatchNorm2d(channel*block.expansion)
            )

        # layers用于顺序储存各连续残差块
        # 每个残差结构，第一个残差块均为需要对X下采样的残差块，后面的残差块不需要对X下采样
        layers=[]
        # 添加第一个残差块，第一个残差块均为需要对X下采样的残差块
        layers.append(block(self.in_channel,
                            channel,
                            downsample=downsample,
                            stride=stride
                            ))
        self.in_channel=channel*block.expansion
        # 后面的残差块不需要对X下采样
        for _ in range(1,block_num):
            layers.append(block(self.in_channel,
                                channel))
        # 以非关键字参数形式，将layers列表，传入Sequential(),使其中残差块串联为一个残差结构
        return nn.Sequential(*layers)

    def forward(self, x):
        x=self.conv1(x)
        x=self.bn1(x)
        x=self.relu(x)

        x=self.maxpool(x)

        x=self.layer1(x)
        x=self.layer2(x)
        x=self.layer3(x)
        x=self.layer4(x)

        if self.include_top:  #一般为True
            x=self.avgpool(x)
            x=torch.flatten(x,1)
            x=self.fc(x)

        return x
# 至此ResNet的基本框架就写好了

# 下面定义不同层的ResNet

def ResNet18(num_classes=10,include_top=True):
    return ResNet(BasicBlock,[2,2,2,2],num_classes=num_classes,include_top=include_top)
#
# def ResNet34(num_classes=1000,include_top=True):
#     return ResNet(BasicBlock,[3,4,6,3],num_classes=num_classes,include_top=include_top)

# def ResNet50(num_classes=10,include_top=True):
#     return ResNet(Bottleneck,[3,4,6,3],num_classes=num_classes,include_top=include_top)

# def ResNet101(num_classes=1000,include_top=True):
#     return ResNet(Bottleneck,[3,4,23,3],num_classes=num_classes,include_top=include_top)
#
# def ResNet152(num_classes=1000,include_top=True):
#     return ResNet(Bottleneck,[3,8,36,3],num_classes=num_classes,include_top=include_top)


# 定义类别标签
class_labels = [
    'airplane', 'automobile', 'bird', 'cat', 'deer',
    'dog', 'frog', 'horse', 'ship', 'truck'
]

model=ResNet18()
# 加载预训练的ResNet50模型
model = torch.load('./models/ResNet18.ckpt')
model.eval()

# 设置图像预处理的转换
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

# 加载测试图像
image_path = './images/img.png'
image = Image.open(image_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# 预处理图像并添加批次维度
image=transform(image)
image=image.to(device)
input_tensor =image.unsqueeze(0)

# 使用模型进行预测
with torch.no_grad():
    output = model(input_tensor)

# 获取预测结果
probabilities = torch.nn.functional.softmax(output[0], dim=0)
predicted_class_index = torch.argmax(probabilities).item()
predicted_class_label = class_labels[predicted_class_index]

# 打印预测结果
print("Predicted class:", predicted_class_label)
print("Probabilities:", probabilities)

4.结果

在这里插入图片描述

壹||零

关注

1
点赞
踩
12

收藏

觉得还不错? 一键收藏
1
评论
ResNet论文及复现代码笔记（CIFAR-10数据集）

左图是50层以下网络所用的残差结构，例如ResNet18/34，右图是50层及以上网络所用的残差结构，例如ResNet50/101/152。从图中可以看出Resnet50/101/152的1x1卷积->3x3卷积->1x1卷积结构中，第一个1x1的卷积是进行降维操作，再做3x3的卷积提取特征，再做1x1的卷积进行升维（恢复维度）。这里的残差结构是add操作，即两组同样shape的权值，对应位置的值进行相加。
复制链接

扫一扫