J1:ResNet-50算法实战与解析

1 前言
ResNet(Residual Neural Network)是一种深度卷积神经网络结构,由微软亚洲研究院的Kaiming He等人在2015年提出。ResNet在图像分类和计算机视觉任务中取得了巨大的成功,并成为了当今最常用的深度学习模型之一。

ResNet主要解决深度卷积网络在加深时候的“退化问题”。

传统的深度卷积神经网络存在一个问题,即随着网络层数的增加,网络性能会出现退化现象。退化现象的表现是在训练过程中,随着网络层数增加,网络的准确率反而下降。这是因为随着网络深度的增加,梯度在反向传播过程中容易消失,导致难以训练深层网络。

为了解决这个问题,ResNet提出了一种残差学习的思想。残差学习的关键在于引入了"残差块"(residual block),它允许网络学习恒等映射(identity mapping)的残差。具体而言,残差块通过在网络中引入跳跃连接(shortcut connection)和恒等映射(identity mapping)来构建。

在传统的卷积神经网络中,网络层之间是顺序连接的,输出特征会直接传递给下一层。而在ResNet中,每个残差块的输入会先经过一系列卷积层和激活函数,然后与输入特征相加,再经过激活函数输出。这种设计使得网络可以学习残差,更容易训练深层网络。

ResNet的基本单元是残差块(Residual Block)。一个残差块由两个卷积层组成,每个卷积层后面跟着一个批量归一化(Batch Normalization)层和一个激活函数(通常使用ReLU)。此外,如果输入和输出的特征图大小不一致,需要引入一个额外的卷积层进行维度匹配。

除了基本的残差块,ResNet还有不同深度的变体,包括ResNet-18、ResNet-34、ResNet-50、ResNet-101和ResNet-152等。这些变体的深度不同,由于残差学习的特性,它们能够在更深的网络中取得更好的性能。

ResNet的成功证明了残差学习的有效性,并启发了后续深度学习模型的设计。它被广泛应用于图像分类、目标检测、语义分割等计算机视觉任务,并在各种竞赛和实际应用中取得了出色的结果。

ResNet50算法已被收录在torchvision.models中,代码如下:
from torchvision import models
from torchsummary import summary

model = models.resnet50(pretrained=False)
device = torch.device(“cuda:0” if torch.cuda.is_available() else “cpu”)
model.to(device)

summary(model, (3, 224, 224))
[图片]
2 前期准备
前期工作中包含数据处理、划分数据集等相关操作,由于在前面的文章中都有较为详细的解释,故在此只贴出代码。
import torch
from torchvision import datasets, transforms
import torch.nn as nn
import time
import numpy as np
import matplotlib.pyplot as plt

import torchsummary as summary

from collections import OrderedDict

data_dir = ‘/Users/montylee/NJUPT/Learn/Github/deeplearning/CNN/J1/data/bird_photos’

def random_split_imagefolder(data_dir, transforms, random_split_rate=0.8):
‘’’
随机分割数据集
:param data_dir: 数据集路径
:param transforms: 数据预处理
:param random_split_rate: 训练集比例
:return: total_data, train_datasets, test_datasets
‘’’
_total_data = datasets.ImageFolder(data_dir, transform=transforms)

train_size = int(random_split_rate * len(_total_data))
test_size = len(_total_data) - train_size

_train_datasets, _test_datasets =  torch.utils.data.random_split(_total_data, [train_size, test_size])

return _total_data, _train_datasets, _test_datasets

获取真实均值-标准差

N_CHANNELS = 3 # RGB
mean = torch.zeros(N_CHANNELS)
std = torch.zeros(N_CHANNELS)

for inputs, labels in (total_data):
for i in range(N_CHANNELS):
mean[i] += inputs[i,:,:].mean()
std[i] += inputs[i,:,:].std()
mean.div
(len(total_data))
std.div_(len(total_data))
print(mean, std)

真实均值-标准差重新读取数据

real_transforms = transforms.Compose(
[
transforms.Resize(224),#中心裁剪到224*224
transforms.ToTensor(),#转化成张量
transforms.Normalize(mean, std)
])
total_data, train_datasets, test_datasets = random_split_imagefolder(data_dir, real_transforms, 0.8)

记录一些类型参数

class_names_dict = total_data.class_to_idx
print(total_data.class_to_idx)

{‘Bananaquit’: 0, ‘Black Skimmer’: 1, ‘Black Throated Bushtiti’: 2, ‘Cockatoo’: 3}

常理上如果要找到对应预测类型的名称可能对换下更加方便

class_names_dict = dict(zip(class_names_dict.values(), class_names_dict.keys()))
N_classes = len(class_names_dict)
print(class_names_dict)

{0: ‘Bananaquit’, 1: ‘Black Skimmer’, 2: ‘Black Throated Bushtiti’, 3: ‘Cockatoo’}

3 残差网络
3.1 残差网络解决了什么
残差网络是为了解决神经网络隐藏层过多时,而引起的网络退化问题。退化(degradation)问题是指:当网络隐藏层变多时,网络的准确度达到饱和然后急剧退化,而且这个退化不是由于过拟合引起的。
拓展: 深度神经网络的“两朵乌云”

  • 梯度弥散/爆炸
    简单来讲就是网络太深了,会导致模型训练难以收敛。这个问题可以被标准初始化和中间层正规化的方法有效控制。(现阶段知道这么一回事就好了)

  • 网络退化
    随着网络深度增加,网络的表现先是逐渐增加至饱和,然后迅速下降,这个退化不是由于过拟合引起的。
    3.2 ResNet-50 介绍
    ResNet-50有两个基本的块,分别名为Conv Block 和 Identity Block
    [图片]
    4 构建 ResNet-50 网络模型
    class Resnet50_Model(nn.Module):
    def init(self):
    super(Resnet50_Model, self).init()
    self.in_channels = 3
    self.layers = [2, 3, 5, 2]
    # ============= 基础层
    # 方法1
    self.cov0 = nn.Conv2d(self.in_channels, out_channels=64, kernel_size=7, stride=2, padding=3)
    self.bn0 = nn.BatchNorm2d(num_features=64)
    self.relu0 = nn.ReLU(inplace=False)
    self.maxpool0 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

      self.basic_layer = nn.Sequential(
          self.cov0,
          self.bn0,
          self.relu0,
          self.maxpool0
          )     
      
      self.layer1 = nn.Sequential(
          ConvBlock(64, 3, [64, 64, 256], 1),
          IdentityBlock(256, 3, [64, 64, 256]),
          IdentityBlock(256, 3, [64, 64, 256]),
          )
      
      self.layer2 = nn.Sequential(
          ConvBlock(256, 3, [128, 128, 512]),
          IdentityBlock(512, 3, [128, 128, 512]),
          IdentityBlock(512, 3, [128, 128, 512]),
          IdentityBlock(512, 3, [128, 128, 512]),
          )
    
      self.layer3 = nn.Sequential(
          ConvBlock(512, 3, [256, 256, 1024]),
          IdentityBlock(1024, 3, [256, 256, 1024]),
          IdentityBlock(1024, 3, [256, 256, 1024]),
          IdentityBlock(1024, 3, [256, 256, 1024]),
          IdentityBlock(1024, 3, [256, 256, 1024]),
          IdentityBlock(1024, 3, [256, 256, 1024]),
          )
    
      self.layer4 = nn.Sequential(
          ConvBlock(1024, 3, [512, 512, 2048]),
          IdentityBlock(2048, 3, [512, 512, 2048]),
          IdentityBlock(2048, 3, [512, 512, 2048]),
          )
      
      # 输出网络
      self.avgpool = nn.AvgPool2d((7, 7))
      # classfication layer
      # 7*7均值后2048个参数
      self.fc = nn.Sequential(nn.Linear(2048, N_classes),
                              nn.Softmax(dim=1))
    

    def forward(self, x):

      x = self.basic_layer(x)
    
    
      x = self.layer1(x)
      x = self.layer2(x)
      x = self.layer3(x)
      x = self.layer4(x)
      
      x = self.avgpool(x)
      x = torch.flatten(x, 1)
      x = self.fc(x)
    
      return x
    

5 训练模型
5.1 训练函数和测试函数
def train_and_test(model, loss_func, optimizer, epochs=25):

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)
summary.summary(model, (3, 224, 224))

record = []
best_acc = 0.0
best_epoch = 0

for epoch in range(epochs):#训练epochs轮
        epoch_start = time.time()
        print("Epoch: {}/{}".format(epoch + 1, epochs))

        model.train()#训练

        train_loss = 0.0
        train_acc = 0.0
        valid_loss = 0.0
        valid_acc = 0.0

        for i, (inputs, labels) in enumerate(train_datasets):
            inputs = inputs.to(device)
            labels = labels.to(device)
            #print(labels)
            # 记得清零
            optimizer.zero_grad()

            outputs = model(inputs)

            loss = loss_func(outputs, labels)

            loss.backward()

            optimizer.step()

            train_loss += loss.item() * inputs.size(0)
            if i%10==0:
                print("train data: {:01d} / {:03d} outputs: {}".format(i, len(train_datasets), outputs.data[0]))
            ret, predictions = torch.max(outputs.data, 1)
            correct_counts = predictions.eq(labels.data.view_as(predictions))

            acc = torch.mean(correct_counts.type(torch.FloatTensor))

            train_acc += acc.item() * inputs.size(0)

        with torch.no_grad():
            model.eval()#验证

            for j, (inputs, labels) in enumerate(test_datasets):
                inputs = inputs.to(device)
                labels = labels.to(device)

                outputs = model(inputs)

                loss = loss_func(outputs, labels)

                valid_loss += loss.item() * inputs.size(0)
                if j%10==0:
                    print("val data: {:01d} / {:03d} outputs: {}".format(j, len(test_datasets), outputs.data[0]))
                ret, predictions = torch.max(outputs.data, 1)
                correct_counts = predictions.eq(labels.data.view_as(predictions))

                acc = torch.mean(correct_counts.type(torch.FloatTensor))

                valid_acc += acc.item() * inputs.size(0)

        avg_train_loss = train_loss / train_datas_size
        avg_train_acc = train_acc / train_data_size

        avg_valid_loss = valid_loss / test_data_size
        avg_valid_acc = valid_acc / test_data_size


        record.append([avg_train_loss, avg_valid_loss, avg_train_acc, avg_valid_acc])

        if avg_valid_acc > best_acc  :#记录最高准确性的模型
            best_acc = avg_valid_acc
            best_epoch = epoch + 1

        epoch_end = time.time()

        print("Epoch: {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}%, \n\t\tValidation: Loss: {:.4f}, Accuracy: {:.4f}%, Time: {:.4f}s".format(
                epoch + 1, avg_valid_loss, avg_train_acc * 100, avg_valid_loss, avg_valid_acc * 100,
                epoch_end - epoch_start))
        print("Best Accuracy for validation : {:.4f} at epoch {:03d}".format(best_acc, best_epoch))    

return model, record

5.2 测试代码
if name==‘main’:

epochs = 25
model = Resnet50_Model()

loss_func = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(),lr=0.0001)
model, record = train_and_test(model, loss_func, optimizer, epochs)

torch.save(model, './Best_Resnet50.pth')

record = np.array(record)
plt.plot(record[:, 0:2])
plt.legend(['Train Loss', 'Valid Loss'])
plt.xlabel('Epoch Number')
plt.ylabel('Loss')
plt.ylim(0, 1.5)
plt.savefig('Loss.png')
plt.show()

plt.plot(record[:, 2:4])
plt.legend(['Train Accuracy', 'Valid Accuracy'])
plt.xlabel('Epoch Number')
plt.ylabel('Accuracy')
plt.ylim(0, 1)
plt.savefig('Accuracy.png')
plt.show()
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Monty _Lee

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值