深度学习-第J6周:ResNeXt-50实战解析

🍨 本文为[🔗365天深度学习训练营]内部限免文章(版权归 *K同学啊* 所有)
🍖 作者:[K同学啊]

前言

 ResNeXt是由何凯明团队在2017年CVPR会议上提出来的新型图像分类网络。在论文《Aggregated Residual Transformations for Deep Neural Networks》作者提出了当时普遍存在的问题:如何提高模型的准确率?

常用的方法是提高网络的深度或宽度,但单纯的提高网络的深度或宽度,加大了设计的难度,也加大了计算的开销。由此何团队设计了cardinality的概念。将卷积通道分组,再对分组进行卷积。

对比ResNet跟ResNeXt

 ResNeXt采用分组卷积的方式,将特征图分为不同的组,再对每组特征图进行卷积,在分组卷积中,每个卷积核只处理一部分的通道

一、分组卷积

我们在ResNet50的基础上进行修改,先设计组采样的Bottleneck模块

class Bottleneck(nn.Module):  # 定义残差块,renet50、resnet101、resnet152使用此残差块
    expansion = 4  # 残差操作维度变化倍数

    def __init__(self, in_channel, out_channel, stride=1, downsample=None, groups=1, base_width=64):  # 初始化方法
        super(Bottleneck, self).__init__()  # 继承初始化方法
        width = int(in_channel * (base_width / 64.0)) * groups  # F(x)第二个卷积的通道数
        self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=width, kernel_size=1, stride=1,
                               bias=False)  # conv操作
        self.bn1 = nn.BatchNorm2d(num_features=width)  # bn操作
        self.conv2 = nn.Conv2d(in_channels=width, out_channels=width, groups=groups, kernel_size=3, stride=stride,
                               padding=1, bias=False)  # conv操作,若为ResNeXt网络,则这里为group conv操作
        self.bn2 = nn.BatchNorm2d(num_features=width)  # bn操作
        self.conv3 = nn.Conv2d(in_channels=width, out_channels=out_channel * self.expansion, kernel_size=1, stride=1,
                               bias=False)  # conv操作
        self.bn3 = nn.BatchNorm2d(num_features=out_channel * self.expansion)  # bn操作

        self.relu = nn.ReLU(inplace=True)  # relu激活函数
        self.downsample = downsample  # 是否下采样

    def forward(self, x):  # 前传函数
        identity = x  # 原始x
        if self.downsample:  # 如果下采样
            identity = self.downsample(x)  # 残差边存在conv操作,x-->x'

        x = self.conv1(x)  # conv操作
        x = self.bn1(x)  # bn操作
        x = self.relu(x)  # relu激活函数

        x = self.conv2(x)  # conv操作
        x = self.bn2(x)  # bn操作
        x = self.relu(x)  # relu激活函数

        x = self.conv3(x)  # conv操作
        x = self.bn3(x)  # bn操作

        x += identity  # F(x)+x/x'
        x = self.relu(x)  # relu激活函数

        return x

其中downsample是向下的组,每一层layer都向下采样一次

        if stride != 1 or self.channel != channel * block.expansion:  # 如果卷积步长不为1或卷积前后通道数不一致,则需要对原始x进行操作
            downsample = nn.Sequential(
                nn.Conv2d(in_channels=self.channel, out_channels=channel * block.expansion, kernel_size=1,  # conv操作
                          stride=stride, bias=False),
                nn.BatchNorm2d(num_features=channel * block.expansion)  # bn操作
            )

二、ResNeXt50_Model模型

class ResNeXt50_Model(nn.Module):
    def __init__(self, in_channel=3, N_classes=1000):
        super(ResNeXt50_Model, self).__init__()
        self.in_channels = in_channel
        self.layers = [2, 3, 5, 2]
        # ============= 基础层
        # 方法1
        self.zeropadding2d = nn.ZeroPad2d(3)
        self.cov0 = nn.Conv2d(self.in_channels, out_channels=64, kernel_size=7, stride=2, padding=3)
        self.bn0 = nn.BatchNorm2d(num_features=64)
        self.relu0 = nn.ReLU(inplace=False)
        self.maxpool0 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        
        self.channel = 64
        self.groups = 1
        self.base_width = 64
        
        self.layer1 = self._make_layer(Bottleneck, 64, self.layers[0])  # 第一块残差集合,由基本的残差块组成
        self.layer2 = self._make_layer(Bottleneck, 128, self.layers[1], stride=2)  # 第二块残差集合,由基本的残差块组成
        self.layer3 = self._make_layer(Bottleneck, 256, self.layers[2], stride=2)  # 第三块残差集合,由基本的残差块组成
        self.layer4 = self._make_layer(Bottleneck, 512, self.layers[3], stride=2)  # 第四块残差集合,由基本的残差块组成

        # 输出网络
        self.avgpool = nn.AvgPool2d((7, 7))
        # classfication layer
        # 7*7均值后2048个参数
        self.fc = nn.Sequential(nn.Linear(2048, N_classes),
                                nn.Softmax(dim=1))

        for m in self.modules():  # 遍历模型结构
            if isinstance(m, nn.Conv2d):  # 如果当前结构是卷积操作
                nn.init.kaiming_normal_(m.weight, mode="fan_out", nonlinearity="relu")  # 使用kaiming初始化方法

    def basic_layer1(self, x):
        '''
        input:  x = tensor(3, 224, 224).unsqueeze(0)
         Layer (type)               Output Shape         Param #
      ================================================================
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
              ReLU-3         [-1, 64, 112, 112]               0
         MaxPool2d-4           [-1, 64, 56, 56]               0
      ================================================================   
        '''
        x = self.zeropadding2d(x)
        x = self.cov0(x)
        x = self.bn0(x)
        x = self.relu0(x)
        x = self.maxpool0(x)
        
        return x
    
    def _make_layer(self, block, channel, blocks, stride=1):  # 定义函数,用于生成模型结构
        downsample = None  # 默认不对原始x进行操作

        if stride != 1 or self.channel != channel * block.expansion:  # 如果卷积步长不为1或卷积前后通道数不一致,则需要对原始x进行操作
            downsample = nn.Sequential(
                nn.Conv2d(in_channels=self.channel, out_channels=channel * block.expansion, kernel_size=1,  # conv操作
                          stride=stride, bias=False),
                nn.BatchNorm2d(num_features=channel * block.expansion)  # bn操作
            )
        layers = []  # 列表用于存放模型结构

        layers.append(block(self.channel, channel, downsample=downsample, stride=stride, groups=self.groups,
                            base_width=self.base_width))  # 模型追加block结构
        self.channel = channel * block.expansion  # 通道数转换为卷积后输出通道数
        for _ in range(1, blocks):  # 进行blocks次循环
            layers.append(block(self.channel, channel, groups=self.groups, base_width=self.base_width))  # 模型追加block结构
        return nn.Sequential(*layers)  # 返回模型结构

    def forward(self, x):
        
        x = self.basic_layer1(x)
        
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)

        return x

三、测试模型

设定epochs=100, 早停为early_stop=10

import torch
from torchvision import datasets, transforms
import torch.nn as nn
import time
import numpy as np
import matplotlib.pyplot as plt
import torch.nn.functional as F 
import torchsummary as summary
import copy
import os

data_dir = './J3-data'

def random_split_imagefolder(data_dir, transforms, random_split_rate=0.8):
    
    _total_data = datasets.ImageFolder(data_dir, transform=transforms)
    
    train_size = int(random_split_rate * len(_total_data))
    test_size = len(_total_data) - train_size
    
    _train_datasets, _test_datasets =  torch.utils.data.random_split(_total_data, [train_size, test_size])

    return _total_data, _train_datasets, _test_datasets

N_classes=2
batch_size = 32
mean = [0.4958, 0.4984, 0.4068]
std = [0.2093, 0.2026, 0.2170]
# 真实均值-标准差重新读取数据
real_transforms = transforms.Compose(
        [
        transforms.Resize([224, 224]),#中心裁剪到224*224
        transforms.ToTensor(),#转化成张量
        transforms.Normalize(mean, std)
])
total_data, train_datasets, test_datasets = random_split_imagefolder(data_dir, real_transforms, 0.8)

# 批读取文件
train_data = torch.utils.data.DataLoader(train_datasets, batch_size=batch_size, shuffle=True, num_workers=8)
test_data = torch.utils.data.DataLoader(test_datasets, batch_size=batch_size, shuffle=True, num_workers=8)

train_data_size = len(train_datasets)
test_data_size = len(test_datasets)

def train_and_test(model, loss_func, optimizer, epochs=100, early_stop=10):
    
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    model.to(device)
    summary.summary(model, (3, 224, 224))
    
    record = []
    best_acc = 0.0
    best_epoch = 0
    stop_steps = 0
    for epoch in range(epochs):#训练epochs轮
            epoch_start = time.time()
            print("Epoch: {}/{}".format(epoch + 1, epochs))
    
            model.train()#训练
    
            train_loss = 0.0
            train_acc = 0.0
            valid_loss = 0.0
            valid_acc = 0.0
    
            for i, (inputs, labels) in enumerate(train_data):
                inputs = inputs.to(device)
                labels = labels.to(device)
                #print(labels)
                # 记得清零
                optimizer.zero_grad()
    
                outputs = model(inputs)
    
                loss = loss_func(outputs, labels)
    
                loss.backward()
    
                optimizer.step()
    
                train_loss += loss.item() * inputs.size(0)
                if i%10==0:
                    print("train data: {:01d} / {:03d} outputs: {}".format(i, len(train_data), outputs.data[0]))
                ret, predictions = torch.max(outputs.data, 1)
                correct_counts = predictions.eq(labels.data.view_as(predictions))
    
                acc = torch.mean(correct_counts.type(torch.FloatTensor))
    
                train_acc += acc.item() * inputs.size(0)

            with torch.no_grad():
                model.eval()#验证
    
                for j, (inputs, labels) in enumerate(test_data):
                    inputs = inputs.to(device)
                    labels = labels.to(device)
    
                    outputs = model(inputs)
    
                    loss = loss_func(outputs, labels)
    
                    valid_loss += loss.item() * inputs.size(0)
                    if j%10==0:
                        print("val data: {:01d} / {:03d} outputs: {}".format(j, len(test_data), outputs.data[0]))
                    ret, predictions = torch.max(outputs.data, 1)
                    correct_counts = predictions.eq(labels.data.view_as(predictions))
    
                    acc = torch.mean(correct_counts.type(torch.FloatTensor))
    
                    valid_acc += acc.item() * inputs.size(0)
    
            avg_train_loss = train_loss / train_data_size
            avg_train_acc = train_acc / train_data_size
    
            avg_valid_loss = valid_loss / test_data_size
            avg_valid_acc = valid_acc / test_data_size
    
    
            record.append([avg_train_loss, avg_valid_loss, avg_train_acc, avg_valid_acc])
    
            if avg_valid_acc > best_acc  :#记录最高准确性的模型
                best_acc = avg_valid_acc
                stop_steps = 0
                best_epoch = epoch + 1
                best_param = copy.deepcopy(model.state_dict())
            else:
                stop_steps += 1
                if stop_steps >=  early_stop:
                    break
            
            epoch_end = time.time()
    
            print("Epoch: {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}%, \n\t\tValidation: Loss: {:.4f}, Accuracy: {:.4f}%, Time: {:.4f}s".format(
                    epoch + 1, avg_valid_loss, avg_train_acc * 100, avg_valid_loss, avg_valid_acc * 100,
                    epoch_end - epoch_start))
            print("Best Accuracy for validation : {:.4f} at epoch {:03d}".format(best_acc, best_epoch))    
    
    model.load_state_dict(best_param)
    
    return model, record
#%%
if __name__=='__main__':
    
    early_stop=10
    epochs = 100
    model = ResNeXt50_Model(3, N_classes)
    
    loss_func = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(),lr=0.0001)
    model, record = train_and_test(model, loss_func, optimizer, epochs, early_stop)

    torch.save(model.state_dict(), './Best_ResNeXt50.pth')

    record = np.array(record)
    plt.plot(record[:, 0:2])
    plt.legend(['Train Loss', 'Valid Loss'])
    plt.xlabel('Epoch Number')
    plt.ylabel('Loss')
    plt.ylim(0, 1.5)
    plt.savefig('Loss_J3_1.png')
    plt.show()

    plt.plot(record[:, 2:4])
    plt.legend(['Train Accuracy', 'Valid Accuracy'])
    plt.xlabel('Epoch Number')
    plt.ylabel('Accuracy')
    plt.ylim(0, 1)
    plt.savefig('Accuracy_J3_1.png')
    plt.show()

四、运行结果

模型打印

       Bottleneck-49          [-1, 512, 29, 29]               0
           Conv2d-50          [-1, 512, 29, 29]         262,144
      BatchNorm2d-51          [-1, 512, 29, 29]           1,024
             ReLU-52          [-1, 512, 29, 29]               0
           Conv2d-53          [-1, 512, 29, 29]       2,359,296
      BatchNorm2d-54          [-1, 512, 29, 29]           1,024
             ReLU-55          [-1, 512, 29, 29]               0
           Conv2d-56          [-1, 512, 29, 29]         262,144
      BatchNorm2d-57          [-1, 512, 29, 29]           1,024
             ReLU-58          [-1, 512, 29, 29]               0
       Bottleneck-59          [-1, 512, 29, 29]               0
           Conv2d-60         [-1, 1024, 15, 15]         524,288
      BatchNorm2d-61         [-1, 1024, 15, 15]           2,048
           Conv2d-62          [-1, 512, 29, 29]         262,144
      BatchNorm2d-63          [-1, 512, 29, 29]           1,024
             ReLU-64          [-1, 512, 29, 29]               0
           Conv2d-65          [-1, 512, 15, 15]       2,359,296
      BatchNorm2d-66          [-1, 512, 15, 15]           1,024
             ReLU-67          [-1, 512, 15, 15]               0
           Conv2d-68         [-1, 1024, 15, 15]         524,288
      BatchNorm2d-69         [-1, 1024, 15, 15]           2,048
             ReLU-70         [-1, 1024, 15, 15]               0
       Bottleneck-71         [-1, 1024, 15, 15]               0
           Conv2d-72         [-1, 1024, 15, 15]       1,048,576
      BatchNorm2d-73         [-1, 1024, 15, 15]           2,048
             ReLU-74         [-1, 1024, 15, 15]               0
           Conv2d-75         [-1, 1024, 15, 15]       9,437,184
      BatchNorm2d-76         [-1, 1024, 15, 15]           2,048
             ReLU-77         [-1, 1024, 15, 15]               0
           Conv2d-78         [-1, 1024, 15, 15]       1,048,576
      BatchNorm2d-79         [-1, 1024, 15, 15]           2,048
             ReLU-80         [-1, 1024, 15, 15]               0
       Bottleneck-81         [-1, 1024, 15, 15]               0
           Conv2d-82         [-1, 1024, 15, 15]       1,048,576
      BatchNorm2d-83         [-1, 1024, 15, 15]           2,048
             ReLU-84         [-1, 1024, 15, 15]               0
           Conv2d-85         [-1, 1024, 15, 15]       9,437,184
      BatchNorm2d-86         [-1, 1024, 15, 15]           2,048
             ReLU-87         [-1, 1024, 15, 15]               0
           Conv2d-88         [-1, 1024, 15, 15]       1,048,576
      BatchNorm2d-89         [-1, 1024, 15, 15]           2,048
             ReLU-90         [-1, 1024, 15, 15]               0
       Bottleneck-91         [-1, 1024, 15, 15]               0
           Conv2d-92         [-1, 1024, 15, 15]       1,048,576
      BatchNorm2d-93         [-1, 1024, 15, 15]           2,048
             ReLU-94         [-1, 1024, 15, 15]               0
           Conv2d-95         [-1, 1024, 15, 15]       9,437,184
      BatchNorm2d-96         [-1, 1024, 15, 15]           2,048
             ReLU-97         [-1, 1024, 15, 15]               0
           Conv2d-98         [-1, 1024, 15, 15]       1,048,576
      BatchNorm2d-99         [-1, 1024, 15, 15]           2,048
            ReLU-100         [-1, 1024, 15, 15]               0
      Bottleneck-101         [-1, 1024, 15, 15]               0
          Conv2d-102         [-1, 1024, 15, 15]       1,048,576
     BatchNorm2d-103         [-1, 1024, 15, 15]           2,048
            ReLU-104         [-1, 1024, 15, 15]               0
          Conv2d-105         [-1, 1024, 15, 15]       9,437,184
     BatchNorm2d-106         [-1, 1024, 15, 15]           2,048
            ReLU-107         [-1, 1024, 15, 15]               0
          Conv2d-108         [-1, 1024, 15, 15]       1,048,576
     BatchNorm2d-109         [-1, 1024, 15, 15]           2,048
            ReLU-110         [-1, 1024, 15, 15]               0
      Bottleneck-111         [-1, 1024, 15, 15]               0
          Conv2d-112           [-1, 2048, 8, 8]       2,097,152
     BatchNorm2d-113           [-1, 2048, 8, 8]           4,096
          Conv2d-114         [-1, 1024, 15, 15]       1,048,576
     BatchNorm2d-115         [-1, 1024, 15, 15]           2,048
            ReLU-116         [-1, 1024, 15, 15]               0
          Conv2d-117           [-1, 1024, 8, 8]       9,437,184
     BatchNorm2d-118           [-1, 1024, 8, 8]           2,048
            ReLU-119           [-1, 1024, 8, 8]               0
          Conv2d-120           [-1, 2048, 8, 8]       2,097,152
     BatchNorm2d-121           [-1, 2048, 8, 8]           4,096
            ReLU-122           [-1, 2048, 8, 8]               0
      Bottleneck-123           [-1, 2048, 8, 8]               0
          Conv2d-124           [-1, 2048, 8, 8]       4,194,304
     BatchNorm2d-125           [-1, 2048, 8, 8]           4,096
            ReLU-126           [-1, 2048, 8, 8]               0
          Conv2d-127           [-1, 2048, 8, 8]      37,748,736
     BatchNorm2d-128           [-1, 2048, 8, 8]           4,096
            ReLU-129           [-1, 2048, 8, 8]               0
          Conv2d-130           [-1, 2048, 8, 8]       4,194,304
     BatchNorm2d-131           [-1, 2048, 8, 8]           4,096
            ReLU-132           [-1, 2048, 8, 8]               0
      Bottleneck-133           [-1, 2048, 8, 8]               0
       AvgPool2d-134           [-1, 2048, 1, 1]               0
          Linear-135                    [-1, 2]           4,098
         Softmax-136                    [-1, 2]               0
================================================================
Total params: 118,185,090
Trainable params: 118,185,090
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 363.40
Params size (MB): 450.84
Estimated Total Size (MB): 814.81
----------------------------------------------------------------

在25层达到最优模型,35层停止

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值