P8周：YOLOv5-C3模块实现

最新推荐文章于 2024-05-12 06:06:59 发布

千筱夜

最新推荐文章于 2024-05-12 06:06:59 发布

阅读量109

点赞数 1

文章标签： YOLO 深度学习 pytorch

本文链接：https://blog.csdn.net/geo436872/article/details/134207962

版权

🍨 本文为🔗365天深度学习训练营中的学习记录博客
🍖 原作者：K同学啊 | 接辅导、项目定制
🚀 文章来源：K同学的学习圈子

环境配置

Python version:  3.8.17 (default, Jul  5 2023, 20:44:21) [MSC v.1916 64 bit (AMD64)]
Pytorch version:  2.0.1+cu117
Torchvision version:  0.15.2+cu117
CUDA is available: True
Using device: cuda

本次天气数据集由K同学提供，若有需要请联系K同学

一、前期准备

1.设置gpu并导入所需包

from datetime import datetime
import torch
import torchvision
from torch.utils.data import DataLoader
if __name__ == '__main__':
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(device)

device打印结果：

cuda

2.数据预处理

将下载好的数据集设置路径，进行预处理，按比例划分数据集，将划分的数据集分别打印出来，并表出标签信息。

    weather_image_dir = "./data-3/weather_photos/"
    class_list = ['cloudy', 'rain', 'shine', 'sunrise']
    train_transforms = torchvision.transforms.Compose([
        torchvision.transforms.Resize([224, 224]),  # 将输入图片resize成统一尺寸
        torchvision.transforms.ToTensor(),          # 将PIL Image或numpy.ndarray转换为tensor，并归一化到[0,1]之间
        # 标准化处理-->转换为标准正态分布（高斯分布），使模型更容易收敛   # 其中 mean=[0.485,0.456,0.406]与std=[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。
        torchvision.transforms.Normalize(  mean = [0.485, 0.456, 0.406],  std = [0.229, 0.224, 0.225])
    ])

    total_data = torchvision.datasets.ImageFolder(weather_image_dir, transform = train_transforms)
    print(total_data)
    print(total_data.class_to_idx)
    train_size = int(0.8 * len(total_data))
    test_size  = len(total_data) - train_size
    train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
    print(train_dataset)
    print(test_dataset)
    #创建数据加载器（dataloader）
    batch_size = 4

    train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size = batch_size, shuffle = True, num_workers = 4)
    test_dataloader  = torch.utils.data.DataLoader(test_dataset , batch_size = batch_size, num_workers = 4)

打印结果如下：

Dataset ImageFolder
    Number of datapoints: 1125
    Root location: ./data-3/weather_photos/
    StandardTransform
Transform: Compose(
               Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=warn)
               ToTensor()
               Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
           )
{'cloudy': 0, 'rain': 1, 'shine': 2, 'sunrise': 3}
<torch.utils.data.dataset.Subset object at 0x000001DF62B2AAF0>
<torch.utils.data.dataset.Subset object at 0x000001DF62B2AD60>

二、模型搭建及可视化

为方便理解，这里引用K同学一张图片。
在这里插入图片描述
概括其特点有以下五点：

骨干网络：YOLOv5-C3使用CSPDarknet53作为骨干网络。CSPDarknet53基于Darknet53骨干网络，并进行了改进，引入了CSP（Cross Stage Partial）连接来减少计算量和内存占用。

特征金字塔网络（FPN）：YOLOv5-C3引入了特征金字塔网络，在不同的层级上提取多尺度的特征。FPN通过从底层到顶层的自上而下和自下而上的路径，将低级的语义信息与高级的语义信息相融合，生成多尺度的特征图用于目标检测。

FPN层级：YOLOv5-C3从CSPDarknet53的顶部开始的三个层级分别是C3、C4和C5。这些层级的特征图分别具有不同的分辨率和语义信息，通过FPN进行特征融合，形成多尺度的特征金字塔。

降采样和上采样：在FPN中，由于每个层级的特征图分辨率不同，需要进行降采样和上采样操作。降采样操作在C3到C5之间进行，通常使用步长为2的卷积层实现。上采样操作则通过双线性插值或转置卷积来将低分辨率的特征图上采样到高分辨率。

输出层：YOLOv5-C3在FPN的每个层级上都有一个输出层，用于产生不同尺度的预测框。每个输出层都基于Anchor Boxes和卷积操作生成边界框的位置和类别信息。

    import torch.nn as nn


    #
    def autopad(k, p=None):  # kernel, padding
        # Pad to 'same'
        if p is None:
            p = k // 2 if isinstance(k, int) else [x // 2 for x in k]  # auto-pad
        return p


    # 一个YOLO v5的Conv模块，包含Conv2D + BN + SiLU，其中Conv2D根据输入输入图像尺寸决定是否要padding
    class Conv(nn.Module):
        # Standard convolution
        def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
            super().__init__()
            self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
            self.bn = nn.BatchNorm2d(c2)
            self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

        def forward(self, x):
            return self.act(self.bn(self.conv(x)))


    class Bottleneck(nn.Module):
        # Standard bottleneck
        def __init__(self, c1, c2, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, shortcut, groups, expansion
            super().__init__()
            c_ = int(c2 * e)  # hidden channels
            self.cv1 = Conv(c1, c_, 1, 1)
            self.cv2 = Conv(c_, c2, 3, 1, g=g)
            self.add = shortcut and c1 == c2

        def forward(self, x):
            return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))


    class C3(nn.Module):
        # CSP Bottleneck with 3 convolutions
        def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
            super().__init__()
            c_ = int(c2 * e)  # hidden channels
            self.cv1 = Conv(c1, c_, 1, 1)
            self.cv2 = Conv(c1, c_, 1, 1)
            self.cv3 = Conv(2 * c_, c2, 1)  # act=FReLU(c2)
            self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))

        def forward(self, x):
            return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))


    class model_K(nn.Module):
        def __init__(self):
            super(model_K, self).__init__()

            # 卷积模块
            self.Conv = Conv(3, 32, 3, 2)

            # C3模块1
            self.C3_1 = C3(32, 64, 3, 2)

            # 全连接网络层，用于分类
            self.classifier = nn.Sequential(
                nn.Linear(in_features=802816, out_features=100),
                nn.ReLU(),
                nn.Linear(in_features=100, out_features=4)
            )

        def forward(self, x):
            x = self.Conv(x)
            x = self.C3_1(x)
            x = torch.flatten(x, start_dim=1)
            x = self.classifier(x)

            return x

为了有一个清晰认识，我们将模型架构可视化。

    model = model_K().to(device)
    from torchinfo import summary
    summary(model, (1, 3, 224, 224))

打印如下：

===============================================================================================
Layer (type:depth-idx)                        Output Shape              Param #
===============================================================================================
model_K                                       [1, 4]                    --
├─Conv: 1-1                                   [1, 32, 112, 112]         --
│    └─Conv2d: 2-1                            [1, 32, 112, 112]         864
│    └─BatchNorm2d: 2-2                       [1, 32, 112, 112]         64
│    └─SiLU: 2-3                              [1, 32, 112, 112]         --
├─C3: 1-2                                     [1, 64, 112, 112]         --
│    └─Conv: 2-4                              [1, 32, 112, 112]         --
│    │    └─Conv2d: 3-1                       [1, 32, 112, 112]         1,024
│    │    └─BatchNorm2d: 3-2                  [1, 32, 112, 112]         64
│    │    └─SiLU: 3-3                         [1, 32, 112, 112]         --
│    └─Sequential: 2-5                        [1, 32, 112, 112]         --
│    │    └─Bottleneck: 3-4                   [1, 32, 112, 112]         10,368
│    │    └─Bottleneck: 3-5                   [1, 32, 112, 112]         10,368
│    │    └─Bottleneck: 3-6                   [1, 32, 112, 112]         10,368
│    └─Conv: 2-6                              [1, 32, 112, 112]         --
│    │    └─Conv2d: 3-7                       [1, 32, 112, 112]         1,024
│    │    └─BatchNorm2d: 3-8                  [1, 32, 112, 112]         64
│    │    └─SiLU: 3-9                         [1, 32, 112, 112]         --
│    └─Conv: 2-7                              [1, 64, 112, 112]         --
│    │    └─Conv2d: 3-10                      [1, 64, 112, 112]         4,096
│    │    └─BatchNorm2d: 3-11                 [1, 64, 112, 112]         128
│    │    └─SiLU: 3-12                        [1, 64, 112, 112]         --
├─Sequential: 1-3                             [1, 4]                    --
│    └─Linear: 2-8                            [1, 100]                  80,281,700
│    └─ReLU: 2-9                              [1, 100]                  --
│    └─Linear: 2-10                           [1, 4]                    404
===============================================================================================
Total params: 80,320,536
Trainable params: 80,320,536
Non-trainable params: 0
Total mult-adds (M): 553.54
===============================================================================================
Input size (MB): 0.60
Forward/backward pass size (MB): 70.65
Params size (MB): 321.28
Estimated Total Size (MB): 392.53
===============================================================================================

三、定义函数

1.编写训练函数

    def train_func(dataloader, model, loss_func, optimizer):
        size = len(dataloader.dataset)  # 获取训练集的大小
        num_batches = len(dataloader)  # 训练批次数量，结果上取整

        train_loss, train_acc = 0, 0  # 初始化训练损失和正确率

        for images, labels in dataloader:  # 获取图片及其标签
            images = images.to(device)
            labels = labels.to(device)
            # 计算预测误差
            pred = model(images)  # 网络输出
            loss = loss_func(pred, labels)  # 计算网络输出和真实值之间的差距，targets为真实值，计算二者差值即为损失

            # 反向传播
            optimizer.zero_grad()  # grad属性归零
            loss.backward()  # 反向传播
            optimizer.step()  # 每一步自动更新

            # 记录acc与loss
            train_acc += (pred.argmax(1) == labels).type(torch.float).sum().item()
            train_loss += loss.item()

        train_acc /= size
        train_loss /= num_batches

        return train_acc, train_loss

2.编写测试函数：

    def test_func(dataloader, model, loss_func):
        size = len(dataloader.dataset)  # 获取测试集的大小
        num_batches = len(dataloader)  # 测试批次数量，结果上取整
        test_loss, test_acc = 0, 0

        # 当不进行训练时，停止梯度更新
        with torch.no_grad():
            for images, labels in dataloader:
                images, labels = images.to(device), labels.to(device)

                # 模型预测并计算loss
                labels_pred = model(images)
                loss = loss_func(labels_pred, labels)

                # 转为numpy格式数值，并进行统计
                test_loss += loss.item()
                test_acc += (labels_pred.argmax(1) == labels).type(torch.float).sum().item()

        test_acc /= size
        test_loss /= num_batches

3.自适应学习率：

    def adjust_learning_rate(optimizer, epoch, initial_lr = 1e-4, atte_rate = 1):
        # optimizer: 优化器
        # epoch: 训练迭代总轮数
        # initial_lr : 初始学习率
        # atte_rate  ：学习率的衰减率
        #确保衰减率为[min_limit, max_limit]之间的值，输入大于1的值则学习率不衰减；输入小于0.8的值则取0.8
        atte_rate = min(atte_rate, 1)
        atte_rate = max(atte_rate, 0.8)
        # 衰减策略：期望得到一个单调递减，但是下降越来越慢的学习率
        # 每2个epoch衰减一次，相当于lr = init_lr * (atte_rate ^ n)
        lr = initial_lr * (atte_rate ** (epoch // 2))
        for param_group in optimizer.param_groups:
            param_group['lr'] = lr

        return lr

4.模型保存：

    def model_save(model, model_filename):
        torch.save(model.state_dict(), model_filename)
        print(f"Saved PyTorch Model State to {model_filename}")

5.可视化函数

    import matplotlib.pyplot as plt
    #隐藏警告
    #import warnings

    def check_func(train_loss, train_acc, test_loss, test_acc):

        epochs_range = range(len(train_loss))
        #定义图像大小
        plt.figure(figsize=(12, 3))

        plt.subplot(1, 2, 1)
        plt.plot(epochs_range, train_acc, label='Training Accuracy')
        plt.plot(epochs_range, test_acc, label='Test Accuracy')
        plt.legend(loc='lower right')
        plt.title('Training and Validation Accuracy')

        plt.subplot(1, 2, 2)
        plt.plot(epochs_range, train_loss, label='Training Loss')
        plt.plot(epochs_range, test_loss, label='Test Loss')
        plt.legend(loc='upper right')
        plt.title('Training and Validation Loss')

        plt.show()

四、正式训练

1.初始化

保存初始化参数

    model_filename = "model_yolov5_c3_init.pth"
    model_save(model, model_filename)

训练初始化

    model_init_flag = False
    if model_init_flag == False:
        # 加载初始化后的模型参数，并转移到device中
        model_filename = "model_yolov5_c3_init.pth"
        weights_dict = torch.load(model_filename, map_location=device)
        model.load_state_dict(weights_dict)  # , strict = False)
        print(f"Loaded PyTorch Model State from {model_filename}")

    # 将acc和loss统计量清零
    train_loss = []
    train_acc = []
    test_loss = []
    test_acc = []
    best_train_acc = 0
    best_test_acc = 0
    total_epochs = 0
    best_epoch = 0

    # 设置初始化标志
    model_init_flag = True
    # 设置信息打印模版
    info_template = ('--Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%，Test_loss:{:.3f}')

    print("Initialization completed, ready to start training...")

2.学习率，损失函数和优化器

    # 定义学习率、损失函数和优化器
    learning_rate = 1e-4
    loss_func = torch.nn.CrossEntropyLoss()
    # optimizer   = torch.optim.SGD(model.parameters(), lr = learning_rate)
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    # 启动训练--测试过程，每轮10个迭代
    enable_adjust_lr = True  # 根据情况决定是否打卡
    gamma = 0.95
    epochs = 10
    model_filename = "model_coffee_bean_best.pth"

    model_init_flag = False
    print("=========================================================")

3.迭代训练

    for epoch in range(epochs):
        if (enable_adjust_lr == True):
            # 动态更新学习率，其中learning_rate为初始值返回值lr为调整后的学习率
            lr = adjust_learning_rate(optimizer, total_epochs, learning_rate, gamma)
        else:
            # 静态学习率，此处赋值是为了每轮训练完成后的信息打印
            lr = learning_rate

        # 测量起始时间
        start_time = datetime.now()
        # 模型训练
        model.train()
        epoch_train_acc, epoch_train_loss = 0, 0
        epoch_train_acc, epoch_train_loss = train_func(train_dataloader, model, loss_func, optimizer)
        # 模型评估
        model.eval()
        epoch_test_acc, epoch_test_loss = 0, 0
        epoch_test_acc, epoch_test_loss = test_func(test_dataloader, model, loss_func)

        # 记录测试数据
        train_acc.append(epoch_train_acc)
        train_loss.append(epoch_train_loss)
        test_acc.append(epoch_test_acc)
        test_loss.append(epoch_test_loss)

        # 保存最优模型参数，此处选取最优条件是train_acc和test_acc同时为历时最优，此条件可以修改
        if (epoch_train_acc > best_train_acc) and (epoch_test_acc > best_test_acc):
            best_train_acc = epoch_train_acc
            best_test_acc = epoch_test_acc
            model_save(model, model_filename)

        end_time = datetime.now()  # 测量结束时间

        # 打印测试数据
        total_epochs += 1
        print(info_template.format(total_epochs, epoch_train_acc * 100, epoch_train_loss, epoch_test_acc * 100,
                                   epoch_test_loss))
        print(f"--Current learing rate is %.4e:" % lr)
        print(f"--Time of epochs {total_epochs} is : {(end_time - start_time)}")
    print("========================== Done =========================")
    print("best train-acc: {:.2f}%, best_test_acc: {:.2f}%".format(best_train_acc * 100, best_test_acc * 100))

打印结果如下：

=========================================================
Saved PyTorch Model State to model_coffee_bean_best.pth
--Epoch: 1, Train_acc:71.6%, Train_loss:1.368, Test_acc:83.6%，Test_loss:0.692
--Current learing rate is 1.0000e-04:
--Time of epochs 1 is : 0:00:41.108158
Saved PyTorch Model State to model_coffee_bean_best.pth
--Epoch: 2, Train_acc:85.8%, Train_loss:0.452, Test_acc:84.9%，Test_loss:0.340
--Current learing rate is 1.0000e-04:
--Time of epochs 2 is : 0:00:39.954869
--Epoch: 3, Train_acc:91.0%, Train_loss:0.287, Test_acc:82.2%，Test_loss:0.701
--Current learing rate is 9.5000e-05:
--Time of epochs 3 is : 0:00:39.575022
Saved PyTorch Model State to model_coffee_bean_best.pth
--Epoch: 4, Train_acc:93.8%, Train_loss:0.201, Test_acc:88.0%，Test_loss:0.473
--Current learing rate is 9.5000e-05:
--Time of epochs 4 is : 0:00:40.214949
--Epoch: 5, Train_acc:95.7%, Train_loss:0.145, Test_acc:84.0%，Test_loss:0.716
--Current learing rate is 9.0250e-05:
--Time of epochs 5 is : 0:00:39.753750
Saved PyTorch Model State to model_coffee_bean_best.pth
--Epoch: 6, Train_acc:94.7%, Train_loss:0.204, Test_acc:88.9%，Test_loss:0.456
--Current learing rate is 9.0250e-05:
--Time of epochs 6 is : 0:00:40.065153
--Epoch: 7, Train_acc:97.4%, Train_loss:0.060, Test_acc:85.8%，Test_loss:0.541
--Current learing rate is 8.5737e-05:
--Time of epochs 7 is : 0:00:39.523396
Saved PyTorch Model State to model_coffee_bean_best.pth
--Epoch: 8, Train_acc:98.1%, Train_loss:0.051, Test_acc:89.3%，Test_loss:0.668
--Current learing rate is 8.5737e-05:
--Time of epochs 8 is : 0:00:39.791922
Saved PyTorch Model State to model_coffee_bean_best.pth
--Epoch: 9, Train_acc:99.3%, Train_loss:0.020, Test_acc:93.8%，Test_loss:0.515
--Current learing rate is 8.1451e-05:
--Time of epochs 9 is : 0:00:39.805669
--Epoch:10, Train_acc:99.6%, Train_loss:0.011, Test_acc:93.8%，Test_loss:0.436
--Current learing rate is 8.1451e-05:
--Time of epochs 10 is : 0:00:39.746361
========================== Done =========================
best train-acc: 99.33%, best_test_acc: 93.78%

4.可视化

在这里插入图片描述

千筱夜

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
P8周：YOLOv5-C3模块实现

size = len(dataloader.dataset) # 获取训练集的大小num_batches = len(dataloader) # 训练批次数量，结果上取整train_loss, train_acc = 0, 0 # 初始化训练损失和正确率for images, labels in dataloader: # 获取图片及其标签# 计算预测误差pred = model(images) # 网络输出。
复制链接

扫一扫