[PyTorch]AlexNet代码复现

声明

内容有点长,善用目录!
  • 代码可以直接运行,代码大部分是参考别人的,自己添加了一些备注,test部分是自己做的。
  • 代码中有很多注释是我修改时残留,观看体验不佳
  • 代码质量很差,仅用作学习
  • 并没有你所期待的LRN(VGG已经证明这东西对于模型训练没有什么帮助)
  • 没有使用ImageNet数据集,而是猫狗数据集
  • 没有PCA
  • 没有使用Overlapping MaxPooling(后面也基本上没人用了)
  • 没有使用双GPU并行计算,这里模拟的是双GPU中的单GPU
  • 只有top-1准确率,没有top-5准确率(因为猫狗只有两类)
  • 没有使用模型集成
  • 没有测试其迁移性能
  • 也没有使用迁移学习进行预训练(猫狗数据集足够大,我们从头开始训练也没有关系)

AlexNet网络结构

在这里插入图片描述
Note:

  • AlexNet论文中输入图像的尺寸是 227 × 227 × 3 227 \times 227 \times 3 227×227×3,而非原论文中的 224 × 224 × 3 224 \times 224 \times 3 224×224×3

feature map大小计算(不想看可以直接跳过)

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述


模型代码

搭建网络的代码如下:

import torch
import torch.nn as nn

class AlexNet(nn.Module):
    def __init__(self, num_classes=1000):
        super(AlexNet, self).__init__()

        # 特征提取器(卷积层)
        self.features = nn.Sequential(
            # Conv1
            nn.Conv2d(3, 48, kernel_size=11, stride=4, padding=2),  # param1:输入通道;param2:输出通道; input[3, 224, 224]; output[48, 55, 55], 自动舍去小数点
            nn.ReLU(inplace=True),  # inplace可以载入更大模型——inplace=True的意思就是对从上层网络Conv2d中传递下来的tensor直接进行修改,这样能够节省运算内存,不用多存储其他变量
            nn.MaxPool2d(kernel_size=3, stride=2),  # output[48, 27, 27],kernel_num为原论文的一半

            # Conv2
            nn.Conv2d(48, 128, kernel_size=5, stride=1, padding=2),  # output[128, 27, 27]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),   # output[128, 13, 13]

            # Conv3
            nn.Conv2d(128, 192, kernel_size=3, stride=1, padding=1),  # output[192, 13, 13]
            nn.ReLU(inplace=True),

            # Conv4
            nn.Conv2d(192, 192, kernel_size=3, stride=1, padding=1),  # output[192, 13, 13]
            nn.ReLU(inplace=True),

            # Conv5
            nn.Conv2d(192, 128, kernel_size=3, stride=1, padding=1),  # output[128, 13, 13]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2)  # output[128, 6, 6]
        )

        # 全连接层
        self.classifier = nn.Sequential(
            # FC_1
            nn.Dropout(p=0.5),
            nn.Linear(128 * 6*6, 2048),  # in_features: [128, 6, 6], out_features: [2048]
            nn.ReLU(inplace=True),

            # FC_2
            nn.Dropout(p=.5),
            nn.Linear(in_features=2048, out_features=2048),
            nn.ReLU(inplace=True),

            # FC_3
            nn.Linear(in_features=2048, out_features=num_classes)
        )

        if init_weights:
                self._initialize_weights()  # 执行初始化权重的策略

    def forward(self, x):
        x = self.features(x)  # 让输入的batch_size张图片走一遍卷积层
        x = torch.flatten(x, start_dim=1)  # 在进入FC层之前,需要使用flatten函数将高维向量拉平
        x = self.classifier(x)  # 让被拉平的高维向量走一遍FC层
        return x  # 最终实现forward

if __name__ == "__main__":
    model = AlexNet()
    print(model)

值得注意的是,网络上很多AlexNet的模型代码并非源码,而将filter进行了改动,从而feature map的size也发生了变动。

而我们的代码是忠于原作者的。

可以通过此代码查看模型的结构。

训练集

相较于ImageNet,《猫狗数据集》对于AlexNet也是非常适合的。

猫狗数据集可以在kaggle上进行下载

数据划分

代码如下:

import os
from shutil import copy
import random


def mkfile(file):
    if not os.path.exists(file):
        os.makedirs(file)


if __name__ == "__main__":
    file = "train"

    flower_class = [cls for cls in os.listdir(file) if ".txt" not in cls]

    # Train
    mkfile("dogsvscats/train")
    for cls in flower_class:
        mkfile("dogsvscats/train/" + cls)

    # Val
    mkfile("dogsvscats/val")
    for cls in flower_class:
        mkfile("dogsvscats/val/" + cls)

    split_rate = 0.1

    for cls in flower_class:
        cls_path = file + '/' + cls + '/'
        images = os.listdir(cls_path)
        num = len(images)
        eval_idx = random.sample(images, k=int(num * split_rate))

        for idx, img in enumerate(images):
            if img in eval_idx:
                img_path = cls_path + img
                new_path = "dogsvscats/val/" + cls
                copy(img_path, new_path)
            else:
                img_path = cls_path + img
                new_path = "dogsvscats/train/" + cls
                copy(img_path, new_path)

            print("\r[{}] processing [{}/{}]".format(cls, idx + 1, num), end="")
        print()
    print("processing done!")

值的说明的是,我们仅仅对猫狗数据集中的train进行了划分,按照 9 : 1 9:1 9:1 的ratio划分为了"train"和"val"。

我们决定使用猫狗数据集中"val"作为我们模型的"test"。

训练

我们的训练代码如下:

import torch
import torch.nn as nn
from torchvision import transforms, datasets, utils
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from model import AlexNet
import os
import json
import time

os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'

# 自适应选择训练硬件
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("训练方式为:", device)

# 定义训练集和测试的数据转换方式
preprocess = {  # 本身是一个dict,"key":"value"
    "train": transforms.Compose([
                # 预处理,使用224×224的窗口进行随机裁剪——我们的数据图片大小为320×240,所以一张图片生成的patch数量为:
                # (240-224)×(240-224)
                transforms.RandomResizedCrop((224, 224)),
                transforms.RandomHorizontalFlip(),  # patch数翻倍
                transforms.ToTensor(),  # PIL Image转Tensor,自动[0,255]归一化到[0,1]
                # 对数据进行归一化——param1为mean,param2为std,因为图片是3通道的,所以mean和std也应该是3通道的
                transforms.Normalize((.5, .5, .5), (.5, .5, .5))
    ]),

    "val": transforms.Compose([
                transforms.Resize((224, 224)),  # 不能是224,必须得是(224, 224)
                transforms.ToTensor(),  # PIL Image转Tensor,自动[0,255]归一化到[0,1]
                transforms.Normalize((.5, .5, .5), (.5, .5, .5))  # 标准化,即减均值,除以标准差
    ])
}

# Return a unicode string representing the current working directory. —— cwd==current working directory
data_root = os.getcwd()
img_path = data_root + "/dogsvscats/"

# 定义训练集路径和数据预处理方式

# """
# torchvision是PyTorch的一个视觉工具包,提供了很多图像处理的工具。
# datasets使用ImageFolder工具(默认PIL Image图像),获取定制化的图片并自动生成类别标签。如裁剪、旋转、标准化、归一化等(使用transforms工具)。
# DataLoader可以把datasets数据集打乱,分成batch,并行加速等。
#
# 使用torchvision.datasets中的ImageFolder工具,功能:
#         1、文件夹名就是类别名
#         2、从上到下自动为文件夹自动创建标签,0、1、2、...。class_to_idx、imgs属性可以查看。
#         3、返回每一幅图的data、label
#
# ========================================================================================================================
# # 定义ImageFolder对象,param1: 路径; param2: transformer的方式
# dataset = torchvision.datasets.ImageFolder(root=img_path + "train", transform=data_transform["train"])
#
# print(train_dataset)  # 是一个包装类,里面是数据集的一些统计情况

# print(train_dataset.class_to_idx)  #查看类别名 + 及对应的标签。
# print(train_dataset.imgs)  #查看所有图片的路径 + 对应的标签
# """

# 定义训练集路径和预处理方式
train_dataset = datasets.ImageFolder(root=img_path + "train",
                                     transform=preprocess["train"])
# 定义验证集路径和预处理方式
validate_dataset = datasets.ImageFolder(root=img_path + "val",
                                        transform=preprocess["val"])

# 计算训练集和测试集数据的个数(张)
train_num = len(train_dataset)  
validate_num = len(validate_dataset) 


# 定义class类别标签的list
flower_dict = train_dataset.class_to_idx  
# 将上面的字典中的key和value翻转一下
cls_dict = dict((val, key) for key, val in flower_dict.items())  # {0: 'daisy', 1: 'dandelion', 2: 'roses', 3: 'sunflowers', 4: 'tulips'}

# 将cls_dict这个字典写进json文件中
"""
    1、json.dumps()和json.loads()是json格式处理函数(可以这么理解,json是字符串)
          (1)json.dumps()函数是将一个Python数据类型列表进行json格式的编码(可以这么理解,json.dumps()函数是将字典转化为字符串)
          (2)json.loads()函数是将json格式数据转换为字典(可以这么理解,json.loads()函数是将字符串转化为字典)
    2、json.dump()和json.load()主要用来读写json文件函数
"""
json_str = json.dumps(cls_dict, indent=4)
"""
{
    "0": "cat",
    "1": "dog",
}
"""

# 将json_str这个变量写入硬盘,名称为:class_indices.json
with open('class_indices.json', 'w') as json_file:
    json_file.write(json_str)

# 设置一次喂入神经网络图片的张数
batch_size = 32

# 加载训练集 —— Note:train_loader是一个可迭代对象
train_loader = DataLoader(dataset=train_dataset,
                          batch_size=batch_size,
                          shuffle=True)
# print(iter(train_loader).next())  打印第一个batch(batch_size张图片的数据,末尾是batch_size张图片的标签)
"""
[32张图片的数据(已经被标准化了)
"""

# 加载验证集
validate_loader = DataLoader(dataset=validate_dataset,
                             batch_size=batch_size,
                             shuffle=True)

# 查看【经过预处理后】验证集图片
test_data_iter = iter(validate_loader)
test_img, test_label = test_data_iter.next()  # test_img中含有batch_size个图片(经过预处理后的图片),test_label中含有batch_size个标签

# print(test_img)
# print(test_label)

# 取出第0张图片,进行属性的查看和可视化
# print(test_img[0].size(), type(test_img[0]))  # torch.Size([3, 224, 224]) <class 'torch.Tensor'>
# print(test_label[0], test_label[0].item(), type(test_label[0]))  # tensor(4) 4 <class 'torch.Tensor'> —— Tensor格式可以通过.item()转化为numpy的格式

# 显示图片
def imshow(img):
    img = img / 2 + 0.5  # 反归一化
    npimg = img.numpy()  # 图片此时的type为torch.Tensor,我们需要转换为numpy格式——让它可以正常显示
    plt.imshow(np.transpose(npimg, (1, 2, 0)))  # 因为图片的size()为[3, 224, 224],所以我们需要把它变为[224, 224, 3])
    plt.show()

# print("; ".join("%5s" % cls_dict[test_label[j].item()] for j in range(32)))
# imshow(utils.make_grid(test_img))


# 创建AlexNet网络的Object
model = AlexNet(num_classes=2, init_weights=True)

# 查看GPU是否可用,如果可以则将model加入GPU
model.to(device)

# 定义Loss Function
loss_function = nn.CrossEntropyLoss()

# Optimizer
optimizer = optim.SGD(params=model.parameters(),
                      lr=0.0008,
                      momentum=0.9,
                      weight_decay=0.0005)

# 训练后参数的保存路径
save_path = "./[dogsvscats]AlexNet.pth"

# Best Accuracy
best_acc = 0.0

# 开始进行训练和测试,训练一轮,测试一轮
for epoch in range(20):
    # Train
    model.train()  # 训练过程中,使用之前在网络中定义的Dropout(p=0.5)
    running_loss = 0.0
    t1 = time.perf_counter()
    for step, data in enumerate(train_loader, start=0):
        imgs, labels = data
        optimizer.zero_grad()
        outputs = model(imgs.to(device))

        loss = loss_function(outputs, labels.to(device))
        loss.backward()
        optimizer.step()

        # 打印统计数据
        running_loss += loss.item()

        rate = (step + 1) / len(train_loader)
        a = "*" * int(rate * 50)
        b = "." * int((1-rate) * 50)
        print("\rtrain loss: {:^3.0f}%[{}->{}]{:.3f}".format(int(rate*100), a, b, loss), end="")
    print()
    print("耗费时间为:", time.perf_counter()-t1)

    # Validate
    model.eval()  # 测试过程中不需要使用Dropout,使用全部的神经元进行推理
    acc = 0.0
    with torch.no_grad():
        for val_data in validate_loader:
            val_imgs, val_labels = val_data
            outputs = model(val_imgs.to(device))
            # print(outputs.size())  # torch.Size([32, 5])
            predict_y = torch.softmax(outputs, dim=1).argmax(dim=1)  # 使用softmax函数进行分类,使用argmax进行定位
            # 等价于:predict_y = torch.max(outputs, dim=1)[1]
            # print(predict_y)
            acc += (predict_y == val_labels.to(device)).sum().item()
        val_accuracy = acc / validate_num

        # 保存不同epoch中验证集准确率最高的那个
        if val_accuracy > best_acc:
            best_acc = val_accuracy
            torch.save(model.state_dict(), save_path)
        print("[epoch %d] train_loss: %.3f    test_accuracy: %.3f" % (epoch + 1, running_loss / step, val_accuracy))
        print("-" * 100)

print("Training Finished!")

我们自己的训练结果如下:

训练方式为: cuda:0
train loss: 100%[**************************************************->]0.695
耗费时间为: 77.8902389
[epoch 1] train_loss: 0.693    test_accuracy: 0.541
train loss: 100%[**************************************************->]0.694
耗费时间为: 73.58257280000001
[epoch 2] train_loss: 0.686    test_accuracy: 0.631
train loss: 100%[**************************************************->]0.572
耗费时间为: 73.7966395
[epoch 3] train_loss: 0.668    test_accuracy: 0.628
train loss: 100%[**************************************************->]0.666
耗费时间为: 74.0734688
[epoch 4] train_loss: 0.649    test_accuracy: 0.631
train loss: 100%[**************************************************->]0.559
耗费时间为: 69.92790500000001
[epoch 5] train_loss: 0.629    test_accuracy: 0.637
train loss: 100%[**************************************************->]0.690
耗费时间为: 70.44158830000003
[epoch 6] train_loss: 0.614    test_accuracy: 0.710
train loss: 100%[**************************************************->]0.441
耗费时间为: 72.88411449999995
[epoch 7] train_loss: 0.600    test_accuracy: 0.706
train loss: 100%[**************************************************->]0.619
耗费时间为: 76.24907659999997
[epoch 8] train_loss: 0.583    test_accuracy: 0.577
train loss: 100%[**************************************************->]0.606
耗费时间为: 70.76603890000001
[epoch 9] train_loss: 0.567    test_accuracy: 0.724
train loss: 100%[**************************************************->]0.537
耗费时间为: 72.8174573
[epoch 10] train_loss: 0.553    test_accuracy: 0.769
train loss: 100%[**************************************************->]0.616
耗费时间为: 72.39202279999995
[epoch 11] train_loss: 0.530    test_accuracy: 0.691
train loss: 100%[**************************************************->]0.587
耗费时间为: 72.95349450000003
[epoch 12] train_loss: 0.525    test_accuracy: 0.783
train loss: 100%[**************************************************->]0.574
耗费时间为: 72.2609660999999
[epoch 13] train_loss: 0.515    test_accuracy: 0.792
train loss: 100%[**************************************************->]0.508
耗费时间为: 72.51263999999992
[epoch 14] train_loss: 0.491    test_accuracy: 0.760
train loss: 100%[**************************************************->]0.603
耗费时间为: 72.61705889999985
[epoch 15] train_loss: 0.480    test_accuracy: 0.719
train loss: 100%[**************************************************->]0.445
耗费时间为: 73.71669989999987
[epoch 16] train_loss: 0.470    test_accuracy: 0.783
train loss: 100%[**************************************************->]0.424
耗费时间为: 72.68489569999997
[epoch 17] train_loss: 0.462    test_accuracy: 0.739
train loss: 100%[**************************************************->]0.384
耗费时间为: 72.6828700000001
[epoch 18] train_loss: 0.450    test_accuracy: 0.768
train loss: 100%[**************************************************->]0.521
耗费时间为: 72.88758529999996
[epoch 19] train_loss: 0.441    test_accuracy: 0.810
train loss: 100%[**************************************************->]0.650
耗费时间为: 69.89989370000012
[epoch 20] train_loss: 0.428    test_accuracy: 0.842
Training Finished!

Process finished with exit code 0

我们可以看到,随着我们的训练的继续,train_loss一直在降低,如果我们增加epoch数,准确率应该会稳定下降。

Note: 一味的增加epoch数可能会导致过拟合现象。


虽然我们并没有使用和AlexNet一样的学习率,但momentum和weight_decay我们和AlexNet采取了相同的策略。而且我们还使用了更加先进的交叉熵损失函数。

但对于猫狗数据集来说, 0.842 0.842 0.842 只能说是差强人意,毕竟AlexNet是2012年的网络,我们不能苛责太多。
在这里插入图片描述

前向推断

我们将一张图片喂入神经网络,神经网络给出类别和概率。

代码如下:

import torch
from model import AlexNet
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt
import json
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'

preprocess = transforms.Compose([transforms.Resize((224, 224)),
                                 transforms.ToTensor(),
                                 transforms.Normalize((.5, .5, .5), (.5, .5, .5))])

# 加载测试图片
img = Image.open("./test_cat.jpg")

# 可视化图片
plt.imshow(img)

# 对图片进行预处理
img = preprocess(img)  # [N, C, H, W]
# print(img.size())  # torch.Size([3, 224, 224])

#
img = torch.unsqueeze(img, dim=0)  # dim=0表示在最前面加1维
print(img.size())  # torch.Size([1, 3, 224, 224])

# 读取标签
try:
    json_file = open("./class_indices.json", "r")
    class_indict = json.load(json_file)
except Exception as e:
    print(e)  # 打印异常
    exit(-1)  # 资源释放

# 模型Object
model = AlexNet(num_classes=2, init_weights=False)

# 加载模型参数
model_weight_path = "./[dogsvscats]AlexNet.pth"
model.load_state_dict(torch.load(model_weight_path))

# 冻结反向传播
model.eval()

# 开始前向推断
with torch.no_grad():
    output = model(img)

    output = torch.squeeze(output)

    predict = torch.softmax(output, dim=0)  # dim=0表示行
    print(predict)  # tensor([0.9700, 0.0300])
    predict_cls = torch.argmax(predict).numpy()

print(class_indict[str(predict_cls)], predict[predict_cls].item())
plt.show()

我们从猫狗原始数据集中的val中随便取出一张猫的图片,让模型执行前向推断,结果如下:
在这里插入图片描述
我们可以发现准确率是比较高的。

在这里插入图片描述

网络上随便找的

我们找几张不那么像狗的图片让网络试一试。
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
把猫识别成了狗,但是置信度不高
说明我们的网络对于不同姿态的Object识别率不高。

动漫风格

在这里插入图片描述
识别成了狗

在这里插入图片描述
依然是狗

在这里插入图片描述
预测正确,但不排除…你懂得

在这里插入图片描述
好吧。。。。


有时间可以自己试试。

测试集

我们将原本猫狗数据的val文件夹rename为test_data。
代码如下:

import torch
import torch.nn as nn
from torchvision import transforms, datasets, utils
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from model import AlexNet
import os
import json
import time

os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'

# 自适应选择test硬件
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("训练方式为:", device)

# 定义测试集的预处理方式
preprocess = {  # 本身是一个dict,"key":"value"
    "test": transforms.Compose([
                transforms.Resize((224, 224)),  # 不能是224,必须得是(224, 224)
                transforms.ToTensor(),  # PIL Image转Tensor,自动[0,255]归一化到[0,1]
                transforms.Normalize((.5, .5, .5), (.5, .5, .5))  # 标准化,即减均值,除以标准差
    ])
}

# Return a unicode string representing the current working directory. —— cwd==current working directory
data_root = os.getcwd()
img_path = data_root + "/test_data/"

# 将数据集用ImageFolder进行封装
test_dataset = datasets.ImageFolder(root=img_path,
                                     transform=preprocess["test"])

# 计算测试集数据的个数(张)
test_num = len(test_dataset)  # 5000
print("测试集数据个数为:", test_num)

# 读取标签
try:
    json_file = open("./class_indices.json", "r")
    class_indict = json.load(json_file)
except Exception as e:
    print(e)  # 打印异常
    exit(-1)  # 资源释放

# 设置一次喂入神经网络图片的张数
batch_size = 32

# 加载训练集 —— Note:train_loader是一个可迭代对象
test_loader = DataLoader(dataset=test_dataset,
                          batch_size=batch_size,
                          shuffle=True)

# 定义模型Object
model = AlexNet(num_classes=2, init_weights=True)

# 加载模型参数
model_weight_path = "./[dogsvscats]AlexNet.pth"
model.load_state_dict(torch.load(model_weight_path))

# 冻结反向传播
model.eval()

# 查看GPU是否可用,如果可以则将model加入GPU
model.to(device)

# 显示图片
def imshow(img):
    img = img / 2 + 0.5  # 反归一化
    npimg = img.numpy()  # 图片此时的type为torch.Tensor,我们需要转换为numpy格式——让它可以正常显示
    plt.imshow(np.transpose(npimg, (1, 2, 0)))  # 因为图片的size()为[3, 224, 224],所以我们需要把它变为[224, 224, 3])
    plt.show()

# 开始推断测试集
acc = 0
with torch.no_grad():
    t1 = time.time()
    for test_data in test_loader:
        # 定义迭代次数
        print("=" * 100)

        test_imgs, test_labels = test_data
        outputs = model(test_imgs.to(device))

        # print(outputs.size())  # torch.Size([32, 5])
        # print("outputs:", outputs)

        # 计算batch中32张图片对应的2个概率,行 为图片,列 为每一个class的概率
        batch_percent = torch.softmax(outputs, dim=1)  # 使用softmax激活分类函数计算一个batch中所有class的概率
        # print("batch_percent", batch_percent)  # 均为正数,且和为1
        # print("batch_percent.size()", batch_percent.size())  # torch.Size([32, 2])
        # print("batch_percent[0][0].item()", batch_percent[0][0].item())  # 0.2794545888900757
        # print("type(batch_percent[0][0].item())", type(batch_percent[0][0].item()))  # <class 'float'>
        # sum = torch.sum(batch_percent,dim=1)

        """        # 计算所有batch中的最大值
        max_percent = torch.max(batch_percent, dim=0)
        print("max_percent: ", max_percent)
        print("max_percent.size()", max_percent.size())"""

        # 计算所有batch中的标签
        predict_cls = torch.argmax(batch_percent, dim=1)  # 使用argmax进行定位
        # print("predict_cls的长度为:", len(predict_cls))  # 32
        # print("predict_cls的size为:", predict_cls.size())  # torch.Size([32])
        # print(predict_cls[0].item())

        # 总的正确个数
        acc += (predict_cls == test_labels.to(device)).sum().item()

        # print("="*50)
        # 根据预测class的position反推json中的class_name
        # Tensor转换为numpy格式必须使用CPU,不能使用GPU
        class_position = predict_cls.cpu().numpy()  # 模型预测类别的标签位置
        # print(class_position)



        # print("batch_percent[0][1]: ", batch_percent[0][1])  # tensor(0.0003, device='cuda:0')
        # print("batch_percent[0][1]: ", batch_percent[0][1].item())  # 0.0002717699098866433(<class 'float'>)
        for i in range(batch_size):  # i为[0,31]  # batch_percent.size==torch.Size([32, 5])
            if len(batch_percent) < batch_size:
                break
            class_score = batch_percent[i]  # 一个batch有32张图片,取第i行的标签,即这张图片2种class的概率
            # print("class_score为:", class_score)  
            # print("class_score.size()为:", class_score.size())  # torch.Size([2])

            # 我们需要对class_score的数据格式进行转换
            class_score = class_score.cpu().numpy()
            # print("class_score为:", class_score)  
            # print("type(class_score)为:", type(class_score))  # <class 'numpy.ndarray'>

            # 我们需要根据标签的位置求出它对应的分数
            logits = class_score[class_position[i]]
            # print("logits为:", logits)  # 0.558576

            # param1:class_indict[位置] √    param2:softmax分数 √
            class_name = class_indict[str(class_position[i])]
            class_score = batch_percent[i][class_position]
            print("{} -> {:.5f}%".format(class_name, logits*100))
            # print(class_name, class_score)


    # 总的准确率
    test_accuracy = acc / test_num
    t2 = time.time()
    print("-"*100)
    print("该测试集共有 {}张 图片,其中预测正确的个数为 {}张:".format(test_num, acc))  # 24
    print("测试集的全部准确率为:{:.4f}%".format(test_accuracy*100))

    print("※※※※※※※※Training Finished!※※※※※※※※")
    print("耗费时间为:{:3f}s".format(t2-t1))
    
    # 找到准确率最高的前32张图片,并可视化出来(显示它的概率以及类别)
    # print(class_indict[str(predict_cls)], batch_percent[predict_cls].item())

结果如下:

训练方式为: cuda:0
测试集数据个数为: 5000
====================================================================================================
cat -> 53.13779%
cat -> 90.49326%
cat -> 85.61757%
cat -> 88.89946%
...
...
cat -> 99.23261%
dog -> 89.79341%
dog -> 83.16571%
====================================================================================================
----------------------------------------------------------------------------------------------------
该测试集共有 5000张 图片,其中预测正确的个数为 4191张:
测试集的全部准确率为:83.8200%
※※※※※※※※Training Finished!※※※※※※※※
耗费时间为:23.785549s

Process finished with exit code 0

过拟合问题

我们可以设置大一些的epoch,并结合测试集观察模型是否发生了过拟合。


[epoch=50]训练结果

这里我们设置的epoch为50,结果如下:

训练方式为: cuda:0
train loss: 100%[**************************************************->]0.689
耗费时间为: 73.5003909
[epoch 1] train_loss: 0.692    test_accuracy: 0.596
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.692
耗费时间为: 68.46996309999999
[epoch 2] train_loss: 0.683    test_accuracy: 0.629
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.562
耗费时间为: 70.4520397
[epoch 3] train_loss: 0.665    test_accuracy: 0.597
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.716
耗费时间为: 73.65075540000001
[epoch 4] train_loss: 0.653    test_accuracy: 0.638
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.525
耗费时间为: 72.74129720000002
[epoch 5] train_loss: 0.635    test_accuracy: 0.610
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.701
耗费时间为: 73.9800553
[epoch 6] train_loss: 0.621    test_accuracy: 0.690
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.578
耗费时间为: 71.0275623
[epoch 7] train_loss: 0.603    test_accuracy: 0.682
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.642
耗费时间为: 69.31189290000009
[epoch 8] train_loss: 0.593    test_accuracy: 0.649
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.680
耗费时间为: 72.82196640000006
[epoch 9] train_loss: 0.575    test_accuracy: 0.696
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.449
耗费时间为: 75.14971049999997
[epoch 10] train_loss: 0.560    test_accuracy: 0.746
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.535
耗费时间为: 74.8120811
[epoch 11] train_loss: 0.546    test_accuracy: 0.754
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.418
耗费时间为: 73.70157189999998
[epoch 12] train_loss: 0.535    test_accuracy: 0.760
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.628
耗费时间为: 74.96500990000004
[epoch 13] train_loss: 0.514    test_accuracy: 0.661
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.401
耗费时间为: 73.34336619999999
[epoch 14] train_loss: 0.507    test_accuracy: 0.786
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.403
耗费时间为: 73.72266659999991
[epoch 15] train_loss: 0.493    test_accuracy: 0.787
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.377
耗费时间为: 72.20094769999992
[epoch 16] train_loss: 0.483    test_accuracy: 0.805
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.301
耗费时间为: 72.83861309999998
[epoch 17] train_loss: 0.472    test_accuracy: 0.781
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.351
耗费时间为: 73.76927139999998
[epoch 18] train_loss: 0.455    test_accuracy: 0.808
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.364
耗费时间为: 73.27734559999999
[epoch 19] train_loss: 0.450    test_accuracy: 0.785
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.438
耗费时间为: 70.3837105
[epoch 20] train_loss: 0.438    test_accuracy: 0.788
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.356
耗费时间为: 69.32952679999994
[epoch 21] train_loss: 0.427    test_accuracy: 0.794
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.560
耗费时间为: 69.50876799999992
[epoch 22] train_loss: 0.415    test_accuracy: 0.790
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.293
耗费时间为: 69.41906799999992
[epoch 23] train_loss: 0.408    test_accuracy: 0.841
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.352
耗费时间为: 69.51199009999982
[epoch 24] train_loss: 0.400    test_accuracy: 0.828
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.273
耗费时间为: 70.3065673000001
[epoch 25] train_loss: 0.391    test_accuracy: 0.798
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.412
耗费时间为: 68.7086746
[epoch 26] train_loss: 0.381    test_accuracy: 0.837
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.520
耗费时间为: 67.78685859999996
[epoch 27] train_loss: 0.384    test_accuracy: 0.868
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.477
耗费时间为: 66.98027090000005
[epoch 28] train_loss: 0.367    test_accuracy: 0.825
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.361
耗费时间为: 66.97244290000026
[epoch 29] train_loss: 0.359    test_accuracy: 0.870
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.388
耗费时间为: 66.93381369999997
[epoch 30] train_loss: 0.355    test_accuracy: 0.868
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.527
耗费时间为: 66.81049679999978
[epoch 31] train_loss: 0.346    test_accuracy: 0.858
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.368
耗费时间为: 67.0129224000002
[epoch 32] train_loss: 0.342    test_accuracy: 0.881
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.232
耗费时间为: 66.78838779999978
[epoch 33] train_loss: 0.340    test_accuracy: 0.849
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.194
耗费时间为: 66.80974939999987
[epoch 34] train_loss: 0.326    test_accuracy: 0.847
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.266
耗费时间为: 66.97147660000019
[epoch 35] train_loss: 0.326    test_accuracy: 0.877
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.367
耗费时间为: 66.94594989999996
[epoch 36] train_loss: 0.318    test_accuracy: 0.885
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.401
耗费时间为: 66.94098100000019
[epoch 37] train_loss: 0.316    test_accuracy: 0.855
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.380
耗费时间为: 66.94510420000006
[epoch 38] train_loss: 0.308    test_accuracy: 0.882
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.389
耗费时间为: 66.88110970000025
[epoch 39] train_loss: 0.313    test_accuracy: 0.872
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.337
耗费时间为: 67.09120940000003
[epoch 40] train_loss: 0.302    test_accuracy: 0.888
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.232
耗费时间为: 66.88292329999967
[epoch 41] train_loss: 0.305    test_accuracy: 0.892
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.233
耗费时间为: 66.96250150000014
[epoch 42] train_loss: 0.299    test_accuracy: 0.880
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.204
耗费时间为: 68.31975290000037
[epoch 43] train_loss: 0.292    test_accuracy: 0.870
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.403
耗费时间为: 67.1377738000001
[epoch 44] train_loss: 0.286    test_accuracy: 0.879
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.625
耗费时间为: 67.14625949999981
[epoch 45] train_loss: 0.287    test_accuracy: 0.886
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.328
耗费时间为: 74.95228959999986
[epoch 46] train_loss: 0.281    test_accuracy: 0.877
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.314
耗费时间为: 70.40748429999985
[epoch 47] train_loss: 0.284    test_accuracy: 0.904
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.264
耗费时间为: 70.86596940000027
[epoch 48] train_loss: 0.280    test_accuracy: 0.886
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.186
耗费时间为: 70.4489579000001
[epoch 49] train_loss: 0.273    test_accuracy: 0.893
----------------------------------------------------------------------------------------------------
train loss: 100%[**************************************************->]0.055
耗费时间为: 70.58004610000035
[epoch 50] train_loss: 0.272    test_accuracy: 0.898
----------------------------------------------------------------------------------------------------
Training Finished!

Process finished with exit code 0

我们的验证集最佳准确率为: 0.904

其实我们通过train_loss可以看出来,我们的epoch还是小,它还是在稳步下降的。

[epoch=50]测试集误差

训练方式为: cuda:0
测试集数据个数为: 5000
====================================================================================================
cat -> 89.36078%
dog -> 99.01215%
dog -> 97.76264%
cat -> 91.98765%
dog -> 99.97818%
dog -> 59.36930%
dog -> 95.63810%
dog -> 58.91334%
cat -> 99.98029%
dog -> 96.82398%
cat -> 99.80793%
cat -> 99.70891%
dog -> 59.67871%
...
...
cat -> 99.50975%
dog -> 99.99977%
cat -> 71.69167%
dog -> 99.81134%
cat -> 81.49423%
cat -> 62.52128%
dog -> 99.98790%
====================================================================================================
----------------------------------------------------------------------------------------------------
该测试集共有 5000张 图片,其中预测正确的个数为 4474张:
测试集的全部准确率为:89.4800%
※※※※※※※※Training Finished!※※※※※※※※
耗费时间为:23.674946s

Process finished with exit code 0

结论

对比[epoch=20]:

  • 训练误差变小 [train_loss] 0.272 0.272 0.272 v.s. 0.428 0.428 0.428
  • 验证集准确率提高[test_accuracy] 0.904 0.904 0.904 v.s. 0.842 0.842 0.842
  • 测试集准确率提高 89.4800 % 89.4800\% 89.4800% v.s. 83.8200 % 83.8200\% 83.8200%

所以我们可以得出结论:

  • e p o c h = 50 epoch=50 epoch=50 并不会导致模型过拟合,我们可以继续调大epoch或者调整lr的初始值。

参考

  1. https://blog.csdn.net/weixin_44023658/article/details/105798326
  • 6
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值