- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
- 🍖 原作者:K同学啊 | 接辅导、项目定制
- 🚀 文章来源:K同学的学习圈子
🍺要求:
1. 自己搭建VGG-16网络框架
2. 调用官方的VGG-16网络框架
3. 查看模型的参数量以及相关指标
🍻拔高(可选):
1. 验证集准确率达到100%
2. 使用PPT画出VGG-16算法框架图(发论文需要这项技能)
🔎 探索(难度有点大)
1. 在不影响准确率的前提下轻量化模型
- 目前VGG16的Total params是134,276,932
🏡 我的环境:
- 语言环境:Python3.8
- 编译器:Jupyter Lab
- 深度学习环境:Pytorch
- torchvision==0.13.1+cu113
- torch==1.12.1+cu113
一、 前期准备
1. 设置GPU
如果设备上支持GPU就使用GPU,否则使用CPU
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
from torchvision import transforms, datasets
import os,PIL,pathlib,warnings
warnings.filterwarnings("ignore") #忽略警告信息
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
输出:(当前设备没有GPU,先用CPU凑合...)
device(type='cpu')
2. 导入数据
获取并查看数据的标签类别名称
data_dir = ''./咖啡豆识别/''
# 使用pathlib.Path()函数将字符串类型的文件夹路径转换为pathlib.Path对象
data_dir = pathlib.Path(data_dir)
# 使用glob()方法获取data_dir路径下的所有文件路径,并以列表形式存储在data_paths中
data_paths = list(data_dir.glob('*'))
# 通过split()函数对data_paths中的每个文件路径执行分割操作,获得各个文件所属的类别名称,并存储在classeNames中
classeNames = [str(path).split('\\')[1] for path in data_paths]
# 打印classeNames列表,显示每个文件所属的类别名称
classeNames
输出: 数据集中有以下几种类别
['Dark', 'Green', 'Light', 'Medium']
# 关于transforms.Compose的更多介绍可以参考:https://blog.csdn.net/qq_38251616/article/details/124878863
train_transforms = transforms.Compose([
transforms.Resize([224, 224]), # 将输入图片resize成统一尺寸
# transforms.RandomHorizontalFlip(), # 随机水平翻转
transforms.ToTensor(), # 将PIL Image或numpy.ndarray转换为tensor,并归一化到[0,1]之间
transforms.Normalize( # 标准化处理-->转换为标准正太分布(高斯分布),使模型更容易收敛
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]) # 其中 mean=[0.485,0.456,0.406]与std=[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。
])
total_data = datasets.ImageFolder(data_dir, transform=train_transforms)
total_data
输出:
Dataset ImageFolder
Number of datapoints: 1200
Root location: 咖啡豆识别
StandardTransform
Transform: Compose(
ToTensor()
Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=None)
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
total_data.class_to_idx
输出:
{'Dark': 0, 'Green': 1, 'Light': 2, 'Medium': 3}
3. 划分数据集
train_size = int(0.8 * len(total_data))
test_size = len(total_data) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(total_data, [train_size, test_size])
train_dataset, test_dataset
print('训练集数量:', len(train_dataset))
print('测试集数量:', len(test_dataset))
输出:
(<torch.utils.data.dataset.Subset at 0x266e32d0790>,
<torch.utils.data.dataset.Subset at 0x266e32d07f0>)
训练集数量: 960
测试集数量: 240
4. 数据可视化
import matplotlib.pyplot as plt
from PIL import Image
image_folder = './咖啡豆识别/Dark/'
image_files = [f for f in os.listdir(image_folder) if f.endswith((".jpg", ".png", ".jpeg"))]
fig, axes = plt.subplots(3, 8, figsize=(16, 6))
for ax, img_file in zip(axes.flat, image_files):
image_path = os.path.join(image_folder, img_file)
img = Image.open(image_path)
ax.imshow(img)
ax.axis('off')
plt.tight_layout()
plt.show()
5. 加载数据
batch_size = 32
train_dl = torch.utils.data.DataLoader(train_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=1)
test_dl = torch.utils.data.DataLoader(test_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=1)
for X, y in test_dl:
print('Shape of X [N C H W]: ', X.shape)
print('Shape of y: ', y.shape, y.dtype)
break
输出:
Shape of X [N C H W]: torch.Size([32, 3, 224, 224])
Shape of y: torch.Size([32]) torch.int64
二、手动搭建VGG-16模型
VGG-16结构说明:
- 13个卷积层(Convolutional Layer),分别用
blockX_convX
表示 - 3个全连接层(Fully connected Layer),分别用
fcX
与predictions
表示 - 5个池化层(Pool layer),分别用
blockX_pool
表示
VGG-16
包含了16个隐藏层(13个卷积层和3个全连接层),故称为VGG-16
1. 搭建模型
import torch.nn.functional as F
class vgg16(nn.Module):
def __init__(self):
super(vgg16, self).__init__()
# 卷积块1
self.block1 = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))
)
# 卷积块2
self.block2 = nn.Sequential(
nn.Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))
)
# 卷积块3
self.block3 = nn.Sequential(
nn.Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))
)
# 卷积块4
self.block4 = nn.Sequential(
nn.Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))
)
# 卷积块5
self.block5 = nn.Sequential(
nn.Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))
)
# 全连接网络层,用于分类
self.classifier = nn.Sequential(
nn.Linear(in_features=512*7*7, out_features=4096),
nn.ReLU(),
nn.Linear(in_features=4096, out_features=4096),
nn.ReLU(),
nn.Linear(in_features=4096, out_features=4)
)
def forward(self, x):
x = self.block1(x)
x = self.block2(x)
x = self.block3(x)
x = self.block4(x)
x = self.block5(x)
x = torch.flatten(x, start_dim=1)
x = self.classifier(x)
return x
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))
model = vgg16().to(device)
model
输出:
Use cpu device
vgg16(
(block1): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU()
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU()
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(block2): Sequential(
(0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU()
(2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU()
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(block3): Sequential(
(0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU()
(2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU()
(4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(5): ReLU()
(6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(block4): Sequential(
(0): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU()
(2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU()
(4): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(5): ReLU()
(6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(block5): Sequential(
(0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU()
(2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU()
(4): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(5): ReLU()
(6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU()
(2): Linear(in_features=4096, out_features=4096, bias=True)
(3): ReLU()
(4): Linear(in_features=4096, out_features=4, bias=True)
)
)
2. 查看模型详情
summary.summary(model, (3, 224, 224)) 是使用 PyTorchSummary 库输出模型摘要的方法。 summary.summary() 接收两个参数:要总结的模型和输入大小。model 是想要汇总的 PyTorch 模型的实例。input_size=(3, 224, 224) 表示输入图片的大小,即每张图片的通道数(3)、宽度(224)和高度(224)。 summary.summary() 输出的摘要包含了关于模型的各种有用信息,例如模型结构、参数总数和内存占用情况。它还显示了模型每一层的输出大小和计算图。
# 统计模型参数量以及其他指标
import torchsummary as summary
summary.summary(model, (3, 224, 224))
输出:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 64, 224, 224] 1,792
ReLU-2 [-1, 64, 224, 224] 0
Conv2d-3 [-1, 64, 224, 224] 36,928
ReLU-4 [-1, 64, 224, 224] 0
MaxPool2d-5 [-1, 64, 112, 112] 0
Conv2d-6 [-1, 128, 112, 112] 73,856
ReLU-7 [-1, 128, 112, 112] 0
Conv2d-8 [-1, 128, 112, 112] 147,584
ReLU-9 [-1, 128, 112, 112] 0
MaxPool2d-10 [-1, 128, 56, 56] 0
Conv2d-11 [-1, 256, 56, 56] 295,168
ReLU-12 [-1, 256, 56, 56] 0
Conv2d-13 [-1, 256, 56, 56] 590,080
ReLU-14 [-1, 256, 56, 56] 0
Conv2d-15 [-1, 256, 56, 56] 590,080
ReLU-16 [-1, 256, 56, 56] 0
MaxPool2d-17 [-1, 256, 28, 28] 0
Conv2d-18 [-1, 512, 28, 28] 1,180,160
ReLU-19 [-1, 512, 28, 28] 0
Conv2d-20 [-1, 512, 28, 28] 2,359,808
ReLU-21 [-1, 512, 28, 28] 0
Conv2d-22 [-1, 512, 28, 28] 2,359,808
ReLU-23 [-1, 512, 28, 28] 0
MaxPool2d-24 [-1, 512, 14, 14] 0
Conv2d-25 [-1, 512, 14, 14] 2,359,808
ReLU-26 [-1, 512, 14, 14] 0
Conv2d-27 [-1, 512, 14, 14] 2,359,808
ReLU-28 [-1, 512, 14, 14] 0
Conv2d-29 [-1, 512, 14, 14] 2,359,808
ReLU-30 [-1, 512, 14, 14] 0
MaxPool2d-31 [-1, 512, 7, 7] 0
Linear-32 [-1, 4096] 102,764,544
ReLU-33 [-1, 4096] 0
Linear-34 [-1, 4096] 16,781,312
ReLU-35 [-1, 4096] 0
Linear-36 [-1, 4] 16,388
================================================================
Total params: 134,276,932
Trainable params: 134,276,932
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 218.52
Params size (MB): 512.23
Estimated Total Size (MB): 731.32
----------------------------------------------------------------
三、 训练模型
1. 编写训练函数
# 训练循环
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset) # 训练集的大小
num_batches = len(dataloader) # 批次数目, (size/batch_size,向上取整)
train_loss, train_acc = 0, 0 # 初始化训练损失和正确率
for X, y in dataloader: # 获取图片及其标签
X, y = X.to(device), y.to(device)
# 计算预测误差
pred = model(X) # 网络输出
loss = loss_fn(pred, y) # 计算网络输出和真实值之间的差距,targets为真实值,计算二者差值即为损失
# 反向传播
optimizer.zero_grad() # grad属性归零
loss.backward() # 反向传播
optimizer.step() # 每一步自动更新
# 记录acc与loss
train_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
train_loss += loss.item()
train_acc /= size
train_loss /= num_batches
return train_acc, train_loss
2. 编写测试函数
测试函数和训练函数大致相同,但是由于不进行梯度下降对网络权重进行更新,所以不需要传入优化器
def test (dataloader, model, loss_fn):
size = len(dataloader.dataset) # 测试集的大小
num_batches = len(dataloader) # 批次数目, (size/batch_size,向上取整)
test_loss, test_acc = 0, 0
# 当不进行训练时,停止梯度更新,节省计算内存消耗
with torch.no_grad():
for imgs, target in dataloader:
imgs, target = imgs.to(device), target.to(device)
# 计算loss
target_pred = model(imgs)
loss = loss_fn(target_pred, target)
test_loss += loss.item()
test_acc += (target_pred.argmax(1) == target).type(torch.float).sum().item()
test_acc /= size
test_loss /= num_batches
return test_acc, test_loss
3. 正式训练
learn_rate = 1e-4
loss_fn = nn.CrossEntropyLoss()
optmizer = torch.optim.Adam(model.parameters(), lr=learn_rate)
import copy
epoches = 40
train_acc = []
train_loss = []
test_acc = []
test_loss = []
best_acc = 0 # 设置一个最佳准确率,作为最佳模型的判别指标
for epoch in range(epoches):
model.train()
epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, optimizer)
schedulr.step() # 更新学习率(调用官方动态学习率接口时使用)
model.eval()
epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)
# 保存最佳模型到 best_model
if epoch_test_acc > best_acc:
best_acc = epoch_test_acc
best_model = copy.deepcopy(model)
train_acc.append(epoch_train_acc)
train_loss.append(epoch_test_loss)
test_acc.append(epoch_test_acc)
test_loss.append(epoch_test_loss)
# 获取当前的学习率
lr = optimizer.state_dict()['param_groups'][0]['lr']
template = ('Epoches: {:2d}, Train_acc: {:.1f}%, Train_loss: {:.3f}, Test_acc: {:.1f}%, Test_loss: {:.3f}, Lr: {:.2E}')
print(template.format(epoch+1, epoch_train_acc*100, epoch_train_loss, epoch_test_acc*100, epoch_test_loss, lr))
# 保存最佳模型到文件中
PATH = './beat_model.pth' # 保存的参数文件名
torch.save(best_model.state_dict(), PATH)
print('Done')
# 加载模型
# model.load_state_dict(torch.load(PATH, map_location=device))
输出: (vgg16网络参数量较大,以下是租用远程GPU运行的结果)
Use cuda device
Epoch: 1, Train_acc: 23.6%, Train_loss: 1.371, Test_acc: 27.5%, Test_loss: 1.222, Lr: 1.000000E-04
Epoch: 2, Train_acc: 51.1%, Train_loss: 1.005, Test_acc: 50.8%, Test_loss: 0.945, Lr: 1.000000E-04
Epoch: 3, Train_acc: 61.0%, Train_loss: 0.790, Test_acc: 60.8%, Test_loss: 0.726, Lr: 1.000000E-04
Epoch: 4, Train_acc: 68.6%, Train_loss: 0.664, Test_acc: 69.6%, Test_loss: 0.573, Lr: 1.000000E-04
Epoch: 5, Train_acc: 75.7%, Train_loss: 0.506, Test_acc: 77.5%, Test_loss: 0.475, Lr: 1.000000E-04
Epoch: 6, Train_acc: 74.5%, Train_loss: 0.522, Test_acc: 81.7%, Test_loss: 0.414, Lr: 1.000000E-04
Epoch: 7, Train_acc: 84.9%, Train_loss: 0.370, Test_acc: 87.5%, Test_loss: 0.334, Lr: 1.000000E-04
Epoch: 8, Train_acc: 96.1%, Train_loss: 0.125, Test_acc: 92.5%, Test_loss: 0.255, Lr: 1.000000E-04
Epoch: 9, Train_acc: 96.2%, Train_loss: 0.110, Test_acc: 95.8%, Test_loss: 0.089, Lr: 1.000000E-04
Epoch: 10, Train_acc: 96.7%, Train_loss: 0.080, Test_acc: 92.9%, Test_loss: 0.188, Lr: 1.000000E-04
Epoch: 11, Train_acc: 96.2%, Train_loss: 0.109, Test_acc: 92.1%, Test_loss: 0.252, Lr: 1.000000E-04
Epoch: 12, Train_acc: 94.8%, Train_loss: 0.153, Test_acc: 98.8%, Test_loss: 0.047, Lr: 1.000000E-04
Epoch: 13, Train_acc: 95.6%, Train_loss: 0.121, Test_acc: 95.8%, Test_loss: 0.115, Lr: 1.000000E-04
Epoch: 14, Train_acc: 97.5%, Train_loss: 0.063, Test_acc: 98.3%, Test_loss: 0.059, Lr: 1.000000E-04
Epoch: 15, Train_acc: 98.8%, Train_loss: 0.035, Test_acc: 97.1%, Test_loss: 0.080, Lr: 1.000000E-04
Epoch: 16, Train_acc: 97.7%, Train_loss: 0.062, Test_acc: 98.8%, Test_loss: 0.031, Lr: 1.000000E-04
Epoch: 17, Train_acc: 99.4%, Train_loss: 0.019, Test_acc: 99.6%, Test_loss: 0.027, Lr: 1.000000E-04
Epoch: 18, Train_acc: 99.1%, Train_loss: 0.033, Test_acc: 98.8%, Test_loss: 0.041, Lr: 1.000000E-04
Epoch: 19, Train_acc: 94.8%, Train_loss: 0.152, Test_acc: 92.9%, Test_loss: 0.152, Lr: 1.000000E-04
Epoch: 20, Train_acc: 98.1%, Train_loss: 0.062, Test_acc: 96.7%, Test_loss: 0.086, Lr: 1.000000E-04
Epoch: 21, Train_acc: 98.8%, Train_loss: 0.034, Test_acc: 98.8%, Test_loss: 0.031, Lr: 1.000000E-04
Epoch: 22, Train_acc: 98.5%, Train_loss: 0.033, Test_acc: 98.8%, Test_loss: 0.028, Lr: 1.000000E-04
Epoch: 23, Train_acc: 99.4%, Train_loss: 0.034, Test_acc: 98.3%, Test_loss: 0.043, Lr: 1.000000E-04
Epoch: 24, Train_acc: 99.2%, Train_loss: 0.030, Test_acc: 97.9%, Test_loss: 0.063, Lr: 1.000000E-04
Epoch: 25, Train_acc: 99.3%, Train_loss: 0.017, Test_acc: 99.2%, Test_loss: 0.029, Lr: 1.000000E-04
Epoch: 26, Train_acc: 99.6%, Train_loss: 0.016, Test_acc: 96.2%, Test_loss: 0.123, Lr: 1.000000E-04
Epoch: 27, Train_acc: 99.2%, Train_loss: 0.018, Test_acc: 98.8%, Test_loss: 0.025, Lr: 1.000000E-04
Epoch: 28, Train_acc: 99.6%, Train_loss: 0.007, Test_acc: 99.2%, Test_loss: 0.017, Lr: 1.000000E-04
Epoch: 29, Train_acc: 97.3%, Train_loss: 0.083, Test_acc: 95.8%, Test_loss: 0.148, Lr: 1.000000E-04
Epoch: 30, Train_acc: 96.6%, Train_loss: 0.112, Test_acc: 97.1%, Test_loss: 0.149, Lr: 1.000000E-04
Epoch: 31, Train_acc: 99.3%, Train_loss: 0.024, Test_acc: 98.8%, Test_loss: 0.028, Lr: 1.000000E-04
Epoch: 32, Train_acc: 99.3%, Train_loss: 0.020, Test_acc: 97.1%, Test_loss: 0.069, Lr: 1.000000E-04
Epoch: 33, Train_acc: 91.2%, Train_loss: 0.306, Test_acc: 97.1%, Test_loss: 0.111, Lr: 1.000000E-04
Epoch: 34, Train_acc: 98.2%, Train_loss: 0.060, Test_acc: 97.5%, Test_loss: 0.069, Lr: 1.000000E-04
Epoch: 35, Train_acc: 98.2%, Train_loss: 0.056, Test_acc: 98.8%, Test_loss: 0.025, Lr: 1.000000E-04
Epoch: 36, Train_acc: 99.2%, Train_loss: 0.026, Test_acc: 97.9%, Test_loss: 0.072, Lr: 1.000000E-04
Epoch: 37, Train_acc: 99.2%, Train_loss: 0.020, Test_acc: 98.8%, Test_loss: 0.042, Lr: 1.000000E-04
Epoch: 38, Train_acc: 99.6%, Train_loss: 0.014, Test_acc: 96.2%, Test_loss: 0.137, Lr: 1.000000E-04
Epoch: 39, Train_acc: 99.8%, Train_loss: 0.006, Test_acc: 98.8%, Test_loss: 0.030, Lr: 1.000000E-04
Epoch: 40, Train_acc: 99.8%, Train_loss: 0.012, Test_acc: 99.6%, Test_loss: 0.027, Lr: 1.000000E-04
Done
四、 结果可视化
1. Loss与Accuracy图
import matplotlib.pyplot as plt
#隐藏警告
import warnings
warnings.filterwarnings("ignore") #忽略警告信息
plt.rcParams['font.sans-serif'] = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号
plt.rcParams['figure.dpi'] = 100 #分辨率
epochs_range = range(epochs)
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, train_acc, label='Training Accuracy')
plt.plot(epochs_range, test_acc, label='Test Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, label='Training Loss')
plt.plot(epochs_range, test_loss, label='Test Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
2. 指定图片进行预测
from PIL import Image
classes = list(total_data.class_to_idx)
def predict_one_image(image_path, model, transform, classes):
test_img = Image.open(image_path).convert('RGB')
plt.imshow(test_img)
test_img = transform(test_img)
img = test_img.to(device).unsqueeze(0)
model.eval()
output = model(img)
_, pred = torch.max(output, 1)
pred_class = classes[pred]
image_name = str(image_path).split('/')[-1]
print(f'图片{image_name}的预测结果是 {pred_class}')
预测单张图片:
predict_one_image('./咖啡豆识别/Green/green (2).png', best_model, train_trainsforms, classes)
输出:
图片green (2).png的预测结果是 Green
预测多张图片
data_paths = './咖啡豆识别/Medium/'
image = [f for f in os.listdir(data_paths)]
for i in range(10):
image_path = os.path.join(data_paths, image[i])
predict_one_image(image_path, best_model, train_trainsforms, classes)
输出:
图片medium (1).png的预测结果是 Medium
图片medium (10).png的预测结果是 Medium
图片medium (100).png的预测结果是 Medium
图片medium (101).png的预测结果是 Medium
图片medium (102).png的预测结果是 Medium
图片medium (103).png的预测结果是 Medium
图片medium (104).png的预测结果是 Medium
图片medium (105).png的预测结果是 Medium
图片medium (106).png的预测结果是 Medium
图片medium (107).png的预测结果是 Medium
3. 模型评估
best_model.eval()
epoch_test_acc, epoch_test_loss = test(test_dl, best_model, loss_fn)
epoch_test_acc, epoch_test_loss
输出:
(0.9958333333333333, 0.027590271769440733)
# 查看是否与我们记录的最高准确率一致
epoch_test_acc
输出:
0.9958333333333333
总结
如果将优化器换成 SGD,运行结果如下:
Epoch: 1, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 2, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 3, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 4, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 5, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 6, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 7, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 8, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 9, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 10, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 11, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 12, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 13, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 14, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 15, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 16, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 17, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 18, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 19, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
Epoch: 20, Train_acc: 25.5%, Train_loss: 1.386, Test_acc: 22.9%, Test_loss: 1.386, Lr: 1.000000E-04
...
训练集和测试集loss保持不变,原因是反向传播过程中梯度消失导致。
vgg16网络为什么采用SGD优化器会梯度消失?
VGG16 模型采用了 ReLU 激活函数,这可能导致梯度消失的问题。ReLU 激活函数会对负数映射为零,这使得梯度不能有效地从较深的层传播回较浅的层。而且 VGG16 模型中的很多 ReLU 层可能会导致梯度消失的问题。
为了减少梯度消失的可能性,可以尝试以下方法:
- 使用更大的学习率:较大的学习率可能会加速收敛,并减小梯度消失的影响。
- 改变学习率调度器:使用自适应学习率策略,如 ADAM 或 RMSprop。
- 使用梯度裁剪:它将梯度限制在一定范围内,以免超出范围而消失。
- 更改激活函数:可以尝试其他的激活函数,如 Sigmoid 或 tanh 等
为什么Adam优化器可以防止梯度消失?
ADAM 优化器是一种自适应学习率方法,它可以自动选择最佳的学习率,以最小化损失函数。这意味着它可以根据每一层的梯度动态调整学习率,从而克服梯度消失或爆炸的情况。
另外,ADAM 使用指数移动平均数,可以自动缩放和中心化梯度,进一步提高学习率的选择效率。这样可以有效防止梯度消失或爆炸的情况,并在大多数情况下比 SGD 更容易收敛。