- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
- 🍦 参考文章:365天深度学习训练营-第6周:好莱坞明星识别(训练营内部成员可读)
- 🍖 原作者:K同学啊 | 接辅导、项目定制
- 🚀 文章来源:K同学的学习圈子
环境配置:
Python version: 3.8.17 (default, Jul 5 2023, 20:44:21) [MSC v.1916 64 bit (AMD64)]
Pytorch version: 2.0.1+cu117
Torchvision version: 0.15.2+cu117
CUDA is available: True
Using device: cuda
本次数据集由K同学提供,如有需要,请联系K同学。
一、前期准备
1.设置GPU
导入一些即将用到的库,定义一个cuda设备并进行打印。
import torch
import torchvision
from PIL import Image
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import pathlib
if __name__ == '__main__':
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
打印结果如下:
Using device: cuda
2.导入数据并进行预处理
将K同学所提供的数据集下载并修改路径,方便后面引用。
● 第一步:使用pathlib.Path()函数将字符串类型的文件夹路径转换为pathlib.Path对象。
● 第二步:使用glob()方法获取data_dir路径下的所有文件路径,并以列表形式存储在data_paths中。
● 第三步:通过split()函数对data_paths中的每个文件路径执行分割操作,获得各个文件所属的类别名称,并存储在classeNames中
● 第四步:打印classNames列表,显示每个文件所属的类别名称。
data_dir = './data-6/'
data_dir = pathlib.Path(data_dir)
data_paths = list(data_dir.glob('*'))
classNames = [str(path).split("\\")[1] for path in data_paths]
print(classNames)
打印结果:
['Angelina Jolie', 'Brad Pitt', 'Denzel Washington', 'Hugh Jackman', 'Jennifer Lawrence', 'Johnny Depp', 'Kate Winslet', 'Leonardo DiCaprio', 'Megan Fox', 'Natalie Portman', 'Nicole Kidman', 'Robert Downey Jr', 'Sandra Bullock', 'Scarlett Johansson', 'Tom Cruise', 'Tom Hanks', 'Will Smith']
进行数据预处理:将图片裁剪为统一尺寸并转化为张量,并将张量进行归一化,然后进行标准化处理,再使用datasets.ImageFolder 加载图像数据集,按照顺序将类别定义索引;使用ImageFolder加载数据,设置batch_size大小,并使用shuffle函数将顺序打乱。
train_transforms = transforms.Compose([
transforms.Resize([224, 224]),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
test_transform = transforms.Compose([
transforms.Resize([224, 224]),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
total_data = torchvision.datasets.ImageFolder(data_dir, transform=train_transforms)
total_data.class_to_idx
train_dataset = datasets.ImageFolder("./data-6/", transform=train_transforms)
test_dataset = datasets.ImageFolder("./data-6/", transform=train_transforms)
total_data = torchvision.datasets.ImageFolder(data_dir, transform = train_transforms)
print(total_data)
print(total_data.class_to_idx)
train_size = int(0.8 * len(total_data))
test_size = len(total_data) - train_size
print(train_dataset)
print(test_dataset)
#创建数据加载器(dataloader)
batch_size = 32
train_dl = torch.utils.data.DataLoader(train_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=0)
test_dl = torch.utils.data.DataLoader(test_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=0)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size = batch_size, shuffle = True, num_workers = 4)
test_dataloader = torch.utils.data.DataLoader(test_dataset , batch_size = batch_size, shuffle = True, num_workers = 4)
打印结果如下:
Dataset ImageFolder
Number of datapoints: 1800
Root location: ./data-6/
StandardTransform
Transform: Compose(
Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=warn)
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
Dataset ImageFolder
Number of datapoints: 1800
Root location: ./data-6/
StandardTransform
Transform: Compose(
Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=warn)
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
二、构建网络模型
1.搭建CNN网络
import torch.nn as nn
import torch.nn.functional as F
class Network_bn(nn.Module):
def __init__(self):
super(Network_bn, self).__init__()
"""
nn.Conv2d()函数:
第一个参数(in_channels)是输入的channel数量
第二个参数(out_channels)是输出的channel数量
第三个参数(kernel_size)是卷积核大小
第四个参数(stride)是步长,默认为1
第五个参数(padding)是填充大小,默认为0
"""
self.conv1 = nn.Conv2d(in_channels=3, out_channels=12, kernel_size=5, stride=1, padding=0)
self.bn1 = nn.BatchNorm2d(12)
self.conv2 = nn.Conv2d(in_channels=12, out_channels=12, kernel_size=5, stride=1, padding=0)
self.bn2 = nn.BatchNorm2d(12)
self.pool = nn.MaxPool2d(2, 2)
self.conv4 = nn.Conv2d(in_channels=12, out_channels=24, kernel_size=5, stride=1, padding=0)
self.bn4 = nn.BatchNorm2d(24)
self.conv5 = nn.Conv2d(in_channels=24, out_channels=24, kernel_size=5, stride=1, padding=0)
self.bn5 = nn.BatchNorm2d(24)
self.fc1 = nn.Linear(24 * 50 * 50, len(classNames))
def forward(self, x):
x = F.relu(self.bn1(self.conv1(x)))
x = F.relu(self.bn2(self.conv2(x)))
x = self.pool(x)
x = F.relu(self.bn4(self.conv4(x)))
x = F.relu(self.bn5(self.conv5(x)))
x = self.pool(x)
x = x.view(-1, 24 * 50 * 50)
x = self.fc1(x)
return x
打印设备与网络结构
# 定义训练函数
print("Using {} device".format(device))
model = Network_bn().to(device)
print(model)
如下:
Using cuda device
Network_bn(
(conv1): Conv2d(3, 12, kernel_size=(5, 5), stride=(1, 1))
(bn1): BatchNorm2d(12, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(12, 12, kernel_size=(5, 5), stride=(1, 1))
(bn2): BatchNorm2d(12, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv4): Conv2d(12, 24, kernel_size=(5, 5), stride=(1, 1))
(bn4): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv5): Conv2d(24, 24, kernel_size=(5, 5), stride=(1, 1))
(bn5): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(fc1): Linear(in_features=60000, out_features=17, bias=True)
)
2.动态学习率
调整动态学习率的意义在于让网络朝着正确的方向加速收敛。不至于在最低点附近震荡,也不至于一直缓慢地朝着最低点的方向前进。
def adjust_learning_rate(optimizer, epoch, start_lr):
lr = start_lr * (0.92 ** (epoch // 2))
for param_group in optimizer.param_groups:
param_group['lr'] = lr
learn_rate = 1e-4 # 初始学习率
optimizer = torch.optim.SGD(model.parameters(), lr=learn_rate)
3.训练函数与测试函数
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset) # 训练集的大小
num_batches = len(dataloader) # 批次数目, (size/batch_size,向上取整)
train_loss, train_acc = 0, 0 # 初始化训练损失和正确率
for X, y in dataloader: # 获取图片及其标签
X, y = X.to(device), y.to(device)
# 计算预测误差
pred = model(X) # 网络输出
loss = loss_fn(pred, y) # 计算网络输出和真实值之间的差距,targets为真实值,计算二者差值即为损失
# 反向传播
optimizer.zero_grad() # grad属性归零
loss.backward() # 反向传播
optimizer.step() # 每一步自动更新
# 记录acc与loss
train_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
train_loss += loss.item()
train_acc /= size
train_loss /= num_batches
return train_acc, train_loss
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset) # 测试集的大小
num_batches = len(dataloader) # 批次数目, (size/batch_size,向上取整)
test_loss, test_acc = 0, 0
# 当不进行训练时,停止梯度更新,节省计算内存消耗
with torch.no_grad():
for imgs, target in dataloader:
imgs, target = imgs.to(device), target.to(device)
# 计算loss
target_pred = model(imgs)
loss = loss_fn(target_pred, target)
test_loss += loss.item()
test_acc += (target_pred.argmax(1) == target).type(torch.float).sum().item()
test_acc /= size
test_loss /= num_batches
return test_acc, test_loss
4.正式训练
loss_fn = nn.CrossEntropyLoss() # 创建损失汉书
epochs = 50
train_loss = []
train_acc = []
test_loss = []
test_acc = []
for epoch in range(epochs):
# 更新学习率(使用自定义学习率使用)
adjust_learning_rate(optimizer, epoch, learn_rate)
model.train()
epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, optimizer)
model.eval()
epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)
train_acc.append(epoch_train_acc)
train_loss.append(epoch_train_loss)
test_acc.append(epoch_test_acc)
test_loss.append(epoch_test_loss)
# 获取当前的学习率
lr = optimizer.state_dict()['param_groups'][0]['lr']
template = ('Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}, Lr:{:.2E}')
print(template.format(epoch + 1, epoch_train_acc * 100, epoch_train_loss,
epoch_test_acc * 100, epoch_test_loss, lr))
print('Done')
运行50epochs,结果如下:
Epoch: 1, Train_acc:9.3%, Train_loss:2.889, Test_acc:15.2%, Test_loss:2.719, Lr:1.00E-04
Epoch: 2, Train_acc:15.9%, Train_loss:2.682, Test_acc:22.1%, Test_loss:2.547, Lr:1.00E-04
Epoch: 3, Train_acc:22.6%, Train_loss:2.511, Test_acc:27.7%, Test_loss:2.386, Lr:9.20E-05
Epoch: 4, Train_acc:27.9%, Train_loss:2.382, Test_acc:33.5%, Test_loss:2.255, Lr:9.20E-05
Epoch: 5, Train_acc:33.4%, Train_loss:2.247, Test_acc:37.6%, Test_loss:2.147, Lr:8.46E-05
Epoch: 6, Train_acc:37.2%, Train_loss:2.148, Test_acc:43.9%, Test_loss:2.054, Lr:8.46E-05
Epoch: 7, Train_acc:42.5%, Train_loss:2.057, Test_acc:48.7%, Test_loss:1.959, Lr:7.79E-05
Epoch: 8, Train_acc:46.8%, Train_loss:1.958, Test_acc:52.5%, Test_loss:1.862, Lr:7.79E-05
Epoch: 9, Train_acc:51.6%, Train_loss:1.881, Test_acc:53.7%, Test_loss:1.799, Lr:7.16E-05
Epoch:10, Train_acc:53.7%, Train_loss:1.817, Test_acc:57.1%, Test_loss:1.745, Lr:7.16E-05
Epoch:11, Train_acc:57.9%, Train_loss:1.744, Test_acc:60.2%, Test_loss:1.682, Lr:6.59E-05
Epoch:12, Train_acc:59.4%, Train_loss:1.695, Test_acc:64.0%, Test_loss:1.635, Lr:6.59E-05
Epoch:13, Train_acc:63.1%, Train_loss:1.630, Test_acc:65.3%, Test_loss:1.585, Lr:6.06E-05
Epoch:14, Train_acc:64.3%, Train_loss:1.593, Test_acc:68.5%, Test_loss:1.512, Lr:6.06E-05
Epoch:15, Train_acc:66.1%, Train_loss:1.538, Test_acc:71.1%, Test_loss:1.474, Lr:5.58E-05
Epoch:16, Train_acc:68.3%, Train_loss:1.498, Test_acc:72.9%, Test_loss:1.435, Lr:5.58E-05
Epoch:17, Train_acc:70.9%, Train_loss:1.464, Test_acc:74.6%, Test_loss:1.401, Lr:5.13E-05
Epoch:18, Train_acc:72.0%, Train_loss:1.421, Test_acc:75.2%, Test_loss:1.363, Lr:5.13E-05
Epoch:19, Train_acc:74.6%, Train_loss:1.383, Test_acc:76.2%, Test_loss:1.324, Lr:4.72E-05
Epoch:20, Train_acc:74.6%, Train_loss:1.345, Test_acc:77.0%, Test_loss:1.317, Lr:4.72E-05
Epoch:21, Train_acc:75.3%, Train_loss:1.327, Test_acc:78.2%, Test_loss:1.282, Lr:4.34E-05
Epoch:22, Train_acc:77.7%, Train_loss:1.299, Test_acc:79.5%, Test_loss:1.258, Lr:4.34E-05
Epoch:23, Train_acc:78.8%, Train_loss:1.272, Test_acc:80.4%, Test_loss:1.233, Lr:4.00E-05
Epoch:24, Train_acc:79.7%, Train_loss:1.240, Test_acc:80.6%, Test_loss:1.227, Lr:4.00E-05
Epoch:25, Train_acc:79.8%, Train_loss:1.224, Test_acc:81.9%, Test_loss:1.180, Lr:3.68E-05
Epoch:26, Train_acc:80.1%, Train_loss:1.201, Test_acc:81.8%, Test_loss:1.162, Lr:3.68E-05
Epoch:27, Train_acc:81.3%, Train_loss:1.175, Test_acc:83.4%, Test_loss:1.143, Lr:3.38E-05
Epoch:28, Train_acc:82.3%, Train_loss:1.168, Test_acc:84.4%, Test_loss:1.112, Lr:3.38E-05
Epoch:29, Train_acc:83.5%, Train_loss:1.149, Test_acc:83.9%, Test_loss:1.110, Lr:3.11E-05
Epoch:30, Train_acc:83.3%, Train_loss:1.130, Test_acc:84.9%, Test_loss:1.102, Lr:3.11E-05
Epoch:31, Train_acc:84.2%, Train_loss:1.113, Test_acc:84.9%, Test_loss:1.080, Lr:2.86E-05
Epoch:32, Train_acc:84.2%, Train_loss:1.096, Test_acc:85.7%, Test_loss:1.047, Lr:2.86E-05
Epoch:33, Train_acc:85.1%, Train_loss:1.088, Test_acc:85.9%, Test_loss:1.052, Lr:2.63E-05
Epoch:34, Train_acc:85.2%, Train_loss:1.073, Test_acc:85.7%, Test_loss:1.044, Lr:2.63E-05
Epoch:35, Train_acc:85.7%, Train_loss:1.069, Test_acc:86.6%, Test_loss:1.047, Lr:2.42E-05
Epoch:36, Train_acc:87.2%, Train_loss:1.046, Test_acc:86.8%, Test_loss:1.021, Lr:2.42E-05
Epoch:37, Train_acc:86.2%, Train_loss:1.046, Test_acc:87.4%, Test_loss:1.030, Lr:2.23E-05
Epoch:38, Train_acc:87.4%, Train_loss:1.029, Test_acc:87.8%, Test_loss:0.992, Lr:2.23E-05
Epoch:39, Train_acc:86.6%, Train_loss:1.024, Test_acc:88.4%, Test_loss:0.997, Lr:2.05E-05
Epoch:40, Train_acc:87.8%, Train_loss:1.008, Test_acc:88.1%, Test_loss:0.982, Lr:2.05E-05
Epoch:41, Train_acc:88.7%, Train_loss:1.003, Test_acc:88.3%, Test_loss:0.981, Lr:1.89E-05
Epoch:42, Train_acc:87.8%, Train_loss:0.998, Test_acc:88.4%, Test_loss:0.976, Lr:1.89E-05
Epoch:43, Train_acc:88.5%, Train_loss:0.989, Test_acc:89.0%, Test_loss:0.963, Lr:1.74E-05
Epoch:44, Train_acc:89.3%, Train_loss:0.976, Test_acc:89.1%, Test_loss:0.959, Lr:1.74E-05
Epoch:45, Train_acc:88.6%, Train_loss:0.973, Test_acc:89.0%, Test_loss:0.963, Lr:1.60E-05
Epoch:46, Train_acc:88.4%, Train_loss:0.975, Test_acc:89.7%, Test_loss:0.938, Lr:1.60E-05
Epoch:47, Train_acc:88.4%, Train_loss:0.963, Test_acc:89.8%, Test_loss:0.925, Lr:1.47E-05
Epoch:48, Train_acc:89.3%, Train_loss:0.952, Test_acc:89.6%, Test_loss:0.931, Lr:1.47E-05
Epoch:49, Train_acc:89.1%, Train_loss:0.947, Test_acc:89.8%, Test_loss:0.918, Lr:1.35E-05
Epoch:50, Train_acc:89.6%, Train_loss:0.947, Test_acc:90.4%, Test_loss:0.916, Lr:1.35E-05
Done
5.可视化
import matplotlib.pyplot as plt
# 隐藏警告
import warnings
warnings.filterwarnings("ignore") # 忽略警告信息
plt.rcParams['font.sans-serif'] = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号
plt.rcParams['figure.dpi'] = 100 # 分辨率
epochs_range = range(epochs)
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, train_acc, label='Training Accuracy')
plt.plot(epochs_range, test_acc, label='Test Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, label='Training Loss')
plt.plot(epochs_range, test_loss, label='Test Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
结果如下:
附:
真的很抱歉,这周事情比较多,周五下午才开始做这周的任务,在进行调用时不知道为什么老是调用不了要求的使用categorical_crossentropy(多分类的对数损失函数)完成本次选题,只能慌慌张张沿用上一次的模型。我会在周末的时候学习并掌握多分类的对数损失函数,尝试自己搭建VGG-16网络框架,并进行优化。