深度学习实战(一):
前言
使用pytorch对Cifar10数据集进行分类任务。
一、Cifar10数据集
该数据集共有60000张彩色图像,这些图像是32*32,分为10个类,每类6000张图。这里面有50000张用于训练,构成了5个训练批,每一批10000张图;另外10000用于测试,单独构成一批。测试批的数据里,取自10类中的每一类,每一类随机取1000张。抽剩下的就随机排列组成了训练批。注意一个训练批中的各类图像并不一定数量相同,总的来看训练批,每一类都有5000张图。
二、使用步骤
1.引入所需的库
代码如下(示例):
import torch
import torchvision
import numpy as np
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torch import nn
from tqdm import tqdm
from torch.utils.data import DataLoader
2.下载数据集
代码如下(示例):
#训练数据集
train_dataset=torchvision.datasets.CIFAR10(root='./dataset',train=True,transform=torchvision.transforms.ToTensor(),download=True)
#测试数据集
test_dataset=torchvision.datasets.CIFAR10(root='./dataset',train=False,transform=torchvision.transforms.ToTensor(),download=True)
#利用dataloader来加载数据集
train_dataloader=DataLoader(train_dataset,batch_size=64)
test_dataloader=DataLoader(test_dataset,batch_size=64)
3.设置网络模型
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 64, 5)
self.pool = nn.MaxPool2d(3, 3)
self.fc1 = nn.Linear(64 * 9 * 9, 100)
self.fc2 = nn.Linear(100, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = x.view(-1, 64 * 9 * 9)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
4.训练
# 初始化网络
net = Net().to(device)
epochs = 100
# 选择损失函数和分类器
loss_fn = nn.CrossEntropyLoss().to(device)
optimizer = optim.SGD(net.parameters(), lr=1e-3, momentum=0.9)
train_losses, test_losses, train_acc, test_acc = [], [], [], []
开始训练:
for epoch in range(epochs): # 训练10个epoch
train_loss = 0.0
train_correct = 0
for i, (inputs, labels) in enumerate(tqdm(train_dataloader, desc="Epoch {}".format(epoch), unit="batch")):
inputs, labels = inputs.to(device), labels.to(device)
input
optimizer.zero_grad()
outputs = net(inputs)
predicted = torch.argmax(outputs, dim=1)
train_correct += (predicted == labels).sum().item()
loss = loss_fn(outputs, labels)
loss.backward()
optimizer.step()
train_loss += loss.item()
print('Epoch %d, Loss: %.3f' % (epoch + 1, train_loss / (i + 1)))
train_losses.append(train_loss / len(train_dataset))
train_acc.append(train_correct / len(train_dataset))
# 测试模型
test_loss = 0.0
test_correct = 0
with torch.no_grad():
for images, labels in test_dataloader:
images, labels = images.to(device), labels.to(device)
outputs = net(images)
predicted = torch.argmax(outputs, dim=1)
test_correct += (predicted == labels).sum().item()
loss = loss_fn(outputs, labels)
test_loss += loss.item()
test_losses.append(test_loss / len(test_dataset))
test_acc.append(test_correct / len(test_dataset))
画训练集和测试集中loss和准确率在训练中的变化情况
# 假设有 train_losses, test_losses, train_acc, test_acc 这四个列表
# 创建画布和子图
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))
# 第一个子图 - Loss
ax1.plot(train_losses, label='Training Loss', color='blue')
ax1.plot(test_losses, label='Testing Loss', color='red')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Loss')
ax1.legend()
ax1.set_title('Training and Testing Loss')
# 第二个子图 - Accuracy
ax2.plot(train_acc, label='Training Accuracy', color='blue')
ax2.plot(test_acc, label='Testing Accuracy', color='red')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Accuracy')
ax2.legend()
ax2.set_title('Training and Testing Accuracy')
plt.tight_layout() # 调整子图的布局,以避免重叠
plt.show()
5.结果
三、小结
本篇旨在向大家介绍Cifar10数据集以及深度学习的基础训练框架,所以只用了简单的神经网络,如若大家感兴趣的话,后续会对其进行更新和介绍一些新的知识。