【深度学习实战】p1实现mnist手写数字识别

最新推荐文章于 2024-07-24 21:01:17 发布

Banana0840

最新推荐文章于 2024-07-24 21:01:17 发布

阅读量288

点赞数 6

文章标签：深度学习人工智能

本文链接：https://blog.csdn.net/Banana0840/article/details/137217114

版权

🍨 本文为🔗365天深度学习训练营中的学习记录博客
🍖 原作者：K同学啊

cnn

CNN，全名为卷积神经网络，英文为 Convolutional Neural Network，简称CNN。CNN 有着更好的处理图像和序列数据的能力，因为它能够自动学习图像中的特征，并提取出最有用的信息。卷积神经网络仿造生物的视知觉（visual perception）机制构建，可以进行监督学习和非监督学习，其隐含层内的卷积核参数共享和层间连接的稀疏性使得卷积神经网络能够以较小的计算量对格点化（grid-like topology）特征，例如像素和音频进行学习、有稳定的效果且对数据没有额外的特征工程（feature engineering）要求。

通俗的讲，CNN是模拟人类对于物体识别的过程。例如，区分猫和狗，我们可以通过一些特征来确定，比如体型、耳朵、尾巴等特征，然后根据这些特征的组合，来判断出是一只猫还是一只狗。

在CNN中，卷积层用来进行特征的提取，激活函数来进行特征的激活，池化层来对结果进行降维、提取主要特征，减少计算量。

cnn结构

在这里插入图片描述

流程

读取数据集
数据集分批
数据可视化显示
模型构建
训练函数
测试函数
模型训练
结果输出

代码

import torch
import torch.nn as nn
import matplotlib.pyplot as plt
import torchvision
import numpy as np
import torch.nn.functional as F
from torchinfo import summary
import warnings

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


# 1 读取数据集
train_ds = torchvision.datasets.MNIST('data', train=True, transform=torchvision.transforms.ToTensor(), download=True)
test_ds = torchvision.datasets.MNIST('data', train=False, transform=torchvision.transforms.ToTensor(), download=True)

batch_size = 32

# 2 数据集分批
train_dl = torch.utils.data.DataLoader(train_ds, batch_size=batch_size, shuffle=True)
test_dl = torch.utils.data.DataLoader(test_ds, batch_size=batch_size)

imgs, labels = next(iter(train_dl))

# 3 数据可视化显示
plt.figure(figsize=(20, 5))
for i, imgs in enumerate(imgs[:20]):
    npimg = np.squeeze(imgs.numpy())
    plt.subplot(2, 10, i + 1)
    plt.imshow(npimg, cmap=plt.cm.binary)
    plt.axis('off')
# plt.show()
print(imgs.shape)

num_class = 10

# 4 模型构建
class Model(nn.Module):

    # 网络构建
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3)
        self.pool1 = nn.MaxPool2d(2)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3)
        self.pool2 = nn.MaxPool2d(2)
        self.fc1 = nn.Linear(in_features=1600, out_features=64)
        self.fc2 = nn.Linear(in_features=64, out_features=num_class)

    # 前向传播
    def forward(self, x):
        x = self.pool1(F.relu(self.conv1(x)))
        x = self.pool2(F.relu(self.conv2(x)))
        x = torch.flatten(x, start_dim=1)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x


model = Model().to(device)
summary(model, input_size=(32, 1, 28, 28))


# 5 训练函数
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)

    train_loss, train_acc = 0, 0

    for X, y in dataloader:
        X, y = X.to(device), y.to(device)

        # 计算预测误差
        pred = model(X)
        loss = loss_fn(pred, y)

        # 反向传播
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # 记录acc与loss
        train_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
        train_loss += loss.item()

    train_acc /= size
    train_loss /= num_batches

    return train_acc, train_loss

# 5 测试函数
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, test_acc = 0, 0

    # 不进行训练时，停止梯度更新，节省计算内存消耗
    with torch.no_grad():
        for imgs, target in dataloader:
            imgs, target = imgs.to(device), target.to(device)

            # 计算loss
            target_pred = model(imgs)
            loss = loss_fn(target_pred, target)

            test_loss += loss.item()
            test_acc += (target_pred.argmax(1) == target).type(torch.float).sum().item()

    test_acc /= size
    test_loss /= num_batches

    return test_acc, test_loss


# 6 模型训练
loss_fn = nn.CrossEntropyLoss()  # 创建损失函数
learn_rate = 1e-2  # 学习率
opt = torch.optim.SGD(model.parameters(), lr=learn_rate)

epochs = 5
train_loss, train_acc = [], []
test_loss, test_acc = [], []

for epoch in range(epochs):
    model.train()
    epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, opt)

    model.eval()
    epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)

    train_acc.append(epoch_train_acc)
    train_loss.append(epoch_train_loss)
    test_acc.append(epoch_test_acc)
    test_loss.append(epoch_test_loss)

    template = ('Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}')
    print(template.format(epoch + 1, epoch_train_acc * 100, epoch_train_loss, epoch_test_acc * 100, epoch_test_loss))

print('Done')

# 7 结果输出
warnings.filterwarnings("ignore")  # 忽略警告信息
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
plt.rcParams['figure.dpi'] = 100  # 分辨率

epochs_range = range(epochs)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)

plt.plot(epochs_range, train_acc, label='Training Accuracy')
plt.plot(epochs_range, test_acc, label='Test Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, label='Training Loss')
plt.plot(epochs_range, test_loss, label='Test Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

模型

在这里插入图片描述

输出

(d2l) PS C:\Users\dlt\Desktop\py-learn> & C:/miniconda3/envs/d2l/python.exe c:/Users/dlt/Desktop/py-learn/0_365_train_camp/1_pytorch/p1.py
torch.Size([1, 28, 28])
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Model                                    [32, 10]                  --
├─Conv2d: 1-1                            [32, 32, 26, 26]          320
├─MaxPool2d: 1-2                         [32, 32, 13, 13]          --
├─Conv2d: 1-3                            [32, 64, 11, 11]          18,496
├─MaxPool2d: 1-4                         [32, 64, 5, 5]            --
├─Linear: 1-5                            [32, 64]                  102,464
├─Linear: 1-6                            [32, 10]                  650
==========================================================================================
Total params: 121,930
Trainable params: 121,930
Non-trainable params: 0
Total mult-adds (M): 81.84
==========================================================================================
Input size (MB): 0.10
Forward/backward pass size (MB): 7.54
Params size (MB): 0.49
Estimated Total Size (MB): 8.13
==========================================================================================
Epoch: 1, Train_acc:75.2%, Train_loss:0.799, Test_acc:93.5%, Test_loss:0.218
Epoch: 2, Train_acc:94.5%, Train_loss:0.183, Test_acc:96.3%, Test_loss:0.126
Epoch: 3, Train_acc:96.4%, Train_loss:0.118, Test_acc:97.5%, Test_loss:0.083
Epoch: 4, Train_acc:97.2%, Train_loss:0.092, Test_acc:97.6%, Test_loss:0.073
Epoch: 5, Train_acc:97.7%, Train_loss:0.077, Test_acc:98.1%, Test_loss:0.061
Done

模型评估

在这里插入图片描述

参考

https://blog.csdn.net/qq1515312832/article/details/136242604?spm=1001.2014.3001.5501

Banana0840

关注

6
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
【深度学习实战】p1实现mnist手写数字识别

卷积神经网络仿造生物的视知觉（visual perception）机制构建，可以进行监督学习和非监督学习，其隐含层内的卷积核参数共享和层间连接的稀疏性使得卷积神经网络能够以较小的计算量对格点化（grid-like topology）特征，例如像素和音频进行学习、有稳定的效果且对数据没有额外的特征工程（feature engineering）要求。通俗的讲，CNN是模拟人类对于物体识别的过程。在CNN中，卷积层用来进行特征的提取，激活函数来进行特征的激活，池化层来对结果进行降维、提取主要特征，减少计算量。
复制链接

扫一扫