[Pytorch]手写数字的识别

最新推荐文章于 2024-05-21 10:59:25 发布

GeorgeKingsman

最新推荐文章于 2024-05-21 10:59:25 发布

阅读量1.7k

点赞数 21

分类专栏：机器学习神经网络文章标签： pytorch 人工智能 python

本文链接：https://blog.csdn.net/KINGSMAN226927/article/details/136104419

版权

机器学习神经网络专栏收录该内容

1 篇文章 0 订阅

订阅专栏

本文介绍了神经网络的基本概念，重点讲解了PyTorch框架在神经网络中的应用，包括动态计算图、反向传播训练过程以及在手写数字识别任务中的实战示例。

摘要由CSDN通过智能技术生成

神经网络

今天学习了一个最为基本的神经网络，首先我们来认识一下神经网络：

神经网络（Neural Network）是一种模仿生物神经网络结构和功能的数学模型，用于机器学习和人工智能领域。它由大量的人工神经元组成，这些神经元通过连接相互传递信息，形成了一个网络。每个神经元接收输入，并通过权重加权后将其传递给下一层的神经元或输出。神经网络可以学习输入与输出之间的复杂关系，使得在给定输入时能够产生相应的输出。

在神经网络中，通常有三类层：

输入层（Input Layer）：接收输入数据的层，每个输入节点对应输入数据的一个特征。
隐藏层（Hidden Layer）：在输入层和输出层之间的层，用于处理输入数据并提取特征。神经网络的深度通常由隐藏层的数量决定。
输出层（Output Layer）：产生神经网络的输出，输出层的节点数通常取决于问题的类型，如分类问题的输出层节点数等于类别数，回归问题通常只有一个输出节点。

神经网络的训练过程通常采用反向传播算法（Backpropagation），它利用梯度下降优化算法调整神经网络中连接的权重，以最小化模型预测与实际标签之间的误差。这样，在经过足够多次的迭代训练之后，神经网络就能够学习到输入与输出之间的复杂关系，并在面对新的输入时做出准确的预测或分类。

Pytorch

PyTorch是一个开源的深度学习框架，由Facebook的人工智能研究团队开发和维护。它提供了丰富的工具和库，用于构建、训练和部署深度神经网络模型。

PyTorch的主要特点包括：

动态计算图：PyTorch使用动态计算图，这意味着计算图是根据代码执行过程中的实际情况动态构建的。这使得编写和调试代码更加直观和灵活。
易于学习和使用：PyTorch采用Python作为主要的编程接口，因此对于熟悉Python语言的开发者来说，学习和使用PyTorch相对较容易。
灵活性：PyTorch提供了丰富的高级API，同时也支持底层的张量操作，使得用户可以在需要时灵活地控制模型的细节。
动态调试：由于PyTorch使用动态计算图，因此它支持动态调试，可以在运行时查看和修改计算图，方便调试代码。
强大的社区支持：PyTorch拥有庞大的社区支持，有着丰富的文档、教程和示例代码，使得用户可以快速上手并解决问题。
广泛的应用领域：PyTorch在图像处理、自然语言处理、语音识别等领域都有广泛的应用，被许多研究人员和工程师用于构建和训练各种类型的深度学习模型。

使用pytorch来进行神经网络的训练是目前非常常用的，其中安装pytorch的过程比较复杂，读者可以搜索本站的相关文章，有很多详细的步骤。

实战：手写数字识别

其中i表示前一层的节点序号，j表示本层的节点序号，k和k+1表示网络层数，构建各层网络，图像信息传播到最后一层（输出层）

通过softmax归一化，我们得到最原始的概率数值，通过训练调整网络参数ab来减小训练的概率差值

神经网络的本质就是一个数学函数，训练的过程就是调整函数中的参数。

安装需要的python包

pip install numpy torch torchvision matplotlib

等他慢慢安装完

代码部分

class Net(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = torch.nn.Linear(28*28, 64)
        self.fc2 = torch.nn.Linear(64, 64)
        self.fc3 = torch.nn.Linear(64, 64)
        self.fc4 = torch.nn.Linear(64, 10)
        # 四个全链接层

    def forward(self, x):
        x = torch.nn.functional.relu(self.fc1(x))
        x = torch.nn.functional.relu(self.fc2(x))
        x = torch.nn.functional.relu(self.fc3(x))
        x = torch.nn.functional.log_softmax(self.fc4(x), dim=1)
        return x
    # 定义前向传播过程

Net类神经网络的主体，包含四个全链接层，输入为28*28像素尺寸的图像，中间三层都放了64个节点，输出为10个数字类别；

forward函数定义了前向传播过程，x是图像输入，先做全链接线性计算，再套上激活函数，输出层通过softmax归一化，log_softmax可以提高计算的稳定性。

def get_data_loader(is_train):
    to_tensor = transforms.Compose([transforms.ToTensor()]) # 多维数组 即向量
    data_set = MNIST("", is_train, transform=to_tensor, download=True)
    return DataLoader(data_set, batch_size=15, shuffle=True)
# 导入数据

get_data_loader函数用来导入数据，下载MNIST数据集。

def evaluate(test_data, net):
    n_correct = 0
    n_total = 0
    with torch.no_grad():
        for(x, y) in test_data:
            outputs = net.forward(x.view(-1, 28*28))
            for i, output in enumerate(outputs):
                if torch.argmax(output) == y[i]:
                    n_correct += 1
                n_total += 1
    return n_correct / n_total
# 评估神经网络的识别正确率

evaluate函数用来评估神经网络的识别正确率，从测试集中按批次取出数据，计算神经网络预测值，最后返回正确率。

def main():

    train_data = get_data_loader(is_train=True)
    test_data = get_data_loader(is_train=False)
    net = Net()
    # 初始化神经网络

    print("initial accuracy:", evaluate(test_data, net)) # 打印初始准确率 不出意外应在0.1左右

    optimizer = torch.optim.Adam(net.parameters(), lr=0.001)
    for epoch in range(2):
        for (x, y) in train_data:
            net.zero_grad() # 初始化
            output = net.forward(x.view(-1, 28*28)) # 正向传播
            loss = torch.nn.functional.nll_loss(output, y) # 计算差值 nll_loss是个对数损失函数，匹配log_softmax的计算
            loss.backward() # 反向误差传播
            optimizer.step() # 优化网络参数
    #训练神经网络
        print("epoch", epoch, "accuracy:", evaluate(test_data, net)) # 打印每个训练轮次的准确率

    for (n, (x, _)) in enumerate(test_data):
        if n > 3:
            break
        predict = torch.argmax(net.forward(x[0].view(-1, 28*28)))
        plt.figure(n)
        plt.imshow(x[0].view(28, 28))
        plt.title("Prediction: " + str(int(predict)))
    plt.show()
    # 训练完后 随机抽取三张图象 显示结果

主函数中，先导入训练集测试集，初始化神经网络，一开始先打印最初的预测正确率，应该是在0.1左右，因为有十个数字。

每个epoch都训练一个轮次，每轮训练都打印相应的正确率。

最后随机抽取图片显示预测结果。

完整代码如下：

# !/user/bin/env python3
# -*- coding: utf-8 -*-
import torch
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.datasets import MNIST
import matplotlib.pyplot as plt

class Net(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = torch.nn.Linear(28*28, 64)
        self.fc2 = torch.nn.Linear(64, 64)
        self.fc3 = torch.nn.Linear(64, 64)
        self.fc4 = torch.nn.Linear(64, 10)
        # 四个全链接层

    def forward(self, x):
        x = torch.nn.functional.relu(self.fc1(x))
        x = torch.nn.functional.relu(self.fc2(x))
        x = torch.nn.functional.relu(self.fc3(x))
        x = torch.nn.functional.log_softmax(self.fc4(x), dim=1)
        return x
    # 定义前向传播过程

def get_data_loader(is_train):
    to_tensor = transforms.Compose([transforms.ToTensor()]) # 多维数组 即向量
    data_set = MNIST("", is_train, transform=to_tensor, download=True)
    return DataLoader(data_set, batch_size=15, shuffle=True)
# 导入数据

def evaluate(test_data, net):
    n_correct = 0
    n_total = 0
    with torch.no_grad():
        for(x, y) in test_data:
            outputs = net.forward(x.view(-1, 28*28))
            for i, output in enumerate(outputs):
                if torch.argmax(output) == y[i]:
                    n_correct += 1
                n_total += 1
    return n_correct / n_total
# 评估神经网络的识别正确率

def main():

    train_data = get_data_loader(is_train=True)
    test_data = get_data_loader(is_train=False)
    net = Net()
    # 初始化神经网络

    print("initial accuracy:", evaluate(test_data, net)) # 打印初始准确率 不出意外应在0.1左右

    optimizer = torch.optim.Adam(net.parameters(), lr=0.001)
    for epoch in range(2):
        for (x, y) in train_data:
            net.zero_grad() # 初始化
            output = net.forward(x.view(-1, 28*28)) # 正向传播
            loss = torch.nn.functional.nll_loss(output, y) # 计算差值 nll_loss是个对数损失函数，匹配log_softmax的计算
            loss.backward() # 反向误差传播
            optimizer.step() # 优化网络参数
    #训练神经网络
        print("epoch", epoch, "accuracy:", evaluate(test_data, net)) # 打印每个训练轮次的准确率

    for (n, (x, _)) in enumerate(test_data):
        if n > 3:
            break
        predict = torch.argmax(net.forward(x[0].view(-1, 28*28)))
        plt.figure(n)
        plt.imshow(x[0].view(28, 28))
        plt.title("Prediction: " + str(int(predict)))
    plt.show()
    # 训练完后 随机抽取三张图象 显示结果

if __name__ == "__main__":
    main()

输出内容：（先对数据集进行下载，在训练评估正确率）

D:\Pycharm\python3.10\python.exe D:\Pycharm\pythoncode\NeuralNetwork\RecognizeNum.py 
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to MNIST\raw\train-images-idx3-ubyte.gz
100.0%
Extracting MNIST\raw\train-images-idx3-ubyte.gz to MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to MNIST\raw\train-labels-idx1-ubyte.gz
100.0%
Extracting MNIST\raw\train-labels-idx1-ubyte.gz to MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to MNIST\raw\t10k-images-idx3-ubyte.gz
100.0%
Extracting MNIST\raw\t10k-images-idx3-ubyte.gz to MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to MNIST\raw\t10k-labels-idx1-ubyte.gz
100.0%
Extracting MNIST\raw\t10k-labels-idx1-ubyte.gz to MNIST\raw

initial accuracy: 0.1135
epoch 0 accuracy: 0.9581
epoch 1 accuracy: 0.97

进程已结束,退出代码0

这就是一个简单的训练好的神经网络啦~

本文是我个人的学习笔记，摘自孔工码字b站up主

GeorgeKingsman

关注

21
点赞
踩
49

收藏

觉得还不错? 一键收藏
2
评论
[Pytorch]手写数字的识别

是一种模仿生物神经网络结构和功能的数学模型，用于机器学习和人工智能领域。它由大量的人工神经元组成，这些神经元通过连接相互传递信息，形成了一个网络。每个神经元接收输入，并通过权重加权后将其传递给下一层的神经元或输出。神经网络可以学习输入与输出之间的复杂关系，使得在给定输入时能够产生相应的输出。：接收输入数据的层，每个输入节点对应输入数据的一个特征。：在输入层和输出层之间的层，用于处理输入数据并提取特征。神经网络的深度通常由隐藏层的数量决定。
复制链接

扫一扫