MNIST数据集下的数字识别——利用torch包快速搭建两层神经网络

曙光81

于 2024-07-22 15:13:32 发布

阅读量742

点赞数 9

文章标签：神经网络人工智能深度学习

本文链接：https://blog.csdn.net/weixin_62511720/article/details/140610175

版权

首先再python快速安装torch包，利用镜像指令进行安装，再pycharm控制台输入：

pip install torch -i https://pypi.tuna.tsinghua.edu.cn/simple some-package

本文在这里是使用了小批量数据集，训练图片20张，如果需要增大训练数据，只要再图片处理函数和标签处理函数进行数字修改就可以了。

这段代码是一个使用PyTorch框架实现的简单的神经网络，用于对MNIST数据集进行分类。下面是对每个函数的分析：

labeldata2(MNIST_labels_path): 这个函数用于读取MNIST数据集中的标签文件。它打开一个标签文件，并逐个读取每个标签，将其从字节转换为整数，并添加到列表中。若使用的是全部数据集：range(1, 9) 应该是 range(1, len(file_labels) // 8 + 1)，以确保读取所有标签。
imagedata(image_folder): 这个函数用于加载图像文件夹中的所有图像，将它们转换为灰度，并调整大小为28x28像素。然后，将图像数据归一化到[0, 1]区间，并返回一个包含所有图像数据的NumPy数组。
labeldata(MNIST_labels_path): 这个函数与labeldata2类似，但是它读取的标签数量是固定的20个，这可能不适用于整个MNIST数据集。
NeuralNetwork(nn.Module): 这是一个继承自nn.Module的类，定义了一个简单的前馈神经网络，包含两个全连接层。__init__方法初始化网络层，forward方法定义了数据通过网络的前向传播过程。
train(model, device, train_loader, optimizer, epoch): 这个函数用于训练模型。它在每个epoch中迭代训练数据加载器，执行前向传播、计算损失、执行反向传播，并更新模型的权重。
test(model, device, test_loader): 这个函数用于在测试集上评估模型的性能。它计算测试集上的平均损失和准确率，并打印结果。
evaluate_accuracy(model, X_test_tensor, y_test_tensor, device): 这个函数计算并返回模型在测试集上的准确率。
main(): 这是主函数，它设置了设备（GPU或CPU），加载和处理训练数据，初始化模型和优化器，执行训练过程，并在训练结束后评估模型在测试集上的准确率。

整体来看，这段代码实现了一个基本的神经网络训练和测试流程。

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, TensorDataset
import numpy as np
from PIL import Image
import os

def labeldata2(MNIST_labels_path):
    with open(MNIST_labels_path, 'rb') as f:
        file_labels = f.read()
    train_label = []
    for i in range(1, 9):
        label = int.from_bytes(file_labels[i + 8 - 1:8 + i], 'big')
        train_label.append(label)
    return train_label
# 假设其他函数（如 imagedata 和 labeldata）保持不变
def imagedata(image_folder):
    image_files = os.listdir(image_folder)
    image_list = []
    for image_file in image_files:
        file_path = os.path.join(image_folder, image_file)
        with Image.open(file_path) as img:
            img_gray = img.convert('L')
            img_resized = img_gray.resize((28, 28))
            image_arry = np.array(img_resized)
            image_arry = np.array(img_resized, dtype=np.float32)  # 转换为浮点数
            image_arry = image_arry / 255  # 数据归一化
            image_list.append(image_arry)
    mnist_images = np.stack(image_list)
    train_image = mnist_images.reshape(mnist_images.shape[0], -1)
    return train_image


def labeldata(MNIST_labels_path):
    with open(MNIST_labels_path, 'rb') as f:
        file_labels = f.read()
    train_label = []
    for i in range(1, 21):
        label = int.from_bytes(file_labels[i + 8 - 1:8 + i], 'big')
        train_label.append(label)
    return train_label


class NeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x


def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.cross_entropy(output, target.squeeze(1))
        loss.backward()
        optimizer.step()
        if batch_idx % 100 == 0:
            print(
                f'Train Epoch: {epoch} [{batch_idx * len(data)}/{len(train_loader.dataset)} ({100. * batch_idx / len(train_loader):.0f}%)]\tLoss: {loss.item():.6f}')


def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.cross_entropy(output, target.squeeze(1), reduction='sum').item()
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()
    test_loss /= len(test_loader.dataset)
    print(
        f'\nTest set: Average loss: {test_loss:.4f}, Accuracy: {correct}/{len(test_loader.dataset)} ({100. * correct / len(test_loader.dataset):.0f}%)\n')

def evaluate_accuracy(model, X_test_tensor, y_test_tensor, device):
    model.eval()  # 将模型设置为评估模式
    correct = 0
    total = y_test_tensor.size(0)  # 测试集的样本总数

    with torch.no_grad():
        # 将测试数据和标签转移到设备（GPU或CPU）
        X_test_tensor = X_test_tensor.to(device)
        y_test_tensor = y_test_tensor.to(device)

        # 进行预测
        outputs = model(X_test_tensor)
        # 获取预测的最大值的索引，即预测的类别
        _, predicted = torch.max(outputs, 1)
        # 计算并累加预测正确的样本数
        correct += (predicted == y_test_tensor).sum().item()

    # 计算准确率
    accuracy = 100 * correct / total
    return accuracy
def main():
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # 加载训练数据集
    image_folder = "D:\\MNIST_data\\train1"
    MNIST_labels_path = 'D:\\MNIST_data\\train-labels-idx1-ubyte\\train-labels.idx1-ubyte'
    X_train = imagedata(image_folder)
    y_train = labeldata(MNIST_labels_path)
    y_train = np.array(y_train)

    # 将NumPy数组转换为PyTorch张量
    X_train_tensor = torch.tensor(X_train.astype(np.float32))
    y_train_tensor = torch.tensor(y_train.astype(np.int64)).view(-1, 1)

    # 创建数据加载器
    train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
    train_loader = DataLoader(dataset=train_dataset, batch_size=32, shuffle=True)

    # 初始化网络
    input_size = 784  # 28x28
    hidden_size = 128
    output_size = 10
    model = NeuralNetwork(input_size, hidden_size, output_size).to(device)

    # 定义损失函数和优化器
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

    # 训练网络
    num_epochs = 10
    for epoch in range(1, num_epochs + 1):
        train(model, device, train_loader, optimizer, epoch)

    # 评估训练集上的准确率
    test(model, device, train_loader)

    test_image_folder = "D:\\MNIST_data\\test"  # 这里应该是测试集的路径
    test_MNIST_labels_path = 'D:\\MNIST_data\\train-labels-idx1-ubyte\\train-labels.idx1-ubyte' # 测试集标签路径
    X_test = imagedata(test_image_folder)
    y_test = labeldata2(test_MNIST_labels_path)  # 使用正确的标签处理函数
    y_test = np.array(y_test)

    # 将测试集的NumPy数组转换为PyTorch张量
    X_test_tensor = torch.tensor(X_test.astype(np.float32))
    y_test_tensor = torch.tensor(y_test.astype(np.int64))

    # 评估测试集上的准确率
    test_accuracy = evaluate_accuracy(model, X_test_tensor, y_test_tensor, device)
    print(f"Test Accuracy: {test_accuracy:.2f}%")

if __name__ == "__main__":
    main()