【Pytorch】16.使用ImageFolder加载自定义MNIST数据集训练手写数字识别网络(包含数据集下载)

数据集下载

MINST_PNG_Training在github的项目目录中的datasets中有MNIST的png格式数据集的压缩包

用于训练的神经网络模型

在这里插入图片描述

自定义数据集训练

在前文【Pytorch】13.搭建完整的CIFAR10模型我们已经知道了基本搭建神经网络的框架了,但是其中的数据集使用的torchvision中的CIFAR10官方数据集进行训练的

train_dataset = torchvision.datasets.CIFAR10('../datasets', train=True, download=True,
                                             transform=torchvision.transforms.ToTensor())
test_dataset = torchvision.datasets.CIFAR10('../datasets', train=False, download=True,
                                            transform=torchvision.transforms.ToTensor())

在这里插入图片描述

本文将用图片格式的数据集进行训练
在这里插入图片描述
我们通过

# Dataset CIFAR10
#     Number of datapoints: 60000
#     Root location: ../datasets
#     Split: Train
#     StandardTransform
# Transform: ToTensor()
print(train_dataset)

可以看到我们下载的数据集是这种格式的,所以我们的主要问题就是如何将自定义的数据集获取,并且转化为这种形式,剩下的步骤就和上文相同了

数据类型进行转化

我们的首要目的是,根据数据集的地址,分别将数据转化为train_datasettest_dataset
我们需要调用ImageFolder方法来进行操作

from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
from torchvision.datasets import ImageFolder
from model import *

# 训练集地址
train_root = "../datasets/mnist_png/training"
# 测试集地址
test_root = '../datasets/mnist_png/testing'

# 进行数据的处理,定义数据转换
data_transform = transforms.Compose([transforms.Resize((28, 28)),
                                     transforms.Grayscale(),
                                     transforms.ToTensor()])


# 加载数据集
train_dataset = ImageFolder(train_root, transform=data_transform)
test_dataset = ImageFolder(test_root, transform=data_transform)

首先我们需要将数据进行处理,通过transforms.Compose获取对象data_transform
其中进行了三步操作

  • 将图片大小变为28*28像素便于输入网络模型
  • 将图片转化为灰度格式,因为手写数字识别不需要三通道的图片,只需要灰度图像就可以识别,而png格式的图片是四通道
  • 将图片转化为tensor数据类型

然后通过ImageFolder给出图片的地址与转化类型,就可以实现与我们在官方下载数据集相同的格式

# Dataset ImageFolder
#     Number of datapoints: 60000
#     Root location: ../datasets/mnist_png/training
#     StandardTransform
# Transform: Compose(
#                Resize(size=(28, 28), interpolation=bilinear, max_size=None, antialias=True)
#                ToTensor()
#            )
print(train_dataset)

其他与前文【Pytorch】13.搭建完整的CIFAR10模型基本相同

完整代码

网络模型

import torch
from torch import nn


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(2, stride=2)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(3136, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.pool2(x)
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.fc2(x)
        return x


if __name__ == "__main__":
    model = Net()
    input = torch.ones((1, 1, 28, 28))
    output = model(input)
    print(output.shape)

训练过程

from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
from torchvision.datasets import ImageFolder
from model import *

# 训练集地址
train_root = "../datasets/mnist_png/training"
# 测试集地址
test_root = '../datasets/mnist_png/testing'

# 进行数据的处理,定义数据转换
data_transform = transforms.Compose([transforms.Resize((28, 28)),
                                     transforms.Grayscale(),
                                     transforms.ToTensor()])


# 加载数据集
train_dataset = ImageFolder(train_root, transform=data_transform)
test_dataset = ImageFolder(test_root, transform=data_transform)

# Dataset ImageFolder
#     Number of datapoints: 60000
#     Root location: ../datasets/mnist_png/training
#     StandardTransform
# Transform: Compose(
#                Resize(size=(28, 28), interpolation=bilinear, max_size=None, antialias=True)
#                ToTensor()
#            )
# print(train_dataset)

# print(train_dataset[0])


train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=True)


device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")

model = Net().to(device)
loss_fn = nn.CrossEntropyLoss().to(device)
learning_rate = 0.01
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

epoch = 10

writer = SummaryWriter('../logs')
total_step = 0

for i in range(epoch):
    model.train()
    pre_step = 0
    pre_loss = 0
    for data in train_loader:
        images, labels = data
        images = images.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = loss_fn(outputs, labels)
        loss.backward()
        optimizer.step()
        pre_loss = pre_loss + loss.item()
        pre_step += 1
        total_step += 1
        if pre_step % 100 == 0:
            print(f"Epoch: {i+1} ,pre_loss = {pre_loss/pre_step}")
            writer.add_scalar('train_loss', pre_loss / pre_step, total_step)

    model.eval()
    pre_accuracy = 0
    with torch.no_grad():
        for data in test_loader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            pre_accuracy += outputs.argmax(1).eq(labels).sum().item()
    print(f"Test_accuracy: {pre_accuracy/len(test_dataset)}")
    writer.add_scalar('test_accuracy', pre_accuracy / len(test_dataset), i)
    torch.save(model, f'../models/model{i}.pth')

writer.close()

参考文章

【CNN】搭建AlexNet网络——并处理自定义的数据集(猫狗分类)
How to download MNIST images as PNGs

  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
可以使用以下代码基于MNIST数据集PyTorch中实现手写数字识别: ```python import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms # 定义超参数 input_size = 784 # 28x28 hidden_size = 100 num_classes = 10 num_epochs = 5 batch_size = 100 learning_rate = 0.001 # 加载数据集,并进行标准化处理 train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transforms.ToTensor(), download=True) test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True) test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False) # 定义神经网络模型 class NeuralNet(nn.Module): def __init__(self, input_size, hidden_size, num_classes): super(NeuralNet, self).__init__() self.fc1 = nn.Linear(input_size, hidden_size) self.relu = nn.ReLU() self.fc2 = nn.Linear(hidden_size, num_classes) def forward(self, x): out = self.fc1(x) out = self.relu(out) out = self.fc2(out) return out model = NeuralNet(input_size, hidden_size, num_classes) # 定义损失函数和优化器 criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) # 训练模型 total_step = len(train_loader) for epoch in range(num_epochs): for i, (images, labels) in enumerate(train_loader): images = images.reshape(-1, 28*28) # 前向传播 outputs = model(images) loss = criterion(outputs, labels) # 反向传播和优化 optimizer.zero_grad() loss.backward() optimizer.step() if (i+1) % 100 == 0: print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' .format(epoch+1, num_epochs, i+1, total_step, loss.item())) # 测试模型 with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.reshape(-1, 28*28) outputs = model(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total)) ``` 运行该代码后,将输出模型的训练过程和在测试集上的准确率。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值