pytorch实现手写字体识别(Mnist数据集)

1.加载数据集

一个快速体验学习的小tip在google的云jupyter上做实验,速度快的飞起。

import torch
from torch.nn import Linear, ReLU
import torch.nn as nn
import numpy as np
from torch.autograd import Variable
from torchvision import datasets,transforms
from torch.autograd import Variable
import torch.optim as optim
transformation = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.1307,),(0.3081,))])
#data/表示下载数据集到的目录,transformation表示对数据集进行的相关处理
train_dataset = datasets.MNIST('data/',train=True, transform=transformation,download=True)
test_dataset = datasets.MNIST('data/', train=False, transform=transformation,download=True)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32, shuffle=True)

2.显示图片

#将数据加载为一个迭代器,读取其中一个批次
simple_data = next(iter(train_loader))
import matplotlib.pyplot as plt
def plot_img(image):
    image = image.numpy()[0]
    mean = 0.1307
    std = 0.3081
    image = ((mean * image) + std)
    plt.imshow(image,cmap='gray')


plot_img(simple_data[0][3])

 3.构造网络模型

import torch.nn.functional as F
class Mnist_Net(nn.Module):
  def __init__(self):
    super(Mnist_Net,self).__init__()
    self.conv1 = nn.Conv2d(1,10,kernel_size=5)
    self.conv2 = nn.Conv2d(10,20,kernel_size=5)
    self.conv2_drop = nn.Dropout2d()
    self.fc1 = nn.Linear(320, 50) #320是根据卷积计算而来4*4*20(4*4表示大小,20表示通道数)
    self.fc2 = nn.Linear(50, 10)
  def forward(self, x):
    x = F.relu(F.max_pool2d(self.conv1(x), 2))
    x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
    
    x = x.view(-1, 320)
    x = F.relu(self.fc1(x))
    #x = F.dropout(x,p=0.1, training=self.training)
    x = self.fc2(x)
    return F.log_softmax(x,dim=1)

log_softmax数学上等价于log(softmax(x)),

由于softmax得出的结果每一个概率都是(0,1)的,这就会导致有些概率过小,导致下溢。 考虑到这个概率分布总归是要经过crossentropy的,而crossentropy的计算是把概率分布外面套一个-log 来似然,那么直接在计算概率分布的时候加上log,把概率从(0,1)变为(-∞,0),这样就防止中间会有下溢出。 所以log_softmax说白了就是将本来应该由crossentropy做的套log的工作提到预测概率分布来,跳过了中间的存储步骤,防止中间数值会有下溢出,使得数据更加稳定

nll_loss(negative log likelihood loss):最大似然 / log似然代价函数
CrossEntropyLoss: 交叉熵损失函数。交叉熵描述了两个概率分布之间的距离,当交叉熵越小说明二者之间越接近。

model = Mnist_Net()
model = model.cuda() #使用Gpu加速训练
optimizer = optim.SGD(model.parameters(), lr=0.01)#优化函数

model

Mnist_Net( (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1)) (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1)) (conv2_drop): Dropout2d(p=0.5, inplace=False) (fc1): Linear(in_features=320, out_features=50, bias=True) (fc2): Linear(in_features=50, out_features=10, bias=True) )

4.模型训练/验证函数

def fit(epoch,model,data_loader,phase='training', volatile=False):
  if phase =="training": #判断当前是训练还是验证
    model.train()
  if phase == "validation":
    model.eval()
    volatile =True
  running_loss = 0.0
  running_correct =0
  for batch_idx,(data,target) in enumerate(data_loader): #取出数据
    data,target = data.cuda(),target.cuda() #使用cuda加速
    data,target = Variable(data,volatile), Variable(target)
    if phase == 'training':
      optimizer.zero_grad() #重置梯度
    output =model(data) #得出预测结果
    loss = F.nll_loss(output,target) #计算损失值
    running_loss += F.nll_loss(output,target,size_average=False).item() #计算总的损失值
    preds = output.data.max(dim=1,keepdim=True)[1] #预测概率值转换为数字
    running_correct += preds.eq(target.data.view_as(preds)).cpu().sum()
    if phase == 'training':
      loss.backward()
      optimizer.step()
  loss = running_loss/len(data_loader.dataset)
  accuracy = 100.*running_correct/len(data_loader.dataset)
  print(f'{phase} loss is {loss:{5}.{2}} and {phase} accuracy is {running_correct}/{len(data_loader.dataset)}{accuracy:{10}.{4}}')
  return loss,accuracy

5.模型训练/验证及可视化

train_losses , train_accuracy = [],[]
val_losses , val_accuracy = [],[]
for epoch in range(1,40):
    epoch_loss, epoch_accuracy = fit(epoch,model,train_loader,phase='training')
    val_epoch_loss , val_epoch_accuracy = fit(epoch,model,test_loader,phase='validation')
    train_losses.append(epoch_loss)
    train_accuracy.append(epoch_accuracy)
    val_losses.append(val_epoch_loss)
    val_accuracy.append(val_epoch_accuracy)

结果:基本上到达99%在训练集和验证集上

training loss is 0.042 and training accuracy is 59206/60000 98.68 validation loss is 0.027 and validation accuracy is 9907/10000 99.07 training loss is 0.041 and training accuracy is 59229/60000 98.71 validation loss is 0.029 and validation accuracy is 9910/10000 99.1 training loss is 0.039 and training accuracy is 59271/60000 98.79 validation loss is 0.029 and validation accuracy is 9908/10000 99.08 training loss is 0.04 and training accuracy is 59261/60000 98.77 validation loss is 0.026 and validation accuracy is 9911/10000 99.11 training loss is 0.041 and training accuracy is 59244/60000 98.74 validation loss is 0.026 and validation accuracy is 9913/10000 99.13 training loss is 0.038 and training accuracy is 59287/60000 98.81 validation loss is 0.029 and validation accuracy is 9906/10000 99.06

可视化损失值

plt.plot(range(1,len(train_losses)+1),train_losses,'bo',label='training_loss')
plt.plot(range(1,len(val_losses)+1),val_losses,'r',label ='validation loss')
plt.legend()

可视化精确度

plt.plot(range(1,len(train_accuracy)+1),train_accuracy,'bo',label = 'train accuracy')
plt.plot(range(1,len(val_accuracy)+1),val_accuracy,'r',label = 'val accuracy')
plt.legend()

 

 6.使用torchvision提供的预训练模型进行训练

加载模型(选用resnet18,注意:需要改变一部分模型结构来拟合你的数据特征)

 

from torchvision import models
transfer_model1 = models.resnet18(pretrained=True)
#第一层卷积层改为1通道,因为mnist是(1,28,28)
transfer_model1.conv1=nn.Conv2d(1,64,kernel_size=(7,7),stride=(2,2),padding=(3,3),bias=False)

#输出的模型结构改为10维
dim_in = transfer_model1.fc.in_features
transfer_model1.fc = nn.Linear(dim_in, 10)

模型训练

#损失函数
criteon = nn.CrossEntropyLoss()
optimizer = optim.SGD(transfer_model.parameters(), lr=0.01)
transfer_model = transfer_model.cuda()
train_losses , train_accuracy = [],[]
val_losses , val_accuracy = [],[]
for epoch in range(10):
  transfer_model.train()
  running_loss =0.0
  running_correct =0
  for batch_idx,(x,target) in enumerate(train_loader):
    #预测值logits
    x,target = x.cuda(),target.cuda()
    x,target = Variable(x),Variable(target)
    logits = transfer_model(x)
 
    loss = criteon(logits,target)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    running_loss +=loss.item()
    preds = logits.data.max(dim=1,keepdim=True)[1]
    running_correct += preds.eq(target.data.view_as(preds)).cpu().sum()
  train_loss = running_loss/len(train_loader.dataset)
  train_acc = 100*running_correct/len(train_loader.dataset)
  train_losses.append(train_loss)
  train_accuracy.append(train_acc)
  print('epoch:{},train loss is{},train_acc is {}'.format(epoch,train_loss,train_acc))

  test_loss =0.0
  test_acc_num=0
  #模型test
  model.eval()
  for data,target in test_loader:
    data,target = data.cuda(),target.cuda()
    data,target = Variable(data),Variable(target)
    logits = transfer_model(data)
    test_loss +=criteon(logits,target).item()
    _,pred = torch.max(logits,1)
    test_acc_num += pred.eq(target).float().sum().item()
  test_los = test_loss/len(test_loader.dataset)
  test_acc = test_acc_num/len(test_loader.dataset)
  val_losses.append(test_los)
  val_accuracy.append(test_acc)
  print("epoch:{} total loss:{},acc:{}".format(epoch,test_los,test_acc))

训练过程

poch:0,train loss is0.006211824892895917,train_acc is 93
/pytorch/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
epoch:0 total loss:0.0015338615737855435,acc:0.9845
epoch:1,train loss is0.001937273425484697,train_acc is 98
epoch:1 total loss:0.0012044131346046925,acc:0.9875
epoch:2,train loss is0.0012458768549064795,train_acc is 98
epoch:2 total loss:0.0010627412386238575,acc:0.9888
epoch:3,train loss is0.0009563570257897178,train_acc is 99
epoch:3 total loss:0.0010410105541348456,acc:0.9895
epoch:4,train loss is0.0006897176718960205,train_acc is 99
epoch:4 total loss:0.000987174815684557,acc:0.9897
epoch:5,train loss is0.0005422685200969378,train_acc is 99
epoch:5 total loss:0.0009278423339128494,acc:0.9905
epoch:6,train loss is0.0004471833350757758,train_acc is 99
epoch:6 total loss:0.0008359558276832104,acc:0.9921
epoch:7,train loss is0.00036988445594906806,train_acc is 99
epoch:7 total loss:0.0007887807078659535,acc:0.9921
epoch:8,train loss is0.00031936961747705935,train_acc is 99
epoch:8 total loss:0.0009252857074141502,acc:0.9909
epoch:9,train loss is0.00030207459069788454,train_acc is 99
epoch:9 total loss:0.0008598978526890278,acc:0.9921

从上面可以看出迅速到达非常好的精确度

可视化

  • 8
    点赞
  • 48
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
要使用PyTorch实现MNIST数据集的手把手教程,你可以按照以下步骤进行操作: 1. 导入所需的库和模块。这包括PyTorch库和其他必要的辅助功能库。 2. 获取并预处理数据集。你可以使用MNIST数据集,该数据集包含了0到9的手写数字图像。可以使用torchvision库中的函数来下载和加载MNIST数据集。然后,你需要对图像进行预处理,例如将其转换为张量、进行归一化等。 3. 构建模型。在PyTorch中,你可以使用nn.Module类来定义模型。你可以选择使用卷积神经网络(CNN)或全连接神经网络(FNN)来构建模型。根据模型的复杂性和准确性需求进行选择。 4. 定义损失函数和优化器。根据你的问题和模型的输出类型,选择适当的损失函数,例如交叉熵损失函数。然后选择一个优化器,例如随机梯度下降(SGD)或Adam优化器。 5. 编写训练循环。在训练循环中,你需要定义训练过程中的前向传播、计算损失、反向传播和参数更新操作。同时,你还可以添加其他功能,例如计算准确率、记录训练损失等。 6. 编写测试循环。在测试循环中,你需要定义测试过程中的前向传播和计算准确率操作。 7. 定义主要函数。在主要函数中,你需要调用前面定义的函数和模型,对数据进行训练和测试,并输出结果。 请注意以上步骤只是一个大致的框架,具体的实现细节和代码可以根据你的需求和实际情况进行调整和修改。在实际操作中,你可能还需要考虑其他因素,例如数据扩充、模型调参和模型保存等。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *3* [PyTorch 手把手教你实现 MNIST 数据集](https://blog.csdn.net/weixin_46274168/article/details/118271544)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *2* [使用自然语言TensorFlow或PyTorch构建模型处理(NLP)技术构建一个简单的情感分析模型(附详细操作步骤)....](https://download.csdn.net/download/weixin_44609920/88234133)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值