Pytorch深度学习之RNN

这里使用RNN中的LSTM对MNIST数据集做了分类。

  首先对这个代码做一个简要概述,我自己的理解:
    ①写在开头,这里采用的RNN中的长短期记忆LSTM是将RNN进行提升的一种算法,具体原理不解释,简要概括就是防止普通RNN中的梯度消失和梯度爆炸,以做到长短期记忆的效果,然后这里对详细怎么对MNIST进行分类预测的操作
    ②首先,这里MNIST数据集耳熟能详,为灰度图像,所以通道数为1,宽高分别为28,28
    ③然后我们对数据集进行batch分批,这里选取batch_size=100,所以下面的介绍向量均有一个维度为100,即Batch_size
    ④这里开始简述我们如何将一张图片划分为序列,我们将图片28*28分解为序列和Input_size,即按行选取序列,这里就是对应的每个序列就是网络图中的x1,x2,…,x28,其中每个 x i x_i xi均为图像中的一行,即长度为28
    ⑤然后我们将通道维度删除(因为为1,我们并不像RNN那样对通道进行改变)
    ⑥这时我们一张图片就有原来的(channel,weight,height)变为(input_size,sequence_size),即(1,28,28)->(28,28)
    ⑦然后对于RNNcell中的每个隐藏层 h i h_i hi,这里维度都是hidden,但是我们写到一起就是(num_layers,input_size, hidden_size),即(2,28,128),因为有num_layers层,每行有input_size个
    ⑧最终输出的out只是最上层的一行 h i h_i hi,所以要减去num_layers这一个维度,就得到(input_size, hidden_size),即(28, 128)
    ⑨然后我们只选取最后一个 h 28 h_{28} h28作为输出,所以会选取-1,维度就变为了(hidden_size)
    ⑩然后加入线性层(128,10)作为对最后的分类.

上代码:

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
#序列长度为28
sequence_length = 28
#输入维度为28
input_size = 28
#隐藏层维度128
hidden_size = 128
#隐藏层数为2
num_layers = 2
#输出分类种类10
num_classes = 10
#每个batch大小
batch_size = 100
#训练epoch数
num_epochs = 2
#学习率
learning_rate = 0.01

train_dataset = torchvision.datasets.MNIST(root='data/',
                                           train=True,
                                           transform=transforms.ToTensor(),
                                           download=True)

test_dataset = torchvision.datasets.MNIST(root='data/',
                                          train=False,
                                          transform=transforms.ToTensor())

# Data loader
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(RNN, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        #h0为2,100,128(num_layers, batch_size, hidden_size)
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
        #c0同h0
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
        #经过lstm后输出变为100,28,128(batch,input_size,hidden_size)
        out, _= self.lstm(x, (h0, c0))
        #经过选取最后一个隐藏层变为(100,128),然后通过线性层变为100, 128,经过fc线性层变为100, 10(batch, num_classes)
        out = self.fc(out[:, -1, :])
        return out

model = RNN(input_size, hidden_size, num_layers, num_classes).to(device)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

total_step = len(train_loader)
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        #此时图像维度为100,1,28,28(batch, channel, height, weight)
        images = images.reshape(-1, sequence_length, input_size).to(device)
        #此时变为100,28,28(batch, sequence_length, input_size)
        labels = labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        if (i + 1) % 100 == 0:
            print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' .format(epoch+1, num_epochs, i+1, total_step, loss.item()))

model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, sequence_length, input_size).to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))

# Save the model checkpoint
torch.save(model.state_dict(), 'model.ckpt')

仅在两轮epoch模型正确率就达到了97%

Epoch [1/2], Step [100/600], Loss: 0.4901
Epoch [1/2], Step [200/600], Loss: 0.3043
Epoch [1/2], Step [300/600], Loss: 0.0759
Epoch [1/2], Step [400/600], Loss: 0.1782
Epoch [1/2], Step [500/600], Loss: 0.0625
Epoch [1/2], Step [600/600], Loss: 0.1130
Epoch [2/2], Step [100/600], Loss: 0.0707
Epoch [2/2], Step [200/600], Loss: 0.1482
Epoch [2/2], Step [300/600], Loss: 0.1023
Epoch [2/2], Step [400/600], Loss: 0.0196
Epoch [2/2], Step [500/600], Loss: 0.1509
Epoch [2/2], Step [600/600], Loss: 0.0954
Test Accuracy of the model on the 10000 test images: 97.01 %
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值