按照这篇文章的代码
https://www.jianshu.com/p/8e447be76478
十分感谢的作者的分享,但文章中的代码是有问题的。需要修改两点
1、代码对齐问题
# 开始训练
for epoch in range(num_epoches):
running_loss = 0.0
running_acc = 0.0
for i, data in enumerate(train_loader, 1):
img, label = data
img = img.squeeze(1)
if torch.cuda.is_available():
img = img.cuda()
label = label.cuda()
else:
img = Variable(img)
label = Variable(label)
# 向前传播
out = model(img)
loss = criterion(out, label)
running_loss += loss.item() * label.size(0)
_, pred = torch.max(out, 1)
num_correct = (pred == label).sum()
running_acc += num_correct.item()
# 向后传播
optimizer.zero_grad()
loss.backward()
optimizer.step()
if i % 300 == 0:
print('[{}/{}] Loss: {:.6f}, Acc: {:.6f}'.format(
epoch + 1, num_epoches, running_loss / (batch_size * i),
running_acc / (batch_size * i)))
print('Finish {} epoch, Loss: {:.6f}, Acc: {:.6f}'.format(
epoch + 1, running_loss / (len(train_dataset)), running_acc / (len(
train_dataset))))
否则用无法使用GPU训练
2、train和eval模式的转换
直接运行代码会出现如下问题:
RuntimeError: cudnn RNN backward can only be called in training mode
可以按照这篇文章的方法解决:
https://blog.csdn.net/dongwanli666/article/details/103072635