相关视频:
PyTorch 动态神经网络 (莫烦 Python 教学)
笔记:PyTorch笔记 入门:写一个简单的神经网络3:CNN(以MNIST数据集为例)记录了如何编写一个简单的CNN神经网络,现在记录如何进一步使用GPU加快神经网络的训练。
一、将神经网络移到GPU上
# 将神经网络移到GPU上
cnn.cuda()
二、将测试数据移到GPU上
# 将测试数据移到GPU上
test_x = test_x.cuda()
test_y = test_y.cuda()
三、(训练过程中)将训练数据、预测结果移到GPU上
# 训练神经网络
for epoch in range(EPOCH):
for step, (batch_x, batch_y) in enumerate(train_loader):
# 将训练数据移到GPU上
batch_x = batch_x.cuda()
batch_y = batch_y.cuda()
output = cnn(batch_x)
loss = loss_func(output, batch_y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 每隔50步输出一次信息
if step%50 == 0:
test_output = cnn(test_x)
# 将预测结果移到GPU上
predict_y = torch.max(test_output, 1)[1].cuda().data.squeeze()
accuracy = (predict_y == test_y).sum().item() / test_y.size(0)
print('Epoch', epoch, '|', 'Step', step, '|', 'Loss', loss.data.item(), '|', 'Test Accuracy', accuracy)
四、(在预测过程中)将数据移回CPU上
# 预测
test_output = cnn(test_x[:100])
# 为了将CUDA tensor转化为numpy,需要将数据移回CPU上
# 否则会报错:TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
predict_y = torch.max(test_output, 1)[1].cpu().data.numpy().squeeze()
real_y = test_y[:100].cpu().numpy()
print(predict_y)
print(real_y)
五、对比
使用CPU训练:
Epoch 0 | Step 0 | Loss 2.3023581504821777 | Test Accuracy 0.2795
Epoch 0 | Step 50 | Loss 0.36932313442230225 | Test Accuracy 0.839
Epoch 0 | Step 100 | Loss 0.17208492755889893 | Test Accuracy 0.9025
Epoch 0 | Step 150 | Loss 0.2834635376930237 | Test Accuracy 0.9025
Epoch 0 | Step 200 | Loss 0.10628349334001541 | Test Accuracy 0.9365
Epoch 0 | Step 250 | Loss 0.07513977587223053 | Test Accuracy 0.949
Epoch 0 | Step 300 | Loss 0.15143314003944397 | Test Accuracy 0.952
Epoch 0 | Step 350 | Loss 0.19321243464946747 | Test Accuracy 0.958
Epoch 0 | Step 400 | Loss 0.08455082774162292 | Test Accuracy 0.963
Epoch 0 | Step 450 | Loss 0.08475902676582336 | Test Accuracy 0.9635
Epoch 0 | Step 500 | Loss 0.14322614669799805 | Test Accuracy 0.966
Epoch 0 | Step 550 | Loss 0.22640569508075714 | Test Accuracy 0.966
Epoch 1 | Step 0 | Loss 0.04606473818421364 | Test Accuracy 0.969
Epoch 1 | Step 50 | Loss 0.35338521003723145 | Test Accuracy 0.9715
Epoch 1 | Step 100 | Loss 0.039717815816402435 | Test Accuracy 0.972
Epoch 1 | Step 150 | Loss 0.10654418915510178 | Test Accuracy 0.9695
Epoch 1 | Step 200 | Loss 0.032110925763845444 | Test Accuracy 0.9745
Epoch 1 | Step 250 | Loss 0.012637133710086346 | Test Accuracy 0.971
Epoch 1 | Step 300 | Loss 0.0625436082482338 | Test Accuracy 0.9735
Epoch 1 | Step 350 | Loss 0.032693102955818176 | Test Accuracy 0.975
Epoch 1 | Step 400 | Loss 0.05973822623491287 | Test Accuracy 0.976
Epoch 1 | Step 450 | Loss 0.22700577974319458 | Test Accuracy 0.9805
Epoch 1 | Step 500 | Loss 0.03670699521899223 | Test Accuracy 0.9725
Epoch 1 | Step 550 | Loss 0.14919476211071014 | Test Accuracy 0.9785
Time cost: 164.68248105049133 s
使用GPU训练:
Epoch 0 | Step 0 | Loss 2.295382499694824 | Test Accuracy 0.1795
Epoch 0 | Step 50 | Loss 0.4366167187690735 | Test Accuracy 0.851
Epoch 0 | Step 100 | Loss 0.1392095685005188 | Test Accuracy 0.915
Epoch 0 | Step 150 | Loss 0.374984472990036 | Test Accuracy 0.925
Epoch 0 | Step 200 | Loss 0.11992576718330383 | Test Accuracy 0.9435
Epoch 0 | Step 250 | Loss 0.09971962124109268 | Test Accuracy 0.955
Epoch 0 | Step 300 | Loss 0.15602746605873108 | Test Accuracy 0.9635
Epoch 0 | Step 350 | Loss 0.10646170377731323 | Test Accuracy 0.963
Epoch 0 | Step 400 | Loss 0.10151582956314087 | Test Accuracy 0.9675
Epoch 0 | Step 450 | Loss 0.050429973751306534 | Test Accuracy 0.97
Epoch 0 | Step 500 | Loss 0.07986892014741898 | Test Accuracy 0.966
Epoch 0 | Step 550 | Loss 0.11002516746520996 | Test Accuracy 0.9665
Epoch 1 | Step 0 | Loss 0.07174035906791687 | Test Accuracy 0.9745
Epoch 1 | Step 50 | Loss 0.1582135409116745 | Test Accuracy 0.9685
Epoch 1 | Step 100 | Loss 0.09163351356983185 | Test Accuracy 0.9805
Epoch 1 | Step 150 | Loss 0.13820190727710724 | Test Accuracy 0.9775
Epoch 1 | Step 200 | Loss 0.0733216404914856 | Test Accuracy 0.978
Epoch 1 | Step 250 | Loss 0.01615101844072342 | Test Accuracy 0.9785
Epoch 1 | Step 300 | Loss 0.0749548077583313 | Test Accuracy 0.978
Epoch 1 | Step 350 | Loss 0.05822641775012016 | Test Accuracy 0.977
Epoch 1 | Step 400 | Loss 0.033135443925857544 | Test Accuracy 0.98
Epoch 1 | Step 450 | Loss 0.07146552950143814 | Test Accuracy 0.9835
Epoch 1 | Step 500 | Loss 0.13729988038539886 | Test Accuracy 0.9795
Epoch 1 | Step 550 | Loss 0.07742690294981003 | Test Accuracy 0.98
Time cost: 7.764967918395996 s
可以发现,使用CPU训练用时为164.7s,准确率为0.9785;
而使用GPU训练用时为7.8s,准确率为0.98。
六、完整代码
使用CPU训练:
import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as Data
from torchvision import datasets, transforms
import matplotlib.pyplot as plt
from torchsummary import summary
import time
# 创建神经网络
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=2),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=2),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2)
)
self.output_layer = nn.Linear(32*7*7, 10)
def forward(self, x):
x = self.layer1(x)
x = self.layer2(x)
x = x.reshape(x.size(0), -1)
output = self.output_layer(x)
return output
# 超参数
EPOCH = 2
BATCH_SIZE = 100
LR = 0.001
DOWNLOAD = False # 若已经下载mnist数据集则设为False
# 下载mnist数据
train_data = datasets.MNIST(
root='./data', # 保存路径
train=True, # True表示训练集,False表示测试集
transform=transforms.ToTensor(), # 将0~255压缩为0~1
download=DOWNLOAD
)
# 旧的写法
print(train_data.train_data.size())
print(train_data.train_labels.size())
# 新的写法
print(train_data.data.size())
print(train_data.targets.size())
# 打印部分数据集的图片
for i in range(2):
print(train_data.targets[i].item())
plt.imshow(train_data.data[i].numpy(), cmap='gray')
plt.show()
# DataLoader
train_loader = Data.DataLoader(
dataset=train_data,
batch_size=BATCH_SIZE,
shuffle=True,
num_workers=2
)
# 如果train_data下载好后,test_data也就下载好了
test_data = datasets.MNIST(
root='./data',
train=False
)
print(test_data.data.size())
print(test_data.targets.size())
# 新建网络
cnn = CNN()
print(cnn)
# 查看网络的结构
model = CNN()
if torch.cuda.is_available():
model.cuda()
summary(model, input_size=(1,28,28))
# 优化器
optimizer = torch.optim.Adam(cnn.parameters(), lr=LR)
# 损失函数
loss_func = nn.CrossEntropyLoss()
# 为了节约时间,只使用测试集的前2000个数据
test_x = Variable(
torch.unsqueeze(test_data.data, dim=1),
volatile=True
).type(torch.FloatTensor)[:2000]/255 # 将将0~255压缩为0~1
test_y = test_data.targets[:2000]
# # 使用所有的测试集
# test_x = Variable(
# torch.unsqueeze(test_data.test_data, dim=1),
# volatile=True
# ).type(torch.FloatTensor)/255 # 将将0~255压缩为0~1
# test_y = test_data.test_labels
# 开始计时
start = time.time()
# 训练神经网络
for epoch in range(EPOCH):
for step, (batch_x, batch_y) in enumerate(train_loader):
output = cnn(batch_x)
loss = loss_func(output, batch_y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 每隔50步输出一次信息
if step%50 == 0:
test_output = cnn(test_x)
predict_y = torch.max(test_output, 1)[1].data.squeeze()
accuracy = (predict_y == test_y).sum().item() / test_y.size(0)
print('Epoch', epoch, '|', 'Step', step, '|', 'Loss', loss.data.item(), '|', 'Test Accuracy', accuracy)
# 结束计时
end = time.time()
# 训练耗时
print('Time cost:', end - start, 's')
# 预测
test_output = cnn(test_x[:100])
predict_y = torch.max(test_output, 1)[1].data.numpy().squeeze()
real_y = test_y[:100].numpy()
print(predict_y)
print(real_y)
# 打印预测和实际结果
for i in range(10):
print('Predict', predict_y[i])
print('Real', real_y[i])
plt.imshow(test_data.data[i].numpy(), cmap='gray')
plt.show()
使用GPU训练:
import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as Data
from torchvision import datasets, transforms
import matplotlib.pyplot as plt
from torchsummary import summary
import time
# 创建神经网络
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=2),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=2),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2)
)
self.output_layer = nn.Linear(32*7*7, 10)
def forward(self, x):
x = self.layer1(x)
x = self.layer2(x)
x = x.reshape(x.size(0), -1)
output = self.output_layer(x)
return output
# 超参数
EPOCH = 2
BATCH_SIZE = 100
LR = 0.001
DOWNLOAD = False # 若已经下载mnist数据集则设为False
# 下载mnist数据
train_data = datasets.MNIST(
root='./data', # 保存路径
train=True, # True表示训练集,False表示测试集
transform=transforms.ToTensor(), # 将0~255压缩为0~1
download=DOWNLOAD
)
# 旧的写法
print(train_data.train_data.size())
print(train_data.train_labels.size())
# 新的写法
print(train_data.data.size())
print(train_data.targets.size())
# 打印部分数据集的图片
for i in range(2):
print(train_data.targets[i].item())
plt.imshow(train_data.data[i].numpy(), cmap='gray')
plt.show()
# DataLoader
train_loader = Data.DataLoader(
dataset=train_data,
batch_size=BATCH_SIZE,
shuffle=True,
num_workers=2
)
# 如果train_data下载好后,test_data也就下载好了
test_data = datasets.MNIST(
root='./data',
train=False
)
print(test_data.data.size())
print(test_data.targets.size())
# 新建网络
cnn = CNN()
# 将神经网络移到GPU上
cnn.cuda()
print(cnn)
# 查看网络的结构
model = CNN()
if torch.cuda.is_available():
model.cuda()
summary(model, input_size=(1,28,28))
# 优化器
optimizer = torch.optim.Adam(cnn.parameters(), lr=LR)
# 损失函数
loss_func = nn.CrossEntropyLoss()
# 为了节约时间,只使用测试集的前2000个数据
test_x = Variable(
torch.unsqueeze(test_data.data, dim=1),
volatile=True
).type(torch.FloatTensor)[:2000]/255 # 将将0~255压缩为0~1
test_y = test_data.targets[:2000]
# # 使用所有的测试集
# test_x = Variable(
# torch.unsqueeze(test_data.test_data, dim=1),
# volatile=True
# ).type(torch.FloatTensor)/255 # 将将0~255压缩为0~1
# test_y = test_data.test_labels
# 将测试数据移到GPU上
test_x = test_x.cuda()
test_y = test_y.cuda()
# 开始计时
start = time.time()
# 训练神经网络
for epoch in range(EPOCH):
for step, (batch_x, batch_y) in enumerate(train_loader):
# 将训练数据移到GPU上
batch_x = batch_x.cuda()
batch_y = batch_y.cuda()
output = cnn(batch_x)
loss = loss_func(output, batch_y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 每隔50步输出一次信息
if step%50 == 0:
test_output = cnn(test_x)
# 将预测结果移到GPU上
predict_y = torch.max(test_output, 1)[1].cuda().data.squeeze()
accuracy = (predict_y == test_y).sum().item() / test_y.size(0)
print('Epoch', epoch, '|', 'Step', step, '|', 'Loss', loss.data.item(), '|', 'Test Accuracy', accuracy)
# 结束计时
end = time.time()
# 训练耗时
print('Time cost:', end - start, 's')
# 预测
test_output = cnn(test_x[:100])
# 为了将CUDA tensor转化为numpy,需要将数据移回CPU上
# 否则会报错:TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
predict_y = torch.max(test_output, 1)[1].cpu().data.numpy().squeeze()
real_y = test_y[:100].cpu().numpy()
print(predict_y)
print(real_y)
# 打印预测和实际结果
for i in range(10):
print('Predict', predict_y[i])
print('Real', real_y[i])
plt.imshow(test_data.data[i].numpy(), cmap='gray')
plt.show()