本篇文章以MNIST数据集为基础,描述如何训练一个小型的CNN模型。
1.Torch库导入
import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision # 数据库
import matplotlib.pyplot as plt
import os
2.数据下载
DOWNLOAD_MNIST = False
if not(os.path.exists('./mnist/')) or not os.listdir('./mnist/'): # Mnist digits dataset
# not mnist dir or mnist is empyt dir
DOWNLOAD_MNIST = True
# Mnist digits dataset
train_data = torchvision.datasets.MNIST(
root='./mnist/', #MNIST所在的文件夹
train=True, #从training.pt创建数据集,否则从test.pt创建数据集
transform=torchvision.transforms.ToTensor(), # 接受PIL图像并返回已转换版本的函数/转换。
# torch.FloatTensor of shape (C x H x W) and normalize in the range [0.0, 1.0]
download=DOWNLOAD_MNIST, # 是否下载
)
其中,torchvision.transforms.ToTensor将[0,255]的图片压缩成[0,1]
具体参考:官方帮助文档
print(train_data.train_data.size()) # (60000, 28, 28)
print(train_data.train_labels.size()) # (60000)
plt.imshow(train_data.train_data[0].numpy(), cmap='gray')
图片显示如下:
3、定义训练集与测试集
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
test_data = torchvision.datasets.MNIST(root='./mnist/', train=False)
test_x = Variable(torch.unsqueeze(test_data.data, dim=1)).type(torch.FloatTensor)[:2000]/255.
test_y = test_data.targets[:2000]
这里将test_x进行了数据类型转化和标准化
4、定义网络结构
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Sequential( # input shape (1, 28, 28)
nn.Conv2d(
in_channels=1, # input height
out_channels=16, # n_filters
kernel_size=5, # filter size
stride=1, # filter movement/step
padding=2, # if want same width and length of this image after con2d, padding=(kernel_size-1)/2 if stride=1
), # output shape (16, 28, 28)
nn.ReLU(), # activation
nn.MaxPool2d(kernel_size=2), # choose max value in 2x2 area, output shape (16, 14, 14)
)
self.conv2 = nn.Sequential( # input shape (16, 14, 14)
nn.Conv2d(16, 32, 5, 1, 2), # output shape (32, 14, 14)
nn.ReLU(), # activation
nn.MaxPool2d(2), # output shape (32, 7, 7)
)
self.out = nn.Linear(32 * 7 * 7, 10) # fully connected layer, output 10 classes
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
x = x.view(x.size(0), -1) # flatten the output of conv2 to (batch_size, 32 * 7 * 7)
output = self.out(x)
return output, x # return x for visualization
cnn = CNN()
print(cnn) # net architecture
网络中定义了init方法和forward方法
init方法主要是指名网络中有那些会使用到的层结构,forward方法则展示了数据在网络中的流动过程。
其中:nn.Conv2d为卷积层,参数为:
nn.Conv2d(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)
上述代码定义的网络结构如下:
类似于最早的LeNet-5(1998)网络结构,但对最后的全连接层做了简化;
关于卷积层的计算方式,参考吴恩达老师的深度学习课程内容:
参考资料:吴恩达deeplearning之CNN—卷积神经网络入门
CNN中feature map、卷积核、卷积核个数、filter、channel的概念解释
5、定义损失函数与优化器
## 超参数
EPOCH = 1 # train the training data n times, to save time, we just train 1 epoch
BATCH_SIZE = 50
LR = 0.001 # learning rate
optimizer = torch.optim.Adam(cnn.parameters(), lr=LR) # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss() # the target label is not one-hotted
这里选择了交叉熵CrossEntropy损失函数,CrossEntropy损失函数中封装了softmax;
后面再详细补充CrossEntropy;
6、训练
for epoch in range(EPOCH):
for step, (x, y) in enumerate(train_loader): # gives batch data, normalize x when iterate train_loader
b_x = Variable(x) # batch x
b_y = Variable(y) # batch y
output = cnn(b_x)[0] # cnn output
loss = loss_func(output, b_y) # cross entropy loss
optimizer.zero_grad() # clear gradients for this training step
loss.backward() # backpropagation, compute gradients
optimizer.step() # apply gradients
神经网络的过程包括以下几个步骤:
1.加载训练数据与标签;
2.cnn前向过程计算loss;
3.反向传播更新权值;
首先需要将优化器先前保存的梯度信息,然后对loss使用.backward求导。最后将优化器中的变量使用.step更新。
7、测试
8、保存模型
torch.save(net1, 'net.pkl') # save entire net
需要使用只需要
torch.load()