记录我的视频学习生活
原地址:https://morvanzhou.github.io/tutorials/machine-learning/torch/4-01-CNN/
遇到的问题:
1.
EOFError: Compressed file ended before the end-of-stream marker was reached
找到下载目录下的文件,删除重新下
2.
UserWarning: test_data has been renamed datawarnings.warn(“test_data has been renamed data”)
UserWarning: test_labels has been renamed targets warnings.warn(“test_labels has been renamed targets”)
test_data.test_data → test_data.data
test_data.tast_labels → test_data.targets
test_x = torch.unsqueeze(test_data.data, dim=1).type(torch.FloatTensor)[:10000]/255.
test_y = test_data.targets[:10000]
import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision
import tensorflow as tf
import matplotlib.pyplot as plt
# hyper parameters
LR = 0.001 # learning rate
BATCH_SIZE = 50
EPOCH = 1
DOWNLOAD_MNIST = False # 如果下载过 就False
train_data = torchvision.datasets.MNIST(
root='./mnist',
train=True, # training data 就是true
transform=torchvision.transforms.ToTensor(),
download=DOWNLOAD_MNIST
)
# print(train_data.train_data.size())
# print(train_data.train_labels.size())
# plt.imshow(train_data.train_data[0].numpy(), cmap='gray')
# plt.title('%i' % train_data.train_labels[0])
# plt.show()
train_loader = Data.DataLoader(
dataset=train_data,
batch_size=BATCH_SIZE,
shuffle=True,
num_workers=0
)
test_data = torchvision.datasets.MNIST(
root='./mnist',
train=False
)
test_x = torch.unsqueeze(test_data.data, dim=1).type(torch.FloatTensor)[:2000] / 255.
test_y = test_data.targets[:2000]
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d( # (1, 28, 28)
in_channels=1,
out_channels=16,
kernel_size=5,
stride=1, # 每次跳过几格
padding=2, # if stride = 1, padding = (kernel_size-1)/2 = (5-1)/2
), # → (16, 28, 28)
nn.ReLU(), # → (16, 28, 28)
nn.MaxPool2d(kernel_size=2), # → (16, 14, 14)
)
self.conv2 = nn.Sequential( # → (16, 14, 14)
nn.Conv2d(16, 32, 5, 1, 2), # → (32, 14, 14)
nn.ReLU(), # → (32, 14, 14)
nn.MaxPool2d(2) # → (32, 7, 7)
)
self.out = nn.Linear(32 * 7 * 7, 10) # 10是分类个数
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x) # (batch, 32, 7, 7)
x = x.view(x.size(0), -1) # (batch, 32 * 7 * 7) # 展平
output = self.out(x)
return output
cnn = CNN()
# print(cnn) # net architecture
optimizer = torch.optim.Adam(cnn.parameters(), lr=LR) # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss() # the target label is not one-hotted # 分类问题常用的损失函数为交叉熵( Cross Entropy Loss)
for epoch in range(EPOCH):
for step, (x, y) in enumerate(train_loader):
output = cnn(x) # get output from every net
loss = loss_func(output, y) # compute loss for every net
optimizer.zero_grad() # clear gradient for next train
loss.backward() # back propagation, compute gradient
optimizer.step() # apply gradients
if step % 50 == 0:
test_output = cnn(test_x)
pred_y = torch.max(test_output, 1)[1].data.numpy()
accuracy = sum(pred_y == test_y.data.numpy()) / test_y.size(0)
print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy(), '| test accuracy: %.2f' % accuracy)
# print 10 predictions from test data
test_output = cnn(test_x[:10])
pred_y = torch.max(test_output, 1)[1].data.numpy()
print(pred_y, 'prediction number')
print(test_y[:10].numpy(), 'real number')
具体的一些函数:
1.torchvision.transforms.ToTensor():
并不是仅仅将图像转化成张量的形式。它的源码:
class ToTensor(object):
"""Convert a ``PIL Image`` or ``numpy.ndarray`` to tensor.
Converts a PIL Image or numpy.ndarray (H x W x C) in the range
[0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
if the PIL Image belongs to one of the modes (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1)
or if the numpy.ndarray has dtype = np.uint8
In the other cases, tensors are returned without scaling.
"""
def __call__(self, pic):
"""
Args:
pic (PIL Image or numpy.ndarray): Image to be converted to tensor.
Returns:
Tensor: Converted image.
"""
return F.to_tensor(pic)
def __repr__(self):
return self.__class__.__name__ + '()'
可以看出:
如果PIL图像属于(L,LA,P,I,F,RGB,YCbCr,RGBA,CMYK,1)其中一种模式或numpy.ndarray具有dtype = np.uint8的情况下,它将范围为[0,255]的PIL图像或(H xW x C)形式的numpy.ndarray转换为范围为[0.0,1.0]的形状为(C xH x W)的Torch.FloatTensor
在其他情况下,将返回张量而不进行缩放。
2.torch.unsqueeze
test_x = torch.unsqueeze(test_data.data, dim=1).type(torch.FloatTensor)[:2000] / 255.
将数据形式由(2000, 28, 28) 转化为 (2000, 1, 28, 28),且范围为[0,1]
3.x.view(x.size(0), -1)
x.size(0)是batch的大小,所以x = x.view(x.size(0), -1)相当于x = x.view(BATCH_SIZE, -1)。
-1是指列数未知的情况下,根据原来Tensor的数据和BATCH_SIZE自动分配列数