《数据挖掘》第三次实验
1、一个新的激活函数Relu
最近几年卷积神经网络中,激活函数往往不选择sigmoid或tanh函数,而是选择relu函数。Relu函数的定义是:
f
(
x
)
=
max
(
0
,
x
)
f(x) = \max (0,x)
f(x)=max(0,x)
Relu函数作为激活函数,有下面几大优势:
1.速度快:
和sigmoid函数需要计算指数和倒数相比,relu函数其实就是一个max(0,x),计算代价小很多。
2.减轻梯度消失问题:
计算梯度的公式
∇
=
σ
′
δ
x
\nabla = \sigma '\delta x
∇=σ′δx,
σ
′
\sigma '
σ′是sigmoid函数的导数。在使用反向传播算法进行梯度计算时,每经过一层sigmoid神经元,梯度就要乘上一个
σ
′
\sigma '
σ′。从下图可以看出
,
σ
′
\sigma '
σ′函数最大值是1/4。因此,乘一个
σ
′
\sigma '
σ′会导致梯度越来越小,这对于深层网络的训练是个很大的问题。而relu函数的导数是1,不会导致梯度变小。当然,激活函数仅仅是导致梯度减小的一个因素,但无论如何在这方面relu的表现强于sigmoid。使用relu激活函数可以让你训练更深的网络。
3.稀疏性:
通过对大脑的研究发现,大脑在工作的时候只有大约5%的神经元是激活的,而采用sigmoid激活函数的人工神经网络,其激活率大约是50%。有论文声称人工神经网络在15%-30%的激活率时是比较理想的。因为relu函数在输入小于0时是完全不激活的,因此可以获得一个更低的激活率。
2、训练图像分类器
1、使用加载和规范化 CIFAR10 训练和测试数据集torchvision
使用PyTorch和torchvision库加载和预处理CIFAR-10数据集,这是一个用于图像分类任务的常用数据集。CIFAR-10数据集包含60,000张32x32彩色图像,分为10个类别,每个类别有6,000张图像。其中包括50,000张用于训练的图像和10,000张用于测试的图像。
import torch
import torchvision
import torchvision.transforms as transforms
#图像通过transforms.ToTensor()转换为张量,使用transforms.Normalize()对每个通道进行标准化,均值为(0.5, 0.5, 0.5),标准差为(0.5, 0.5, 0.5)。
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
#批处理大小设置为4
batch_size = 4
#使用torchvision.datasets.CIFAR10加载CIFAR-10训练数据集,指定数据下载的根目录,设置train=True以加载训练数据集,并传递定义的数据预处理操作。
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
#使用torch.utils.data.DataLoader创建训练数据集的数据加载器
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
shuffle=True, num_workers=2)
#使用torchvision.datasets.CIFAR10加载CIFAR-10测试数据集,train=False以加载测试数据集,并传递定义的数据预处理操作。
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
#使用torch.utils.data.DataLoader创建测试数据集的数据加载器,类似于训练数据加载器。
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
shuffle=False, num_workers=2)
#数据集的类别定义为('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
如果在 Windows 上运行并且您收到 BrokenPipeError,请尝试设置将torch.utils.data.DataLoader()的num_worker为 0。
让我们展示一些训练图像,以方便使用。定义了一个用于显示图像的函数imshow(),并使用trainloader加载了一批训练图像,并将其显示在一个网格中。
import matplotlib.pyplot as plt
import numpy as np
# functions to show an image
def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.show()
# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)
# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join(f'{classes[labels[j]]:5s}' for j in range(batch_size)))
2、定义卷积神经网络
定义一个名为Net的神经网络模型,用于进行图像分类任务。该模型使用了卷积层(Conv2d)、池化层(MaxPool2d)和全连接层(Linear)构建而成,包含了两个卷积层、三个全连接层。
self.conv1 = nn.Conv2d(3, 6, 5): 第一个卷积层,输入通道数为3,输出通道数为6,卷积核大小为5x5。
self.pool = nn.MaxPool2d(2, 2): 最大池化层,池化窗口大小为2x2,步幅为2,用于降低特征图的空间维度。
self.conv2 = nn.Conv2d(6, 16, 5): 第二个卷积层,输入通道数为6,输出通道数为16,卷积核大小为5x5。
self.fc1 = nn.Linear(16 * 5 * 5, 120): 第一个全连接层,输入节点数为16x5x5,输出节点数为120。
self.fc2 = nn.Linear(120, 84): 第二个全连接层,输入节点数为120,输出节点数为84。
self.fc3 = nn.Linear(84, 10): 最后一个全连接层,输入节点数为84,输出节点数为10,对应着CIFAR-10数据集中的10个类别。
def forward(self, x): 这是模型的前向传播函数,定义了输入x通过卷积层、池化层和全连接层的计算过程。
net = Net(): 创建一个神经网络模型对象。
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = torch.flatten(x, 1) # flatten all dimensions except batch数据扁平化
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
3、定义损失函数
nn.CrossEntropyLoss(): 交叉熵损失函数,常用于多分类问题。
optim.SGD(net.parameters(), lr=0.001, momentum=0.9): 随机梯度下降(SGD)优化器,用于更新神经网络的权重。学习率(lr)和动量(momentum)作为超参数进行配置。
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
4、根据训练数据训练网络
在数据迭代器上循环,将输入 输入到网络并进行优化
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
running_loss = 0.0
print('Finished Training')
快速保存经过训练的模型:
PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)
5、根据测试数据测试网络
从一个名为testloader的数据加载器中获取一批图像数据和对应的标签,并将这些图像以网格形式进行展示。
dataiter = iter(testloader)
images, labels = next(dataiter)
# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join(f'{classes[labels[j]]:5s}' for j in range(4)))
使用已经训练好的神经网络模型(由Net()创建)加载预训练的权重参数(从PATH路径加载),并利用这个模型对之前加载的图像数据进行预测。预测的结果存储在outputs中,在每个预测概率值中选择最大的一个作为最终的预测结果,并将其对应的类别索引保存在predicted中。
net = Net()
net.load_state_dict(torch.load(PATH))
outputs = net(images)
_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join(f'{classes[predicted[j]]:5s}'
for j in range(4)))
看看网络在整个数据集上的表现
correct = 0
total = 0
# since we're not training, we don't need to calculate the gradients for our outputs
with torch.no_grad():
for data in testloader:
images, labels = data
# calculate outputs by running images through the network
outputs = net(images)
# the class with the highest energy is what we choose as prediction
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f'Accuracy of the network on the 10000 test images: {100 * correct // total} %')
具体看看每一类的准确率
# prepare to count predictions for each class
correct_pred = {classname: 0 for classname in classes}
total_pred = {classname: 0 for classname in classes}
# again no gradients needed
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predictions = torch.max(outputs, 1)
# collect the correct predictions for each class
for label, prediction in zip(labels, predictions):
if label == prediction:
correct_pred[classes[label]] += 1
total_pred[classes[label]] += 1
# print accuracy for each class
for classname, correct_count in correct_pred.items():
accuracy = 100 * float(correct_count) / total_pred[classname]
print(f'Accuracy for class: {classname:5s} is {accuracy:.1f} %')
6、整体code
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
batch_size = 4
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
shuffle=True, num_workers=0)
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
shuffle=False, num_workers=0)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.show()
# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)
# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join(f'{classes[labels[j]]:5s}' for j in range(batch_size)))
class Net(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = torch.flatten(x, 1) # flatten all dimensions except batch
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
running_loss = 0.0
print('Finished Training')
PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)
dataiter = iter(testloader)
images, labels = next(dataiter)
# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join(f'{classes[labels[j]]:5s}' for j in range(4)))
net = Net()
net.load_state_dict(torch.load(PATH))
outputs = net(images)
_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join(f'{classes[predicted[j]]:5s}'
for j in range(4)))
correct = 0
total = 0
# since we're not training, we don't need to calculate the gradients for our outputs
with torch.no_grad():
for data in testloader:
images, labels = data
# calculate outputs by running images through the network
outputs = net(images)
# the class with the highest energy is what we choose as prediction
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f'Accuracy of the network on the 10000 test images: {100 * correct // total} %')
# prepare to count predictions for each class
correct_pred = {classname: 0 for classname in classes}
total_pred = {classname: 0 for classname in classes}
# again no gradients needed
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predictions = torch.max(outputs, 1)
# collect the correct predictions for each class
for label, prediction in zip(labels, predictions):
if label == prediction:
correct_pred[classes[label]] += 1
total_pred[classes[label]] += 1
# print accuracy for each class
for classname, correct_count in correct_pred.items():
accuracy = 100 * float(correct_count) / total_pred[classname]
print(f'Accuracy for class: {classname:5s} is {accuracy:.1f} %')
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)
del dataiter
以上内容参考链接:https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html