Pytorch构建神经网络(二)笔记
3. 神经网络与深度学习
3.1 Fashion-MNIST 数据集的起源
- 计算机程序一般由两个主要部分组成:代码和数据
- 对于深度学习而言,软件即为网络本身,尤其是在训练过程中通过训练产生的权重
- 神经网络程序员的工作是通过训练来监督和指导学习过程(可以看做是编写软件或代码的间接方式)
3.1.1 Fashion-MNIST数据集
MNIST是非常著名的手写数字数据集 (M:Modify; NIST: National Institute of Standard and Technology)
MNIST中共有7万张图像:6万张用于训练;1万张用于测试;共0—9十个类别
Fashion-MNIST数据集来自Zalando网站:10类别对应10种服饰;7万张 28x28的灰度图像
Fashion-MNIST的目的是取代MNIST数据集,用作基准来测试机器学习算法
Fashion-MNIST与MNIST数据集的异同:(1)异:MNIST数据集中图像都是手写图像,而Fashion-MNIST中的是真实图像;(2)同:这两个数据集具有相同的数据规模,图像大小,数据格式,以及训练集和测试集的分割方式
MNIST为何如此受欢迎:1.该数据集的规模允许深度学习研究者快速地检查和复现它们的算法;2.在所有的深度学习框架中都能使用该数据集
Pytorch中的torchvision包可以加载fashion-mnist数据集
3.2 使用torchvision导入和加载数据集
3.2.1 创建深度学习项目的流程:
准备数据集
创建网络模型
训练网络模型
分析结果
3.2.2 数据准备遵守ETL过程:
提取(extract)、转换(transform)、加载(load)
pytorch中自带的包,能够将ETL过程变得简单
3.2.3 数据的准备:
1.提取:从源数据中获取fashion-mnist图像数据
2.转换:将数据转换成张量的形式
3.加载:将数据封装成对象,使其更容易访问
Fashion-MNIST 与 MNIST数据集在调用上最大的不同就是URL的不同
torch.utils.data.Dataset:一个用于表示数据集的抽象类
torch.utils.data.DataLoader: 包装数据集并提供对底层的访问
import torch
import torchvision
import torchvision.transforms as transforms # 可帮助对数据进行转换
train_set = torchvision.datasets.FashionMNIST(
root = './data/FashionMNIST', # 数据集在本地的存储位置
train = True, # 数据集用于训练
download = True, # 如果本地没有数据,就自动下载
transform = transforms.Compose([
transforms.ToTensor()
]) # 将图像转换成张量
)
train_loader = torch.utils.data.DataLoader(train_set)
# 训练集被打包或加载到数据加载器中,可以以我们期望的格式来访问基础数据;
# 数据加载器使我们能够访问数据并提供查询功能
报错:
D:\Anaconda3_install\envs\pytorch_1.9\lib\site-packages\torchvision\datasets\mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:180.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
解决方法:修改 mnist.py 文件
Pytorch | 报错The given NumPy array is not writeable,and PyTorch does not support non-writeable tensor
这个链接解决了我的报错
我的解决方法:
点击报错,直接能跳转到minst.py中,并且能直接跳转到copy=False的位置,然后删除copy=False即可。
Bach_size大小的调整
讲解了Bach_size大小的相关意义。
神经网络中batch_size参数的含义及设置方法
讲解了一些设置的技巧和优缺点
3.3 数据集的访问
import torch
import torchvision
import torchvision.transforms as transforms # 可帮助对数据进行转换
import numpy as np
import matplotlib.pyplot as plt
train_set = torchvision.datasets.FashionMNIST(
root = './data/FashionMNIST', # 数据集在本地的存储位置
train = True, # 数据集用于训练
download = True, # 如果本地没有数据,就自动下载
transform = transforms.Compose([
transforms.ToTensor()
]) # 将图像转换成张量
)
num_workers = 4 # 指定进程数为4
train_loader = torch.utils.data.DataLoader(train_set, batch_size=10)
# 训练集被打包或加载到数据加载器中,可以以我们期望的格式来访问基础数据;
# 数据加载器使我们能够访问数据并提供查询功能
torch.set_printoptions(linewidth=120) # 设置打印行宽
print(len(train_set))
print(train_set.train_labels)
print(train_set.train_labels.bincount()) # bincount:张量中每个值出现的频数
out:
60000
tensor([9, 0, 0, ..., 3, 0, 5])
tensor([6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000])
# 查看单个样本
sample = next(iter(train_set))
print(len(sample))
print(type(sample))
out:
torch.Size([1, 28, 28])
# 显示图像和标签
plt.imshow(image.squeeze(), cmap='gray') # 将[1, 28, 28]->[28,28]
plt.show()
print('label:', label)
# 查看批量样本
batch= next(iter(train_loader))
print(len(batch))
print(type(batch))
images, labels = batch
print(images.shape)
print(labels.shape)
out:
2
<class 'list'>
torch.Size([10, 1, 28, 28])
torch.Size([10])
# 画出一批的图像
grid= torchvision.utils.make_grid(images,nrow =10)
print(grid.shape)
plt.figure(figsize=(15, 15))
plt.imshow(np.transpose(grid,(1,2,0))) # 将张量转换成矩阵
print('labels:', labels)
# 可以通过改变batchsize来显示更多的数据
out:
torch.Size([3, 32, 302])
labels: tensor([9, 0, 0, 3, 0, 2, 7, 2, 5, 5])
其中:
grid = torchvision.utils.make_grid(images, nrow=10)
#make_grid的作用是将若干幅图像拼成一幅图像。images是所有的图片集(需要在之前定义),nrow的作用是一行多少张图片(images的数量/nrow=行数),其中padding的作用就是子图像与子图像之间的pad有多宽。
例如:
# 画出一批的图像
grid = torchvision.utils.make_grid(images, nrow=3)
print(grid.shape)
plt.figure(figsize=(15, 15))
plt.imshow(np.transpose(grid,(1,2,0))) # 将张量转换成矩阵
plt.show()
print('labels:', labels)
# 可以通过改变batchsize来显示更多的数据
numpy.transpose函数的作用:调整数组的行列值的索引值
例如(0,1,2)对应(x,y,z)。我们可以使用这个函数调整为np.transpose(grid,(1,2,0))——(y,z,x)
Python numpy.transpose 详解
http://www.360doc.com/content/19/0602/00/7669533_839717717.shtml
3.3.1 不平衡数据集
关于数据不均衡的问题可以读文章:A systematic study of the class imbalance problem in convolutional neural networks
3.4 网络建立
3.4.1 class和object的区分
class 就是一个实际对象的蓝图或描述
object 就是事物本身
创建的对象需要在类的实例中调用对象
一个给定类的所有实例都有两个核心组件:方法和属性
方法代表代码,属性代表数据;方法和属性是由类定义的
属性用于描述对象的特征;方法用于描述对象的行为,即对象能够做什么
在一个项目中可以有许多对象,即给定类的实例可以同时存在(可在一个类中创建多个对象)
类用于封装方法和属性
3.4.2 类和实例(对象)<python基础知识补充>
类是抽象的模板,用于表述具有相同属性和方法的对象的集合,类的命名尽量见名知意
对象是真实的,见得到摸得着的东西
类的定义:class 类名():
类的组成:类名;属性(一组数据);方法(允许进行的操作)
# 类的创建
class Lizard:
def __init__(self, name): # 创建对象时自动运行,不用额外调用,无返回值
self.name = name
def set_name(self, name):
self.name = name
# 类的调用
lizard = Lizard('deep')
print(lizard.name)
lizard.set_name('lizard')
print(lizard.name)
out:
deep
lizard
3.4.3 面向对象编程与pytorch的结合
构建一个神经网络的主要组件是层(pytorch神经网络库中包含了帮助构造层的类)
神经网络中的每一层都有两个主要组成部分:转换和权重(转换代表代码;权重代表数据)
forward方法(前向传输):张量通过每层的变换向前流动,直到达到输出层
构建神经网络时必须提供前向方法,前向方法即为实际的变换
使用pytorch创建神经网络的步骤:
1.扩展nn.Module基类
2.定义层(layers)为类属性
3.实现前向方法
# CNN网络的建立
import torch.nn as nn
class Network(nn.Module): #()中加入nn.Module可以使得Network类继承Module基类中的所有功能
def __init__(self):
super(Network, self).__init__() # 对继承的父类的属性进行初始化,使用父类的方法来进行初始化
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120) # 从卷积层传入线性层需要对张量flatten
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self, t):
# implement the forward pass
return t
network = Network() # 创建网络对象network
print(network)
out:
Network(
(conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear(in_features=192, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=60, bias=True)
(out): Linear(in_features=60, out_features=10, bias=True)
)
3.5 CNN构建及网络参数的使用
在上述的Network类中,我们定义了两个卷积层和三个线性层;两个主要的部分封装在其中,即前向函数的定义和权重张量;每个层中权重张量包含了随着我们的网络在训练过程中学习而更新的权重值(这就是在网络类中将层定义为类属性的原因);在Module类中,pytorch可以跟踪每一层的权重张量,由于我们在创建Network类时扩展了Module类,也就自动继承了该功能。
Parameter和Argument的区别:
Parameter在函数定义中使用,可将其看作是占位符;(形参)
Argument是当函数被调用时传递给函数的实际值;(实参)
Parameter的两种类型:
1.Hyperparameters(超参数):其值是手动和任意确定的;要构建神经网络:kernel_size, out_channels, out_features都需要手动选择
2.Data dependent Hyperparameters:其值是依赖于数据的参数
该参数位于网络的开始或末端,即第一个卷积层的输入通道和最后一个卷积层的输出特征图
第一个卷积层的输入通道依赖于构成训练集的图像内部的彩色通道的数量(灰度图像是1,彩色图像是3)
输出层的输出特征依赖于训练集中类的数量(fashion-MNIST数据集中的类型为10,则输出层的out_features=10)
通常情况下,一层的输入是上一层的输出(即:卷积层中所有输入通道和线性层中的输入特征都依赖于上一层的数据)
当张量从卷积层传入线性层时,张量必须是flatten的
3.6 CNN的权重
可学习参数:是在训练过程中学习的参数,初值是选择的任意值,其值在网络学习的过程中以迭代的方式进行更新
说网络在学习是指:网络在学习参数的适合的值,适合的值就是能使损失函数最小化的值
可学习的参数是网络的权重,存在于每一层中
当我们扩展类的时候,我们会得到它的所有功能,为了得到它,我们可以添加额外的功能,也可覆盖现有的功能:def repr(self):
在python中,所有特殊的面向对象的方法通常都有前双下划线和后双下划线(init, repr)
# Network类没有扩展Module基类:class Network()缺少nn.Model和 #super(Network, self).__init__()
import torch.nn as nn
class Network():
def __init__(self):
#super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self,t):
# implement the forward pass
return t
network = Network() # 创建网络对象network
print(network)
out:
# 下面是python的默认的字符串表示的输出
<__main__.Network at 0x22587c614a8>
如下所示,在未扩展module时,可使用repr函数实现正常输出
import torch.nn as nn
class Network():
def __init__(self):
#super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self,t):
# implement the forward pass
return t
# 用于重写python的默认字符串表示
def __repr__(self):
return "lizard"
network = Network() # 创建网络对象network
print(network)
out:
lizard
视频中输出网络参数的代码为:
import torch.nn as nn
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self,t):
# implement the forward pass
return t
network = Network() # 创建网络对象network
print(network)
out:
Network(
(conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear(in_features=192, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=60, bias=True)
(out): Linear(in_features=60, out_features=10, bias=True)
)
可使用点符号来访问指定的层
print(network.conv1)
out:
Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
# 输出conv1的权重
print(network.conv1.weight)
out:
Parameter containing:
tensor([[[[-0.1438, -0.1988, 0.1899, -0.1422, 0.1970],
[ 0.1218, 0.1801, 0.0804, 0.1110, -0.1473],
[-0.1049, -0.1533, 0.0420, 0.1099, -0.1373],
[ 0.1582, -0.0019, -0.0629, 0.0914, -0.0435],
[-0.1514, -0.0354, -0.1848, -0.0231, 0.1370]]],
[[[ 0.0317, -0.1364, 0.1620, 0.1353, -0.1444],
[ 0.0680, 0.1570, 0.0125, 0.0637, -0.0675],
[-0.1313, -0.1136, 0.1897, 0.1206, -0.0622],
[-0.1080, -0.0497, -0.0702, -0.0526, -0.1793],
[ 0.0029, 0.1846, -0.0085, 0.0482, -0.0998]]],
[[[-0.0316, 0.0776, -0.0835, 0.1112, 0.0020],
[-0.0056, -0.1553, -0.1064, 0.1666, 0.1231],
[ 0.1483, 0.1326, 0.0449, 0.0727, -0.0959],
[ 0.1752, -0.1934, 0.0086, 0.1932, -0.0894],
[ 0.0845, 0.0121, -0.1207, 0.0316, -0.1766]]],
[[[ 0.0294, 0.1874, -0.1835, 0.0130, -0.0245],
[-0.0159, -0.1468, -0.0155, -0.0169, -0.0171],
[-0.1077, -0.1065, -0.1337, -0.1069, -0.1904],
[-0.1552, -0.1737, -0.0083, 0.1185, 0.0473],
[ 0.0124, 0.0715, -0.1177, -0.0071, -0.0533]]],
[[[ 0.0202, 0.0005, -0.1567, -0.0514, -0.1844],
[ 0.1773, 0.0434, -0.0500, -0.0931, -0.0610],
[-0.0461, 0.0202, -0.1609, 0.1488, -0.1418],
[ 0.1540, 0.0594, 0.0386, -0.0253, 0.1520],
[ 0.1568, 0.0054, 0.0918, 0.0434, -0.0474]]],
[[[-0.0508, 0.1441, -0.0893, -0.1571, 0.1605],
[-0.0918, -0.0100, 0.0122, -0.1781, -0.0800],
[-0.1800, -0.0535, 0.0338, -0.1285, 0.0770],
[ 0.0650, 0.1575, 0.1226, 0.1950, -0.0195],
[-0.1236, -0.0997, 0.0097, -0.0187, -0.1009]]]], requires_grad=True)
# 输出conv1权重的形状
print(network.conv1.weight.shape)
out:
# 第一个参数6代表滤波器的数量,第二个参数1代表输入的通道数量,第三、四代表滤波器的高度和宽度
torch.Size([6, 1, 5, 5])
我们可以把任何一个滤波器单独拉出来,通过索引到权重张量的第一个轴上
print(network.conv1.weight[0].shape)
out:
# 深度是1,高度和宽度是5
torch.Size([1, 5, 5])
# 对于全连接层,由于需要flatten的张量输入,故此时的权重张量是个秩为2的高度、宽度轴
print(network.fc1.weight.shape) # height=>out_features; width=>in_features
out:
# self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
# (fc1): Linear(in_features=192, out_features=120, bias=True)
# 从下面的输出,我们可以看出,这里的模式是高度的长度等于期望的输出特征的长度,宽度的长度等于输入特征的长度
torch.Size([120, 192])
这是因为矩阵乘法的特点导致的:
为了追踪网络中的所有权重张量,pytorch有一个叫Parameter的类,该类扩展了Tensor类, 所以每一层的权重张量就是这个参数类的一个实例
权重矩阵定义了线性函数(线性映射)
# 张量的乘法
in_features = torch.tensor([1,2,3,4],dtype=torch.float32)
weight_matrix = torch.tensor([
[1,2,3,4],
[2,3,4,5],
[3,4,5,6]
], dtype=torch.float32)
print(weight_matrix.matmul(in_features)) # matmul: matrix multiply
out:
tensor([30., 40., 50.])
# CNN网络的建立
import torch.nn as nn
class Network(nn.Module): #()中加入nn.Module可以使得Network类继承Module基类中的所有功能
def __init__(self):
super(Network, self).__init__() # 对继承的父类的属性进行初始化,使用父类的方法来进行初始化
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120) # 从卷积层传入线性层需要对张量flatten
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self, t):
# implement the forward pass
return t
network = Network() # 创建网络对象network
print(network)
out:
Network(
(conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear(in_features=192, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=60, bias=True)
(out): Linear(in_features=60, out_features=10, bias=True)
)
访问所有的参数
# 访问所有的参数
# 方法1:
for param in network.parameters():
print(param.shape)
out:
torch.Size([6, 1, 5, 5])
torch.Size([6])
torch.Size([12, 6, 5, 5])
torch.Size([12])
torch.Size([120, 192])
torch.Size([120])
torch.Size([60, 120])
torch.Size([60])
torch.Size([10, 10])
torch.Size([10])
# 方法2:
for name, param in network.named_parameters():
print(name,'\t\t', param.shape)
out:
conv1.weight torch.Size([6, 1, 5, 5])
conv1.bias torch.Size([6])
conv2.weight torch.Size([12, 6, 5, 5])
conv2.bias torch.Size([12])
fc1.weight torch.Size([120, 192])
fc1.bias torch.Size([120])
fc2.weight torch.Size([60, 120])
fc2.bias torch.Size([60])
out.weight torch.Size([10, 10])
out.bias torch.Size([10])
3.7 pytorch可调用模块
3.7.1 Linear的工作原理
# 1. 张量的乘法
in_features = torch.tensor([1,2,3,4], dtype=torch.float32)
weight_matrix = torch.tensor([
[1,2,3,4],
[2,3,4,5],
[3,4,5,6]
], dtype = torch.float32)
print(weight_matrix.matmul(in_features))
# 可将上述的权重矩阵看作是一个线性映射(函数),其实现过程与pytorch中的线性层一样
out:
tensor([30., 40., 50.])
# 2. 线性层
fc = nn.Linear(in_features=4, out_features=3)
# pytorch 线性层通过将数字4和3传递给构造函数,以创建一个3x4的权重矩阵
# 查看in_features张量
print(fc(in_features))
# 此时的结果与上述不同是因为这里的weight_matrix是由随机值来初始化的
out:
tensor([-0.4276, -1.8520, 2.3740], grad_fn=<AddBackward0>)
# 在parameter类中包装一个权重矩阵,以使得输出结果与1中一样
fc = nn.Linear(in_features=4, out_features=3)
fc.weight= nn.Parameter(weight_matrix)
print(fc(in_features))
# 此时的结果接近1中的结果却不精确,是因为由bias的存在
out:
tensor([30.4195, 40.2070, 50.1337], grad_fn=<AddBackward0>)
# 给bias传递一个false值,以得到精确的输出
fc = nn.Linear(in_features=4, out_features=3, bias =False)
fc.weight = nn.Parameter(weight_matrix)
print(fc(in_features))
out:
tensor([30., 40., 50.], grad_fn=<SqueezeBackward3>)
线性转换的数学表示:
y = Ax + b
A: 权重矩阵张量
x: 输入张量
b: 权重张量
y: 输出张量
3.7.2 特殊的调用
讲解了代码调用的内部细节,没有看
3.8 CNN前向方法的实现
前向方法的实现将使用我们在构造函数中定义的所有层
前向方法实际上是输入张量到预测的输出张量的映射
3.8.1 Input Layer¶
输入层是由输入数据决定的
输入层可以看做是恒等变换 f(x)=x
输入层通常是隐式存在的
import torch.nn as nn
import torch.nn.functional as F
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
self.fc2 = nn.Linear(in_features=120,out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self,t):
# (1) input layer
t = t
# (2) hidden conv layer1
t = self.conv1(t)
t = F.relu(t)
t = F.max_pool2d(t, kernel_size=2, stride=2)
# (3) hidden conv layer2
t = self.conv2(t)
t = F.relu(t)
t = F.max_pool2d(t, kernel_size=2, stride=2)
# relu 和 max pooling 都没有权重;激活层和池化层的本质都是操作而非层;层与操作的不同之处在于,层有权重,操作没有
#(4)hidden linear layer2
t = t.reshape(-1, 12*4*4)
t = self.fc1(t)
t = F.relu(t)
# (5) hidden linear layer2
t = self.fc2(t)
t = F.relu(t)
# (6) output layer
t = self.out(t)
# t= F.softmax(t, dim=1) # 这里暂不使用softmax,在训练中使用交叉熵损失可隐式的表示softmax
# 在隐藏层中,通常使用relu作为非线性激活函数
# 在输出层,有类别要预测时,使用
return t
3.9 单张图像的预测
3.9.1 前向传播(forward propagation)
是将输入张量转换为输出张量的过程(即:神经网络是将输入张量映射到输出张量的函数)
前向传播只是将输入张量传递给网络并从网络接收输出的过程的一个特殊名称
3.9.2 反向传播(back propagation)
反向传播通常在前向传播后发生
使用torch.set_grad_enabled(False)来关闭pytorch的梯度计算,这将阻止pytorch在我们的张量通过网络时构建一个计算图(关闭是因为我们这里还没有进行训练,只是看随机初始化的网络)
计算图通过跟踪张量在网络中传播的每一个计算,来跟踪网络的映射;然后在训练过程中使用这个图来计算导数,也就是损失函数的梯度;关闭并非强制的,但可以减少内存。
# 单张图像预测
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
# 设置打印格式
torch.set_printoptions(linewidth=120)
# 一、数据准备
train_set = torchvision.datasets.FashionMNIST(
root = './data/FashionMNIST'
,train = True
,download = True
, transform = transforms.Compose([
transforms.ToTensor()
])
)
# 二、创建网络
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels = 1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels = 6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features = 12*4*4, out_features=120)
self.fc2 = nn.Linear(in_features = 120, out_features = 60)
self.out = nn.Linear(in_features = 60, out_features=10)
def forward(self, t):
# (1)Input Layer
t = t
# (2) hidden conv1
t = self.conv1(t)
t = F.relu(t)
t = F.max_pool2d(t, kernel_size=2, stride=2)
# (3) hidden conv2
t = self.conv2(t)
t = F.relu(t)
t = F.max_pool2d(t, kernel_size=2, stride=2)
# (4) hidden linear1
t = t.reshape(-1, 12*4*4)
t = self.fc1(t)
t = F.relu(t)
# (5) hidden linear2
t = self.fc2(t)
t = F.relu(t)
# (6) output
t = self.out(t)
return t
# 调用network实例
torch.set_grad_enabled(False) #关闭pytorch的梯度计算
network = Network()
sample = next(iter(train_set))
image, label = sample
print(image.shape)
# 显示图像和标签
#plt.imshow(image.squeeze(), cmap='gray') # 将[1, 28, 28]->[28,28]
#print('label:', label)
# 如上我们得到的图像的形状为[1,28,28];而网络期望的张量是【batchsize,channels, height, width】
# 需要使用unsqueeze方法来为其增加一个维度
print(image.unsqueeze(0).shape)
# 对单张图像进行预测
pred = network(image.unsqueeze(0))
print(pred.shape)
print(pred.argmax(dim=1))
print(label)
out:
torch.Size([1, 28, 28])
torch.Size([1, 1, 28, 28])
torch.Size([1, 10]) # 一个预测图像,10种预测结果
tensor([2])
9
# 要想将预测值用概率表示,可以使用softmax
print(F.softmax(pred, dim=1))
print(F.softmax(pred, dim=1).sum())
out:
# 这个预测是不准确的,因为我们的权重还没有训练,这只是随机初始化权重得到的结果
tensor([[0.1052, 0.0973, 0.0985, 0.1051, 0.1061, 0.0883, 0.0925, 0.0925, 0.1145, 0.1000]])
# 所有类的预测概率和为1
tensor(1.0000)
3.10 单批次图像预测
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
torch.set_printoptions(linewidth=120)
print(torch.__version__)
print(torchvision.__version__)
out:
1.9.0
0.10.0
# 数据准备
train_set = torchvision.datasets.FashionMNIST(
root = './data/FashionMNIST'
,train = True
,download = True
,transform = transforms.Compose([
transforms.ToTensor()
]))
# 网络创建
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self, t):
#super(Network, self).__init__()
#(1)Input Layer
t = t
#(2)Conv1
t = F.relu(self.conv1(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
#(3)Conv2
t = F.relu(self.conv2(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
#(4)FC1
t = t.reshape(-1,12*4*4)
t = F.relu(self.fc1(t))
#(5)FC2
t = F.relu(self.fc2(t))
#(6)output
t = self.out(t)
return t
# 调用network实例
torch.set_grad_enabled(False)
network = Network()
# 从dataloader中取出一批数据
data_loader = torch.utils.data.DataLoader(train_set, batch_size=10)
# 调用next(iter(data_loader)),数据加载器会返回一批十张图像
batch = next(iter(data_loader))
images, labels = batch
print(images.shape)
print(labels.shape)
out:
# [bachsize一次有10张图像,1个单独的色彩通道,高度,宽度 ]
torch.Size([10, 1, 28, 28])
# 10张图像,每张图像对应一个标签
torch.Size([10])
# 将图像张量传递给网络,获得一个预测
pred = network(images)
print(pred.shape)
print(pred)
out:
# 输出的维度是10*10,10张图像,每张图像由10种预测概率
torch.Size([10, 10])
tensor([[-0.0193, -0.1122, 0.1257, 0.1487, -0.1571, -0.0396, -0.0396, 0.1130, 0.0356, -0.0452],
[-0.0110, -0.1102, 0.1150, 0.1468, -0.1428, -0.0616, -0.0477, 0.1210, 0.0502, -0.0388],
[-0.0192, -0.1119, 0.1058, 0.1287, -0.1349, -0.0584, -0.0487, 0.1007, 0.0328, -0.0372],
[-0.0199, -0.1143, 0.1094, 0.1347, -0.1394, -0.0532, -0.0469, 0.1065, 0.0386, -0.0388],
[-0.0153, -0.1097, 0.1158, 0.1431, -0.1525, -0.0596, -0.0413, 0.1135, 0.0452, -0.0478],
[-0.0162, -0.1106, 0.1121, 0.1506, -0.1474, -0.0451, -0.0463, 0.1247, 0.0500, -0.0350],
[-0.0336, -0.1143, 0.1142, 0.1397, -0.1467, -0.0355, -0.0486, 0.0978, 0.0278, -0.0389],
[-0.0202, -0.1184, 0.1161, 0.1590, -0.1570, -0.0470, -0.0401, 0.1269, 0.0574, -0.0331],
[-0.0173, -0.1080, 0.1156, 0.1301, -0.1344, -0.0578, -0.0503, 0.0940, 0.0322, -0.0312],
[-0.0188, -0.1214, 0.1172, 0.1419, -0.1376, -0.0582, -0.0485, 0.1039, 0.0307, -0.0406]])
# argmax获得概率最大的值的索引
print(pred.argmax(dim=1))
print(labels)
out:
# 所有都预测3是最大的概率的索引
tensor([3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
# 下面是图像的正确label
tensor([9, 0, 0, 3, 0, 2, 7, 2, 5, 5])
# 比较预测值和label是否相等
print(pred.argmax(dim=1).eq(labels))
# 计算预测正确的数量
print(pred.argmax(dim=1).eq(labels).sum())
out:
tensor([False, False, False, True, False, False, False, False, False, False])
tensor(1)
print(F.softmax(pred, dim=1))
print(F.softmax(pred,dim=1).sum())
out:
tensor([[0.0975, 0.0889, 0.1127, 0.1154, 0.0850, 0.0956, 0.0956, 0.1113, 0.1030, 0.0950],
[0.0982, 0.0890, 0.1114, 0.1150, 0.0861, 0.0934, 0.0947, 0.1121, 0.1044, 0.0955],
[0.0981, 0.0894, 0.1112, 0.1138, 0.0874, 0.0944, 0.0953, 0.1106, 0.1034, 0.0964],
[0.0979, 0.0890, 0.1114, 0.1142, 0.0868, 0.0946, 0.0952, 0.1110, 0.1037, 0.0960],
[0.0981, 0.0893, 0.1119, 0.1150, 0.0855, 0.0939, 0.0956, 0.1116, 0.1042, 0.0950],
[0.0976, 0.0888, 0.1109, 0.1153, 0.0856, 0.0948, 0.0947, 0.1123, 0.1043, 0.0958],
[0.0967, 0.0892, 0.1121, 0.1149, 0.0863, 0.0965, 0.0952, 0.1102, 0.1028, 0.0961],
[0.0971, 0.0880, 0.1112, 0.1161, 0.0847, 0.0945, 0.0952, 0.1125, 0.1049, 0.0958],
[0.0982, 0.0897, 0.1121, 0.1138, 0.0873, 0.0943, 0.0950, 0.1097, 0.1032, 0.0968],
[0.0980, 0.0885, 0.1123, 0.1151, 0.0870, 0.0942, 0.0952, 0.1108, 0.1030, 0.0959]])
tensor(10.0000)
3.11 输入张量在通过CNN的过程中的变化
3.11.1 CNN 输出特征图尺寸(正方形)
假设输入特征的大小为n x n
假设滤波器的大小为 f x f
令padding为p,步长stride为s
则输出特征图的大小为 O = ( n - f + 2p )/s + 1
3.11.2 CNN 输出特征图尺寸(非正方形)
假设输入特征的大小为 nh x nw
假设滤波器的大小为 fh x fw
令padding为p,步长stride为s
则输出特征图的高度为 Oh = (nh - fh + 2p)/s + 1
输出特征图的宽度为 Ow = (nw - fw + 2p)/s + 1
3.12 训练神经网络的步骤
3.12.1 训练神经网络的七个步骤
从训练集中获取批量数据
将批量数据传入网络
计算损失(预测值与真实值之间的差)【需要loss function实现】
计算损失函数的梯度 【需要back propagation实现】
通过上一步计算的梯度来更新权重,进而减少损失【需要optimization algorithm实现】
重复1-5步直到一个epoch执行完成
重复1-6步直到所设定的epochs执行完成并得到满意的accuracy
3.12.2 单批次图像训练
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
torch.set_printoptions(linewidth=120)
torch.set_grad_enabled(True) # 这里并不是必须的,默认情况下是打开的
print(torch.__version__)
print(torchvision.__version__)
out:
1.9.0
0.10.0
# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
return preds.argmax(dim=1).eq(labels).sum().item()
# 一、训练数据获取
train_set = torchvision.datasets.FashionMNIST(
root = './data/FashionMNIST',
train = True,
download = True,
transform = transforms.Compose([
transforms.ToTensor()
])
)
# 二、创建网络
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self, t):
# Input Layer
t = t
# Conv1
t = F.relu(self.conv1(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
# Conv2
t = F.relu(self.conv2(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
# FC1
t = t.reshape(-1, 12*4*4)
t = F.relu(self.fc1(t))
# FC2
t = F.relu(self.fc2(t))
# Output
t = self.out(t)
return t
# 调用network实例
network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
batch = next(iter(train_loader))
images, labels = batch
# 计算损失
preds = network(images)
loss = F.cross_entropy(preds,labels) # 交叉熵损失函数
print(loss.item()) #获得损失的值
out:
2.296623468399048
print(network.conv1.weight.grad) # 输出conv1的梯度
# 计算损失的梯度
loss.backward() #反向传播
out:
None
# 更新权重 学习率=0.01
optimizer = optim.Adam(network.parameters(), lr =0.01)
loss.item() # 显示当前loss值
out:
2.296623468399048
# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
return preds.argmax(dim=1).eq(labels).sum().item()
print(get_num_correct(preds, labels))
# 更新权重
optimizer.step()
out:
11
preds = network(images)
loss = F.cross_entropy(preds,labels)
print(loss.item())
print(get_num_correct(preds, labels))
out:
# 这里可以看到损失值变小了,预测正确的数量也增加了
2.272930383682251
12
3.12.3 单批次网络训练步骤总结
从训练集中获取批量数据(lr为学习率:即朝着loss最小的方向走多远)
network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr = 0.01)
batch = next(iter(train_loader))
将批量数据传入network
preds = network(images)
计算loss
loss = F.cross_entropy(preds, labels)
计算loss的梯度
loss.backward()
使用计算出的梯度来更新权重,从而减少loss
optimizer.step()
print('loss1:',loss.item()) #更新前的loss
preds = network(images)
loss = F.cross_entropy(preds, labels)
print('loss2:',loss.item())
3.13 单周期(epoch)CNN的训练
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
torch.set_printoptions(linewidth=120) # 这里告诉pytorch如何显示输出
torch.set_grad_enabled(True) # 这里并不是必须的,默认情况下是打开的,pytorch的梯度跟踪功能
print(torch.__version__)
print(torchvision.__version__)
train_set = torchvision.datasets.FashionMNIST(
root = './data/FashionMNIST',
train = True,
download = True,
transform = transforms.Compose([
transforms.ToTensor()
])
)
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self, t):
# Input Layer
t = t
# Conv1
t = F.relu(self.conv1(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
# Conv2
t = F.relu(self.conv2(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
# FC1
t = t.reshape(-1, 12 * 4 * 4)
t = F.relu(self.fc1(t))
# FC2
t = F.relu(self.fc2(t))
# Output
t = self.out(t)
return t
# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
return preds.argmax(dim=1).eq(labels).sum().item()
# 创建网络实例
network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr=0.01)
flag_sum = 0 # 记录总共训练的次数
# 多次epoch
for epoch in range(5):
total_loss = 0
total_correct = 0
flag_epoch = 0 # 记录一次epoch的训练次数
# 一次epoch
for batch in train_loader: # Get batch,从所有的数据中得到一个bach,一个bach是100张图片
images, labels = batch
preds = network(images)
loss = F.cross_entropy(preds, labels)
# 这里梯度归零是因为当我们对损失函数进行逆向调用时(loss.backward()),新的梯度将会被计算出来,它们会添加到这些当前值中,如果不将当前值归零,就会累积梯度,
optimizer.zero_grad() # 告诉优化器把梯度属性中权重的梯度归零,否则pytorch会累积梯度
loss.backward() # 计算梯度
# 使用梯度和学习率,梯度告诉我们走那条路,(哪个方向时损失函数的最小值),学习率告诉我们在这个方向上走多远
optimizer.step() # 更新权重,更新所有参数
flag_sum += 1
flag_epoch += 1
total_loss += loss.item()
total_correct += get_num_correct(preds, labels)
print("epoch:", epoch, "loss:", total_loss, "total_correct:", total_correct)
print("flag_sum: ",flag_sum,"flag_epoch",flag_epoch)
accuracy = total_correct/len(train_set)
print("accuracy:",accuracy)
out:
1.9.0
0.10.0
epoch: 0 loss: 333.44036097824574 total_correct: 47315
epoch: 1 loss: 229.44155816733837 total_correct: 51582
epoch: 2 loss: 209.4198594391346 total_correct: 52302
epoch: 3 loss: 199.58387261629105 total_correct: 52657
epoch: 4 loss: 194.86104479432106 total_correct: 52838
flag_sum: 3000 flag_epoch 600
accuracy: 0.8806333333333334
每个周期的迭代数(flag_epoch ) = 数据总数/batchsize(当改变batchsize时,也就是改变了更新权重的次数,也就是朝损失函数最小的防线前进的步数)
accuracy = total_correct/len(train_set)
梯度:告诉我们应该走哪条路能更快的到达loss最小
使用梯度和学习率,梯度告诉我们走那条路,(哪个方向时损失函数的最小值),学习率告诉我们在这个方向上走多远
3.14 神经网络的混淆矩阵
创建混淆矩阵的两个条件:一个预测的张量和一个有相应真值或标签的张量
CNN中的混淆矩阵 | PyTorch系列(二十三)
这个超链接中和本节课讲解一本一样,并且非常详细,强烈建议观看
# 在3.13训练后网络的基础上进行分析
len(train_set)
60000
len(train_set.targets)
60000
对整个训练集进行预测
def get_all_preds(model,loader):
all_preds = torch.tensor([])
for batch in loader:
images,labels = batch
preds = model(images)
all_preds = torch.cat((all_preds,preds), dim=0)
return all_preds # 返回所有的预测结果
prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=10000)
train_preds = get_all_preds(network, prediction_loader)
print(train_preds.shape)
out:
torch.Size([60000, 10])
print(train_preds.requires_grad) #查看训练预测张量的梯度属性
train_preds.grad
# 即使训练中关于梯度张量的跟踪已打开,但在没有进行反向传播的情况下依旧不会有梯度的值
train_preds.grad_fn # 由于train_preds是经过函数产生的,故具有该属性
out:
True
<CatBackward at 0x1d964546ba8>
# 局部关闭梯度跟踪以减小内存损耗,也可使用torch.set.grad.enabled(False)进行全局关闭
with torch.no_grad():
prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=1000)
train_preds = get_all_preds(network, prediction_loader)
len(train_preds)
60000
print(train_preds.requires_grad)
False
train_preds.grad
train_preds.grad_fn
preds_correct = get_num_correct(train_preds, train_set.targets)
print("total_correct:",preds_correct)
print("accuracy:",preds_correct/len(train_set))
total_correct: 51988
accuracy: 0.8664666666666667
绘制混淆矩阵(方法1:)
print(train_set.targets)
print(train_set.targets.shape)
out:
tensor([9, 0, 0, ..., 3, 0, 5])
torch.Size([60000])
print(train_preds.argmax(dim=1))
print(train_preds.argmax(dim=1).shape)
out:
tensor([9, 0, 0, ..., 3, 0, 5])
torch.Size([60000])
stack = torch.stack((train_set.targets, train_preds.argmax(dim=1)),dim=1)
print(stack)
out:
tensor([[9, 9],
[0, 0],
[0, 0],
...,
[3, 3],
[0, 0],
[5, 5]])
# 使用tolist方法可访问【target,pred】对
print(stack[0].tolist())
out:
[9, 9]
# 创建一个混淆矩阵(初始)
cmt = torch.zeros(10,10,dtype=torch.int32)
# 遍历所有的对,并计算每个组合发生的次数
for p in stack:
tl,pl = p.tolist()
cmt[tl,pl] = cmt[tl,pl] + 1
print(cmt)
out:
tensor([[5661, 5, 77, 73, 8, 2, 117, 1, 56, 0],
[ 64, 5774, 5, 128, 5, 1, 20, 0, 3, 0],
[ 111, 1, 4692, 82, 768, 1, 299, 0, 46, 0],
[ 546, 20, 20, 5216, 138, 0, 56, 0, 4, 0],
[ 21, 6, 364, 297, 4830, 0, 419, 5, 58, 0],
[ 27, 6, 8, 1, 0, 5665, 2, 213, 8, 70],
[1871, 9, 612, 127, 498, 0, 2792, 0, 91, 0],
[ 0, 0, 0, 0, 0, 49, 0, 5846, 3, 102],
[ 40, 1, 23, 20, 13, 15, 25, 15, 5846, 2],
[ 1, 0, 1, 0, 0, 20, 0, 307, 5, 5666]], dtype=torch.int32)
import matplotlib.pyplot as plt
from resources.plotcm import plot_confusion_matrix
# 请注意plotcm是一个文件plotcm.py,位于当前目录中的资源文件夹中。在plotcm.py文件中,有一个称为plot_confusion_matrix()的函数,我们将调用该函数。或者直接在当前py中定义这个函数(但是对于主函数定义太多函数,会导致代码太长,不便于观看和理解)
names = (
'T-shirt/top',
'Trouser',
'Pullover',
'Dress',
'Coat',
'Sandal',
'Shirt',
'Sneaker',
'Bag',
'Ankle boot')
plt.figure(figsize=(10,10))
plot_confusion_matrix(cmt, names)
out:
Confusion matrix, without normalization
tensor([[5190, 10, 19, 216, 53, 15, 454, 0, 43, 0],
[ 17, 5784, 6, 149, 13, 4, 18, 0, 9, 0],
[ 71, 2, 3313, 45, 1962, 4, 563, 0, 40, 0],
[ 211, 33, 10, 5230, 402, 3, 106, 0, 4, 1],
[ 4, 8, 104, 107, 5558, 2, 194, 0, 23, 0],
[ 3, 1, 0, 1, 1, 5493, 0, 435, 4, 62],
[1228, 6, 278, 117, 1305, 4, 2957, 0, 104, 1],
[ 0, 0, 0, 0, 0, 47, 0, 5885, 1, 67],
[ 33, 4, 11, 28, 47, 32, 59, 13, 5771, 2],
[ 0, 0, 0, 0, 1, 52, 0, 645, 7, 5295]], dtype=torch.int32)
plot_confusion_matrix函数的定义:
第一种:看起来更清楚简洁
plotcm.py为
import itertools
import numpy as np
import matplotlib.pyplot as plt
def plot_confusion_matrix(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
print("Normalized confusion matrix")
else:
print('Confusion matrix, without normalization')
print(cm)
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
fmt = '.2f' if normalize else 'd'
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
输出样例:
第二种:
plotcm.py为
# 定义绘制混淆矩阵函数
def plot_confusion_matrix(cm, labels_name, title):
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] # 归一化
plt.imshow(cm, interpolation='nearest') # 在特定的窗口上显示图像
plt.title(title) # 图像标题
plt.colorbar()
num_local = np.array(range(len(labels_name)))
plt.xticks(num_local, labels_name, rotation=90) # 将标签印在x轴坐标上
plt.yticks(num_local, labels_name) # 将标签印在y轴坐标上
plt.ylabel('True label')
plt.xlabel('Predicted label')
输出样例:
绘制混淆矩阵(方法2:)
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix # 需要install scikit-learn包
from resources.plotcm import plot_confusion_matrix # plotcm.py文件位于当前文件resources中
cm = confusion_matrix(train_set.targets, train_preds.argmax(dim=1))
print(cm)
names = (
'T-shirt/top',
'Trouser',
'Pullover',
'Dress',
'Coat',
'Sandal',
'Shirt',
'Sneaker',
'Bag',
'Ankle boot')
plt.figure(figsize=(10,10))
plot_confusion_matrix(cm, names)
out:
Confusion matrix, without normalization
tensor([[5190, 10, 19, 216, 53, 15, 454, 0, 43, 0],
[ 17, 5784, 6, 149, 13, 4, 18, 0, 9, 0],
[ 71, 2, 3313, 45, 1962, 4, 563, 0, 40, 0],
[ 211, 33, 10, 5230, 402, 3, 106, 0, 4, 1],
[ 4, 8, 104, 107, 5558, 2, 194, 0, 23, 0],
[ 3, 1, 0, 1, 1, 5493, 0, 435, 4, 62],
[1228, 6, 278, 117, 1305, 4, 2957, 0, 104, 1],
[ 0, 0, 0, 0, 0, 47, 0, 5885, 1, 67],
[ 33, 4, 11, 28, 47, 32, 59, 13, 5771, 2],
[ 0, 0, 0, 0, 1, 52, 0, 645, 7, 5295]], dtype=torch.int32)
3.15 concatenating和stacking的区分
concatenating(cat)是在一个现有的轴上连接一系列的张量
stacking(stack)是在一个新的轴上连接一系列的张量(即,我们在所有的张量中创建一个新轴)
对于具体的细节和动作图例子,可以查看 p28视频
# 给张量创建新轴
import torch
t = torch.tensor([1,1,1])
print(t.unsqueeze(dim=0))
print(t.unsqueeze(dim=0).shape)
print(t.unsqueeze(dim=1))
print(t.unsqueeze(dim=1).shape)
out:
tensor([[1, 1, 1]])
torch.Size([1, 3])
tensor([[1],
[1],
[1]])
torch.Size([3, 1])
# 使用Pytorch实现concatenating和stacking
t1 = torch.tensor([1,1,1])
t2 = torch.tensor([2,2,2])
t3 = torch.tensor([3,3,3])
# Concatenating
t_cat = torch.cat((t1,t2,t3), dim=0)
print(t_cat)
# Stacking
t_stack = torch.stack((t1, t2, t3), dim=0)
print(t_stack)
# Staking相当于先给张量添加一个新轴然后在concat
t_stack1 = torch.cat((t1.unsqueeze(0),t2.unsqueeze(0),t3.unsqueeze(0)), dim =0)
print(t1.unsqueeze(0))
print(t_stack1)
out:
tensor([1, 1, 1, 2, 2, 2, 3, 3, 3])
tensor([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
tensor([[1, 1, 1]])
tensor([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
Tensorflow的代码我没有尝试
# 使用Tensorflow实现concatenating和stacking
import tensorflow as tf
# Concatenating
t_cat = tf.concat((t1, t2, t3), axis =0)
print(t_cat)
#Stacking
t_stack = tf.concat((t1, t2, t3), axis =0)
print(t_stack)
out:
Tensor("concat:0", shape=(9,), dtype=int64)
Tensor("concat_1:0", shape=(9,), dtype=int64)
# 使用Numpy实现concatenating和stacking
import numpy as np
t1 = np.array([1,1,1])
t2 = np.array([2,2,2])
t3 = np.array([3,3,3])
# Concatenating
t_cat = np.concatenate((t1,t2,t3), axis=0)
print(t_cat)
# Stacking
t_stack = np.stack((t1,t2,t3), axis =0)
print(t_stack)
t_cat_to_stack = np.concatenate(
(
np.expand_dims(t1, 0)
,np.expand_dims(t2, 0)
,np.expand_dims(t3, 0)
)
,axis=0
)
out:
[1 1 1 2 2 2 3 3 3]
[[1 1 1]
[2 2 2]
[3 3 3]]
[[1 1 1]
[2 2 2]
[3 3 3]]
总结:一次完整训练和绘图的完整代码
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import itertools
from sklearn.metrics import confusion_matrix # 生成混淆矩阵函数
import matplotlib.pyplot as plt
# from resources.plotcm import plot_confusion_matrix
import numpy as np
import torchvision
import torchvision.transforms as transforms
torch.set_printoptions(linewidth=120) # 这里告诉pytorch如何显示输出
torch.set_grad_enabled(True) # 这里并不是必须的,默认情况下是打开的,pytorch的梯度跟踪功能
print(torch.__version__)
print(torchvision.__version__)
train_set = torchvision.datasets.FashionMNIST(
root = './data/FashionMNIST',
train = True,
download = True,
transform = transforms.Compose([
transforms.ToTensor()
])
)
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self, t):
# Input Layer
t = t
# Conv1
t = F.relu(self.conv1(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
# Conv2
t = F.relu(self.conv2(t))
t = F.max_pool2d(t, kernel_size=2, stride=2)
# FC1
t = t.reshape(-1, 12 * 4 * 4)
t = F.relu(self.fc1(t))
# FC2
t = F.relu(self.fc2(t))
# Output
t = self.out(t)
return t
# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
return preds.argmax(dim=1).eq(labels).sum().item()
# 创建网络实例
network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr=0.01)
flag_sum = 0 # 记录总共训练的次数
# 多次epoch,这里可以自己进行设置
for epoch in range(1):
total_loss = 0
total_correct = 0
flag_epoch = 0 # 记录一次epoch的训练次数
# 一次epoch
for batch in train_loader: # Get batch,从所有的数据中得到一个bach,一个bach是100张图片
images, labels = batch
preds = network(images)
loss = F.cross_entropy(preds, labels)
# 这里梯度归零是因为当我们对损失函数进行逆向调用时(loss.backward()),新的梯度将会被计算出来,它们会添加到这些当前值中,如果不将当前值归零,就会累积梯度,
optimizer.zero_grad() # 告诉优化器把梯度属性中权重的梯度归零,否则pytorch会累积梯度
loss.backward() # 计算梯度
# 使用梯度和学习率,梯度告诉我们走那条路,(哪个方向时损失函数的最小值),学习率告诉我们在这个方向上走多远
optimizer.step() # 更新权重,更新所有参数
flag_sum += 1
flag_epoch += 1
total_loss += loss.item()
total_correct += get_num_correct(preds, labels)
print("epoch:", epoch, "loss:", total_loss, "total_correct:", total_correct)
print("flag_sum: ",flag_sum,"flag_epoch",flag_epoch)
accuracy = total_correct/len(train_set)
print("accuracy:",accuracy)
# 在3.13训练后网络的基础上进行分析
len(train_set)
len(train_set.targets)
# 获得所有的预测结果
def get_all_preds(model,loader):
all_preds = torch.tensor([])
for batch in loader:
images,labels = batch
preds = model(images)
all_preds = torch.cat((all_preds,preds), dim=0)
return all_preds
# 定义绘制混淆矩阵函数
def plot_confusion_matrix(cm, labels_name, title):
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] # 归一化
plt.imshow(cm, interpolation='nearest') # 在特定的窗口上显示图像
plt.title(title) # 图像标题
plt.colorbar()
num_local = np.array(range(len(labels_name)))
plt.xticks(num_local, labels_name, rotation=90) # 将标签印在x轴坐标上
plt.yticks(num_local, labels_name) # 将标签印在y轴坐标上
plt.ylabel('True label')
plt.xlabel('Predicted label')
# 定义绘制混淆矩阵函数
def plot_confusion_matrix_1(cm, labels_name, title):
#cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] # 归一化
plt.imshow(cm, interpolation='nearest') # 在特定的窗口上显示图像
plt.title(title) # 图像标题
plt.colorbar()
num_local = np.array(range(len(labels_name)))
plt.xticks(num_local, labels_name, rotation=90) # 将标签印在x轴坐标上
plt.yticks(num_local, labels_name) # 将标签印在y轴坐标上
plt.ylabel('True label')
plt.xlabel('Predicted label')
def plot_confusion_matrix_2(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
print("Normalized confusion matrix")
else:
print('Confusion matrix, without normalization')
print(cm)
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
fmt = '.2f' if normalize else 'd'
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=10000)
train_preds = get_all_preds(network, prediction_loader)
print(train_preds.shape)
print(train_preds.requires_grad) #查看训练预测张量的梯度属性
print(train_preds.grad)
# 即使训练中关于梯度张量的跟踪已打开,但在没有进行反向传播的情况下依旧不会有梯度的值
print(train_preds.grad_fn) # 由于train_preds是经过函数产生的,故具有该属性
# 局部关闭梯度跟踪以减小内存损耗,也可使用torch.set.grad.enabled(False)进行全局关闭
with torch.no_grad():
prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=1000)
train_preds = get_all_preds(network, prediction_loader)
len(train_preds)
print(train_preds.requires_grad)
print(train_preds.grad)
print(train_preds.grad_fn)
preds_correct = get_num_correct(train_preds, train_set.targets)
print("total_correct:",preds_correct)
print("accuracy:",preds_correct/len(train_set))
print(train_set.targets)
print(train_set.targets.shape)
print(train_preds.argmax(dim=1))
print(train_preds.argmax(dim=1).shape)
stack = torch.stack((train_set.targets, train_preds.argmax(dim=1)),dim=1)
print(stack)
# 使用tolist方法可访问【target,pred】对
print(stack[0].tolist())
# 创建一个混淆矩阵(初始)
cmt = torch.zeros(10,10,dtype=torch.int32)
# 遍历所有的对,并计算每个组合发生的次数
for p in stack:
tl,pl = p.tolist()
cmt[tl,pl] = cmt[tl,pl] + 1
print(cmt)
cm = confusion_matrix(train_set.targets, train_preds.argmax(dim=1))
names = (
'T-shirt/top',
'Trouser',
'Pullover',
'Dress',
'Coat',
'Sandal',
'Shirt',
'Sneaker',
'Bag',
'Ankle boot')
plt.figure(figsize=(10, 10))
plot_confusion_matrix(cm, names, "pred")
plt.show()
plt.figure(figsize=(10, 10))
plot_confusion_matrix_1(cmt, names, "haha")
plt.show()
plt.figure(figsize=(10, 10))
plot_confusion_matrix_2(cmt, names)
plt.show()
out:
1.9.0
0.10.0
epoch: 0 loss: 344.86296156048775 total_correct: 46965
flag_sum: 600 flag_epoch 600
accuracy: 0.78275
torch.Size([60000, 10])
True
None
<CatBackward object at 0x000002DF8A411EE0>
False
None
None
total_correct: 50476
accuracy: 0.8412666666666667
tensor([9, 0, 0, ..., 3, 0, 5])
torch.Size([60000])
tensor([9, 0, 3, ..., 3, 0, 5])
torch.Size([60000])
tensor([[9, 9],
[0, 0],
[0, 3],
...,
[3, 3],
[0, 0],
[5, 5]])
[9, 9]
tensor([[5190, 10, 19, 216, 53, 15, 454, 0, 43, 0],
[ 17, 5784, 6, 149, 13, 4, 18, 0, 9, 0],
[ 71, 2, 3313, 45, 1962, 4, 563, 0, 40, 0],
[ 211, 33, 10, 5230, 402, 3, 106, 0, 4, 1],
[ 4, 8, 104, 107, 5558, 2, 194, 0, 23, 0],
[ 3, 1, 0, 1, 1, 5493, 0, 435, 4, 62],
[1228, 6, 278, 117, 1305, 4, 2957, 0, 104, 1],
[ 0, 0, 0, 0, 0, 47, 0, 5885, 1, 67],
[ 33, 4, 11, 28, 47, 32, 59, 13, 5771, 2],
[ 0, 0, 0, 0, 1, 52, 0, 645, 7, 5295]], dtype=torch.int32)
Confusion matrix, without normalization
tensor([[5190, 10, 19, 216, 53, 15, 454, 0, 43, 0],
[ 17, 5784, 6, 149, 13, 4, 18, 0, 9, 0],
[ 71, 2, 3313, 45, 1962, 4, 563, 0, 40, 0],
[ 211, 33, 10, 5230, 402, 3, 106, 0, 4, 1],
[ 4, 8, 104, 107, 5558, 2, 194, 0, 23, 0],
[ 3, 1, 0, 1, 1, 5493, 0, 435, 4, 62],
[1228, 6, 278, 117, 1305, 4, 2957, 0, 104, 1],
[ 0, 0, 0, 0, 0, 47, 0, 5885, 1, 67],
[ 33, 4, 11, 28, 47, 32, 59, 13, 5771, 2],
[ 0, 0, 0, 0, 1, 52, 0, 645, 7, 5295]], dtype=torch.int32)
Process finished with exit code 0