深度学习实战指南

dcjszhr

已于 2024-05-20 10:38:54 修改

阅读量715

点赞数 12

分类专栏：深度学习文章标签：深度学习卷积神经网络

于 2024-02-25 13:04:53 首次发布

本文链接：https://blog.csdn.net/dcjszhr/article/details/136256913

版权

深度学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

PyTorch 介绍

PyTorch是一个开源的机器学习库，广泛应用于计算机视觉、自然语言处理等人工智能领域。它提供了强大的GPU加速，以及灵活的深度学习研究平台。以下是对PyTorch的基本介绍，以及如何用中文简单介绍其核心概念。

PyTorch的核心特点

动态计算图：PyTorch采用动态计算图（Dynamic Computational Graph），也称为自动微分系统。这意味着图的构建是即时（或称为“动态”的），这为模型的调试和复杂结构的构建提供了便利。
易于使用的API：PyTorch设计了简洁直观的API，让研究人员和开发人员可以快速上手。它支持大量的预训练模型和模块，便于构建和测试新的模型架构。
强大的GPU加速：PyTorch提供了对CUDA的支持，能够利用NVIDIA GPU来加速计算过程，大幅提高模型训练和推理的效率。
扩展性和灵活性：PyTorch不仅仅是一个深度学习框架，它还可以用于科学计算。用户可以自定义操作符，利用C++等语言进行底层优化。

PyTorch的基本组件

张量（Tensor）：张量是PyTorch的基本数据结构，可视为一个高维数组或矩阵。PyTorch中的张量类似于NumPy的ndarrays，但它们可以在GPU上运行，以加速计算。
自动微分（Autograd）：PyTorch的自动微分引擎，可以自动计算梯度，非常适合执行反向传播。这对于训练复杂的深度学习模型至关重要。
模型（Module）：在PyTorch中，模型通常通过继承torch.nn.Module类来构建。这个类帮助组织网络层，并提供了保存和加载模型的方法。
优化器（Optimizer）：PyTorch提供了多种优化算法，如SGD、Adam等，这些都封装在torch.optim中。优化器用于更新模型的参数，以减少模型的损失函数。

# frequently used import
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import random
import matplotlib.pyplot as plt

# fix the random number so that the experiments are reproducible
random.seed(0)
np.random.seed(0)
torch.manual_seed(0)
# random.seed(0)：此行设置 Python 内置随机数生成器的种子，
#生成器是模块的一部分random。该random模块用于生成各种分布的随机数、
#打乱序列以及从序列中选择随机项。通过将种子设置为固定数量（0在本例中），
#使用该模块的任何随机操作random将在每次执行代码时产生相同的结果。

# np.random.seed(0)：此行设置 NumPy 随机数生成器的种子。 
#NumPy 是 Python 中的一个数值计算库，它有自己的生成随机数数组的函数。
#与该模块类似random，设置种子np.random.seed可确保 NumPy 
#的随机数生成是确定性的，每次都会生成相同的数字序列。

# torch.manual_seed(0)：此行设置 PyTorch 随机数生成器的种子。 
#PyTorch 是一个深度学习框架，还需要生成随机数来初始化神经网络的权重、
#分割数据集和其他随机过程。通过设置手动种子，涉及随机性的 PyTorch 
#操作将是可重复的，从而确保运行结果一致。

torch.__version__

张量

张量是一种专门的数据结构，与数组和矩阵非常相似。在 PyTorch 中，我们使用张量来编码模型的输入和输出，以及模型的参数。

初始化张量

可以直接从数据中创建张量。数据类型会自动推断。

data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)
print(x_data)

#tensor([[1, 2],
#        [3, 4]])

张量是一个可以包含单一数据类型的多维数组。不同的初始化方式可以对模型训练和性能产生重要影响。以下是一些常见的张量初始化方法：

零张量：创建一个所有元素都为0的张量。
- PyTorch: torch.zeros(size)
- TensorFlow: tf.zeros(shape)
单位张量：创建一个所有元素都为1的张量。
- PyTorch: torch.ones(size)
- TensorFlow: tf.ones(shape)
随机张量：元素随机分布。
- 均匀分布：PyTorch使用torch.rand(size)，TensorFlow使用tf.random.uniform(shape)。
- 正态分布：PyTorch使用torch.randn(size)，TensorFlow使用tf.random.normal(shape)。
特定范围的张量：创建一个包含特定范围内的连续值的张量。
- PyTorch: torch.arange(start, end, step)
- TensorFlow: tf.range(start, limit, delta)
指定值张量：根据用户指定的值创建张量。
- PyTorch: torch.tensor([1, 2, 3])
- TensorFlow: tf.constant([1, 2, 3])
复制张量：基于现有张量的形状和类型创建一个新的张量。
- PyTorch: 使用torch.clone()复制张量。
- TensorFlow: 使用tf.identity()或张量的.copy()方法复制张量。
特定形状的随机整数张量：生成一个给定形状的随机整数张量。
- PyTorch: torch.randint(low, high, size)
- TensorFlow: tf.random.uniform(shape, minval=low, maxval=high, dtype=tf.int32)
对角线张量：创建一个对角线上值为1，其余位置为0的方阵张量。
- PyTorch: torch.eye(n)
- TensorFlow: tf.eye(num_rows)
自定义初始化：基于特定的初始化函数或策略来初始化张量，例如使用Xavier/Glorot初始化或He初始化等。

构建一个包含 3 个元素的向量

x=torch.Tensor( [5.3 , 2.1 , -3.1 ] )
print(x)

#tensor([ 5.3000,  2.1000, -3.1000])

随机构建一个 10 x 2 矩阵

A=torch.rand(10,2)
print(A)

# tensor([[0.6816, 0.9152],
#         [0.3971, 0.8742],
#         [0.4194, 0.5529],
#         [0.9527, 0.0362],
#         [0.1852, 0.3734],
#         [0.3051, 0.9320],
#         [0.1759, 0.2698],
#         [0.1507, 0.0317],
#         [0.2081, 0.9298],
#         [0.7231, 0.7423]])

构建一个 10 x 2 的矩阵，其中填满零

A=torch.zeros(10,2)
print(A)

# tensor([[0., 0.],
#         [0., 0.],
#         [0., 0.],
#         [0., 0.],
#         [0., 0.],
#         [0., 0.],
#         [0., 0.],
#         [0., 0.],
#         [0., 0.],
#         [0., 0.]])

检查张量的维度

A=torch.rand(3,2,4,8)
print(A)

# tensor([[[[0.5263, 0.2437, 0.5846, 0.0332, 0.1387, 0.2422, 0.8155, 0.7932],
#           [0.2783, 0.4820, 0.8198, 0.9971, 0.6984, 0.5675, 0.8352, 0.2056],
#           [0.5932, 0.1123, 0.1535, 0.2417, 0.7262, 0.7011, 0.2038, 0.6511],
#           [0.7745, 0.4369, 0.5191, 0.6159, 0.8102, 0.9801, 0.1147, 0.3168]],

#          [[0.6965, 0.9143, 0.9351, 0.9412, 0.5995, 0.0652, 0.5460, 0.1872],
#           [0.0340, 0.9442, 0.8802, 0.0012, 0.5936, 0.4158, 0.4177, 0.2711],
#           [0.6923, 0.2038, 0.6833, 0.7529, 0.8579, 0.6870, 0.0051, 0.1757],
#           [0.7497, 0.6047, 0.1100, 0.2121, 0.9704, 0.8369, 0.2820, 0.3742]]],


#         [[[0.0237, 0.4910, 0.1235, 0.1143, 0.4725, 0.5751, 0.2952, 0.7967],
#           [0.1957, 0.9537, 0.8426, 0.0784, 0.3756, 0.5226, 0.5730, 0.6186],
#           [0.6962, 0.5300, 0.2560, 0.7366, 0.0204, 0.2036, 0.3748, 0.2564],
#           [0.3251, 0.0902, 0.3936, 0.6069, 0.1743, 0.4743, 0.8579, 0.4486]],

#          [[0.5139, 0.4569, 0.6012, 0.8179, 0.9736, 0.8175, 0.9747, 0.4638],
#           [0.0508, 0.2630, 0.8405, 0.4968, 0.2515, 0.1168, 0.0321, 0.0780],
#           [0.3986, 0.7742, 0.7703, 0.0178, 0.8119, 0.1087, 0.3943, 0.2973],
#           [0.4037, 0.4018, 0.0513, 0.0683, 0.4218, 0.5065, 0.2729, 0.6883]]],


#         [[[0.0500, 0.4663, 0.9397, 0.2961, 0.9515, 0.6811, 0.0488, 0.8163],
#           [0.4423, 0.2768, 0.8998, 0.0960, 0.5537, 0.3953, 0.8571, 0.6396],
#           [0.7403, 0.6766, 0.3798, 0.3948, 0.0880, 0.7709, 0.8970, 0.8421],
# ...
#          [[0.5210, 0.8223, 0.1220, 0.1567, 0.2097, 0.8500, 0.3203, 0.9217],
#           [0.6808, 0.5633, 0.4963, 0.4012, 0.5627, 0.3858, 0.4965, 0.5638],
#           [0.1089, 0.2379, 0.9037, 0.0942, 0.4641, 0.9946, 0.6806, 0.5142],
#           [0.0667, 0.7477, 0.1439, 0.3581, 0.3322, 0.4260, 0.5055, 0.9124]]]])


print(A.dim())

#4

print(A.size())

#torch.Size([3, 2, 4, 8])

print(A.size(2))

#4

print(A.shape)

#torch.Size([3, 2, 4, 8])

张量数据类型

print(torch.FloatTensor(2,3).type()) #float type
print(torch.DoubleTensor(2,3).type()) #double type
print(torch.HalfTensor (2,3).type()) #half type

#torch.FloatTensor
#torch.DoubleTensor
#torch.HalfTensor


tensor = torch.randn(2, 2)
print(tensor.type())

long_tensor = tensor.long()
print(long_tensor.type())

half_tensor = tensor.half()
print(half_tensor.type())

int_tensor = tensor.int()
print(int_tensor.type())

#torch.FloatTensor
#torch.LongTensor
#torch.HalfTensor
#torch.IntTensor

将张量转换为 cuda 张量

现在我们来设置 GPU 环境。实验室提供了免费的 GPU 供使用。操作如下

- 运行时 -> 更改运行时类型 -> 在硬件加速器中选择 "GPU

- 点击右上角的 "连接

连接到 GPU 后，可以使用 `nvidia-smi` 命令查看其状态。

!nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   59C    P8              11W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

现在，我们可以将 CPU 张量转换为 CUDA 张量，以加速计算！

x1 = torch.tensor(1.0) # tensor on CPU
x2 = torch.tensor(1.0, device="cuda:0") # tensor on GPU
x3 = torch.tensor(1.0, requires_grad=True).cuda() # tensor on GPU and require grad
print("x1:", x1)
print("x2:", x2)
print("x3:", x3)

#x1: tensor(1.)
#x2: tensor(1., device='cuda:0')
#x3: tensor(1., device='cuda:0', grad_fn=<ToCopyBackward0>)

包括 PyTorch 在内的大多数深度学习框架默认使用 32 位浮点运算（FP32）进行训练。
然而，对于许多深度学习模型来说，这并不是实现完全准确性的必要条件。2017 年，英伟达研究人员开发出一种混合精度训练方法，在训练网络时将单精度（FP32）与半精度（如 FP16）格式相结合，在使用相同超参数的情况下实现了与 FP32 训练相同的精度，并在英伟达 GPU 上实现了额外的性能优势：

* 训练时间更短；
* 更低的内存要求，可支持更大的批次规模、更大的模型或更大的输入。

英伟达于 2018 年开发了 Apex，这是一个轻量级 PyTorch 扩展，具有自动混合精度（AMP）功能。该功能可将某些 GPU 操作从 FP32 精度自动转换为混合精度，从而在保持精度的同时提高性能。
例如，在 PyTorch1.6 中，以下操作会转换为半精度：
* addbmm
* addmm
* bmm
* conv1d
* conv2d
* conv3d
* conv_transpose1d
* conv_transpose2d
* conv_transpose3d
* linear
* matmul......

NVIDIA V100 上的 FP16 与 V100 上的 FP32 的对比

使用 FP16 的 AMP 是在 V100 上进行 DL 训练时性能最好的选项。在表 1 中，我们可以看到，对于各种模型，V100 上的 AMP 比 V100 上的 FP32 快 1.5 到 5.5 倍，同时收敛到相同的最终精度。

重塑张量

#返回一个新的张量，其数据与自张量相同，但形状不同。
x=torch.arange(10)
print(x)

#tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

print( x.view(2,5) )

#tensor([[0, 1, 2, 3, 4],
#        [5, 6, 7, 8, 9]])


print( x.view(5,2) )

# tensor([[0, 1],
#         [2, 3],
#         [4, 5],
#         [6, 7],
#         [8, 9]]) 

# 请注意，原始张量 x 并未被修改

print(x)

#tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

重塑

torch.reshape(input, shape) → 张量
返回一个张量，其数据和元素个数与输入相同，但具有指定的形状。
单个维度可能为-1，在这种情况下，将根据其余维度和输入元素的数量推断出张量。

参数

输入（张量）- 要重塑的张量
shape（int 元组）- 新形状

a = torch.arange(4.)
print(torch.reshape(a, (2, 2)))


b = torch.tensor([[0, 1], [2, 3]])
print(torch.reshape(b, (-1, 4)))

# tensor([[0., 1.],
#         [2., 3.]])
# tensor([[0, 1, 2, 3]])

a = torch.randn(1, 2, 3, 4)
print(a.size())

b = a.transpose(1, 2)  # Swaps 2nd and 3rd dimension
print(b.size())

c = a.view(1, 3, 2, 4)  # Does not change tensor layout in memory
print(c.size())


# torch.Size([1, 2, 3, 4])
# torch.Size([1, 3, 2, 4])
# torch.Size([1, 3, 2, 4])

print(torch.equal(b, c))

#False

print(a)
print(b)
print(c)

# tensor([[[[ 0.5851, -1.1560, -0.1434, -0.1947],
#           [ 1.4903, -0.7005,  0.1806,  1.3615],
#           [-0.7205, -2.2148, -0.6837,  0.5164]],

#          [[ 0.5588,  0.7918, -0.1847, -0.7318],
#           [-1.1057,  0.1437,  0.5836,  1.3482],
#           [-0.8137,  0.8200, -0.6332,  1.2948]]]])
# tensor([[[[ 0.5851, -1.1560, -0.1434, -0.1947],
#           [ 0.5588,  0.7918, -0.1847, -0.7318]],

#          [[ 1.4903, -0.7005,  0.1806,  1.3615],
#           [-1.1057,  0.1437,  0.5836,  1.3482]],

#          [[-0.7205, -2.2148, -0.6837,  0.5164],
#           [-0.8137,  0.8200, -0.6332,  1.2948]]]])
# tensor([[[[ 0.5851, -1.1560, -0.1434, -0.1947],
#           [ 1.4903, -0.7005,  0.1806,  1.3615]],

#          [[-0.7205, -2.2148, -0.6837,  0.5164],
#           [ 0.5588,  0.7918, -0.1847, -0.7318]],

#          [[-1.1057,  0.1437,  0.5836,  1.3482],
#           [-0.8137,  0.8200, -0.6332,  1.2948]]]])

简单例子：创建双层网络

1. 引入必要的包

import torch
import torch.nn as nn
import torch.nn.functional as F

2. 定义网络

在 Pytorch 中，网络被定义为类

class two_layer_net(nn.Module):

    def __init__(self, input_size, hidden_size, output_size):
        super(two_layer_net , self).__init__()

        self.layer1 = nn.Linear( input_size, hidden_size , bias=True)
        self.layer2 = nn.Linear( hidden_size, output_size , bias=True)

    def forward(self, x):

        x = self.layer1(x)
        x = F.relu(x)
        x = self.layer2(x)
        p = F.softmax(x, dim=0)

        return p

3. 创建实例

创建一个实例，接收大小为 2 的输入，然后将其转换为大小为 5 的内容，再转换为大小为 3 的内容

$\begin{bmatrix} \times \\ \times \end{bmatrix} \longrightarrow \begin{bmatrix} \times \\ \times \\ \times \\ \times \\ \times \end{bmatrix} \longrightarrow \begin{bmatrix} \times \\ \times \\ \times \end{bmatrix}$

net= two_layer_net(2,5,3)
print(net)

# two_layer_net(
#   (layer1): Linear(in_features=2, out_features=5, bias=True)
#   (layer2): Linear(in_features=5, out_features=3, bias=True)
# )

4. 传入实例

现在，我们要制作一个输入向量，并将其输入网络：

x=torch.Tensor([1,1])
print(x)

#tensor([1., 1.])

p=net.forward(x)
print(p)

#tensor([0.2430, 0.2654, 0.4916], grad_fn=<SoftmaxBackward0>)

p=net(x)
print(p)

#tensor([0.2430, 0.2654, 0.4916], grad_fn=<SoftmaxBackward0>)

5. 检查输出

我们可以通过以下方式访问第一个模块：

print(net.layer1)

#Linear(in_features=2, out_features=5, bias=True)

print(net.layer1.weight)

# Parameter containing:
# tensor([[-0.3412,  0.1270],
#         [-0.3673,  0.1629],
#         [ 0.1389, -0.5250],
#         [ 0.1177,  0.3012],
#         [ 0.2799, -0.0890]], requires_grad=True)

print(net.layer1.bias)

#Parameter containing:
#tensor([-0.5797, -0.1090,  0.2456, -0.2580,  0.2684], requires_grad=True)

list_of_param = list( net.parameters() )
print(list_of_param)

# [Parameter containing:
# tensor([[-0.3412,  0.1270],
#         [-0.3673,  0.1629],
#         [ 0.1389, -0.5250],
#         [ 0.1177,  0.3012],
#         [ 0.2799, -0.0890]], requires_grad=True), Parameter containing:
# tensor([-0.5797, -0.1090,  0.2456, -0.2580,  0.2684], requires_grad=True), Parameter containing:
# tensor([[ 0.2978, -0.2335,  0.0044,  0.1849,  0.0351],
#         [ 0.0374,  0.0558, -0.3516,  0.0351,  0.3097],
#         [ 0.4030,  0.2629,  0.0599,  0.2089, -0.2176]], requires_grad=True), Parameter containing:
# tensor([-0.3706, -0.3846,  0.4461], requires_grad=True)]

如何实现自己的网络

前期准备

torch.nn.module 是所有神经网络模块的基类。

您的模型也应该继承这个类。

在定义 nn.Module 的子类时，您应该重写 init 函数来初始化您需要的其他预定义模块和函数，以及您希望如何初始化它们。然后，您应该重写 forward 函数来定义我们如何使用输入和定义的模块及函数来计算输出。

模块也可以包含其他模块，允许您以树状结构嵌套它们。您可以将子模块作为常规属性分配。

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)
# self.conv1 = nn.Conv2d(1, 20, 5)：定义第一个卷积层conv1。
#这个层期望输入的是单通道（比如灰度图像）的数据，输出20个特征图（即有20个卷积核），
#卷积核的大小是5x5。
# self.conv2 = nn.Conv2d(20, 20, 5)：定义第二个卷积层conv2。
#这个层接收conv1的20个输出特征图作为输入，并输出另外20个特征图，卷积核的大小同样是5x5。

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))
# x = F.relu(self.conv1(x))：数据x首先通过第一个卷积层conv1，然后通过ReLU激活函数。
#ReLU激活函数会将所有负值设为0，为模型引入非线性，增强学习能力。
# return F.relu(self.conv2(x))：接着，经过ReLU函数处理的数据通过第二个卷积层conv2，
#最后再次通过ReLU激活函数。函数的返回值是经过两层卷积和两次非线性激活后的数据。

你的模块应该包含 PyTorch 中的其他模块（例如 nn.Conv2d），以及预定义的函数（如 F.relu 和其他激活函数）。
PyTorch 的 autograd 上下文会跟踪你的计算，自动保存反向传播所需的激活，并在你调用 loss.backward() 时计算梯度。
但有时你也需要定义一些 torch.nn.functional 中没有定义的函数，或者这些函数不可微。这时就需要使用另一种定义网络的方法：torch.autograd.function。

与 torch.mm.module 不同，torrent.autograd.function 不包含 init 函数。您需要为它重写前进和后退函数。此外，由于 autograd 在本例中无法使用，我们需要自己处理激活。幸运的是，ctx 可以帮助我们。将中间结果保存在 ctx 中，我们就可以在反向传播时轻松取出。值得一提的是，前向传播的输入数应与后向传播的输出数相同。反之亦然。

from torch.autograd import Function
# 继承自Function
class LinearFunction(Function):

    # 注意，forward和backward都是@staticmethods
    @staticmethod
    # bias是一个可选参数
    def forward(ctx, input, weight, bias=None):
        ctx.save_for_backward(input, weight, bias)
        output = input.mm(weight.t())
        if bias is not None:
            output += bias.unsqueeze(0).expand_as(output)
        return output

# forward方法
# forward是静态方法（使用@staticmethod装饰器），
#它接收输入张量input、权重weight，以及可选的偏置bias。
# ctx是一个上下文对象，用于在前向传播和后向传播之间存储信息。
#ctx.save_for_backward方法保存了input、weight、和bias，这些在后向传播中可能会用到。
# 计算输出张量output是通过input和转置后的weight矩阵相乘得到的。
#如果提供了bias，则将其加到输出上。
# 函数返回计算后的输出张量output。

    # 这个函数只有一个输出，所以它只得到一个梯度
    @staticmethod
    def backward(ctx, grad_output):
        # 这是一个非常方便的模式 - 在backward的顶部
        # 解包saved_tensors并初始化所有关于输入的梯度为
        # None。由于额外的尾随Nones被忽略，即使函数有
        # 可选输入，返回语句也是简单的。
        input, weight, bias = ctx.saved_tensors
        grad_input = grad_weight = grad_bias = None

        # 这些needs_input_grad检查是可选的，仅用于
        # 提高效率。如果你想让你的代码更简单，你可以
        # 跳过它们。为不需要它的输入返回梯度
        # 不是错误。
        if ctx.needs_input_grad[0]:
            grad_input = grad_output.mm(weight)
        if ctx.needs_input_grad[1]:
            grad_weight = grad_output.t().mm(input)
        if bias is not None and ctx.needs_input_grad[2]:
            grad_bias = grad_output.sum(0)
        return grad_input, grad_weight, grad_bias
# backward也是一个静态方法，它接收grad_output，即对输出张量的梯度。
# ctx.saved_tensors用来恢复在前向传播中保存的张量：input、weight、和bias。
# 初始化梯度张量grad_input、grad_weight、和grad_bias为None。
#这是为了处理那些不需要计算梯度的情况。
# 检查每个输入是否需要梯度（ctx.needs_input_grad）。这一步是可选的，但可以提高效率，
#因为只为需要梯度的输入计算梯度。
# 如果input需要梯度，则通过grad_output和weight的矩阵乘法来计算grad_input。
# 如果weight需要梯度，则通过grad_output的转置和input的矩阵乘法来计算grad_weight。
# 如果bias不是None且需要梯度，则通过对grad_output求和来计算grad_bias。
# 返回这些梯度：grad_input、grad_weight、和grad_bias。
#这些返回值的顺序与forward方法的输入参数顺序相对应。

但在大多数情况下，PyTorch 的预定义函数对我们来说已经足够了。因此，只有当你需要操作grads，或者由于分布式训练中复杂的通信等情况而无法轻松使用 autograd 时，才会使用 torch.autograd.Function。

MLP 基本案例

import torch
import torchvision
from torch.utils.data import DataLoader

定义训练参数

n_epochs = 2
batch_size_train = 64
batch_size_test = 1000
learning_rate = 0.01
momentum = 0.5
log_interval = 10
random_seed = 1
torch.manual_seed(random_seed)
# n_epochs：训练的总轮数（epoch）。一个epoch意味着每个训练样本（整个训练集）
#在训练过程中被使用了一次。这里设置为2，意味着整个训练集将被遍历两次。

# batch_size_train：训练批量的大小。这是指在训练过程中，每次前向传播和反向传播更新权重时
#使用的样本数量。这里设置为64，意味着每次训练时将随机选取64个样本进行训练。

# batch_size_test：测试批量的大小。这是指在测试模型性能时，一次性处理的样本数量。
#这里设置为1000，意味着在测试模型时，每次会处理1000个样本。

# learning_rate：学习率。这是一个超参数，控制着权重调整的幅度。值越大，每次更新时步长越大，
#可能导致快速收敛，但也可能越过最低点；值越小，收敛越慢，但有助于找到更精确的最低点。
#这里设置为0.01。

# momentum：动量。这是用于加速SGD（随机梯度下降）在相关方向上的收敛，并抑制震荡的参数。
#这里设置为0.5，意味着将当前梯度更新向量的50%加到下一次的更新中。

# log_interval：日志间隔。这表示每处理log_interval批数据后，打印一次日志信息。
#这里设置为10，意味着每处理10个批次的数据后，将输出当前的训练状态。

# random_seed：随机种子。这用于初始化随机数生成器，以确保实验的可重复性。
#相同的随机种子将导致每次实验时初始化的权重和数据集的分批都是相同的。

# torch.manual_seed(random_seed)：这行代码通过设置随机数生成器的种子，
#确保所有依赖于随机数的操作（如数据打乱、权重初始化等）都是可重复的。这对于实验的可重复性非常重要。

加载数据

train_loader = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST('./data/', train=True, download=True,
                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.ToTensor(),
                                   torchvision.transforms.Normalize(
                                       (0.1307,), (0.3081,))
                               ])),
    batch_size=batch_size_train, shuffle=True)

test_loader = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST('./data/', train=False, download=True,
                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.ToTensor(),
                                   torchvision.transforms.Normalize(
                                       (0.1307,), (0.3081,))
                               ])),
    batch_size=batch_size_test, shuffle=True)
# torch.utils.data.DataLoader
# DataLoader是PyTorch中用于封装数据集的一个迭代器，支持自动批处理、采样、
#打乱数据和多线程数据加载等功能。这里它被用来创建训练和测试数据的加载器。
# torchvision.datasets.MNIST
# torchvision.datasets.MNIST是一个用于加载MNIST数据集的类。MNIST是一个包含70,000
#张28x28大小的手写数字灰度图像的数据集，广泛用于图像处理和机器学习领域的入门级学习和测试。
# 参数'./data/'指定了下载数据的本地存储路径。
# train=True和train=False分别表示加载的是训练集还是测试集。
# download=True指示如果数据还未下载到本地，则自动从互联网下载数据。
# transform
# transform=torchvision.transforms.Compose([...])定义了一个转换操作的组合。
#在这里，它首先将图像转换为PyTorch的张量，然后对这些图像进行标准化。
# torchvision.transforms.ToTensor()将PIL图像或NumPy ndarray转换为FloatTensor，
#并在[0., 1.]范围内缩放图像的像素值。
# torchvision.transforms.Normalize((0.1307,), (0.3081,))标准化张量图像，
#其中(0.1307,)是MNIST数据集的全局平均亮度，(0.3081,)是标准差。标准化有助于稳定训练过程，
#并提高模型收敛速度。
# batch_size和shuffle
# batch_size=batch_size_train和batch_size=batch_size_test指定了每个数据批次的大小。
#对于训练数据，通常选择较小的批量大小来提高训练过程的稳定性和效率；对于测试数据，
#由于不需要进行反向传播，可以使用更大的批量大小以更快地完成评估。
# shuffle=True表示在每个epoch开始时，数据将被打乱顺序。这有助于模型学习到更泛化的特征，
#因为它不能期待数据以特定的顺序出现。

examples = enumerate(test_loader)
batch_idx, (example_data, example_targets) = next(examples)
print(example_targets)
print(example_data.shape)
# enumerate(test_loader)：这将创建一个枚举器，它会遍历test_loader中的所有批次。
#test_loader是一个DataLoader对象，它按照指定的批量大小
# （之前代码中的batch_size_test）加载MNIST测试集的数据。
#enumerate会给每个批次添加一个索引（从0开始）。

# batch_idx, (example_data, example_targets) = next(examples)：
#这行代码使用next()函数从枚举器中获取第一个元素（即第一批数据）。
# 返回的元素包含两部分：batch_idx是这个批次的索引（这里是0，因为它是第一个批次），
#(example_data, example_targets)是一个元组，其中example_data是包含图像数据的张量，
#example_targets是这批数据对应的标签张量。

# example_targets：这是一个包含当前批次中所有图像的目标标签的张量。
#打印它将显示这批图像对应的数字标签。
# example_data.shape：这表示example_data张量的形状。对于MNIST数据集，
#example_data的形状应该是(batch_size, channels, height, width)。
# 给定batch_size_test为1000（之前的设置），channels为1（因为MNIST是灰度图像），
#height和width都是28（MNIST图像的尺寸），所以期望的形状是(1000, 1, 28, 28)。


#tensor([3, 9, 4, 9, 9, 0, 8, 3, 1, 2, 3, 9, 1, 3, 6, 6, 4, 4, 9, 7, 3, 7, 6, 3,
#         4, 8, 4, 6, 8, 6, 1, 1, 1, 1, 0, 0, 1, 8, 4, 0, 1, 2, 7, 9, 3, 2, 3, 8,
#         3, 2, 0, 4, 6, 6, 5, 5, 3, 0, 3, 7, 2, 4, 1, 6, 7, 5, 4, 1, 0, 8, 5, 9,
#         0, 9, 6, 1, 8, 0, 9, 3, 5, 7, 8, 5, 6, 4, 2, 2, 2, 1, 8, 4, 4, 2, 1, 5,

...
#         9, 4, 8, 6, 8, 1, 7, 2, 6, 9, 8, 2, 7, 2, 2, 4, 8, 9, 3, 6, 2, 2, 7, 8,
#         5, 7, 2, 0, 0, 2, 3, 1, 1, 5, 8, 5, 3, 9, 7, 6])
# torch.Size([1000, 1, 28, 28])

检查结果

import matplotlib.pyplot as plt
fig = plt.figure()
for i in range(6):
  plt.subplot(2,3,i+1)
  plt.tight_layout()
  plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
  plt.title("Ground Truth: {}".format(example_targets[i]))
  plt.xticks([])
  plt.yticks([])
plt.show()

# 导入matplotlib.pyplot
# import matplotlib.pyplot as plt导入了matplotlib库的pyplot模块，
#这是一个用于绘制图表和进行数据可视化的强大工具。
# 创建一个新的图形
# fig = plt.figure()创建了一个新的图形窗口，以便在其中绘制图像。
# 循环绘制六个图像
# for i in range(6):循环六次，每次循环将处理并显示一张图像。
# 设置子图
# plt.subplot(2,3,i+1)在2行3列的网格中创建子图。i+1指定了当前子图的位置（从1开始编号）。
#这意味着图像将被排列成2行3列的格式。
# 自动调整子图布局
# plt.tight_layout()自动调整子图参数，确保子图之间有足够的空间，
#以及子图与图形边缘之间的空间合适，避免重叠和更好地展示。
# 显示图像
# plt.imshow(example_data[i][0], cmap='gray', interpolation='none')显示第i个图像。
#example_data[i][0]选择了第i个图像的第0个通道（对于灰度图像，只有一个通道）。
#cmap='gray'指定了颜色映射为灰度，以便图像以灰度形式显示。
#interpolation='none'指定了在显示图像时不使用插值，这可以帮助更清晰地显示每个像素。
# 添加图像标题
# plt.title("Ground Truth: {}".format(example_targets[i]))设置每个子图的标题，
#显示图像对应的真实标签。
# 移除坐标轴
# plt.xticks([])和plt.yticks([])移除了图像的x轴和y轴的刻度，使图像更加清晰，
#因为在这种情况下坐标轴的具体数值并不重要。
# 显示图形
# plt.show()显示最终的图形窗口，其中包含了六个子图，每个子图展示了一个手写数字图像及其对应的真实标签。

使用 torch.nn.Module 定义网络

构建网络

import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)
    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x)

1. 类定义：class Net(nn.Module)定义了一个名为Net的类，它继承自nn.Module。在PyTorch中，nn.Module是所有神经网络模块的基类，提供了网络的基本功能如参数管理等。

2. 初始化方法：def __init__(self):是类的初始化方法，用于创建和初始化网络中的层。

self.conv1 = nn.Conv2d(1, 10, kernel_size=5)：定义了第一个卷积层，输入通道数为1（例如，灰度图像），输出通道数为10，卷积核大小为5x5。
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)：定义了第二个卷积层，输入通道数为10（来自上一层的输出），输出通道数为20，卷积核大小为5x5。
self.conv2_drop = nn.Dropout2d()：定义了一个二维dropout层，用于减少过拟合，通过随机丢弃一些特征图上的单元来实现。
self.fc1 = nn.Linear(320, 50)：定义了第一个全连接层，输入特征数为320，输出特征数为50。
self.fc2 = nn.Linear(50, 10)：定义了第二个全连接层，输入特征数为50，输出特征数为10，通常对应于分类任务的类别数。

3. 前向传播方法：def forward(self, x):定义了数据通过网络的前向传播路径。

x = F.relu(F.max_pool2d(self.conv1(x), 2))：应用第一个卷积层，然后进行最大池化（池化核大小为2x2），接着应用ReLU激活函数。
x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))：应用第二个卷积层，再应用dropout，然后进行最大池化和ReLU激活。
x = x.view(-1, 320)：将二维特征图展平成一维，以便输入到全连接层。
x = F.relu(self.fc1(x))：通过第一个全连接层，然后应用ReLU激活函数。
x = F.dropout(x, training=self.training)：应用dropout，self.training标志确保dropout在训练模式时启用，在评估模式时禁用。
x = self.fc2(x)：通过第二个全连接层。
return F.log_softmax(x)：应用log-softmax函数，通常用于多类别分类的输出层，返回每个类别的对数概率。

移至 GPU 并定义优化器

network = Net().cuda()
optimizer = optim.SGD(network.parameters(), lr=learning_rate,
                      momentum=momentum)

optimizer = optim.SGD(network.parameters(), lr=learning_rate, momentum=momentum): 这行代码定义了一个优化器，用于调整网络参数以最小化损失函数。

这里使用的优化器是随机梯度下降（SGD）算法。

network.parameters()是一个生成器，它会返回网络中所有可以训练的参数。lr=learning_rate设置了学习率，这是一个控制参数更新步长的超参数。momentum=momentum设置了动量值，这是一个帮助加速SGD在相关方向上前进，并抑制震荡的参数。
learning_rate: 学习率，一个控制参数更新幅度的重要超参数，决定了在优化过程中每一步对参数的更新程度。
momentum: 动量，一个用于加速SGD在正确方向上前进的超参数，同时减少震荡。

train_losses = []
train_counter = []
test_losses = []
test_counter = [i*len(train_loader.dataset) for i in range(n_epochs + 1)]

train_losses：这是一个空列表，用于存储训练过程中的损失值。在神经网络的训练过程中，我们希望监控模型的性能，损失值（loss）是衡量模型性能的一个重要指标。这个列表将用来记录每次训练迭代后的损失值。
train_counter：这个也是一个空列表，用于记录训练过程中的迭代次数。这可以帮助我们了解在训练过程中损失值是如何变化的。
test_losses：这是另一个空列表，用于存储测试过程中的损失值。在神经网络的训练过程中，除了监控训练损失之外，还需要监控模型在未见过的数据上的表现，即测试损失。这有助于我们评估模型的泛化能力。
test_counter：这个列表通过一个列表推导式创建，其中包含了每个测试周期对应的训练样本总数。i*len(train_loader.dataset)计算的是在第i个测试周期结束时，训练数据集中的样本已经被训练过的总次数。

训练循环

def train(epoch):
  network.train()
  for batch_idx, (data, target) in enumerate(train_loader):
    data = data.cuda()
    target = target.cuda()
    optimizer.zero_grad()
    output = network(data)
    loss = F.nll_loss(output, target)
    loss.backward()
    optimizer.step()
    if batch_idx % log_interval == 0:
      print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
        epoch, batch_idx * len(data), len(train_loader.dataset),
        100. * batch_idx / len(train_loader), loss.item()))
      train_losses.append(loss.item())
      train_counter.append(
        (batch_idx*64) + ((epoch-1)*len(train_loader.dataset)))

network.train()：设置网络为训练模式。这对某些层（如Dropout或BatchNorm）很重要，因为它们在训练和测试时的行为不同。
for batch_idx, (data, target) in enumerate(train_loader)：这是训练循环的开始，它遍历数据加载器train_loader提供的数据批次。train_loader按批次迭代训练数据集，每个批次包括一组输入数据（data）和对应的目标标签（target）。
data = data.cuda(), target = target.cuda()：将数据和目标标签移动到GPU上，以便使用CUDA进行加速计算。
optimizer.zero_grad()：在进行梯度下降之前，清零已累积的梯度。这是因为默认情况下，梯度是累加的，以便于处理如RNN这样的网络结构。
output = network(data)：通过将输入数据data传递给网络network，获取网络的输出结果。
loss = F.nll_loss(output, target)：计算损失值。这里使用的是负对数似然损失（Negative Log Likelihood Loss），适用于多分类问题。
loss.backward()：反向传播误差，计算梯度。
optimizer.step()：使用优化器更新网络参数。这里的优化器已经在之前被定义为随机梯度下降（SGD）优化器。
if batch_idx % log_interval == 0：这是一个日志记录条件，用于在训练过程中定期打印训练状态。如果批次索引batch_idx是log_interval的倍数，则执行打印操作。在打印语句中，展示了当前的训练轮数epoch，处理的数据量，总数据量，以及损失值。
train_losses.append(loss.item())：记录当前批次的损失值。
train_counter.append((batch_idx*64) + ((epoch-1)*len(train_loader.dataset)))：记录训练过程中处理的总数据量，这有助于后续分析训练过程。

测试功能和检查结果

def test():
  network.eval()
  test_loss = 0
  correct = 0
  with torch.no_grad():
    for data, target in test_loader:
      data = data.cuda()
      target = target.cuda()
      output = network(data)
      test_loss += F.nll_loss(output, target, size_average=False).item()
      pred = output.data.max(1, keepdim=True)[1]
      correct += pred.eq(target.data.view_as(pred)).sum()
  test_loss /= len(test_loader.dataset)
  test_losses.append(test_loss)
  print('\nTest set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
    test_loss, correct, len(test_loader.dataset),
    100. * correct / len(test_loader.dataset)))

test()
for epoch in range(1, n_epochs + 1):
    train(epoch)
    test()
import matplotlib.pyplot as plt
fig = plt.figure()
plt.plot(train_counter, train_losses, color='blue')
plt.scatter(test_counter, test_losses, color='red')
plt.legend(['Train Loss', 'Test Loss'], loc='upper right')
plt.xlabel('number of training examples seen')
plt.ylabel('negative log likelihood loss')
plt.show()

dcjszhr

关注

12
点赞
踩
18

收藏

觉得还不错? 一键收藏
0
评论
深度学习实战指南

PyTorch是一个开源的机器学习库，广泛应用于计算机视觉、自然语言处理等人工智能领域。它提供了强大的GPU加速，以及灵活的深度学习研究平台。以下是对PyTorch的基本介绍，以及如何用中文简单介绍其核心概念。
复制链接

扫一扫

专栏目录