PyTorch学习系列之PyTorch：nn和PyTorch：optim优化

最新推荐文章于 2024-07-11 15:22:58 发布

sereasuesue

最新推荐文章于 2024-07-11 15:22:58 发布

阅读量283

点赞数

分类专栏： Python 深度学习文章标签： python pytorch

本文链接：https://blog.csdn.net/sereasuesue/article/details/108991372

版权

Python 深度学习专栏收录该内容

70 篇文章 16 订阅

订阅专栏

PyTorch：nn

在构建神经网络时，我们经常考虑将计算分为几层，其中一些层具有可学习的参数 ，这些参数将在学习过程中进行优化。

在TensorFlow，像包 Keras， TensorFlow修身，和TFLearn提供了原始计算图表，是构建神经网络有用的更高层次的抽象。

在PyTorch中，该nn程序包达到了相同的目的。该nn 软件包定义了一组Modules，它们大致等效于神经网络层。模块接收输入张量并计算输出张量，但也可以保持内部状态，例如包含可学习参数的张量。该nn软件包还定义了一组有用的损失函数，这些函数通常在训练神经网络时使用。

在此示例中，我们使用该nn包来实现我们的两层网络：

# -*- coding: utf-8 -*-
import torch

#N是批量大小； D_in是输入尺寸；
#H是隐藏维； D_out是输出尺寸。
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

#使用nn包将模型定义为一系列图层。 nn.Sequential是一个包含其他模块的模块，并依次应用它们以产生其
#输出。每个线性模块使用线性函数计算输入的输出，并保留内部张量以用于其权重和偏差。
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

#nn包还包含流行的损失函数的定义；在这种情况下，我们将使用均方误差（MSE）作为损失函数
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(500):
    '''正向传递：通过将x传递给模型来计算预测的y。模块对象会覆盖__call__运算符，因此您可以像调用函数一样调用它们。这样做时，您将输入数据的张量传递给模块，它会产生输出数据的张量。'''
    y_pred = model(x)

    # 计算和打印损失。我们传递包含y的预测值和真实值的Tensors，损失函数返回Tensor损失。
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # 在进行反向通过之前将梯度归零
    model.zero_grad()

   '''向后传递：相对于模型的所有可学习参数计算损耗的梯度。在内部，每个模块的参数都存储在Tensors中，其中require_grad = True，因此此调用将计算模型中所有可学习参数的梯度。'''
    loss.backward()

    #使用梯度下降更新权重。每个参数都是张量，因此我们可以像以前一样访问其梯度。
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

PyTorch：`optim优化`

通过手动更改持有可学习参数的张量来更新模型的权重（使用torch.no_grad() 或.data避免在autograd中跟踪历史记录）。对于像随机梯度下降这样的简单优化算法而言，这并不是一个巨大的负担，但是在实践中，我们经常使用更复杂的优化器（例如AdaGrad，RMSProp，Adam等）来训练神经网络。

optimPyTorch中的软件包抽象了优化算法的思想，并提供了常用优化算法的实现。

在此示例中，我们将nn像以前一样使用包来定义我们的模型，但是我们将使用optim包提供的Adam算法来优化模型：

# -*- coding: utf-8 -*-
import torch

#N是批量大小； D_in是输入尺寸；
#H是隐藏维； D_out是输出尺寸。
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

#使用nn包将模型定义为一系列图层。 nn.Sequential是一个包含其他模块的模块，并依次应用它们以产生其
#输出。每个线性模块使用线性函数计算输入的输出，并保留内部张量以用于其权重和偏差。
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

#nn包还包含流行的损失函数的定义；在这种情况下，我们将使用均方误差（MSE）作为损失函数
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(500):
    '''正向传递：通过将x传递给模型来计算预测的y。模块对象会覆盖__call__运算符，因此您可以像调用函数一样调用它们。这样做时，您将输入数据的张量传递给模块，它会产生输出数据的张量。'''
    y_pred = model(x)

    # 计算和打印损失。我们传递包含y的预测值和真实值的Tensors，损失函数返回Tensor损失。
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())
'''在向后传递之前，请使用优化程序对象将要更新的变量的所有梯度都归零（这是模型的可学习权重）。 这是因为默认情况下，渐变是每当调用.backward（）时，累积在缓冲区中（即不覆盖）。 有关更多详细信息，请查看torch.autograd.backward的文档。'''

    optimizer.zero_grad()

    # 向后传递：计算相对于模型参数的损耗梯度
    loss.backward()

    # 在Optimizer上调用step函数可对其参数进行更新
    optimizer.step()

PyTorch：自定义nn模块

有时，您将需要指定比一系列现有模块更复杂的模型。在这些情况下，您可以通过子类化nn.Module和定义a来定义您自己的模块，forward该模块使用其他模块或在Tensors上的其他autograd操作接收输入张量并生成输出Tensors。

在此示例中，我们将两层网络实现为自定义的Module子类：

# -*- coding: utf-8 -*-
import torch


class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        """
        In the constructor we instantiate two nn.Linear modules and assign them as
        member variables.
        """
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        """
        In the forward function we accept a Tensor of input data and we must return
        a Tensor of output data. We can use Modules defined in the constructor as
        well as arbitrary operators on Tensors.
        """
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred


# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Construct our model by instantiating the class defined above
model = TwoLayerNet(D_in, H, D_out)

# Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
for t in range(500):
    # Forward pass: Compute predicted y by passing x to the model
    y_pred = model(x)

    # Compute and print loss
    loss = criterion(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()