Pytorch入门学习（五）---- 示例讲解Tensor, Autograd, nn.module

最新推荐文章于 2024-07-30 10:51:24 发布

Hungryof

最新推荐文章于 2024-07-30 10:51:24 发布

阅读量9.7k

点赞数 5

分类专栏： pytorch PyTorch 文章标签： pytorch

本文链接：https://blog.csdn.net/Hungryof/article/details/71440266

版权

本文介绍了PyTorch中的核心概念，包括N维Tensor及其GPU支持，自动求导机制Autograd，以及nn.Module模块的使用。通过实例展示了如何用Tensor构建简单的神经网络，如何自定义自动求导函数，并对比了PyTorch动态图与TensorFlow静态图的区别。此外，还探讨了nn.Module简化网络构建的方式，以及优化器optim的应用和权重共享的控制流技巧。

摘要由CSDN通过智能技术生成

提示：我觉得大部分的人可以直接看文章最后，你觉得呢？

Tensors

虽然python有Numpy这样的框架，但Numpy是不支持GPU的。Pytorch的主要两个特性就是：N维Tensor以及自动求导。

两层网络模型，单纯用Tensor实现：

# -*- coding: utf-8 -*-

import torch


dtype = torch.FloatTensor
# dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data
x = torch.randn(N, D_in).type(dtype)
y = torch.randn(N, D_out).type(dtype)

# Randomly initialize weights
w1 = torch.randn(D_in, H).type(dtype)
w2 = torch.randn(H, D_out).type(dtype)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)

    # Compute and print loss
    loss = (y_pred - y).pow(2).sum()
    print(t, loss)

    #手动写求导
    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(w2.t())
    grad_h = grad_h_relu.clone()
    grad_h[h < 0] = 0
    grad_w1 = x.t().mm(grad_h)

    # Update weights using gradient descent
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

Autograd

如何让求导自动，这时候进入第二阶段。我们需要把网络中的所有变量wrap到 Variable对象中。Variable对象是代表计算图中的一个节点，这种节点有 x.data代表Tensor，和x.grad代表其梯度。
值得注意的是，Pytorch Variables和Pytorch Tensors几乎具有所有相同的的API，唯一的不同就是Variables可以是定义了一个计算图，可以自动求导。所以牛顿说：想要自动求导，那就用Variable包一包？

# -*- coding: utf-8 -*-
import torch
from torch.autograd import Variable

dtype = torch.FloatTensor
# dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold input and outputs, and wrap them in Variables.
# Setting requires_grad=False indicates that we do not need to compute gradients
# with respect to these Variables during the backward pass.
# 输入求导干嘛，一点用都没有，所以 requires_grad = False
x = Variable(torch.randn(N, D_in).type(dtype), requires_grad=False)
y = Variable(torch.randn(N, D_out).type(dtype), requires_grad=False)

# Create random Tensors for weights, and wrap them in Variables.
# Setting requires_grad=True indicates that we want to compute gradients with
# respect to these Variables during the backward pass.
w1 = Variable(torch.randn(D_in, H).type(dtype), requires_grad=True)
w2 = Variable(torch.randn(H, D_out).type(dtype), requires_grad=True)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y using operations on Variables; these
    # are exactly the same operations we used to compute the forward pass using
    # Tensors, but we do not need to keep references to intermediate values since
    # we are not implementing the backward pass by hand.
    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    # Compute and print loss using operations on Varia