pytorch 入坑安装+基本概念+入门demo

最新推荐文章于 2024-04-23 15:26:33 发布

Goodwillie

最新推荐文章于 2024-04-23 15:26:33 发布

阅读量372

点赞数

文章标签：深度学习 pytorch

本文链接：https://blog.csdn.net/qq_39597358/article/details/107595420

版权

windows版本的安装

先来到官网 https://pytorch.org/get-started/locally/ 选择合适的选项会得到一条pip命令，允许这条命令即可下载合适版本。

若出现网速不给力的情况，可以手动将pip输出的信息中的url粘贴到下载工具中下载，然后使用pip instal

**.whl 直接安装这个包。

如果需要使用cuda加速，则还需要安装对应版本的cuda和“足够新”的显卡驱动。我试过cuda9.2+4xx版本以上的驱动，配合torch1.5.1版本，才能正常使用cuda加速。

import torch
print(torch.cuda.is_available()) # 如果输出为true，表名torch和cuda都是ok的。

torch的核心概念

tensor

张量的概念与numpy的array接近，pytorch提供了许多张量操作方法。：

构造+初始化+属性（类型，形状）转换+切片+运算+cuda操作

import torch
import numpy as np


# 常见类型
x = torch.empty(5,3,2,dtype=torch.float)
x1 = torch.zeros(2,2,2)
x2 = torch.rand(2,2,2)
x3 = torch.randn(5,3,2)
# 直接用数据创建
x4 = torch.tensor([[5.5, 3],[5.5, 32]])
# 基于旧tensor创建新tensor
x5 = x.new_tensor(range(10),dtype=torch.int16) # 新的x继承了旧的x的dtype
x5_new = x.type(torch.int8)   # 强制转换数据类型
x6 = torch.zeros_like(x)   # 新的x继承了旧x的尺寸
# 将numpy数组转成tensor
y = np.zeros((4, 4), dtype=np.double)
# 获取尺寸
print(x.size())
# 转换尺寸
print(x.view(5,-1))
# 运算
A,B = x3[:,:,0],x[:,:,0]+1 # 切片
C = A+B  # 加法
C = A.add(B)
torch.add(A, B, out=C) # 指定输出对象
A.add_(B) # in-place 相加
# 带下划线后缀的都是in-place操作。x.copy_(y), x.t_(), will change x.
print(A*B) #hadmard 积 点对点相乘
B.transpose_(0,1) # inplace转置
C = A.mm(B)  # 普通的矩阵相乘 要求A列数等于B行数
# numpy桥
x7 = torch.from_numpy(y)
print(x7)
x7.add_(1)   # x7 和 y都会改变。这两个矩阵共享一块内存
print(x7)
print(y)
y2 = x7.numpy()  # 反过来也是一样的

# CUDA上面的tensor操作
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

auto_grad

自动求导（梯度计算）是pytorch的核心组件，有了自动求导才有了如此方便易用的pytorch（不确定其他的深度学习框架是不是也是这么设计的）。求导遵循链式法则，因而很容易被跟踪。

'''
torch.Tensor is the central class of the package.
 If you set its attribute .requires_grad as True, it starts to track all operations on it.
 When you finish your computation you can call .backward() and have all the gradients computed automatically.
 The gradient for this tensor will be accumulated into .grad attribute.
'''
import torch
#a = torch.tensor(range(1, 5), dtype=torch.double,requires_grad = True).view(2, -1)
a1 = torch.randn(2,2,requires_grad=True)
a = a1.view(1,-1)
a.retain_grad()
print(a,a.requires_grad)
b = a**2
b.retain_grad()  # 不加这行会报错，非叶子节点不能求梯度
print(b, b.requires_grad,b.grad_fn)
c= b.sum()
print(c,c.grad_fn)
c.backward()
print(b.grad)
print(a.grad)

很多时候我们不希望自动求导器拖慢运算速度，比如在模型训练完了之后，用模型做推理的时候。

with torch.no_grad():
    print((x ** 2).requires_grad)

自定义Auto_grad函数

自定义forward和backward两个函数，即可自定义一个autograd函数。这样的函数是模型的构建基石。

import torch


class MyReLU(torch.autograd.Function):
    """
    We can implement our own custom autograd Functions by subclassing
    torch.autograd.Function and implementing the forward and backward passes
    which operate on Tensors.
    """

    @staticmethod
    def forward(ctx, input):
        """
        In the forward pass we receive a Tensor containing the input and return
        a Tensor containing the output. ctx is a context object that can be used
        to stash information for backward computation. You can cache arbitrary
        objects for use in the backward pass using the ctx.save_for_backward method.
        """
        ctx.save_for_backward(input)
        return input.clamp(min=0)

    @staticmethod
    def backward(ctx, grad_output):
        """
        In the backward pass we receive a Tensor containing the gradient of the loss
        with respect to the output, and we need to compute the gradient of the loss
        with respect to the input.
        """
        input, = ctx.saved_tensors
        grad_input = grad_output.clone()
        grad_input[input < 0] = 0
        return grad_input

模型

有一点神经网络基础的话就会知道，ANN模型是一堆函数和参数的总和。训练模型就是根据结果使用优化器调整模型参数的过程。像keras一样，pytorch也能用sequence的方式模型。

import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model. Module objects
    # override the __call__ operator so you can call them like functions. When
    # doing so you pass a Tensor of input data to the Module and it produces
    # a Tensor of output data.
    y_pred = model(x)

    # Compute and print loss. We pass Tensors containing the predicted and true
    # values of y, and the loss function returns a Tensor containing the
    # loss.
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # Zero the gradients before running the backward pass.
    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable
    # parameters of the model. Internally, the parameters of each Module are stored
    # in Tensors with requires_grad=True, so this call will compute gradients for
    # all learnable parameters in the model.
    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so
    # we can access its gradients like we did before.
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

自定义模型

继承torch.nn.Module，实现初始化和forward两个方法。注意forward里面都是现有的模型，或者其他的什么支持auto_grad的函数。这个模型支持推理，有自己的parameters字典，可以被自动求导。即可以被训练了。

import torch


class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        """
        In the constructor we instantiate two nn.Linear modules and assign them as
        member variables.
        """
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        """
        In the forward function we accept a Tensor of input data and we must return
        a Tensor of output data. We can use Modules defined in the constructor as
        well as arbitrary operators on Tensors.
        """
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred

优化器

上面定义的module可以这样更新参数：

# Zero the gradients before running the backward pass.
    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable
    # parameters of the model. Internally, the parameters of each Module are stored
    # in Tensors with requires_grad=True, so this call will compute gradients for
    # all learnable parameters in the model.
    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so
    # we can access its gradients like we did before.
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

也可以使用更高级的优化器来做；

learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  # 初始化优化器

for t in range(500):  #在所有的数据集上迭代500次
    y_pred = model(x)

    # 计算损失
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable
    # weights of the model). This is because by default, gradients are
    # accumulated in buffers( i.e, not overwritten) whenever .backward()
    # is called. Checkout docs of torch.autograd.backward for more details.
    optimizer.zero_grad() # 后向之前，梯度先置0，防止累计（貌似是为了RNN训练方便，默认如此了）

    # Backward pass: compute gradient of the loss with respect to model
    # parameters
    loss.backward()  # 后向传播

    # Calling the step function on an Optimizer makes an update to its
    # parameters
    optimizer.step()   # 简单的调用一下优化器的步进函数即可

https://blog.csdn.net/u011995719/article/details/88988420 pytorch的十个优化器

损失函数

游戏规则中最关键的一环就是损失函数了，torch提供了很多损失函数。也可以自己定义损失函数，关键点在于要能够支持自动求导即可。

https://blog.csdn.net/shanglianlm/article/details/85019768

pytorch的十九个损失函数

Goodwillie

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
pytorch 入坑安装+基本概念+入门demo

windows版本的安装先来到官网https://pytorch.org/get-started/locally/ 选择合适的选项会得到一条pip命令，允许这条命令即可下载合适版本。若出现网速不给力的情况，可以手动将pip输出的信息中的url粘贴到下载工具中下载，然后使用pip instal**.whl 直接安装这个包。如果需要使用cuda加速，则还需要安装对应版本的cuda和“足够新”的显卡驱动。我试过cuda9.2+4xx版本以上的驱动，配合torch1.5.1版本，才...
复制链接

扫一扫