pytorch 入坑 安装+基本概念+入门demo

 

windows版本的安装

先来到官网 https://pytorch.org/get-started/locally/  选择合适的选项会得到一条pip命令,允许这条命令即可下载合适版本。

若出现网速不给力的情况,可以手动将pip输出的信息中的url粘贴到下载工具中下载,然后使用pip instal

 **.whl 直接安装这个包。

如果需要使用cuda加速,则还需要安装对应版本的cuda和“足够新”的显卡驱动。 我试过cuda9.2+4xx版本以上的驱动,配合torch1.5.1版本,才能正常使用cuda加速。

import torch
print(torch.cuda.is_available()) # 如果输出为true,表名torch和cuda都是ok的。

torch的核心概念

  • tensor 

张量的概念与numpy的array接近,pytorch提供了许多张量操作方法。:

  • 构造+初始化+属性(类型,形状)转换+切片+运算+cuda操作
import torch
import numpy as np


# 常见类型
x = torch.empty(5,3,2,dtype=torch.float)
x1 = torch.zeros(2,2,2)
x2 = torch.rand(2,2,2)
x3 = torch.randn(5,3,2)
# 直接用数据创建
x4 = torch.tensor([[5.5, 3],[5.5, 32]])
# 基于旧tensor创建新tensor
x5 = x.new_tensor(range(10),dtype=torch.int16) # 新的x继承了旧的x的dtype
x5_new = x.type(torch.int8)   # 强制转换数据类型
x6 = torch.zeros_like(x)   # 新的x继承了旧x的尺寸
# 将numpy数组转成tensor
y = np.zeros((4, 4), dtype=np.double)
# 获取尺寸
print(x.size())
# 转换尺寸
print(x.view(5,-1))
# 运算
A,B = x3[:,:,0],x[:,:,0]+1 # 切片
C = A+B  # 加法
C = A.add(B)
torch.add(A, B, out=C) # 指定输出对象
A.add_(B) # in-place 相加
# 带下划线后缀的都是in-place操作。x.copy_(y), x.t_(), will change x.
print(A*B) #hadmard 积 点对点相乘
B.transpose_(0,1) # inplace转置
C = A.mm(B)  # 普通的矩阵相乘 要求A列数等于B行数
# numpy桥
x7 = torch.from_numpy(y)
print(x7)
x7.add_(1)   # x7 和 y都会改变。这两个矩阵共享一块内存
print(x7)
print(y)
y2 = x7.numpy()  # 反过来也是一样的

# CUDA上面的tensor操作
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

 

  • auto_grad

自动求导(梯度计算)是pytorch的核心组件,有了自动求导才有了如此方便易用的pytorch(不确定其他的深度学习框架是不是也是这么设计的)。求导遵循链式法则,因而很容易被跟踪。

'''
torch.Tensor is the central class of the package.
 If you set its attribute .requires_grad as True, it starts to track all operations on it.
 When you finish your computation you can call .backward() and have all the gradients computed automatically.
 The gradient for this tensor will be accumulated into .grad attribute.
'''
import torch
#a = torch.tensor(range(1, 5), dtype=torch.double,requires_grad = True).view(2, -1)
a1 = torch.randn(2,2,requires_grad=True)
a = a1.view(1,-1)
a.retain_grad()
print(a,a.requires_grad)
b = a**2
b.retain_grad()  # 不加这行会报错,非叶子节点不能求梯度
print(b, b.requires_grad,b.grad_fn)
c= b.sum()
print(c,c.grad_fn)
c.backward()
print(b.grad)
print(a.grad)

很多时候我们不希望自动求导器拖慢运算速度,比如在模型训练完了之后,用模型做推理的时候。

with torch.no_grad():
    print((x ** 2).requires_grad)

自定义Auto_grad函数

自定义forward和backward两个函数,即可自定义一个autograd函数。这样的函数是模型的构建基石。

import torch


class MyReLU(torch.autograd.Function):
    """
    We can implement our own custom autograd Functions by subclassing
    torch.autograd.Function and implementing the forward and backward passes
    which operate on Tensors.
    """

    @staticmethod
    def forward(ctx, input):
        """
        In the forward pass we receive a Tensor containing the input and return
        a Tensor containing the output. ctx is a context object that can be used
        to stash information for backward computation. You can cache arbitrary
        objects for use in the backward pass using the ctx.save_for_backward method.
        """
        ctx.save_for_backward(input)
        return input.clamp(min=0)

    @staticmethod
    def backward(ctx, grad_output):
        """
        In the backward pass we receive a Tensor containing the gradient of the loss
        with respect to the output, and we need to compute the gradient of the loss
        with respect to the input.
        """
        input, = ctx.saved_tensors
        grad_input = grad_output.clone()
        grad_input[input < 0] = 0
        return grad_input

模型

有一点神经网络基础的话就会知道,ANN模型是一堆函数和参数的总和。训练模型就是根据结果使用优化器调整模型参数的过程。像keras一样,pytorch也能用sequence的方式模型。

import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model. Module objects
    # override the __call__ operator so you can call them like functions. When
    # doing so you pass a Tensor of input data to the Module and it produces
    # a Tensor of output data.
    y_pred = model(x)

    # Compute and print loss. We pass Tensors containing the predicted and true
    # values of y, and the loss function returns a Tensor containing the
    # loss.
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # Zero the gradients before running the backward pass.
    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable
    # parameters of the model. Internally, the parameters of each Module are stored
    # in Tensors with requires_grad=True, so this call will compute gradients for
    # all learnable parameters in the model.
    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so
    # we can access its gradients like we did before.
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

自定义模型

继承torch.nn.Module,实现初始化和forward两个方法。注意forward里面都是现有的模型,或者其他的什么支持auto_grad的函数。这个模型支持推理,有自己的parameters字典,可以被自动求导。即可以被训练了。

import torch


class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        """
        In the constructor we instantiate two nn.Linear modules and assign them as
        member variables.
        """
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        """
        In the forward function we accept a Tensor of input data and we must return
        a Tensor of output data. We can use Modules defined in the constructor as
        well as arbitrary operators on Tensors.
        """
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred

优化器

上面定义的module可以这样更新参数:

# Zero the gradients before running the backward pass.
    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable
    # parameters of the model. Internally, the parameters of each Module are stored
    # in Tensors with requires_grad=True, so this call will compute gradients for
    # all learnable parameters in the model.
    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so
    # we can access its gradients like we did before.
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

也可以使用更高级的优化器来做;

learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  # 初始化优化器

for t in range(500):  #在所有的数据集上迭代500次
    y_pred = model(x)

    # 计算损失
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable
    # weights of the model). This is because by default, gradients are
    # accumulated in buffers( i.e, not overwritten) whenever .backward()
    # is called. Checkout docs of torch.autograd.backward for more details.
    optimizer.zero_grad() # 后向之前,梯度先置0,防止累计(貌似是为了RNN训练方便,默认如此了)

    # Backward pass: compute gradient of the loss with respect to model
    # parameters
    loss.backward()  # 后向传播

    # Calling the step function on an Optimizer makes an update to its
    # parameters
    optimizer.step()   # 简单的调用一下优化器的步进函数即可

https://blog.csdn.net/u011995719/article/details/88988420   pytorch的十个优化器

损失函数

游戏规则中最关键的一环就是损失函数了,torch提供了很多损失函数。也可以自己定义损失函数,关键点在于要能够支持自动求导即可。

https://blog.csdn.net/shanglianlm/article/details/85019768

pytorch的十九个损失函数

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值