Pytorch基础知识

最新推荐文章于 2024-04-09 09:44:59 发布

hello_JeremyWang

最新推荐文章于 2024-04-09 09:44:59 发布

阅读量491

点赞数 1

分类专栏： Pytorch实战文章标签： pytorch python 深度学习

本文链接：https://blog.csdn.net/hello_JeremyWang/article/details/120678164

版权

Pytorch实战专栏收录该内容

9 篇文章 11 订阅

订阅专栏

1. 张量

注意是张量，不是麻辣烫，哈哈哈。（冷笑话又多了）

张量（Tensor）可能是Pytorch中最重要的数据形式了，也是整个运算的基础。那什么是张量呢？个人理解就是向量和矩阵的推广。比如我们常见的图片，它是由RGB三通道表示的，那一张图片就可以由一个(width, height, channel)的三维字段来表示。而对于多张图片，就需要一个4D的张量来表示，即：(sample_size, width, height, channel)。

1.1 张量数据生成

常见的数据生成操作如下图所示：

函数	功能
Tensor(*sizes)	基础构造函数
tensor(data)	类似于np.array
ones(*sizes)	全1
zeros(*sizes)	全0
eye(*sizes)	对角为1，其余为0
arange(s,e,step)	从s到e，步长为step
linspace(s,e,steps)	从s到e，均匀分成step份
rand/randn(*sizes)
normal(mean,std)/uniform(from,to)	正态分布/均匀分布
randperm(m)	随机排列

拿其中几个函数举例说明一下：

torch.zeros

torch.zeros类似于numpy里面的np.zeros

import torch
x = torch.zeros(4,3,dtype=torch.long)
print(x)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])

32位浮点型：torch.FloatTensor。pyorch.Tensor()默认的就是这种类型。
64位整型：torch.LongTensor。
32位整型：torch.IntTensor。
16位整型：torch.ShortTensor。
64位浮点型：torch.DoubleTensor。

基于已经存在的 tensor，创建一个 tensor ：

x = torch.ones(4, 3, dtype=torch.double)
print(x)
x = torch.randn_like(x, dtype=torch.float)
# 重置数据类型
print(x)
# 结果会有一样的size

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[ 0.2626, -0.6196,  1.0963],
        [ 1.1366, -0.6543,  0.6040],
        [-0.6623,  0.1115,  0.2433],
        [ 1.1626, -2.3529, -0.9417]])

1.2 张量数据加法

# 方式1
y = torch.rand(4, 3) 
print(x + y)

# 方式2
print(torch.add(x, y))

# 方式3 提供一个输出 tensor 作为参数
result = torch.empty(5, 3) 
torch.add(x, y, out=result) 
print(result)

# 方式4 in-place
y.add_(x) 
print(y)

1.3 张量数据索引

需要注意的是：索引出来的结果与原数据共享内存，也即修改一个，另一个会跟着修改。

# 取第二列
print(x[:, 1])

1.4 张量数据广播机制

当对两个形状不同的 Tensor 按元素运算时，可能会触发广播(broadcasting)机制：先适当复制元素使这两个 Tensor 形状相同后再按元素运算。

x = torch.arange(1, 3).view(1, 2)
print(x)
y = torch.arange(1, 4).view(3, 1)
print(y)
print(x + y)

由于 x 和 y 分别是1行2列和3行1列的矩阵，如果要计算 x + y ，那么 x 中第一行的2个元素被广播 (复制)到了第二行和第三行，⽽ y 中第⼀列的3个元素被广播(复制)到了第二列。如此，就可以对2 个3行2列的矩阵按元素相加。

2. Pytorch自动求导

Pytorch在构建神经网络的过程中比较重要的一个功能就是自动求导。

下面这段话来自知乎文章叶子节点和tensor的requires_grad参数：“无论如何定义计算过程、如何定义计算图，要谨记我们的核心目的是为了计算某些 tensor 的梯度。在 pytorch 的计算图中，其实只有两种元素：数据（tensor）和运算，运算就是加减乘除、开方、幂指对、三角函数等可求导运算，而 tensor 可细分为两类：叶子节点 (leaf node) 和非叶子节点。使用 backward() 函数反向传播计算 tensor 的梯度时，并不计算所有 tensor 的梯度，而是只计算满足这几个条件的 tensor 的梯度：

类型为叶子节点、
requires_grad=True、
依赖该tensor 的所有 tensor 的 requires_grad=True。”

Pytorch之requires_grad 一文指出：“在神经网络的训练过程中，只要某一个输入需要相关梯度值，则输出也需要保存相关梯度信息，这样就保证了这个输入的梯度回传。而反之，若所有的输入都不需要保存梯度，那么输出的requires_grad会自动设置为False。既然没有了相关的梯度值，自然进行反向传播时会将这部分子图从计算中剔除。”

有没有更具体一点的例子呢？这篇文章也给出了这样一个例子。

在这里插入图片描述
比如对应这样一个网络模型，哪些节点会被自动记录其梯度呢？

import torch.nn.functional as F
import torch.nn as nn

class Conv_Classifier(nn.Module):
    def __init__(self):
        super(Conv_Classifier, self).__init__()
        self.conv1 = nn.Conv2d(1, 5, 5)
        self.pool1 = nn.MaxPool2d(2)
        self.conv2 = nn.Conv2d(5, 16, 5)
        self.pool2 = nn.MaxPool2d(2)
        self.fc1 = nn.Linear(256, 20)
        self.fc2 = nn.Linear(20, 10)
    
    def forward(self, x):
        x = F.relu(self.pool1((self.conv1(x))))
        x = F.relu(self.pool2((self.conv2(x))))
        x = F.dropout2d(x, training=self.training)
        x = x.view(-1, 256)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        return x

Mnist_Classifier = Conv_Classifier()

作者：ForeverRuri
链接：https://www.imooc.com/article/282785
来源：慕课网
本文首次发布于慕课网 ，转载请注明出处，谢谢合作

从这个图可以看出，所有的叶子节点对应的操作都被记录，以便之后的梯度回传。
在这里插入图片描述
除此之外，Pytorch中requires_grad_(), detach(), torch.no_grad()的区别一文对于自动求导中参数的理解也很有帮助。

3. 代码演示

Datawhale给出了一些代码，有利于进一步理解本节内容。
代码演示部分：配合本章学习材料使用
第一部分：张量运算示例
这里将演示Tensor的一些基本操作

import torch

?torch.tensor

# 创建tensor，用dtype指定类型。注意类型要匹配
a = torch.tensor(1.0, dtype=torch.float)
b = torch.tensor(1, dtype=torch.long)
c = torch.tensor(1.0, dtype=torch.int8)
print(a, b, c)

tensor(1.) tensor(1) tensor(1, dtype=torch.int8)


/tmp/ipykernel_11770/1264937814.py:4: DeprecationWarning: an integer is required (got type float).  Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.
  c = torch.tensor(1.0, dtype=torch.int8)

# 使用指定类型函数随机初始化指定大小的tensor
d = torch.FloatTensor(2,3)
e = torch.IntTensor(2)
f = torch.IntTensor([1,2,3,4])  #对于python已经定义好的数据结构可以直接转换
print(d, '\n', e, '\n', f)

tensor([[ 7.2398e-07,  4.5710e-41, -2.0912e+23],
        [ 3.0812e-41,  6.7262e-43,  0.0000e+00]]) 
 tensor([64,  0], dtype=torch.int32) 
 tensor([1, 2, 3, 4], dtype=torch.int32)

# tensor和numpy array之间的相互转换
import numpy as np

g = np.array([[1,2,3],[4,5,6]])
h = torch.tensor(g)
print(h)
i = torch.from_numpy(g)
print(i)
j = h.numpy()
print(j)

tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([[1, 2, 3],
        [4, 5, 6]])
[[1 2 3]
 [4 5 6]]

# 常见的构造Tensor的函数
k = torch.rand(2, 3) 
l = torch.ones(2, 3)
m = torch.zeros(2, 3)
n = torch.arange(0, 10, 2)
print(k, '\n', l, '\n', m, '\n', n)

tensor([[0.2652, 0.0650, 0.5593],
        [0.7864, 0.0015, 0.4458]]) 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 
 tensor([[0., 0., 0.],
        [0., 0., 0.]]) 
 tensor([0, 2, 4, 6, 8])

# 查看tensor的维度信息（两种方式）
print(k.shape)
print(k.size())

torch.Size([2, 3])
torch.Size([2, 3])

# tensor的运算
o = torch.add(k,l)
print(o)

tensor([[1.2652, 1.0650, 1.5593],
        [1.7864, 1.0015, 1.4458]])

# tensor的索引方式与numpy类似
print(o[:,1])
print(o[0,:])

tensor([1.0650, 1.0015])
tensor([1.2652, 1.0650, 1.5593])

# 改变tensor形状的神器：view
print(o.view((3,2)))
print(o.view(-1,2))

tensor([[1.2652, 1.0650],
        [1.5593, 1.7864],
        [1.0015, 1.4458]])
tensor([[1.2652, 1.0650],
        [1.5593, 1.7864],
        [1.0015, 1.4458]])

# tensor的广播机制（使用时要注意这个特性）
p = torch.arange(1, 3).view(1, 2)
print(p)
q = torch.arange(1, 4).view(3, 1)
print(q)
print(p + q)

tensor([[1, 2]])
tensor([[1],
        [2],
        [3]])
tensor([[2, 3],
        [3, 4],
        [4, 5]])

# 扩展&压缩tensor的维度：squeeze
print(o)
r = o.unsqueeze(1)
print(r)
print(r.shape)

tensor([[1.2652, 1.0650, 1.5593],
        [1.7864, 1.0015, 1.4458]])
tensor([[[1.2652, 1.0650, 1.5593]],

        [[1.7864, 1.0015, 1.4458]]])
torch.Size([2, 1, 3])

s = r.squeeze(0)
print(s)
print(s.shape)

tensor([[[1.2652, 1.0650, 1.5593]],

        [[1.7864, 1.0015, 1.4458]]])
torch.Size([2, 1, 3])

t = r.squeeze(1)
print(t)
print(t.shape)

tensor([[1.2652, 1.0650, 1.5593],
        [1.7864, 1.0015, 1.4458]])
torch.Size([2, 3])

第二部分：自动求导示例
这里将通过一个简单的函数 $y=x_1+2*x_2$ 来说明PyTorch自动求导的过程

import torch

x1 = torch.tensor(1.0, requires_grad=True)
x2 = torch.tensor(2.0, requires_grad=True)
y = x1 + 2*x2
print(y)

tensor(5., grad_fn=<AddBackward0>)

# 首先查看每个变量是否需要求导
print(x1.requires_grad)
print(x2.requires_grad)
print(y.requires_grad)

True
True
True

# 查看每个变量导数大小。此时因为还没有反向传播，因此导数都不存在
print(x1.grad.data)
print(x2.grad.data)
print(y.grad.data)

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

/tmp/ipykernel_11770/1707027577.py in <module>
      1 # 查看每个变量导数大小。此时因为还没有反向传播，因此导数都不存在
----> 2 print(x1.grad.data)
      3 print(x2.grad.data)
      4 print(y.grad.data)


AttributeError: 'NoneType' object has no attribute 'data'

x1

tensor(1., requires_grad=True)

## 反向传播后看导数大小
y = x1 + 2*x2
y.backward()
print(x1.grad.data)
print(x2.grad.data)

tensor(1.)
tensor(2.)

# 导数是会累积的，重复运行相同命令，grad会增加
y = x1 + 2*x2
y.backward()
print(x1.grad.data)
print(x2.grad.data)

tensor(5.)
tensor(10.)

# 所以每次计算前需要清除当前导数值避免累积，这一功能可以通过pytorch的optimizer实现。后续会讲到

# 尝试，如果不允许求导，会出现什么情况？
x1 = torch.tensor(1.0, requires_grad=False)
x2 = torch.tensor(2.0, requires_grad=False)
y = x1 + 2*x2
y.backward()

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

/tmp/ipykernel_11770/4087792071.py in <module>
      3 x2 = torch.tensor(2.0, requires_grad=False)
      4 y = x1 + 2*x2
----> 5 y.backward()


/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
    253                 create_graph=create_graph,
    254                 inputs=inputs)
--> 255         torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
    256 
    257     def register_hook(self, hook):


/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    145         retain_graph = create_graph
    146 
--> 147     Variable._execution_engine.run_backward(
    148         tensors, grad_tensors_, retain_graph, create_graph, inputs,
    149         allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag


RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

hello_JeremyWang

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Pytorch基础知识

1. 张量注意是张量，不是麻辣烫，哈哈哈。（冷笑话又多了）张量（Tensor）可能是Pytorch中最重要的数据形式了，也是整个运算的基础。那什么是张量呢？个人理解就是向量和矩阵的推广。比如我们常见的图片，它是由RGB三通道表示的，那一张图片就可以由一个(width, height, channel)的三维字段来表示。而对于多张图片，就需要一个4D的张量来表示，即：(sample_size, width, height, channel)。1.1 张量数据生成常见的数据生成操作如下图所示：
复制链接

扫一扫