【Pytorch学习笔记1】Pytorch的安装与基础知识

獭祭

已于 2022-08-22 11:35:28 修改

阅读量493

点赞数 1

文章标签： pytorch 学习 python

于 2022-08-15 18:01:48 首次发布

本文链接：https://blog.csdn.net/edifier_wy/article/details/126328703

版权

个人笔记，仅用于个人学习与总结
感谢DataWhale开源组织提供的优秀的开源Pytorch学习文档：原文档链接

1. Pytorch简介与安装

1.1 Pytorch简介

PyTorch是由Facebook人工智能研究小组开发的一种基于Lua编写的Torch库的Python实现的深度学习库，目前被广泛应用于学术界和工业界，而随着Caffe2项目并入Pytorch， Pytorch开始影响到TensorFlow在深度学习应用框架领域的地位。总的来说，PyTorch是当前难得的简洁优雅且高效快速的框架。
Pytorch主要具有简洁、上手快、良好文档社区支持、开源、代码调试功能等优点

1.2 Pytorch、CUDA安装

1.2.1 pip与Anaconda的选择

网上大部分都推荐使用Anaconda对虚拟环境与第三方库进行管理，我更倾向于更简洁的pip下载第三方库，Pycharm自带的虚拟环境创建功能对虚拟环境进行管理。（如果不使用Pycharm同样可以使用virtualenv）

1.2.2 wins系统下python虚拟环境

Pycharm创建虚拟环境可在settings - project interpreter中选取
如果打开虚拟环境报如下错误

无法加载文件 E:\Codes\YOLOv5\yolov5\venv\Scripts\activate.ps1，因为在此系统上禁止运行脚本。有关详细信息，请参阅 https:/go.microsoft.com/fwlink/?LinkID=135170 中的 about_Execution_Policies。
   + CategoryInfo          : SecurityError: (:) []，ParentContainsErrorRecordException
   + FullyQualifiedErrorId : UnauthorizedAccess

则以管理员身份打开PowerShell，输入指令

set-executionpolicy remotesigned

打开、关闭虚拟环境

./venv/Scripts/activate
./venv/Scripts/deactivate

1.2.3 pip换源

Windows：
输入指令即可

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

1.2.4 CUDA安装

版本选择：查询显卡型号，并结合Pytorch网站安装教程对应CUDA版本；
安装路径：默认安装路径即可；若要自定义，一定要记住安装路径位置，方便之后cudnn安装
安装内容：只需安装CUDA核心，不用勾选VS部件；在已有显卡驱动的情况下，如果在CUDA中选择了图形驱动安装则会使得显卡驱动冲突，电脑黑屏。（当时的解决方案为进入安装模式卸载CUDA，但及其麻烦）
环境变量：设置 - 搜索“高级系统” - 进入高级系统设置 - 添加安装路径、bin、lib等文件夹路径作为环境变量

1.2.5 Pytorch下载安装

在pytorch官网选择对应版本，复制指令下载即可，由于换了源，因此下载速度较快

1.2.6 检查环境配置是否成功

import torch # 如果pytorch安装成功即可导入
print(torch.cuda.is_available()) # 查看CUDA是否可用
print(torch.cuda.device_count()) # 查看可用的CUDA数量
print(torch.version.cuda) # 查看CUDA的版本号

1.3 Pytorch学习资源

Awesome-pytorch-list：目前已获12K Star，包含了NLP,CV,常见库，论文实现以及Pytorch的其他项目。
PyTorch官方文档：官方发布的文档，十分丰富。
Pytorch-handbook：GitHub上已经收获14.8K，pytorch手中书。
PyTorch官方社区：PyTorch拥有一个活跃的社区，在这里你可以和开发pytorch的人们进行交流。
PyTorch官方tutorials：官方编写的tutorials，可以结合colab边动手边学习。
动手学深度学习：动手学深度学习是由李沐老师主讲的一门深度学习入门课，拥有成熟的书籍资源和课程资源，在B站，Youtube均有回放。
Awesome-PyTorch-Chinese：常见的中文优质PyTorch资源

2. Pytorch基础知识

2.1 张量

2.1.1 简介

几何代数中张量的定义即为基于向量和矩阵的推广

张量维度	代表含义
0维张量	标量
1维张量	向量
2维张量	矩阵
3维张量	时间序列数据股价文本数据单张彩色图片(RGB)

张量是现代机器学习的基础。它的核心是一个数据容器，多数情况下，它包含数字，有时候它也包含字符串，但这种情况比较少。因此可以把它想象成一个数字的水桶。
例：一个图像数据集可以用四个字段表示

(batch_size, width, height, channel) = 4D

Pytorch中，torch.Tensor是存储和变换数据的主要工具。Tensor与Numpy中多维数组类似。但Tensor提供GPU计算和自动求梯度等功能。

2.1.2 创建tensor

几种常见的创建tensor的方法。

随机初始化矩阵
全0矩阵
通过torch.zeros()构造全0矩阵，通过dtype设置数据类型，通过torch.zero_()和torch.zeros_like()将现有矩阵转换为全0矩阵
可以通过torch.tensor()构建张量，也可以基于已经存在的张量创建张量

2.1.3 张量的操作

加法操作，三种加法操作
索引操作，与numpy相似，且索引出的结果与原数据共享内存，同时修改同时变化。可用copy()方法避免
维度变换，常用方法torch.view()和torch.reshape()。其中前者需要注意同样共享内存。可用clone()方法避免

2.1.4 广播机制

当对两个形状不同的 Tensor 按元素运算时，可能会触发广播(broadcasting)机制：先适当复制元素使这两个 Tensor 形状相同后再按元素运算。

2.1.5 参考文献

Tensor看这一篇就够了
内容十分详细清晰

import torch
# matrix in torch
a = torch.zeros(4, 3, dtype=torch.long)
b = torch.rand(4, 3)
c = torch.zero_(b)
d = torch.zeros_like(b)
print("a = ", a)
print("b = ", b)
print("c = ", c)
print("d = ", d)
print("---------------------------------------------")

# tensor in torch
# generate a tensor
x = torch.tensor([5.5, 3])
print("x used to be:", x)
# generate a tensor based on an old one
x = x.new_ones(4, 3, dtype=torch.double)
print("x now is:",x)
x = torch.randn_like(x, dtype=torch.float)
print("random new x is:",x)
print("size of x is:", x.size())
print("shape of x is:", x.shape)
print("---------------------------------------------")

# operations of tensor
# addition
y = torch.rand(4, 3)
print("first way of addition: x + y =",x + y)
print("second way of addition: torch.add(x,y) =",torch.add(x, y))
print("third way of addition: y.add(x) =",y.add(x))
print("---------------------------------------------")

# index operation (like numpy)
print("the second column of x is: x[:,1] =", x[:,1])
z = x[0,:]
print("x[0,:] used to be",x[0,:])
z += 1
print("x[0,:] now is",x[0,:],"after z += 1")
print("---------------------------------------------")

# dimension transformations
xx = torch.randn(4,4)
yy = xx.view(16)
zz = xx.view(-1,8)
# -1 means the dimension is determined by other dimension
print("raw dimension is:",xx.size())
print("xx.view(16) dimension is:",yy.size())
print("xx.view(-1,8) dimension is:",zz.size())
print("raw data: yy =",yy)
xx += 1
print("after xx += 1, yy =",yy)
print("---------------------------------------------")

# broadcast mechanism
xxx = torch.arange(1,3).view(1,2)
print("xxx =",xxx)
yyy = torch.arange(1,4).view(3,1)
print("yyy =",yyy)
print("xxx + yyy =",xxx+yyy)

a =  tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])
b =  tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
c =  tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
d =  tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
---------------------------------------------
x used to be: tensor([5.5000, 3.0000])
x now is: tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
random new x is: tensor([[-0.7224,  0.0764,  1.5647],
        [-0.7701,  0.2851, -0.8329],
        [ 1.4413, -1.7074, -0.6832],
        [ 1.2671, -0.5872,  0.5994]])
size of x is: torch.Size([4, 3])
shape of x is: torch.Size([4, 3])
---------------------------------------------
first way of addition: x + y = tensor([[-0.5030,  0.3373,  1.9948],
        [ 0.0952,  0.7261, -0.6664],
        [ 1.8564, -0.9470,  0.0648],
        [ 1.6775,  0.3741,  1.0837]])
second way of addition: torch.add(x,y) = tensor([[-0.5030,  0.3373,  1.9948],
        [ 0.0952,  0.7261, -0.6664],
        [ 1.8564, -0.9470,  0.0648],
        [ 1.6775,  0.3741,  1.0837]])
third way of addition: y.add(x) = tensor([[-0.5030,  0.3373,  1.9948],
        [ 0.0952,  0.7261, -0.6664],
        [ 1.8564, -0.9470,  0.0648],
        [ 1.6775,  0.3741,  1.0837]])
---------------------------------------------
the second column of x is: x[:,1] = tensor([ 0.0764,  0.2851, -1.7074, -0.5872])
x[0,:] used to be tensor([-0.7224,  0.0764,  1.5647])
x[0,:] now is tensor([0.2776, 1.0764, 2.5647]) after z += 1
---------------------------------------------
raw dimension is: torch.Size([4, 4])
xx.view(16) dimension is: torch.Size([16])
xx.view(-1,8) dimension is: torch.Size([2, 8])
raw data: yy = tensor([ 1.3347,  2.0753, -0.3788, -0.9742,  1.8819, -0.4829,  0.3795, -0.9319,
        -0.4008, -0.0103, -0.6632, -0.5539,  0.3991, -0.0936,  1.0349, -0.2901])
after xx += 1, yy = tensor([2.3347, 3.0753, 0.6212, 0.0258, 2.8819, 0.5171, 1.3795, 0.0681, 0.5992,
        0.9897, 0.3368, 0.4461, 1.3991, 0.9064, 2.0349, 0.7099])
---------------------------------------------
xxx = tensor([[1, 2]])
yyy = tensor([[1],
        [2],
        [3]])
xxx + yyy = tensor([[2, 3],
        [3, 4],
        [4, 5]])

2.2 自动求导

2.2.1 Autograd简介

torch.Tensor 是这个包的核心类。如果设置它的属性 .requires_grad 为 True，那么它将会追踪对于该张量的所有操作。当完成计算后可以通过调用 .backward()，来自动计算所有的梯度。这个张量的所有梯度将会自动累加到.grad属性。

要阻止一个张量被跟踪历史，可以调用.detach()方法将其与计算历史分离，并阻止它未来的计算记录被跟踪。为了防止跟踪历史记录(和使用内存），可以将代码块包装在with torch.no_grad(): 中。在评估模型时特别有用，因为模型可能具有requires_grad = True的可训练的参数，但是我们不需要在此过程中对他们进行梯度计算。

还有一个类对于autograd的实现非常重要：Function。Tensor 和 Function 互相连接生成了一个无环图 (acyclic graph)，它编码了完整的计算历史。每个张量都有一个.grad_fn属性，该属性引用了创建 Tensor 自身的Function(除非这个张量是用户手动创建的，即这个张量的grad_fn是 None )。下面给出的例子中，张量由用户手动创建，因此grad_fn返回结果是None。

from __future__ import print_function
import torch
x = torch.randn(3,3,requires_grad=True)
print(x.grad_fn)

None

如果需要计算导数，可以在 Tensor 上调用 .backward()。如果 Tensor 是一个标量(即它包含一个元素的数据），则不需要为 backward() 指定任何参数，但是如果它有更多的元素，则需要指定一个gradient参数，该参数是形状匹配的张量。

创建一个张量并设置requires_grad=True用来追踪其计算历史

x = torch.ones(2, 2, requires_grad=True)
print(x)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

对这个张量做一次运算：

y = x**2
print(y)

tensor([[1., 1.],
        [1., 1.]], grad_fn=<PowBackward0>)

y是计算的结果，所以它有grad_fn属性。

print(y.grad_fn)

<PowBackward0 object at 0x000001CB45988C70>

对 y 进行更多操作

z = y * y * 3
out = z.mean()

print(z, out)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<MulBackward0>) tensor(3., grad_fn=<MeanBackward0>)

.requires_grad_(...) 原地改变了现有张量的requires_grad标志。如果没有指定的话，默认输入的这个标志是 False。

a = torch.randn(2, 2) # 缺失情况下默认 requires_grad = False
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

False
True
<SumBackward0 object at 0x000001CB4A19FB50>

2.2.2 梯度计算

2.2.1中主要讲解了Tensor类中与求导相关的属性.requires_grad。如果为True，则会追踪操作且可以利用.backward()方法自动计算梯度。
这部分介绍如何利用.backward()方法在反向传播中求导。
variable.backward(gradient=None, retain_graph=None, create_graph=None)主要有如下参数：

grad_variables:形状与variable一致。对于y.backward()，grad_variables相当于链式法则 $\frac {dz}{dx}=\frac {dz}{dy}\frac {dy}{dx}$ 中的 $\frac {dz}{dy}$ 。grad_variables也可以是tensor或序列；
retrain_graph:反向传播需要缓存一些中间结果，反向传播之后，这些缓存就被清空，可通过指定这个参数不清空缓存，用来多次反向传播。
create_graph：对反向传播过程再次构建计算图，可通过backward of backward实现求高阶导数。
例子：计算以下函数导数
$y=x^2e^x$

def f(x):
    '''计算y'''
    y = x**2 * t.exp(x)
    return y

def gradf(x):
    '''手动求导函数'''
    dx = 2*x*t.exp(x) + x**2*t.exp(x)
    return dx
    
x = t.randn(3,4, requires_grad = True)
y = f(x)

y.backward(t.ones(y.size())) # gradient形状与y一致
x.grad

# autograd的计算结果与利用公式手动计算的结果一致
gradf(x)

tensor([[-0.4146, -0.4610,  2.9016,  3.2831],
        [ 3.8102,  1.8614, -0.4536, -0.0244],
        [-0.4321,  0.5110, -0.4549,  8.6048]])

2.2.4 参考文献

Autograd看这一篇就够了

獭祭

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
【Pytorch学习笔记1】Pytorch的安装与基础知识

个人笔记，用于DataWhale组织的pytorch开源学习
复制链接

扫一扫