python和pytorch关系_PyTorch学习（1）-CSDN博客

PyTorch学习（1）

一、预先善其事，必先利其器-pytorch与cuda对应关系

pytorchtorchvisionpythoncuda

<1.0.1

0.2.2

==2.7，>=3.5，<=3.7

9.0，10.0

1.1.0

0.3.0

==2.7，>=3.5，<=3.7

9.0，10.0

1.2.0

0.4.0

==2.7，>=3.5，<=3.7

9.2，10.0

1.3.0

0.4.1

==2.7，>=3.5，<=3.7

9.2，10.0

1.3.1

0.4.2

==2.7，>=3.5，<=3.7

9.2，10.0

1.4.0

0.5.0

==2.7，>=3.5，<=3.8

9.2，10.0

1.5.0

0.6.0

>=3.6

9.2，10.1，10.2

1.5.1

0.6.1

>=3.6

9.2，10.1，10.2

各个版本最好相对应，不然代码的运行容易出现问题。

二、pytorch相关

1.创建张量

import torch

a1 = torch.tensor(3)

a2 = torch.tensor([1, 2, 3])

a3 = torch.randn(2, 3)

b3 = torch.rand(2, 3)

a4 = torch.rand(1, 2, 3)

print('a1的值:', a1)

print('a1的大小:', a1.shape)

print('------------')

print('a2的值:', a2)

print('a2的大小:', a2.shape)

print('------------')

print('a3的值:', a3)

print('a3的大小:', a3.shape)

print('------------')

print('b3的值:', b3)

print('b3的大小:', b3.shape)

print('------------')

print('a4的值:', a4)

print('a4的大小:', a4.shape)

print('\n 以上为分步定义tensor的值 \n *******************')

# 结果显示

a1的值: tensor(3)

a1的大小: torch.Size([])

------------

a2的值: tensor([1, 2, 3])

a2的大小: torch.Size([3])

------------

a3的值: tensor([[ 0.8593, 0.8400, -0.7855],

[-0.6212, -0.2771, -0.9999]])

a3的大小: torch.Size([2, 3])

------------

b3的值: tensor([[0.0023, 0.1359, 0.0431],

[0.9841, 0.4317, 0.2710]])

b3的大小: torch.Size([2, 3])

------------

a4的值: tensor([[[0.3898, 0.1011, 0.8075],

[0.4289, 0.2972, 0.8072]]])

a4的大小: torch.Size([1, 2, 3])

以上为分步定义tensor的值

*******************

print(torch.tensor([1, 2.2, -1]))

print('定义的确定数据的float张量:', torch.FloatTensor([1, 2.2, -1]))

print(torch.tensor([[1, 2.2],[3, -1]])) # 与rand的操作类似，构建多维张量

print('\n 以上为直接定义tensor的值 \n *******************')

#结果显示

tensor([ 1.0000, 2.2000, -1.0000])

定义的确定数据的float张量: tensor([ 1.0000, 2.2000, -1.0000])

tensor([[ 1.0000, 2.2000],

[ 3.0000, -1.0000]])

以上为直接定义tensor的值

*******************

print(torch.empty(2, 4)) # 定义未初始化的2行4列的张量

print('定义的1行3列的随机float张量:', torch.FloatTensor(1, 3))

print('\n 以上为随机(未初始化)定义tensor的值 \n *******************')

#结果显示

tensor([[1.9758e-43, 0.0000e+00, 0.0000e+00, 0.0000e+00],

[0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00]])

定义的1行3列的随机float张量: tensor([[0.0000e+00, 0.0000e+00, 5.3564e-18]])

以上为随机(未初始化)定义tensor的值

*******************

print('a1原来的类型:', a1.type())

torch.set_default_tensor_type(torch.DoubleTensor)

print('a1转变后的类型:', a1.type())

print('\n 以上为转换默认张量类型 \n *******************')

#结果显示

a1原来的类型: torch.LongTensor

a1转变后的类型: torch.LongTensor

以上为转换默认张量类型

*******************

a5 = torch.rand(3)

b5 = torch.randperm(3) # 生成随机的整数张量

print('a5的值:', a5)

print('b5的值:', b5)

print('将b5作为a5的索引的值:', a5[b5])

print('\n 以上为生成随机的整数张量 \n *******************')

#结果显示

a5的值: tensor([0.5683, 0.6638, 0.6250])

b5的值: tensor([1, 0, 2])

将b5作为a5的索引的值: tensor([0.6638, 0.5683, 0.6250])

以上为生成随机的整数张量

*******************

扩展：所创建张量的其他相关语句

torch.ones(size)/zero(size)/eye(size):返回全为1/0/对角单位的张量

torch.full(size,fill_value):返回以fill_value取值填充的size大小的张量

torch.rand(size):返回[0，1)之间的均匀分布张量

torch.randn(size):返回方差为1，均值为0的正态分布张量

torch.*_like(input):返回和输入大小(几维、几行几列)一样的张量，其中*可以是rand、randn等等

torch.linspace(start,end,step=100):返回以步长为100的由start到end的一维张量

torch.logspace(start,end,steps=100,base=10.0):返回以100为步长的由base为底的start次方到end次方的一维张量

2.维度变换

先列一个总纲，具体用法可见代码，顺序与总纲一致

tensor.squeeze（）/tensor.unsqueeze（0）降维/升维

tensor.expand（）/tensor.repeat（）扩展张量

tensor.transpose（）/tensor.premute（）调换张量维度的顺序

tensor.cat（）/tensor.stack（）张量拼接

import torch

x = torch.rand(4, 1, 28, 1, 28, 1)

y1 = x.unsqueeze(0) # 在对应索引位置插入一个维度

print('y1的大小：', y1.shape)

y2 = x.squeeze() # 删除维度为1的维度

print('y2的大小：', y2.shape)

y3 = x.squeeze(1) # 删除括号数值里对应的索引维度的维度为1的维度

print('y3的大小：', y3.shape)

#结果显示

y1的大小： torch.Size([1, 4, 1, 28, 1, 28, 1])

y2的大小： torch.Size([4, 28, 28])

y3的大小： torch.Size([4, 28, 1, 28, 1])

a = torch.tensor([[[1, 2, 3]]])

print(a)

print('a的大小：', a.shape)

b1 = a.expand(1, 2, 3) # 注意的是expand中的扩展是对某个单一维度（值为1的维度）进行扩展，比如是1行3列，就对行（因为行才是1）进行扩展，列（如果多维，就除要变的不一样，其他必须一样）需要与原数据一致。

print(b1)

print('b1的大小：', b1.shape)

b2 = a.expand(1, -1, 3) # -1表示与原张量维度一致

print(b2)

print('b2的大小：', b2.shape)

c = torch.tensor([[[1, 2, 3]]])

print(c)

d1 = c.repeat(2, 4, 2) # repeat是将原张量看成一个整体，对其进行复制操作，例中对第三个维度复制两次，即变成两个，行复制四次，列复制两次，可以不用管维度对应，只管扩张。

print(d1)

print('d1的大小：', d1.shape)

d2 = c.repeat(2, 4, 2, 1) # 此处是增加一个维度，即整体变成两个，然后里面的一个小块是四个，四个块中的一个又是经过原张量行复制两次，列不复制生成。

print(d2)

print('d2的大小：', d2.shape)

#结果显示

tensor([[[1, 2, 3]]])

a的大小： torch.Size([1, 1, 3])

tensor([[[1, 2, 3],

[1, 2, 3]]])

b1的大小： torch.Size([1, 2, 3])

tensor([[[1, 2, 3]]])

b2的大小： torch.Size([1, 1, 3])

tensor([[[1, 2, 3]]])

tensor([[[1, 2, 3, 1, 2, 3],

[1, 2, 3, 1, 2, 3],

[1, 2, 3, 1, 2, 3]],

[[1, 2, 3, 1, 2, 3],

[1, 2, 3, 1, 2, 3],

[1, 2, 3, 1, 2, 3]]])

d1的大小： torch.Size([2, 4, 6])

tensor([[[[1, 2, 3],

[1, 2, 3]],

[[1, 2, 3],

[1, 2, 3]],

[[1, 2, 3],

[1, 2, 3]],

[[1, 2, 3],

[1, 2, 3]]],

[[[1, 2, 3],

[1, 2, 3]],

[[1, 2, 3],

[1, 2, 3]],

[[1, 2, 3],

[1, 2, 3]],

[[1, 2, 3],

[1, 2, 3]]]])

d2的大小： torch.Size([2, 4, 2, 3])

e = torch.rand(2, 2, 3, 4)

# print(e)

f1 = e.transpose(1, 3) # 将指定的维度进行调换，换的只能是两个

# print(f1)

print('f1的大小：', f1.shape)

f2 = e.permute(0, 2, 3, 1) # 将所有维度进行括号内的索引顺序转换，转换的个数必须和原张量一样

# print(f2)

print('f2的大小：', f2.shape)

#结果显示

f1的大小： torch.Size([2, 4, 3, 2])

f2的大小： torch.Size([2, 3, 4, 2])

g1 = torch.randn(3, 4)

g2 = torch.rand(3, 4)

print(g1)

print(g2)

h1 = torch.cat((g1, g2), 0) # 按行进行同一维度的拼接，如上例，按行拼接拼接后为（6，4）

h2 = torch.stack((g1, g2), 0) # 沿着一个新的维度对输入张量进行拼接，此处的dim一般为0，不取其他值

print('h1的大小：', h1.shape)

print('h2的大小：', h2.shape)

#结果显示

tensor([[ 0.5554, 0.0449, 0.1231, -0.5494],

[-0.1639, -0.2909, 2.2580, 1.5841],

[ 0.1315, -1.4964, 0.0706, -0.9549]])

tensor([[0.9899, 0.5225, 0.7383, 0.9421],

[0.5493, 0.0317, 0.3085, 0.9770],

[0.5221, 0.0223, 0.2915, 0.7914]])

h1的大小： torch.Size([6, 4])

h2的大小： torch.Size([2, 3, 4])

3.索引切片及数学运算

索引切片：

import torch

a = torch.rand(2, 3, 4, 4)

print(a.shape)

# 索引

print('a的前两个维度的索引：', a[0, 0].shape)

print('a的具体值索引：', a[0, 0, 2, 3])

# 切片

print('a的第一个维度进行切片：', a[:1].shape)

print('a的每个维度进行切片：', a[:-1, :1, :, :].shape)

# ...的用法

print(a[...].shape)

print(a[0, ...].shape)

print(a[:, 2, ...].shape)

print(a[..., :2].shape)

# 掩码取值

x = torch.rand(3, 4)

print(x)

mask = x.ge(0.5) # 与0.5比较，大的为Ture，小的为False

print(mask)

print(torch.masked_select(x, mask)) # 挑选出里面为True的值并打印

# 通过torch.take取值

y = torch.tensor([[4, 3, 5], [6, 7, 8]])

y1 = torch.take(y, torch.tensor([0, 2, 5]))

print('y的取值：', y)

print('y1的取值：', y1)

#结果显示

torch.Size([2, 3, 4, 4])

# 索引结果

a的前两个维度的索引： torch.Size([4, 4])

a的具体值索引： tensor(0.8660)

# 切片结果

a的第一个维度进行切片： torch.Size([1, 3, 4, 4])

a的每个维度进行切片： torch.Size([1, 1, 4, 4])

# ...的用法结果

torch.Size([2, 3, 4, 4])

torch.Size([3, 4, 4])

torch.Size([2, 4, 4])

torch.Size([2, 3, 4, 2])

# 掩码取值结果

tensor([[0.5534, 0.1831, 0.9449, 0.6261],

[0.4419, 0.2026, 0.4816, 0.0258],

[0.7853, 0.9431, 0.7531, 0.2443]])

tensor([[ True, False, True, True],

[False, False, False, False],

[ True, True, True, False]])

tensor([0.5534, 0.9449, 0.6261, 0.7853, 0.9431, 0.7531])

# 通过torch.take取值结果

y的取值： tensor([[4, 3, 5],

[6, 7, 8]])

y1的取值： tensor([4, 5, 8])

加、减、乘：

torch.add（）加法

torch.sub（）减法

torch.mul/mm/bmm/matmul（）乘法

数学运算：

import torch

#加、减、乘

a = torch.rand(3, 4)

b = torch.rand(4)

c1 = a + b

c2 = torch.add(a, b)

print('直接用加号结果：', c1)

print('使用add结果：', c2)

d1 = a - b

d2 = torch.sub(a, b)

print('直接用减号结果：', d1)

print('使用sub结果：', d2)

c = torch.randn(1, 2, 3)

d = torch.randn(1, 3, 4)

e = torch.rand(1, 2)

f = torch.rand(2, 3)

e1 = a * b

e2 = torch.mul(a, b) # 点乘，当a,b维度不一样可以自己复制填充不够的然后相乘，对位相乘

e3 = torch.mm(e, f) # 针对二维矩阵，要满足矩阵乘法规则

e4 = torch.bmm(c, d) # 输入，即括号内的张量必须是三维的，且满足第一个（x,y,z），第二个必须(x,z,随意)

e5 = torch.matmul(c, d) # 具有广播效果，矩阵维度不一样时，自动填充，然后相乘，但需要相乘矩阵最后两个维度满足矩阵乘法法则

print(e1)

print(e2)

print(e3)

print(e4)

print(e5)

#结果显示

直接用加号结果： tensor([[0.9060, 1.1983, 1.1655, 1.2972],

[1.6351, 0.3494, 0.8485, 1.0029],

[1.8000, 0.4619, 0.9559, 0.7184]])

使用add结果： tensor([[0.9060, 1.1983, 1.1655, 1.2972],

[1.6351, 0.3494, 0.8485, 1.0029],

[1.8000, 0.4619, 0.9559, 0.7184]])

直接用减号结果： tensor([[-0.8189, 0.7739, 0.7891, 0.2740],

[-0.0898, -0.0749, 0.4722, -0.0202],

[ 0.0752, 0.0375, 0.5796, -0.3047]])

使用sub结果： tensor([[-0.8189, 0.7739, 0.7891, 0.2740],

[-0.0898, -0.0749, 0.4722, -0.0202],

[ 0.0752, 0.0375, 0.5796, -0.3047]])

tensor([[0.0376, 0.2092, 0.1839, 0.4019],

[0.6663, 0.0291, 0.1243, 0.2514],

[0.8086, 0.0530, 0.1445, 0.1058]])

tensor([[0.0376, 0.2092, 0.1839, 0.4019],

[0.6663, 0.0291, 0.1243, 0.2514],

[0.8086, 0.0530, 0.1445, 0.1058]])

tensor([[0.1087, 0.0323, 0.2181]])

tensor([[[ 1.9481, 3.7797, -2.5594, 0.2444],

[ 0.3162, 0.1580, -0.0066, 0.0721]]])

tensor([[[ 1.9481, 3.7797, -2.5594, 0.2444],

[ 0.3162, 0.1580, -0.0066, 0.0721]]])

扩展：

torch.exp（） e的指数幂

torch.log（）取对数

torch.mean（）求均值

torch.sum（）求和

torch.max\torch.min（）求最大/最小值

torch.prod（）返回input中所有元素的乘积

torch.argmin（input）/torch.argmax（input）最大值/最小值的索引

torch.where（condition, x, y)）如果符合条件返回x，不符合返回y

torch.gather（input, dim, index）沿dim指定的轴收集数据

tensor.floor()向下取整

tensor.pow() 平方

tensor.sqrt() 开根号

tensor.ceil()向上取整

tensor.round()四舍五入

tensor.trunc()取整数值

tensor.frac()取小数值

tensor.clamp(min,max)比最小值小的变成最小值，把比最大值大的变成最大值

4.autograd：自动求导

首先，在pytorch中创建张量的形式为：torch.tensor(data= , dtype=None（默认） , device=None（默认） , requires_grad=False（默认） )。简单来说，自动求导就是在进行张量定义时，自行的可以进行求导或者说求梯度计算，只要将张量默认输入参数中的requires_gard设置成True，就看进行自动求导了。下面举个例子，简单看一下具体流程：

我们求的原式为：zi=3(xi+2)2，即可以看成z=3(x1+2)(x2+2)...(xi+2)

第一种情况，当我们的输出时一个标量时

import torch

x = torch.ones(1， 3, requires_grad=True) # 为了方便手动计算，我们使用单位矩阵

a = x + 2

z = 3 * a.pow(2)

print('x的值', x)

print('a的值', a)

print('z的值', z)

out = torch.mean(z) # 此处的out是一个标量，由x的大小可以看出，求均值的分母为x的个数

out.backward()

print(x.grad)

#结果显示

x的值 tensor([[1., 1., 1.]], requires_grad=True)

a的值 tensor([[3., 3., 3.]], grad_fn=)

z的值 tensor([[27., 27., 27.]], grad_fn=)

tensor([[6., 6., 6.]])

上面代码中out被我们定义为：

$$out = \frac{{3\left[ {{{\left( {{x_1} + 2} \right)}^2} + {{\left( {{x_2} + 2} \right)}^2} + {{\left( {{x_3} + 2} \right)}^2}} \right]}}{3}$$

所以求导很容易看出：

$$\frac{{\partial out}}{{\partial {x_1}}} = \frac{{\partial out}}{{\partial {x_2}}} = \frac{{\partial out}}{{\partial {x_3}}} = \frac{{3*\left( {2*1 + 2*1 + 2*1} \right)}}{3} = 6$$

第二种情况，当我们的输出是一个向量时

import torch

import copy

x = torch.ones(1, 3, requires_grad=True) # 为了方便手动计算，我们使用单位矩阵

a = x + 2

z = 3 * a.pow(2)

print('x的值', x)

print('a的值', a)

print('z的值', z)

gradients1 = torch.tensor([[0.1, 1, 0.01]], dtype=torch.float) # 要注意的是这里的参数要与out的维度保持一致

z.backward(gradients1, True) # 此处是为了保证最后输出的行数，以此类推，几个gradients就是几行

A_temp = copy.deepcopy(x.grad)