组队学PyTorch Task2：数据结构基础

最新推荐文章于 2022-04-26 20:21:59 发布

Amihua Lau

最新推荐文章于 2022-04-26 20:21:59 发布

阅读量181

点赞数

文章标签： pytorch 数据结构深度学习

本文链接：https://blog.csdn.net/weixin_43913783/article/details/120751915

版权

Task2：数据结构基础

数据结构

import torch

数据结构

张量

张量(tensor)表示由一个数值组成的数组，这个数组可能有多个维度。具有一个轴的张量对应数学上的向量（vector）。具有两个轴的张量对应数学上的矩阵（matrix）。具有两个轴以上的张量没有特殊的数学名称。

行向量的创建

x = torch.tensor([0, 1])
x.shape

torch.Size([2])

x = torch.arange(5) # arange(a,b)返回的是一个一维的向量，从a开始，b-1结束，若只传入一个参数则为从0开始
x

tensor([0, 1, 2, 3, 4])

初始化

torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]]) #列表赋值

tensor([[2, 1, 4, 3],
        [1, 2, 3, 4],
        [4, 3, 2, 1]])

x = torch.rand(4,3) #随机初始化size = (4,3)的矩阵
x

tensor([[0.1836, 0.0780, 0.8451],
        [0.7885, 0.4624, 0.7173],
        [0.6457, 0.8101, 0.0711],
        [0.9758, 0.0086, 0.8012]])

x = torch.zeros(4, 3, dtype=torch.long)#零矩阵
x

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])

x = torch.ones(4, 3, dtype=torch.long)#全1矩阵
x

tensor([[1, 1, 1],
        [1, 1, 1],
        [1, 1, 1],
        [1, 1, 1]])

基于已经存在的 tensor，创建一个 tensor ：

创建一个新的tensor，返回的tensor默认具有相同的 torch.dtype和torch.device

x = x.new_ones(4, 3, dtype=torch.double) 
x

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)

重置数据类型（规格不变）

x = torch.randn_like(x, dtype=torch.float)
x

tensor([[ 1.2294,  1.6048,  0.0640],
        [-1.8506,  0.1932,  0.7351],
        [ 0.1602,  0.0789,  1.0126],
        [ 0.6248, -0.3423,  0.4942]])

SIZE

x.shape #返回张量的规模

torch.Size([4, 3])

x.size() #跟shape相同

torch.Size([4, 3])

tensor([[1, 1, 1],
        [1, 1, 1],
        [1, 1, 1],
        [1, 1, 1]])

x.reshape(3,4)#重新调整规模 ——这样相当于转置

tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]])

自动调整

x.reshape(-1,2) #要求2列，行自动调整

tensor([[1, 1],
        [1, 1],
        [1, 1],
        [1, 1],
        [1, 1],
        [1, 1]])

x.numel() #张量中元素的总数，即形状的所有元素乘积

x.size()

torch.Size([4, 3])

函数	功能
Tensor(*sizes)	基础构造函数
tensor(data)	类似于np.array
ones(*sizes)	全1
zeros(*sizes)	全0
eye(*sizes)	对角为1，其余为0
arange(s,e,step)	从s到e，步长为step
linspace(s,e,steps)	从s到e，均匀分成step份
rand/randn(*sizes)
normal(mean,std)/uniform(from,to)	正态分布/均匀分布
randperm(m)	随机排列

运算

四则运算

按元素计算

x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y  # **运算符是求幂运算

(tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))

torch.exp(x)

tensor([2.7183e+00, 7.3891e+00, 5.4598e+01, 2.9810e+03])

所有元素求和

x.sum()

tensor(15.)

逻辑运算

X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

线性代数

连结（concatenate）——拼接

只需要提供张量列表，并给出沿哪个轴连结

X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
torch.cat((X, Y), dim=0), torch.cat((X, Y), dim=1)

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [ 2.,  1.,  4.,  3.],
         [ 1.,  2.,  3.,  4.],
         [ 4.,  3.,  2.,  1.]]),
 tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
         [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
         [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]]))

连结torch.cat((X,Y),dim)的时候要注意，连结对应的轴的维数应该相等;

dim反映了用哪个轴连结，0代表第一层（y轴），1代表第二层（x轴）

矩阵

A = torch.arange(20).reshape(5, 4)
A

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]])

A.T

tensor([[ 0,  4,  8, 12, 16],
        [ 1,  5,  9, 13, 17],
        [ 2,  6, 10, 14, 18],
        [ 3,  7, 11, 15, 19]])

张量

X = torch.arange(24).reshape(2, 3, 4)
X

tensor([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],

        [[12, 13, 14, 15],
         [16, 17, 18, 19],
         [20, 21, 22, 23]]])

A = torch.arange(20, dtype=torch.float32).reshape(5, 4)
B = A.clone()  # 通过分配新内存，将A的一个副本分配给B
A == B

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

A, A + B

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 tensor([[ 0.,  2.,  4.,  6.],
         [ 8., 10., 12., 14.],
         [16., 18., 20., 22.],
         [24., 26., 28., 30.],
         [32., 34., 36., 38.]]))

哈达玛积——对应元素相乘

A * B

tensor([[  0.,   1.,   4.,   9.],
        [ 16.,  25.,  36.,  49.],
        [ 64.,  81., 100., 121.],
        [144., 169., 196., 225.],
        [256., 289., 324., 361.]])

张量乘以或加上一个标量不会改变张量的形状，其中张量的每个元素都将与标量相加或相乘

a = 2
X = torch.arange(24).reshape(2, 3, 4)
a + X, (a * X).shape

(tensor([[[ 2,  3,  4,  5],
          [ 6,  7,  8,  9],
          [10, 11, 12, 13]],
 
         [[14, 15, 16, 17],
          [18, 19, 20, 21],
          [22, 23, 24, 25]]]),
 torch.Size([2, 3, 4]))

广播机制

当对两个形状不同的 Tensor 按元素运算时，可能会触发广播(broadcasting)机制：先适当复制元素使这两个 Tensor 形状相同后再按元素运算。

a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 2))
a, b

(tensor([[0],
         [1],
         [2]]),
 tensor([[0, 1]]))

由于a和b分别是 3×1 和 1×2 矩阵，如果我们让它们相加，它们的形状不匹配。我们将两个矩阵广播为一个更大的 3×2 矩阵，如下所示：矩阵a将复制列，矩阵b将复制行，然后再按元素相加。

a + b

tensor([[0, 1],
        [1, 2],
        [2, 3]])

降维

求所有元素的和

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.],
        [16., 17., 18., 19.]])

A.shape

torch.Size([5, 4])

按轴求和

A_sum_axis0 = A.sum(axis=0)
A_sum_axis0, A_sum_axis0.shape

(tensor([40., 45., 50., 55.]), torch.Size([4]))

A_sum_axis1 = A.sum(axis=1)
A_sum_axis1, A_sum_axis1.shape

(tensor([ 6., 22., 38., 54., 70.]), torch.Size([5]))

任意形状张量的平均值

A.mean(), A.sum() / A.numel()

(tensor(9.5000), tensor(9.5000))

非降维求和 (保留原有维数不变，求和轴退化为1维)

sum_A = A.sum(axis=1, keepdims=True)
sum_A.shape

torch.Size([5, 1])

广播除法

A /  A.sum(axis=1, keepdims=True)

tensor([[0.0000, 0.1667, 0.3333, 0.5000],
        [0.1818, 0.2273, 0.2727, 0.3182],
        [0.2105, 0.2368, 0.2632, 0.2895],
        [0.2222, 0.2407, 0.2593, 0.2778],
        [0.2286, 0.2429, 0.2571, 0.2714]])

如果我们想沿某个轴计算A元素的累积总和，比如axis=0（按行计算），我们可以调用cumsum函数。此函数不会沿任何轴降低输入张量的维度。从以下例子和容易看出其计算逻辑，是每一行不断累加到下一行.

A.cumsum(axis=0)

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  6.,  8., 10.],
        [12., 15., 18., 21.],
        [24., 28., 32., 36.],
        [40., 45., 50., 55.]])

A.sum(axis=0,keepdims = True)

tensor([[40., 45., 50., 55.]])

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.],
        [16., 17., 18., 19.]])

点乘

x = torch.arange(4 , dtype = torch.float32)
y = torch.ones(4, dtype = torch.float32)
x, y, torch.dot(x, y)

(tensor([0., 1., 2., 3.]), tensor([1., 1., 1., 1.]), tensor(6.))

矩阵-向量积

A.shape, x.shape, torch.mv(A, x)

(torch.Size([5, 4]), torch.Size([4]), tensor([ 14.,  38.,  62.,  86., 110.]))

矩阵-矩阵乘法

B = torch.ones(4, 3)
torch.mm(A, B)

tensor([[ 6.,  6.,  6.],
        [22., 22., 22.],
        [38., 38., 38.],
        [54., 54., 54.],
        [70., 70., 70.]])

$1 -$ 范数

u = torch.tensor([3.0, -4.0])
torch.abs(u).sum()

tensor(7.)

$2 -$ 范数

u = torch.tensor([3.0, -4.0])
torch.norm(u)

tensor(5.)

F-范数(矩阵范数)

$||X||_F = \sqrt{\sum^{m}_{i = 1} \sum^{n}_{j=1}x_{ij}^2}$

torch.ones((4, 9)),torch.norm(torch.ones((4, 9)))

(tensor([[1., 1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1., 1.]]),
 tensor(6.))

自动求导

在这里插入图片描述

torch.Tensor是这个包的核心类。如果设置它的属性 .requires_grad为 True，那么它将会追踪对于该张量的所有操作。当完成计算后可以通过调用.backward()，来自动计算所有的梯度。这个张量的所有梯度将会自动累加到.grad属性。

注意：在 y.backward() 时，如果 y 是标量，则不需要为 backward() 传入任何参数；否则，需要传入一个与 y 同形的Tensor。

一般地，如果只有一个loss函数，一般直接.sum()即可，如果出现了多个loss可以加上权重

每次累计梯度时要记得清零

要阻止一个张量被跟踪历史，可以调用.detach()方法将其与计算历史分离，并阻止它未来的计算记录被跟踪。为了防止跟踪历史记录(和使用内存），可以将代码块包装在 with torch.no_grad(): 中。在评估模型时特别有用，因为模型可能具有 requires_grad = True的可训练的参数，但是我们不需要在此过程中对他们进行梯度计算。

# 再来反向传播⼀一次，注意grad是累加的 2 out2 = x.sum()
out2.backward()
print(x.grad)

out3 = x.sum()
x.grad.data.zero_()
out3.backward()
print(x.grad)

如果需要计算导数，可以在 Tensor上调用 .backward()。如果 Tensor 是一个标量(即它包含一个元素的数据），则不需要为 backward()指定任何参数，但是如果它有更多的元素，则需要指定一个gradient参数，该参数是形状匹配的张量。

实例

import torch

创建一个张量并设置requires_grad=True用来追踪其计算历史

x = torch.ones(2, 2, requires_grad=True)
print(x)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)





tensor(3.)

y = x**2
y,y.grad_fn

(tensor([[1., 1.],
         [1., 1.]], grad_fn=<PowBackward0>),
 <PowBackward0 at 0x179b8a0b8b0>)

z = y * y * 3
out = z.mean()

print(z, out)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<MulBackward0>) tensor(3., grad_fn=<MeanBackward0>)

out.backward()

x.grad

tensor([[3., 3.],
        [3., 3.]])

雅可比向量积

x = torch.randn(3, requires_grad=True)
print(x)

tensor([-1.4351,  1.0933,  0.6845], requires_grad=True)

y = x * 2

i = 0
while y.data.norm() < 1000:
    y = y * 2
    i = i + 1

print(y)
print(i)

tensor([-1469.5287,  1119.5259,   700.8845], grad_fn=<MulBackward0>)
9

在这种情况下，y 不再是标量。torch.autograd 不能直接计算完整的雅可比矩阵，但是如果我们只想要雅可比向量积，只需将这个向量作为参数传给 backward：

v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)

print(x.grad)

tensor([1.0240e+02, 1.0240e+03, 1.0240e-01])

也可以通过将代码块包装在 with torch.no_grad(): 中，来阻止 autograd跟踪设置了.requires_grad=True的张量的历史记录。

print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

True
True
False

如果我们想要修改 tensor 的数值，但是又不希望被 autograd 记录(即不会影响反向传播)，那么我么可以对 tensor.data 进行操作。

x = torch.ones(1,requires_grad=True)

print(x.data) # 还是一个tensor
print(x.data.requires_grad) # 但是已经是独立于计算图之外

y = 2 * x
x.data *= 100 # 只改变了值，不会记录在计算图，所以不会影响梯度传播

y.backward()
print(x) # 更改data的值也会影响tensor的值 
print(x.grad)

tensor([1.])
False
tensor([100.], requires_grad=True)
tensor([2.])

Amihua Lau

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
组队学PyTorch Task2：数据结构基础

Task2：数据结构基础数据结构张量运算四则运算按元素计算所有元素求和逻辑运算线性代数广播机制降维点乘矩阵-向量积矩阵-矩阵乘法1−1-1−范数2−2-2−范数F-范数(矩阵范数)自动求导每次累计梯度时要记得清零实例import torch数据结构张量张量(tensor)表示由一个数值组成的数组，这个数组可能有多个维度。具有一个轴的张量对应数学上的向量（vector）。具有两个轴的张量对应数学上的矩阵（matrix）。具有两个轴以上的张量没有特殊的数学名称。行向量的创建x = torch.
复制链接

扫一扫