动手学深度学习
pytorch 学习
2.2数据操作
2.2.1创建tensor
import torch
x = torch.arange(0,10,1)
print(x)
y = torch.linspace(0,10,9)
print(y)
y1 = torch.linspace(0,10,11)
print(y1)
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
tensor([ 0.0000, 1.2500, 2.5000, 3.7500, 5.0000, 6.2500, 7.5000, 8.7500,
10.0000])
tensor([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
随机操作
x = torch.rand(10) # 输出0到1之间的随机分布
y = torch.randn(100) # 输出标准分布
print(x,'\n',y)
tensor([0.0684, 0.7630, 0.9122, 0.1588, 0.2664, 0.3837, 0.4715, 0.0807, 0.4532,
0.2872])
tensor([ 2.1036, -1.5781, -1.2902, -0.0721, -0.3115, -0.2346, 2.1363, 0.4831,
0.6730, -1.5753, 0.3163, 0.1210, -0.1839, -0.7775, 0.5507, 0.1059,
-0.0429, 0.0927, 0.4781, -0.3479, -1.6435, -0.0850, -0.9785, 1.6217,
-0.3711, -0.5670, -0.6948, -1.7653, 0.0760, 0.5582, 0.5921, -0.9598,
0.1816, 0.7103, 0.3276, -0.0963, 0.0336, 1.3279, 0.4696, -1.6410,
-0.7763, -0.1203, 0.6502, 0.1245, 0.2461, -0.5191, -0.6006, 0.5724,
0.0778, 1.4447, 0.5317, 0.7403, 0.7342, -0.6195, 0.0574, -0.8145,
0.7288, 1.0454, -0.0500, -1.3021, -0.5386, -0.4432, -0.8483, -1.6664,
-1.0277, -1.9278, 0.2480, 0.0880, 0.5837, 0.5370, 1.3765, 0.0698,
0.7090, 0.6360, 0.4958, 1.4599, -0.3818, -1.6178, 0.1753, -1.4700,
1.0116, -0.2483, 0.7212, 0.8937, -0.2017, 1.0467, -0.7851, -1.1092,
1.2478, 0.4951, -2.1717, 1.9579, 1.7288, 0.8295, -1.1136, -0.0555,
0.5662, -0.0403, 0.2550, -0.3003])
y.mean()
tensor(0.0108)
y = torch.randn(1000000)
y.mean()
tensor(0.0008)
randn
应该是均值为0的平均分布?
y = torch.rand(100000)
y.mean()
tensor(0.5013)
rand
是0~1的均匀分布
2.2.2基本操作介绍
算数操作
x = torch.rand(5,3)
y = torch.rand(5,3)
print(x+y)
print(torch.add(x,y))
tensor([[1.0719, 1.3702, 0.6004],
[0.9794, 0.9305, 0.4777],
[1.4245, 1.1329, 0.9888],
[1.7019, 0.8654, 1.0943],
[1.8360, 1.0775, 0.8313]])
tensor([[1.0719, 1.3702, 0.6004],
[0.9794, 0.9305, 0.4777],
[1.4245, 1.1329, 0.9888],
[1.7019, 0.8654, 1.0943],
[1.8360, 1.0775, 0.8313]])
result = torch.empty(5,3)
ot = torch.add(x,y,out = result)
print(result)
print(ot)
tensor([[1.0719, 1.3702, 0.6004],
[0.9794, 0.9305, 0.4777],
[1.4245, 1.1329, 0.9888],
[1.7019, 0.8654, 1.0943],
[1.8360, 1.0775, 0.8313]])
tensor([[1.0719, 1.3702, 0.6004],
[0.9794, 0.9305, 0.4777],
[1.4245, 1.1329, 0.9888],
[1.7019, 0.8654, 1.0943],
[1.8360, 1.0775, 0.8313]])
通过out = result
将结果赋值时,result必须为相同的变量,另外,该表达式同时也会返回该值
y.add_(x)
print(y)
tensor([[1.0719, 1.3702, 0.6004],
[0.9794, 0.9305, 0.4777],
[1.4245, 1.1329, 0.9888],
[1.7019, 0.8654, 1.0943],
[1.8360, 1.0775, 0.8313]])
这种方式会改变y的值,值得注意
索引元素
索引的结果与原数据共享内存,修改一个,则另一个也会跟着修改
y = x[0,:]
y += 1
print(y)
print(x[0,:])
tensor([1.8157, 1.9540, 1.3887])
tensor([1.8157, 1.9540, 1.3887])
z = x[0,0]
z+=1
print(z)
print(x[0,0])
tensor(2.8157)
tensor(2.8157)
改变形状
x.shape
torch.Size([5, 3])
y = x.view(15)
print(y)
tensor([2.8157, 1.9540, 1.3887, 0.5083, 0.0611, 0.0030, 0.5555, 0.4139, 0.6278,
0.9763, 0.0696, 0.6482, 0.9607, 0.2227, 0.6250])
z = x.view(-1,15) # 这里-1的维度可以通过其他维度的值推出来
print(z)
tensor([[2.8157, 1.9540, 1.3887, 0.5083, 0.0611, 0.0030, 0.5555, 0.4139, 0.6278,
0.9763, 0.0696, 0.6482, 0.9607, 0.2227, 0.6250]])
view
同样共享内存,只是改变了对这个张量的观察角度
不推荐使用reshape
,而是建议先创建一个副本,然后view,如下
x_cp = x.clone().view(15)
print(x)
print(x_cp)
tensor([[2.8157, 1.9540, 1.3887],
[0.5083, 0.0611, 0.0030],
[0.5555, 0.4139, 0.6278],
[0.9763, 0.0696, 0.6482],
[0.9607, 0.2227, 0.6250]])
tensor([2.8157, 1.9540, 1.3887, 0.5083, 0.0611, 0.0030, 0.5555, 0.4139, 0.6278,
0.9763, 0.0696, 0.6482, 0.9607, 0.2227, 0.6250])
使用clone
还有一个好处是会被记录在计算图中,即梯度回传到副本时也会传到源Tensor
,额,这句话没看懂,以后再琢磨吧
还有一个常用的函数是item
,可以将一个标量Tensor
转化成一个数,注意这里必须是一个元素
print(x[0,0].item())
2.8156752586364746
线性代数
下面介绍一下基本的线性函数
x = torch.rand(5,5)
print(x.trace())
print(x.diag())
print(x.triu())
print(x.tril())
tensor(2.6414)
tensor([0.1093, 0.5959, 0.5285, 0.5512, 0.8565])
tensor([[0.1093, 0.9400, 0.1872, 0.7878, 0.6490],
[0.0000, 0.5959, 0.9601, 0.3741, 0.3903],
[0.0000, 0.0000, 0.5285, 0.8106, 0.3142],
[0.0000, 0.0000, 0.0000, 0.5512, 0.9473],
[0.0000, 0.0000, 0.0000, 0.0000, 0.8565]])
tensor([[0.1093, 0.0000, 0.0000, 0.0000, 0.0000],
[0.3538, 0.5959, 0.0000, 0.0000, 0.0000],
[0.6069, 0.8945, 0.5285, 0.0000, 0.0000],
[0.2717, 0.8283, 0.9797, 0.5512, 0.0000],
[0.8650, 0.8743, 0.7103, 0.1917, 0.8565]])
y = torch.rand(5,5)
print(x.mm(y)) #矩阵的乘法
tensor([[1.1735, 0.8019, 1.8170, 1.0493, 1.4732],
[1.2337, 1.2944, 2.0360, 1.3018, 1.6521],
[1.4766, 1.1538, 2.2645, 1.2423, 1.9270],
[1.6059, 1.5529, 2.7127, 1.7260, 2.1005],
[1.8105, 1.4080, 2.6665, 1.5510, 1.9396]])
x.bmm(y)
#必须是3维的数组,而不是二维了,因为这里是batch的乘法
#矩阵运算
print(x.t()) #转置
tensor([[0.1093, 0.3538, 0.6069, 0.2717, 0.8650],
[0.9400, 0.5959, 0.8945, 0.8283, 0.8743],
[0.1872, 0.9601, 0.5285, 0.9797, 0.7103],
[0.7878, 0.3741, 0.8106, 0.5512, 0.1917],
[0.6490, 0.3903, 0.3142, 0.9473, 0.8565]])
print(x[0,:].dot(y[:,0]).item()) #点积
1.1735045909881592
2.2.3广播机制
torch 会把形状不同的tensor进行适当复制,然后按元素进行运算
例如:
x = torch.arange(1,3).view(1,2)
y = torch.arange(1,4).view(3,1)
print(x)
print(y)
print(x+y)
tensor([[1, 2]])
tensor([[1],
[2],
[3]])
tensor([[2, 3],
[3, 4],
[4, 5]])
2.2.4运算内存的开销
前面已经提到,view不会开辟新内存,而y = y + x会开辟新的内存,那为了使结果指定到原来的内存,可以通过索引来进行替换操作,如下
x = torch.tensor([1,2])
y = torch.tensor([3,4])
id_before = id(y)
y[:] = x+y
print(id(y) == id_before) # True
True
也可以使用out参数或者自加运算符+= 达到上述效果
x = torch.tensor([1,2])
y = torch.tensor([3,4])
id_yb = id(y)
id_xb = id(x)
x = torch.add(x,y,out = y)
print(id(y) == id_yb)
print(id(x) == id_xb)
print(id(y) == id(x)) # 可以看到这时候返回的结果也是相同的地址
True
False
True
2.2.5 Tensor 和Numpy 相互转换
采用numpy() 和 from_numpy() 进行相互转换,这样的速度会很快,共享内存。
还有一个常用的是 torch.tensor() 这个会进行数据拷贝,会消耗更多时间和空间,且不再共享内存
a = torch.ones(5)
b = a.numpy()
print(a,b)
print(type(b))
tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.]
<class 'numpy.ndarray'>
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
print(a,b)
print(type(a),type(b))
[1. 1. 1. 1. 1.] tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
<class 'numpy.ndarray'> <class 'torch.Tensor'>
2.2.6 Tessor on GPU
采用to()
可以将Tensor在CPU和GPU之间相互移动
# 由于下载的不是GPU版本的,后续再看
2.3 自动求梯度
pytorch 提供的autograd包能够根据输入和前向传播过程自动构建计算图,并执行反向传播。本节将介绍如何使用auotgrad包来进行自动求梯度的相关操作
2.3.1 概念
太啰嗦了,直接看例子吧
2.3.2 Tensor
创建一个Tensor 并设置require_grad=True
x = torch.ones(2,2,requires_grad = True) #只有这里是设置了对x求导
print(x)
print(x.grad_fn)
tensor([[1., 1.],
[1., 1.]], requires_grad=True)
None
y = x + 2
print(y)
print(y.grad_fn)
tensor([[3., 3.],
[3., 3.]], grad_fn=<AddBackward0>)
<AddBackward0 object at 0x0000016CBBE6A668>
可以看到,x
是直接创造的,因此其没有grad_fn
,而y
是加法操作创建的,所以其有一个为<AddBackward>
的grad_fn
像x
这种直接创建的称为叶子节点,对应的grad_fn
是None
print(x.is_leaf,y.is_leaf)
True False
z = y * y * 3
out = z.mean()
print(z)
print(out)
tensor([[27., 27.],
[27., 27.]], grad_fn=<MulBackward0>)
tensor(27., grad_fn=<MeanBackward0>)
a = torch.randn(2,2)
a = ((a * 3) / (a -1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.grad_fn)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)
False
None
True
<SumBackward0 object at 0x0000016CBBEC5160>
2.3.2梯度
out.backward() #等价于out.backward(torch.tensor(1.0))
print(x.grad)
tensor([[4.5000, 4.5000],
[4.5000, 4.5000]])
out = 1/4(3*(x + 2)^2)
out2 = x.sum()
out2.backward()
print(x.grad)
out3 = x.sum()
x.grad.data.zero_()
out3.backward()
print(x.grad)
out4 = x*2
out4.backward(torch.ones(2,2))
print(x.grad)
tensor([[2., 2.],
[2., 2.]])
tensor([[1., 1.],
[1., 1.]])
tensor([[3., 3.],
[3., 3.]])
print(out3)
print(out4)
tensor(4., grad_fn=<SumBackward0>)
tensor([[2., 2.],
[2., 2.]], grad_fn=<MulBackward0>)
这里是不允许张量对张量进行求导的,只允许标量对张量进行求导,结果是和自变量同形的张量。因此有必要把张量通过将所有元素加权求和的形式转化为标量。
!明白了
y = backward(w)
的含义是
l = torch.sum(y * w) l.backward()
即先用w对y进行加权求和,然后对自变量进行求导
下面看看中断梯度追踪的例子
x = torch.tensor(1.0,requires_grad =True)
y1 =x**2
with torch.no_grad():
y2 =x**3
y3 = y2*x
z = y1 + y3
print(x.requires_grad)
print(y1,y1.requires_grad)
print(y2,y2.requires_grad)
True
tensor(1., grad_fn=<PowBackward0>) True
tensor(1.) False
可以看到,y1,y3 有grad_fn,而y2没有。
如果对y3求导,有
z.backward()
print(x.grad)
tensor(3.)
可以看到,结果为3,这里的y2已经变成一个单纯的值了
通过x.data可以修改x的值同时不会记录在计算图中,不会影响梯度传播
x = torch.ones(1,requires_grad = True)
print(x.data)
print(x.data.requires_grad) #False
print(x.requires_grad) #True
y = 2 * x
x.data *=100
#x *=100
y.backward()
print(x)
print(x.grad)
print(y)
tensor([1.])
False
True
tensor([100.], requires_grad=True)
tensor([2.])
tensor([2.], grad_fn=<MulBackward0>)