深度学习开发入门（三）——PyTorch深度学习基础

邶风

已于 2024-03-27 08:57:21 修改

阅读量963

点赞数 22

文章标签：深度学习 pytorch

于 2024-03-24 19:07:36 首次发布

本文链接：https://blog.csdn.net/2202_75671186/article/details/136671019

版权

1、Tensor对象及其运算

1.1

Tensor对象是一个任意维度的矩阵，但是Tensor中所有元素的数据类型必须一致。torch包含的数据类型和普遍编程语言的数据类型类似，包含浮点型、有符号整型和无符号整型。这些类型既可以定义在CPU上，也可以定义在GPU上。在使用Tensor数据类型时，可以通过dtype属性指定它的数据类型，device指定它的设备（CPU或者GPU）。

# torch.tensor
print('torch.Tensor默认为:{}'.format(torch.Tensor(1).dtype)) # torch.Tensor默认为:torch.float32
print('torch.Tensor默认为:{}'.format(torch.Tensor(1).dtype)) # torch.Tensor默认为:torch.int64
# 可以用list构建
a = torch.tensor([[1,2],[3,4]],dtype=torch.float64)
# 也可以用ndarray构建
b = torch.tensor(np.array([[1,2],[3,4]]),dtype=torch.uint8)
print(a)
print(b)

'''
torch.Tensor默认为:torch.float32
torch.tensor默认为:torch.int63
tensor([[1.,2.],
        [3.,4.]],dtype=torch.float64)
tensor([[1,2],
        [3,4]],dtype=torch.uint8)
tensor([[1.,1.],
        [1.,1.]],device='cuda:0')
'''

1.2

通过device在GPU上定义变量后，可以在终端上通过nvidia-smi命令查看显存占用。torch还支持在CPU和GPU之间复制变量。

# 通过device指定设备
cuda0 = torch.device('cuda:0')
c = torch.ones((2,2),device=cuda0)
print(c)

# 复制变量
c = c.to('cpu',torch.double)
print(c.device)
b = b.to(cuda0,torch.float)
print(b.device)

1.3

对Tensor执行算数运算符的运算，是两个矩阵对应元素的运算。torch.mm执行运算矩阵乘法的运算。

a = torch.tensor([[1,2],[3,4]])
b = torch.tensor([[1,2],[3,4]])
c = a * b
print('逐元素相乘:',c)
c = torch.mm(a,b)
print('矩阵乘法:',c)
'''
逐元素相乘:tensor([[1,4],
                [9,16]])
矩阵乘法:tensor([[7,10],
                [15,22]])
'''

1.4

torch.clamp起到分段函数的作用，可用于去掉矩阵中过小或者过大的元素；torch.round将小数转为整数；torch.tanh计算双曲正切函数，该函数将数值映射到（0，1）。

a = torch.tensor([[1,2],[3,4]])
torch.clamp(a,min=2,max=3)
'''
tensor([[2,2],[3,3]])
'''
a = torch.tensor([-1.1,0.5,0.501,0.99])
torch.round(a)
'''
tensor([[2,2],[3,3]]) 
'''
a = torch.Tensor([-3,-2,-1,-0.5,0,0.5,1,2,3])
torch.tanh(a)
'''
tensor([-0.9951,-0.9640,-0.7616,-0.4621,0.0000,0.4621,0.7616,0.9640,0.9951])
'''

1.5

torch.arange和Python内置的range的使用方法基本相同，其中第三个参数是步长。torch.linspace的第3个参数指定返回的个数。torch.ones返回全零，torch.zeros返回全零矩阵。

print(torch.arange(5))
print(torch.arange(1,5,2))
print(torch.linspace(0,5,10))
'''
tensor([0,1,2,3,4])
tensor([1,3])
tensor([0.0000,0.5556,1.1111,1.6667,2.2222,2.7778,3.3333,3.8889,4.4444,5.0000])
'''
print(torch.ones(3,3))
print(torch.zeros(3,3))
'''
tensor([[1.,1.,1.],
        [1.,1.,1.],
        [1.,1.,1.]])
tensor([[0.,0.,0.],
        [0.,0.,0.],
        [0.,0.,0.]])   
'''

1.6

torch.rand返回范围为[0,1]的均匀分布采样的元素所组成的矩阵，torch.randn返回从正态分布采样的元素所组成的矩阵，torch.randint返回指定区间的均匀分布采样的随机整数所生成的矩阵。

torch.rand(3,3)
'''
tensor([[0.0388,0.6819,0.3144],
        [0.7826,0.0966,0.4319],
        [0.6758,0.2630,0.9727]])
'''
torch.randn(3,3)
'''
tensor([[-0.6956,0.6792,0.8957],
        [0.7826,0.0966,0.4319],
        [0.6958,0.2630,0.9727]])
'''
torch.randint(0,9,(3,3))
'''
tensor([[5,2,7],
        [8,4,8],
        [2,1,4]])
'''

2、Tensor的索引和切片

2.1

Tensor支持基本索引和切片操作，不仅如此，它还支持ndarry中的高级索引（整数索引和布尔索引）操作。

a = torch.arange(9).view(3,3)
# 基本索引
a[2,2]
'''
tensor(8)
'''
# 切片
a[1:,:-1]
'''
tensor([[3,4],[6,7]])
'''
# 带步长的切片
a[::2]
'''
tensor([[0,1,2],[6,7,8]])
'''
# 整数索引
rows = [0,1]
cols = [2,2]
a[rows,cols]
'''
tensor([2,5])
'''
# 布尔索引
index = a>4
print(index)
print(a[index])
'''
tensor([[0,0,0],
        [0,0,1],
        [1,1,1]],dtype=torch.uint8)
tensor([5,6,7,8])
'''

2.2

torch.nonzereo用于返回非零值的索引矩阵。

# torch.nonzero用于返回非零值的索引矩阵
a = torch.arange(9).view(3,3)
index = torch.nonzero(a >= 8)
print(index)
'''
tensor([2,2])
'''
a = torch.randint(0,2,(3,3))
print(a)
index = torch.nonzero(a)
print(index)
'''
tensor([[0,0,1],
        [0,0,1],
        [1,1,0]])
tensor([[0,2],
        [1,2],
        [2,0],
        [2,1]])
'''

2.3

torch.where(condition,x,y)判断condition的条件是否满足。当某个元素满足条件时，则返回对应矩阵x相同位置的元素，否则返回矩阵y的元素。

# torch.where(condition,x,y)判断condition的条件是否满足
x = torch.randn(3,2)
y = torch.randn(3,2)
print(x)
print(torch.where(x > 0,x,y))
'''
tensor([[0.0914,-0.8913],
        [-0.0046,0.0617],
        [1.0744,-1.2068]])
tensor([[0.0914,1.0000],
        [1.0000,0.0617],
        [1.0744,1.0000]])
'''

3、Tensor的变换、拼接和拆分

3.1

Tensor.nelement、Tensor.ndimension、ndimension.size可分别用来查看矩阵元素的个数、轴的个数以及维度，属性Tensor.shape也可以用来查看Tensor的维度。

a = torch.rand(1,2,3,4,5)
print('元素个数:',a.nelement())
print('轴的个数:',a.ndimension())
print('矩阵维度:',a.size(),a.shape)
'''
元素个数 120
轴的个数 5
矩阵维度 torch.Size([1,2,3,4,5]) torch.Size([1,2,3,4,5])
'''

3.2

Tensor.view和Tensor.reshape都能被用来更改Tensor的维度。Tensor.view要求Tensor的物理存储必须是连续的，否则将报错；而Tensor.reshape则没有这种要求。但是，Tensor.view返回的一定是一个索引，更改返回值，则原始值同样被更改；Tensor.reshape返回的是引用还是复制是不确定的。它们的相同之处是都接收要输出的维度作为参数，且输出的矩阵元素个数不能改变，可以在维度中输入-1，PyTorch会自动推断它的数值。

# Tensor.view和Tensor.reshape都能被用来更改Tensor的维度
b = a.view(2*3,4*5)
print(b.shape)
c = a.reshape(-1)
print(c.shape)
d = a.reshape(2*3,-1)
print(d.shape)
'''
torch.Size([6,20])
torch.Size([120])
torch.Size([6,20])
'''

3.3

torch.squeeze和torch.unsqueeze用于为Tensor去掉和添加轴。torch.squeeze用于去掉维度为1的轴，而torch.unsqueeze用于给Tensor的指定位置添加一个维度为1的轴。

# torch.squeeze和torch.unsqueeze用于为Tensor去掉和添加轴
b = torch.squeeze(a)
b.shape
'''
torch.Size([2,3,4,5])
'''
torch.unsqueeze(b,0).shape

3.4

torch.t和torch.transpose用于转置二维矩阵。这两个函数只接收二维Tensor，torch.t是torch.transpose的简化。

# torch.t和torch.transpose用于转置二维矩阵
a = torch.tensor([[2]])
b = torch.tensor([[2,3]])
print(torch.transpose(a,1,0,))
print(torch.t(a))
print(torch.transpose(b,1,0))
print(torch.t(b))
'''
tensor([[2]])
tensor([[2]])
tensor([[2],[3]])
tensor([[2],[3]])
'''

3.5

对于高维度Tensor,可以使用permute方法来变换纬度。

# 对于高维度Tensor,可以使用permute方法来变换纬度。
a = torch.rand((1,224,224,3))
print(a.shape)
b = a.permute(0,3,1,2)
print(b.shape)
'''
torch.Size([1,224,224,3])
torch.Size([1,3,224,224])
'''

3.6

torch.cat和torch.stack用于拼接矩阵。torch.cat在已有的轴dim上拼接矩阵，给定轴的纬度可以不同，而其他轴的纬度必须相同。torch.stack在新的轴上拼接，它要求被拼接的矩阵的所有纬度都相同。

# tprch.cat和torch.satack用于拼接矩阵。
a = torch.randn(2,3)
b = torch.randn(3,3)

# 默认维度为dim = 0
c = torch.cat((a,b))
d = torch.cat((b,b,b),dim=1)

print(c.shape)
print(d.shape)
'''
torch.Size([5,3])
torch.Size([3,9])
'''
c = torch.stack((b,b),dim=1)
d = torch.stack((b,b),dim=0)
print(c.shape)
print(d.shape)
'''
torch.Size([3,2,3])
torch.Size([2,3,3])
'''

3.7

torch.split和torch.chunk用于拆分矩阵。torch.split传入的是拆分后每个矩阵的大小，可以传入list，也可以传入整数，而toech.chunk传入的拆分的矩阵个数。

# torch.split和torch.chunk用于拆分矩阵。
a = torch.randn(10,3)
for x in torch.split(a,[1,2,3,4],dim=0):
    print(x.shape)
'''
torch.Size([1,3])
torch.Size([2,3])
torch.Size([3,3])
torch.Size([4,3])
'''
foe x in torch.split(a,4,dim=0):
    print(x.shape)
'''
torch.Size([4,3])
torch.Size([4,3])
torch.Size([2,3])
'''
for x in torch.chunk(a,4,dim=0):
    print(x.shape)
'''
torch.Size([3,3])
torch.Size([3,3])
torch.Size([3,3])
torch.Size([1,3])
'''

4、PyTorch的Reduction操作

Reduction操作的特点是它往往对一个Tensor内的元素执行归约操作，它还提供dim参数来指定沿矩阵的哪个纬度执行操作。

# 默认求取全局最大值
a = torch.tensor([1,2],[3,4])
print('全局最大值:',torch.max(a))

# 指定维度dim后,返回最大值及其索引
torch.max(a,dim=0)
'''
全局最大值:tensor(4)
(tensor([3,4]),tensor([1,1]))
'''
print('沿着横轴计算每一列的累加:')
print(torch.cumsum(a,dim=0))
print('沿着纵轴计算每一行的累乘:')
print(torch.cumprod(a,dim=1))
'''
沿着横轴计算每一列的列加:
tensor([[1,2],[4,6]])
沿着纵轴计算每一行的累乘:
tensor([[1,2],[3,12]])
'''

# 计算矩阵的均值、中值、协方差
a = torch.Tensor([[1,2],[3,4]])
a.mean(),a.median(),a.std()
'''
(tensor(2.5000),temsor(2.),tensor(1.2910))
'''

# torch.unique用来找出矩阵中出现了哪些元素
a = torch.randint(0,3,(3,3))
print(a)
print(torch.unique(a))
'''
tensor([[0,0,0],[2,0,2],[0,0,1]])
tensor([1,2,0])
'''

5、PyTorch的自动微分

5.1

当将Tensor的requires_grad属性设置为True时，PyTorch的torch.autograd会自动追踪它的计算轨迹。当需要计算微分的时候。只需要对最终计算结果的Tensor调用backward方法，所有计算节点的微分就会被保存在grad属性。

x = torch.arange(9).view(3,3)
x.requires_grad
'''
False
'''
x = torch.rand(3,3,requires_grad=True)
print(x)
'''
tensor([[0.0018,0.3481,0.6948],
        [0,4811,0.8106,0.5855],
        [0.4229,0.7706,0.4321],requires_grad=True])
'''
w = torch.ones(3,3,requires_grad=True)
y = torch.sum(torch.mm(w,x))
y
'''
tensor(13.6424,grad_fn=<SumBanckward0>)
'''
y.backward()
print(y.grad)
print(x.grad)
print(w.grad)
'''
None
tensor([[3.,3.,3.],[3.,3.,3.],[3.,3.,3.]])
tensor([[1.1877,0.9406,1.6424],[1.1877,0.9406,1.6424],[1.1877,0.9406,1.6424]])
'''

5.2

Tensor.detach会将Tensor从计算图剥离出去，不再计算它的微分。

x = torch.rand(3,3,requires_grad=True)
w = torch.ones(3,3,requires_grad=True)
print(x)
print(w)
yy = torch.mm(w,x)

detached_yy = yy.detach()
y = torch.mean(yy)
y.backward()

print(yy.grad)
print(detached_yy)
print(w.grad)
print(x.grad)
'''
tensor([[0.3030,0.6487,0.6878],
        [0.4371,0.9960,0.6529],
        [0.4750,0.4995,0.7988]],requires_grad=True)
tensor([[1.,1.,1.],
        [1.,1.,1.],
        [1.,1.,1.]],requires_grad=True)
None
tensor([[1.2151,2.1442,2.1395],
        [1.2151,2.1442,2.1395],
        [1.2151,2.1442,2.1395)
tensor([[0.1822,0.2318,0.1970],
        [0.1822,0.2318,0.1970],
        [0.1822,0.2318,0.1970]])
tensor([[0.3333,0.3333,0.3333],
        [0.3333,0.3333,0.3333],
        [0.3333,0.3333,0.3333]])
'''

5.3

with torch.no_grad():包括的代码段不会计算微分。

y = torch.sum(torch.mm(w,x))
print(y.requires_grad)

with torch.no_grad():
    y = torch.sum(torch.mm(w,x))
    print(y.requires_grad)
'''
True
False
'''

6、小结

PyTorch的Tensor和NumPy的ndarray十分类似，但是Tesor具备两个ndarray不具备而对于深度学习来说非常重要的功能。其一是Tensor能nen用利用GPU计算，GPU根据芯片性能的不同，在进行矩阵计算时，能比CPU快十倍。其二是Tensor计算时，能够作为节点自动地加入计算图，而计算图可以为其中的每个节点自动计算微分，也就是说当我们使用Tensor时，就不需要手动计算微分了。