基础-深度学习pytorch

最新推荐文章于 2025-04-25 21:53:23 发布

On3K3y

最新推荐文章于 2025-04-25 21:53:23 发布

阅读量198

点赞数

分类专栏：自学笔记文章标签：深度学习 pytorch 人工智能

本文链接：https://blog.csdn.net/qq_61793088/article/details/131819077

版权

自学笔记专栏收录该内容

3 篇文章

订阅专栏

🤗写在前面

本笔记是Onekey在学习传智教育黑马程序员课程中的本人的笔记记录，供本人复习使用，感谢黑马程序员的教学课程，如有不便分享可联系删除，感谢你们的教学视频！👍👍👍

深度学习框架pytorch的使用

神经网络的重要概念

神经网络应用于图像

神经网络应用于自然语言处理任务

第一章 PyTorch使用

0 介绍

PyTorch 创建张量
PyTorch 张量的数值计算
PyTorch 张量类型转换
PyTorch 张量的拼接操作
PyTorch 索引操作
PyTorch 张量形状的操作
PyTorch 张量的运算函数
PyTorch 自动微分模块
案例手动构建线性回归
PyTorch 构架构建线性回归
模型的保存和加载

1 张量的创建

💯 小结

创建张量的方式

torch.tensor 根据指定数据创建张量

torch.Tensor 根据形状创建张量，其也可用来创建指定数据的张量

torch.lntTensor、torch.FloatTensor、torch.DoubleTensor 创建指定类型的张量
创建线性和随机张量

torch.arange 和 torch.linspace 创建线性张量

torch.random.init_seed和 torch.random.manual_seed 随机种子设置

torch.randn 创建随机张量
创建01张量

torch.ones 和 torch.ones_like 创建全1张量

torch.zeros 和torch.zeros_like 创建全0张量

torch.full 和 torch.full_like 创建全为指定值张量
张量元素类型转换

tensor.type(torch.DoubleTensor)

torch.double()

(1) 基本创建方式

PyTorch 是一个Python 深度学习框架，它将数据封装成张量 (Tensor) 来进行运算
PyTorch 中的张量就是元素为同一种数据类型的多维矩阵
PyTorch 中，张量以“类”的形式封装起来，对张量的一些运算、处理的方法被封装在类中。

torch.tensor 根据指定数据创建张量
torch.Tensor 根据形状创建张量，其也可以用来根据指定数据创建张量
torch.IntTensor、torch.FloatTensor、torch.DoubleTensor创建指定类型张量

import torch
import numpy as np
import random

# 1. 根据已有的数据创造张量
def test01():
    
	# 1.1 创建标量
    data = torch.tensor(10)
    print(data)
    
    # 1.2 使用numpy数组来创建
    data = np.random.randn(2,3)
    data = torch.tensor(data)
    print(data)#pytorch默认float32
    
    # 1.3 使用list列表来创建
	data = [[10.,20.,30.],[40.,50.,60.]]
    data = torch.tensor(data)
    print(data)
    
# 2. 创建指定形状的张量
def test02():
    
    # 2.1 创建2行3列的张量
    data = torch.Tensor(2,3)
    print(data)
    
    # 2.2 创建指定值的张量
    # 注意：传递列表
    data = torch.Tensor([2,3])
    print(data)
    
    data = torch.Tensor([10])
    print(data)
    
# 3. 创建指定类型的张量
def test03():
    
    # 前面创建的张量都是使用默认类型或者元素类型
    # 创建一个int类型的张量
	data = torch.IntTensor(2,3)
    print(data)
    
    # data = torch.ShortTensor(2,3) 表示创建的是int16张量
    # data = torch.LongTensor(2,3) 表示创建的是int32张量    
    # data = torch.FloatTensor(2,3) 表示创建的是float32张量

    
	# 注意：如果创建指定类型的张量，但是传递的数据不匹配，会发生类型装换
    data = torch.IntTensor([2.5,3.5])
    print(data)

    
if __name__ == '__main__':
	test01()
    test02()
    test03()

输出

tensor([[ 0.4466, -0.2139, -0.2531],
        [-1.0478,  1.2809, -0.7754]], dtype=torch.float64)
tensor([[10., 20., 30.],
        [40., 50., 60.]])
tensor([[-2.3635e+38,  9.3186e-43,  0.0000e+00],
        [ 0.0000e+00,  0.0000e+00,  0.0000e+00]])
tensor([2., 3.])
tensor([10.])
tensor([[6357102, 7274595, 6553710],
        [3342433, 7077980, 6422633]], dtype=torch.int32)
tensor([2, 3], dtype=torch.int32)

(2) 创建线性和随机张量

torch.arange和 torch.linspace 创建线性张量
torch.random.init_seed 和torch.random.manual_seed 随机种子设置
torch.randn 创建随机张量

import torch

# 1. 创建线性张量
def test01():
    # 1.1 创建指定步长的张量
    # 第一参数：开始值
    # 第二参数：结束值
    # 第三参数：步长
    data = torch.arange(0, 10, 2)
    print(data)

    # 1.2 在指定区间指定元素个数
    # 第一参数：开始值
    # 第二参数：结束值
    # 第三参数：创建元素的个数
    data = torch.linspace(0, 11, 10)
    print(data)

# 2. 创建随机张量
def test02():
    # 固定随机数种子
    torch.random.manual_seed(0)  # 手动设置随机数种子

    # 2.1 创建随机张量
    data = torch.randn(2, 3)
    print(data)

    # 2.2 希望能够固定随机数
    print('随机数种子：',torch.random.initial_seed())

if __name__ == '__main__':
    test01()
    test02()

输出

tensor([0, 2, 4, 6, 8])
tensor([ 0.0000,  1.2222,  2.4444,  3.6667,  4.8889,  6.1111,  7.3333,  8.5556,
         9.7778, 11.0000])
tensor([[ 1.5410, -0.2934, -2.1788],
        [ 0.5684, -1.0845, -1.3986]])
随机数种子： 0

(3) 创建01张量

torch.ones 和 torch.ones_like 创建全1张量
torch.zeros 和 torch.zeros_like 创建全0张量
torch.full 和 torch.full_like 创建全为指定值张量

import torch

# 1. 创建全为0的张量
def test01():

    # 1.1 创建制定形状全为0的张量
    data = torch.zeros(2,3)
    print(data)

    # 1.2 根据其他张量的形状去创建全0张量
    data = torch.zeros_like(data)
    print(data)

# 2. 创建全为1的张量
def test02():

    # 1.1 创建制定形状全为1的张量
    data = torch.ones(2, 3)
    print(data)

    # 1.2 根据其他张量的形状去创建全01量
    data = torch.ones_like(data)
    print(data)

# 3. 创建全为0的张量
def test03():

    # 3.1 创建形状为2行3列，值全部为100的张量
    data = torch.full([2,3],100)
    print(data)

    # 3.2 创建一个形状和data一样，但是值全部是200的张量
    data = torch.full_like(data,200)
    print(data)

if __name__ == '__main__':
    test01()
    test02()
    test03()

输出

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[100, 100, 100],
        [100, 100, 100]])
tensor([[200, 200, 200],
        [200, 200, 200]])

(4) 张量元素类型转换

tensor.type(torch.DoubleTensor)
torch.double()

import torch

# 1. 使用type函数进行转换
def test01():

    data = torch.full([2,3],10)
    print(data.dtype)

    # 注意：返回一个新的类型装换过的张量
    data = data.type(torch.DoubleTensor)
    print(data.dtype)
# 2. 使用具体类型函数进行转换
def test02():

    data = torch.full([2,3],10)
    print(data.dtype)

    # 转换成float64类型
    data = data.double()
    print(data.dtype)

    data.short() # 将张量元素装换为int16类型
    data.int() # 将张量元素装换为int32类型
    data.long() # 将张量元素装换为int64类型
    data.float() # 将张量元素装换为float32类型
    data.double() # 将张量元素装换为float64类型

if __name__ == '__main__':
    test01()
    test02()

输出

torch.int64
torch.float64
torch.int64
torch.float64

2 张量数值计算

💯 小结

张量基本运算

加法add add_

减法sub sub_

乘法mul mul_

除法div div_

取反neg neg_

张量阿达玛积

*或者mul，矩阵对应位置的元素相乘

张量点积运算

运算符 @ 用于进行两个矩阵的点乘运算

torch.mm 用于进行两个矩阵点乘运算要求输入的张量为2维

torch.bmm 用于批量进行矩阵点乘运算要求输入的张量为3维

torch.matmul 对进行点乘运算的两张量形状没有限定

制定运算设备

使用 cuda 方法

直接在 GPU上创建张量device="cuda:0"

使用to指定设备

(1) 张量基本计算

加法add add_
减法sub sub_
乘法mul mul_
除法div div_
取反neg neg_

其中待下划线的版本为修改原数据

import torch

# 1. 不修改原数据的计算
def test01():

    # 第一个参数：开始值
    # 第二个参数：结束值
    # 第三个参数：形状
    data = torch.randint(0,10,[2,3])
    print(data)

    # 计算完成之后，会返回一个新的张量
    data = data.add(10)
    print(data)

    # data.sub() # 减法
    # data.mul() # 乘法
    # data.div() # 除法
    # data.neg() # 取相反数

# 2. 修改原数据的计算（inplace方式的计算）
def test02():

    data = torch.randint(0,10,[2,3])
    print(data)

    # 待下划线的版本的函数直接修改原数据，不需要用新的变量保存
    data.add_(10)
    print(data)

    # data.sub_() # 减法
    # data.mul_() # 乘法
    # data.div_() # 除法
    # data.neg_() # 取相反数

if __name__ == '__main__':
    test01()
    test02()

输出

tensor([[0, 5, 5],
        [9, 6, 8]])
tensor([[10, 15, 15],
        [19, 16, 18]])
tensor([[5, 9, 7],
        [6, 9, 7]])
tensor([[15, 19, 17],
        [16, 19, 17]])

(2) 阿达玛积

阿达玛积指的是矩阵对应位置的元素相乘，*或者mul

import torch

# 1. 使用mul函数
def test01():

    data1 = torch.tensor([[1,2],[3,4]])
    data2 = torch.tensor([[5,6],[7,8]])

    data = data1.mul(data2)
    print(data)

# 2. 使用*号运算符
def test02():

    data1 = torch.tensor([[1,2],[3,4]])
    data2 = torch.tensor([[5,6],[7,8]])

    data = data1 * data2
    print(data)

if __name__ == '__main__':
    test01()
    test02()

输出

tensor([[ 5, 12],
        [21, 32]])
tensor([[ 5, 12],
        [21, 32]])

(3) 点积运算

点积运算要求第一个矩阵 shape: (n,m)，第二个矩阵 shape: (m, p) 两个阵点积运算 shape 为:(n,p)。

运算符 @ 用于进行两个矩阵的点乘运算
torch.mm 用于进行两个矩阵点乘运算要求输入的张量为2维
torch.bmm 用于批量进行矩阵点乘运算要求输入的张量为3维

torch.matmul 对进行点乘运算的两张量形状没有限定

对于输入都是二维的张量相当于 torch.mm 运算
对于输入都是三维的张量相当于 torch.bmm 运算
对数输入的 shape 不同的张量对应的最后几个维度须符合矩阵运算规则

import torch

# 1. 使用@运算符
def test01():

    # 形状为：3行2列
    data1 = torch.tensor([[1,2],
                   [3,4],
                   [5,6]])
    # 形状为：3行2列
    data2 = torch.tensor([[5,6],
                   [7,8]])
    data = data1 @ data2
    print(data)

# 2. 使用mm函数
def test02():

    # 要求输入的张量形状都是二维
    # 形状为：3行2列
    data1 = torch.tensor([[1, 2],
                          [3, 4],
                          [5, 6]])
    # 形状为：3行2列
    data2 = torch.tensor([[5, 6],
                          [7, 8]])
    data = torch.mm(data1,data2)
    print(data)

# 3. 使用bmm函数
def test03():

    # 第一个维度：表示批次
    # 第二个维度：多少行
    # 第三个维度：多少列
    data1 = torch.randn(3,4,5)
    data2 = torch.randn(3,5,8)

    data = torch.bmm(data1,data2)
    print(data.shape)

# 4. 使用matmul函数
def test04():

    # 对二维进行计算
    data1 = torch.randn(4,5)
    data2 = torch.randn(5,8)
    print(torch.matmul(data1,data2).shape)

    # 对三维进行计算
    # 对二维进行计算
    data1 = torch.randn(3,4,5)
    data2 = torch.randn(3,5,8)
    print(torch.matmul(data1,data2).shape)

    data1 = torch.randn(3,4,5)
    data2 = torch.randn(5,8)
    print(torch.matmul(data1,data2).shape)

if __name__ == '__main__':
    test01()
    test02()
    test03()
    test04()

输出

tensor([[19, 22],
        [43, 50],
        [67, 78]])
tensor([[19, 22],
        [43, 50],
        [67, 78]])
torch.Size([3, 4, 8])
torch.Size([4, 8])
torch.Size([3, 4, 8])
torch.Size([3, 4, 8])

(4) 指定运算设备

PyTorch 默认会将张量创建在 CPU 控制的内存中,默认的运算设备为 CPU。

我们也可以将张量创建在GPU 上能够利用对于矩阵计算的优势加快模型训练。

将张量移动到 GPU 上有方法:

使用 cuda 方法
直接在 GPU上创建张量
使用to 法指定设备

查看torch版本和cuda是否可用：

import torch
print(torch.__version__)
print(torch.cuda.is_available())

输出

2.0.1+cu117
True

import torch

# 1. 使用cuda方法
def test01():

   data = torch.tensor([10,20,30])
   print("存储设备:",data.device)

   # 将张量移动到GPU设备上
   data = data.cuda()
   print("存储设备:",data.device)

   # 将张量从GPU再移动到CPU
   data = data.cpu()
   print("存储设备:",data.device)

# 2. 直接讲张量创建在GPU上
def test02():

   data = torch.tensor([10,20,30],device="cuda:0")
   print("存储设备：",data.device)

   # 将张量从GPU再移动到CPU
   data = data.cpu()
   print("存储设备:",data.device)

# 3. 使用to方法
def test03():

   data = torch.tensor([10,20,30])
   print("存储设备:",data.device)

   # 使用to方法移动张量到指定设备
   data = data.to("cuda:0")
   print("存储设备:",data.device)

# 4. 注意：存储在不同设备上的张量不能直接进行运算
def test04():

   data1 = torch.tensor([10,20,30])
   data2 = torch.tensor([10,20,30],device="cuda:0")

   # RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
   # 以下代码会报错

   # 如果你的电脑上安装pytorch不是gpu版本的，或者电脑本身没有gpu（nvidia）设备环境
   # 否则下面的调用cuda函数的代码会报错
   data1 = data1.cuda()

   data = data1 + data2
   print(data)

if __name__ == '__main__':
   test01()
   test02()
   test03()
   test04()

输出

存储设备: cpu
存储设备: cuda:0
存储设备: cpu
存储设备： cuda:0
存储设备: cpu
存储设备: cpu
存储设备: cuda:0
tensor([20, 40, 60], device='cuda:0')

3 张量类型转换

💯 小结

tensor.numpy函数可以将张量转换为ndarry数组，共享内存，使用copy函数避免共享
torch.from_numpy（浅拷贝）可以将ndarry数组转换为tensor，共享内存，使用copy函数避免共享
torch.tensor（深拷贝）可以将ndarry数组转换为tensor，不共享内存
item方法将该值从张量中提取出来(只有一个元素的张量)

(1) 张量转换为numpy数组

使用tensor.numpy函数可以将张量转换为ndarry数组

但是共享内存，可以使用copy函数避免共享

import torch

# 1. 张量转换为numpy数组
def test01():
     data_tensor = torch.tensor([2,3,4])
     # 将张量转换为numpy数组
     data_numpy = data_tensor.numpy()
     print(type(data_tensor))
     print(type(data_numpy))

     print(data_tensor)
     print(data_numpy)

# 2. 张量和numpy数组共享内存
def test02():

    data_tensor = torch.tensor([2,3,4])
    data_numpy = data_tensor.numpy()

    # 修改张量元素的值，看看numpy数组是否会发生变化？
    # data_tensor[0] = 100
    # print(data_tensor)
    # print(data_numpy)

    # 修改numpy数组元素的值，看看张量是否会发生变化？
    data_numpy[0] = 100
    print(data_tensor)
    print(data_numpy)

# 3. 使用copy函数使用不共享内存
def test03():

    data_tensor = torch.tensor([2,3,4])
    # 此处，发生了类型转换，可以使用拷贝函数产生新的数据，避免共享内存
    data_numpy = data_tensor.numpy().copy()

    # 修改张量元素的值，看看numpy数组是否会发生变化？
    data_tensor[0] = 100
    print(data_tensor)
    print(data_numpy)

if __name__ == '__main__':
    test01()
    test02()
    test03()

输出

<class 'torch.Tensor'>
<class 'numpy.ndarray'>
tensor([2, 3, 4])
[2 3 4]
tensor([100,   3,   4])
[100   3   4]
tensor([100,   3,   4])
[2 3 4]

(2) numpy转换为张量

torch.from_numpy（浅拷贝）可以将ndarry数组转换为tensor，默认共享内存，使用copy函数避免共享。
torch.tensor（深拷贝）可以将ndarry数组转换为tensor，默认不共享内存

import torch
import numpy as np

# 1. from_numpy函数的用法
def test01():

    data_numpy = np.array([2,3,4])
    data_tensor = torch.from_numpy(data_numpy.copy())

    print(type(data_numpy))
    print(type(data_tensor))

    # 默认共享内存
    # data_numpy[0] = 100
    data_tensor[0] = 100
    print(data_numpy)
    print(data_tensor)

# 2. torch.tensor函数的用法
def test02():

    data_numpy = np.array([2, 3, 4])
    data_tensor = torch.tensor(data_numpy)

    # 默认不！共！享！内存
    # data_numpy[0] = 100
    data_tensor[0] = 100
    print(data_numpy)
    print(data_tensor)

if __name__ == '__main__':
    test01()
    test02()

输出

<class 'numpy.ndarray'>
<class 'torch.Tensor'>
[2 3 4]
tensor([100,   3,   4], dtype=torch.int32)
[2 3 4]
tensor([100,   3,   4], dtype=torch.int32)

(3) 标量张量和数字的转换

item方法将该值从张量中提取出来(只有一个元素的张量)

import torch

def test():

    t1 = torch.tensor(30)
    t2 = torch.tensor([30])
    t3 = torch.tensor([[30]])

    print(t1.shape)
    print(t2.shape)
    print(t3.shape)

    print(t1.item())
    print(t2.item())
    print(t3.item())

    # 注意：张量中只有一个元素，如果有多个元素的话，使用item函数可能就会报错
    # RuntimeError: a Tensor with 2 elements cannot be converted to Scalar
    # t4 = torch.tensor([30,40])
    # print(t4.item())

if __name__ == '__main__':
    test()

输出

torch.Size([])
torch.Size([1])
torch.Size([1, 1])
30
30
30

4 张量拼接操作

💯 小结

torch.cat函数可以将两个张量根据指定的维度拼接起来
torch.stack函数可以将两个张量根据指定的维度叠加起来

(1) `torch.cat`函数的使用

torch.cat函数可以将两个张量根据指定的维度拼接起来，是频繁使用的函数

import torch

def test():

    # 固定随机数种子
    torch.manual_seed(0)

    data1 = torch.randint(0, 10, [3, 4, 5])
    data2 = torch.randint(0, 10, [3, 4, 5])

    print(data1.shape)
    print(data2.shape)

    # 1. 按照0维度进行拼接
    new_data = torch.cat([data1, data2], dim=0)
    print(new_data.shape)

    # 2. 按照1维度进行拼接
    new_data = torch.cat([data1, data2], dim=1)
    print(new_data.shape)

    # 3. 按照2维度进行拼接
    new_data = torch.cat([data1, data2], dim=2)
    print(new_data.shape)

    # 注意： dim必须保证是有效

if __name__ == '__main__':
    test()

输出

torch.Size([3, 4, 5])
torch.Size([3, 4, 5])
torch.Size([6, 4, 5])
torch.Size([3, 8, 5])
torch.Size([3, 4, 10])

(2) `torch.stack`函数的使用

torch.stack函数可以将两个张量根据指定的维度叠加起来

import torch

def test():

    torch.manual_seed(0)
    data1 = torch.randint(0, 10, [2, 3])
    data2 = torch.randint(0, 10, [2, 3])

    print(data1)
    print(data2)
    print('-'*30)

    # 将两个张量stack起来，想cat一样指定维度
    # 1. 按照0维度进行叠加
    new_data = torch.stack([data1,data2],dim=0)
    print(new_data.shape)
    print(new_data)
    print('-' * 30)


    # 2. 按照1维度进行叠加
    new_data = torch.stack([data1,data2],dim=1)
    print(new_data.shape)
    print(new_data)
    print('-' * 30)

    # 2. 按照2维度进行叠加
    new_data = torch.stack([data1, data2], dim=2)
    print(new_data.shape)
    print(new_data)
    print('-' * 30)

if __name__ == '__main__':
    test()

输出

tensor([[4, 9, 3],
        [0, 3, 9]])
tensor([[7, 3, 7],
        [3, 1, 6]])
------------------------------
torch.Size([2, 2, 3])
tensor([[[4, 9, 3],
         [0, 3, 9]],

        [[7, 3, 7],
         [3, 1, 6]]])
------------------------------
torch.Size([2, 2, 3])
tensor([[[4, 9, 3],
         [7, 3, 7]],

        [[0, 3, 9],
         [3, 1, 6]]])
------------------------------
torch.Size([2, 3, 2])
tensor([[[4, 7],
         [9, 3],
         [3, 7]],

        [[0, 3],
         [3, 1],
         [9, 6]]])
------------------------------

5 张量索引操作

💯 小结

简单行、列索引

data[1,2], data[1][2]获得指定位置的某个元素

data[:, 0]获得第1列的元素

列表索引

data[[0, 2, 3], [0, 1, 2]]获得(0, 0)、(2, 1)、(3, 2)三个位置的元素

data[[[0], [2], [3]], [0, 1, 2]]获得0、2、3行的0、1、2列

范围索引

data[:3, 2]获得的前3行，再获得第2列的数据

布尔索引

data[data[:, 1] > 6返回第2列元素大于6的行

多维索引

data[0, :, :]按照第0个维度选择第0元素，4行5列元素

(1) 简单行、列索引

(2) 列表索引

(3) 范围索引

import torch

# 1. 简单行列索引
def test01():

    data = torch.randint(0, 10, [4, 5])
    print(data)
    print('-' * 30)

    # 1.1 获得指定的某行元素
    print(data[0])

    # 1.2 获得指定的某列元素
    # 逗号前面表示行，逗号后面表示列
    # 冒号表示所有行或者所有列

    # 表示获得第1列的元素
    print(data[:, 0])

    # 获得指定位置的某个元素
    print(data[1,2], data[1][2])
    
# 3. 范围索引 

    # 表示先获得的前3行，再获得第2列的数据
    print(data[:3, 2])

    # 表示获得前3行的前2列
    print(data[:3, :2])

# 2. 列表索引
def test02():

    # 固定随机数种子
    torch.manual_seed(0)

    data = torch.randint(0, 10, [4, 5])
    print(data)
    print('-' * 30)

    # 如果索引的行列都是一个1维的列表，那么两个列表的长度必须相等
    # 表示获得(0, 0)、(2, 1)、(3, 2)三个位置的元素
    print(data[[0, 2, 3], [0, 1, 2]])

    # 表示获得0、2、3行的0、1、2列
    print(data[[[0], [2], [3]], [0, 1, 2]])

if __name__ == '__main__':
    test01()
    test02()

输出

tensor([[1, 0, 1, 6, 3],
        [2, 9, 7, 5, 4],
        [1, 0, 9, 9, 8],
        [2, 1, 5, 7, 7]])
------------------------------
tensor([1, 0, 1, 6, 3])
tensor([1, 2, 1, 2])
tensor(7) tensor(7)
tensor([1, 7, 9])
tensor([[1, 0],
        [2, 9],
        [1, 0]])
tensor([[4, 9, 3, 0, 3],
        [9, 7, 3, 7, 3],
        [1, 6, 6, 9, 8],
        [6, 6, 8, 4, 3]])
------------------------------
tensor([4, 6, 8])
tensor([[4, 9, 3],
        [1, 6, 6],
        [6, 6, 8]])

(4) 布尔索引

(5) 多维索引

import torch

# 1. 布尔索引
def test01():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [4, 5])
    print(data)

    # 希望能够获得该张量中所有大于3的元素
    print(data[data > 3])

    # 希望返回第2列元素大于6的行
    print(data[data[:, 1] > 6])

    # 希望返回第2行元素大于3的所有列
    print(data[:, data[1] > 3])

# 2. 多维索引
def test02():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [3, 4, 5])
    print(data)
    print('-' * 30)

    # 按照第0个维度选择第0元素，4行5列元素
    print(data[0, :, :])
    print('-' * 30)

    # 按照第1个维度选择第0元素
    print(data[:, 0, :])
    print('-' * 30)

    # 按照第2个维度选择第0元素
    print(data[:, :, 0])
    print('-' * 30)

if __name__ == '__main__':
    test01()
    test02()

输出

tensor([[4, 9, 3, 0, 3],
        [9, 7, 3, 7, 3],
        [1, 6, 6, 9, 8],
        [6, 6, 8, 4, 3]])
tensor([4, 9, 9, 7, 7, 6, 6, 9, 8, 6, 6, 8, 4])
tensor([[4, 9, 3, 0, 3],
        [9, 7, 3, 7, 3]])
tensor([[4, 9, 0],
        [9, 7, 7],
        [1, 6, 9],
        [6, 6, 4]])
tensor([[[4, 9, 3, 0, 3],
         [9, 7, 3, 7, 3],
         [1, 6, 6, 9, 8],
         [6, 6, 8, 4, 3]],

        [[6, 9, 1, 4, 4],
         [1, 9, 9, 9, 0],
         [1, 2, 3, 0, 5],
         [5, 2, 9, 1, 8]],

        [[8, 3, 6, 9, 1],
         [7, 3, 5, 2, 1],
         [0, 9, 3, 1, 1],
         [0, 3, 6, 6, 7]]])
------------------------------
tensor([[4, 9, 3, 0, 3],
        [9, 7, 3, 7, 3],
        [1, 6, 6, 9, 8],
        [6, 6, 8, 4, 3]])
------------------------------
tensor([[4, 9, 3, 0, 3],
        [6, 9, 1, 4, 4],
        [8, 3, 6, 9, 1]])
------------------------------
tensor([[4, 9, 1, 6],
        [6, 1, 1, 5],
        [8, 7, 0, 0]])
------------------------------

6 张量形状操作

💯 小结

reshape函数可以在保证张量数据不变的前提下改变数据的维度
torch.transpose可以一次性交换2个维度
torch.permute可以一次性交换多个维度
view函数也可用于修改张量的形状，但是用法比较局限，只能用于存储在整块内存中的张量
squeeze函数用删除shape为1的维度
unsqueeze函数在每个维度添加1，以增加数据的形状

(1) `reshape`函数的用法

shape size()可以查看张量的形状
reshape函数可以在保证张量数据不变的前提下改变数据的维度，将其转换成指定的形状

import torch

def test():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [4, 5])

    # 查看张量的形状
    print(data.shape, data.shape[0], data.shape[1])
    print(data.size(), data.size(0), data.size(1))

    # 修改张量的形状
    new_data = data.reshape(2, 10)
    print(new_data)

    # 注意：转换之后的形状元素个数得等于原来张量的元素个数
    # new_data = data.reshape(1, 10)
    # print(new_data)

    # 使用-1代替省略的形状
    new_data = data.reshape(5, -1)
    print(new_data)

if __name__ == '__main__':
    test()

输出

torch.Size([4, 5]) 4 5
torch.Size([4, 5]) 4 5
tensor([[4, 9, 3, 0, 3, 9, 7, 3, 7, 3],
        [1, 6, 6, 9, 8, 6, 6, 8, 4, 3]])
tensor([[4, 9, 3, 0],
        [3, 9, 7, 3],
        [7, 3, 1, 6],
        [6, 9, 8, 6],
        [6, 8, 4, 3]])

(2) `torch.transpose`和`torch.permute`函数的使用

torch.transpose可以一次性交换2个维度
torch.permute可以一次性交换多个维度
本质上都是在修改数据的维度

import torch

# 1. transpose函数
def test01():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [3, 4, 5])

    new_data = data.reshape(4, 3, 5)
    print(new_data.shape)

    # 直接交换两个维度的值
    new_data = torch.transpose(data, 1, 2)
    print(new_data.shape)

    # 缺点： 一次只能交换两个维度
    # 把数据的形状变成(4, 5, 3)
    # 进行第一次交换：(4, 3, 5)
    # 进行第二次交换：(4, 5, 3)
    new_data = torch.transpose(data, 0, 1)
    new_data = torch.transpose(new_data, 1, 2)
    print(new_data.shape)

# 2. permute函数
def test02():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [3, 4, 5])

    # permute可以一次性交换多个维度
    new_data = torch.permute(data, [1, 2, 0])
    print(new_data.shape)

if __name__ == '__main__':
    test01()
    test02()

输出

torch.Size([4, 3, 5])
torch.Size([3, 5, 4])
torch.Size([4, 5, 3])
torch.Size([4, 5, 3])

(3) `view`和`contigous`函数的用法

view函数也可用于修改张量的形状，但是用法比较局限，只能用于存储在整块内存中的张量。例：经过torch.transpose或torch.permute函数处理之后的张量，无法使用view函数进行形状操作
is_contiguous判断是否连续
contiguous将张量连续化

import torch

# 1. view函数的使用
def test01():

    data = torch.tensor([[10, 20, 30], [40, 50, 60]])
    data = data.view(3, 2)
    print(data.shape)

    # is_contiguous函数来判断张量是否是连续内存空间（整块的内存）
    print(data.is_contiguous())

# 2. view函数使用注意
def test02():

    # 当张量经过transpose或者pre，ute函数之后，内存空间基本不连续
    # 此时，必须先把空间连续，才能够使用view函数进行张量形状操作

    data = torch.tensor([[10, 20, 30], [40, 50, 60]])
    print('是否连续：', data.is_contiguous())
    data = torch.transpose(data, 0, 1)
    print('是否连续：', data.is_contiguous())

    # 此时，在不连续内存的情况下使用view会怎么样呢？
    data = data.contiguous().view(2, 3)
    print(data)

if __name__ == '__main__':
    test01()
    test02()

输出

torch.Size([3, 2])
True
是否连续： True
是否连续： False
tensor([[10, 40, 20],
        [50, 30, 60]])

(4) `squeeze`和`unsqueeze`函数的用法

squeeze函数用删除shape为1的维度
unsqueeze函数在每个维度添加1，以增加数据的形状

import torch

# 1. squeeze函数使用
def test01():

    data = torch.randint(0, 10, [1, 3, 1, 5])
    print(data.shape)

    # 维度压缩，默认去掉所有1的维度
    new_data = data.squeeze()
    print(new_data.shape)

    # 指定去掉某个1的维度
    new_data = data.squeeze(2)
    print(new_data.shape)

# 2. unsqueeze函数使用
def test02():

    data = torch.randint(0, 10, [3, 5])
    print(data.shape)

    # 可以在指定位置增加维度
    # -1 代表最后一个维度
    new_data = data.unsqueeze(-1)
    print(new_data.shape)

if __name__ == '__main__':
    test01()
    test02()

输出

torch.Size([1, 3, 1, 5])
torch.Size([3, 5])
torch.Size([1, 3, 5])
torch.Size([3, 5])
torch.Size([3, 5, 1])

7 张量运算函数

💯 小结

mean均值
sum求和
pow次方
sqrt平方根
expe多少次方
log对数

import torch

# 1. 均值
def test01():

    torch.manual_seed(0)
    # data = torch.randint(0, 10, [2, 3], dtype=torch.float64)
    data = torch.randint(0, 10, [2, 3]).double()
    # print(data.dtype)

    print(data)
    # 默认对所有的数据计算均值
    print(data.mean())
    # 按指定的维度计算均值
    print(data.mean(dim=0))
    print(data.mean(dim=1))

# 2. 求和
def test02():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [2, 3]).double()

    print(data.sum())
    print(data.sum(dim=0))
    print(data.sum(dim=1))

# 3. 平方
def test03():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [2, 3]).double()

    print(data)
    print(data.pow(2))

# 4. 平方根
def test04():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [2, 3]).double()

    print(data)
    print(data.sqrt())

# 5. e多少次方
def test05():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [2, 3]).double()

    print(data)
    print(data.exp())

# 6. 对数
def test06():

    torch.manual_seed(0)
    data = torch.randint(0, 10, [2, 3]).double()

    print(data)
    print(data.log()) # 以e为底
    print(data.log2())  # 以2为底
    print(data.log10())  # 以10为底

if __name__ == '__main__':
    test01()
    print('-' * 30)
    test02()
    print('-' * 30)
    test03()
    print('-' * 30)
    test04()
    print('-' * 30)
    test05()
    print('-' * 30)
    test06()

输出

tensor([[4., 9., 3.],
        [0., 3., 9.]], dtype=torch.float64)
tensor(4.6667, dtype=torch.float64)
tensor([2., 6., 6.], dtype=torch.float64)
tensor([5.3333, 4.0000], dtype=torch.float64)
------------------------------
tensor(28., dtype=torch.float64)
tensor([ 4., 12., 12.], dtype=torch.float64)
tensor([16., 12.], dtype=torch.float64)
------------------------------
tensor([[4., 9., 3.],
        [0., 3., 9.]], dtype=torch.float64)
tensor([[16., 81.,  9.],
        [ 0.,  9., 81.]], dtype=torch.float64)
------------------------------
tensor([[4., 9., 3.],
        [0., 3., 9.]], dtype=torch.float64)
tensor([[2.0000, 3.0000, 1.7321],
        [0.0000, 1.7321, 3.0000]], dtype=torch.float64)
------------------------------
tensor([[4., 9., 3.],
        [0., 3., 9.]], dtype=torch.float64)
tensor([[5.4598e+01, 8.1031e+03, 2.0086e+01],
        [1.0000e+00, 2.0086e+01, 8.1031e+03]], dtype=torch.float64)
------------------------------
tensor([[4., 9., 3.],
        [0., 3., 9.]], dtype=torch.float64)
tensor([[1.3863, 2.1972, 1.0986],
        [  -inf, 1.0986, 2.1972]], dtype=torch.float64)
tensor([[2.0000, 3.1699, 1.5850],
        [  -inf, 1.5850, 3.1699]], dtype=torch.float64)
tensor([[0.6021, 0.9542, 0.4771],
        [  -inf, 0.4771, 0.9542]], dtype=torch.float64)

8 自动微分模块

💯 小结

首先要设置属性requires_grad=True，计算梯度的张量
backward自动微分，必须是一个标量
grad访问梯度值
控制梯度
1. with torch.no_grad():第一种方法：不参与梯度计算
2. ```
# 第二种方法：装饰器
@torch.no_grad()
    def my_func(x):
        return x**2
```
3. torch.set_grad_enabled(False)第三种方法：全局的方式（会影响其他代码）
累计梯度和梯度清零

if x.grad is not None:
    x.grad.data.zero_()

梯度下降法 $w=w-\alpha*\nabla f$
要先使用detach函数进行分离，再使用numpy函数

(1) 梯度基本计算

要计算梯度的张量，首先要设置属性requires_grad=True
backward自动微分，必须是一个标量，一般使用mean 或sum把向量值变成一个标量，然后再进行backward反向传播（反向梯度求导）
grad访问梯度值

import torch

# 1. 标量的梯度计算
# y = x ** 2 + 20
def test01():

    # 对于需要求导的张量需要设置requires_grad=True
    x = torch.tensor(10, requires_grad=True, dtype=torch.float64)

    # 对x的中间计算
    f = x ** 2 + 20    # 2x

    # 自动微分
    f.backward()

    # 访问梯度
    print(x.grad)

# 2. 向量的梯度计算
def test02():

    x = torch.tensor([10, 20, 30, 40], requires_grad=True, dtype=torch.float64)
    # 定义变量的计算过程
    y1 = x ** 2 + 20

    # 注意：自动微分的时候，必须是一个标量
    y2 = y1.mean()    # 1/4 * y1 ==> 1/4 * 2x

    # 自动微分
    y2.backward()

    # 打印梯度值
    print(x.grad)

# 3. 多标量的梯度计算
# y = x1**2 + x2**2 + x1*x2
def test03():

    x1 = torch.tensor(10, requires_grad=True, dtype=torch.float64)
    x2 = torch.tensor(20, requires_grad=True, dtype=torch.float64)

    # 中间计算过程
    y = x1**2 + x2**2 + x1*x2

    # 自动微分
    y.backward()

    # 打印梯度值
    print(x1.grad)
    print(x2.grad)

# 4. 多向量的梯度计算
def test04():

    x1 = torch.tensor([10, 20], requires_grad=True, dtype=torch.float64)
    x2 = torch.tensor([30, 40], requires_grad=True, dtype=torch.float64)

    # 定义中间计算过程
    y = x1**2 + x2**2 + x1*x2

    # 将输出结果变为标量
    y = y.sum()

    # 自动微分
    y.backward()

    # 打印梯度值
    print(x1.grad)
    print(x2.grad)

if __name__ == '__main__':
    test01()
    print('-' * 30)
    test02()
    print('-' * 30)
    test03()
    print('-' * 30)
    test04()

输出

tensor(20., dtype=torch.float64)
------------------------------
tensor([ 5., 10., 15., 20.], dtype=torch.float64)
------------------------------
tensor(40., dtype=torch.float64)
tensor(50., dtype=torch.float64)
------------------------------
tensor([50., 80.], dtype=torch.float64)
tensor([ 70., 100.], dtype=torch.float64)

(2) 控制梯度计算

控制梯度计算三种方式
1. with torch.no_grad():第一种方法：不参与梯度计算
2. ```
@torch.no_grad()
    def my_func(x):
        return x**2
```
  第二种方法：装饰器
3. torch.set_grad_enabled(False)第三种方法：全局的方式（会影响其他代码）

累计梯度和梯度清零

if x.grad is not None:
    x.grad.data.zero_()

梯度下降法 $w=w-\alpha*\nabla f$

import torch

# 1. 控制梯度计算
def test01():

    x = torch.tensor(10, requires_grad=True, dtype=torch.float64)
    print(x.requires_grad)

    # 1. 第一种方法：不参与梯度计算
    with torch.no_grad():
        y = x**2
    print(y.requires_grad)

    # 2. 第二种方法：装饰器
    @torch.no_grad()
    def my_func(x):
        return x**2

    y = my_func(x)
    print(y.requires_grad)

    # 3.第三种方法：全局的方式（会影响其他代码）
    torch.set_grad_enabled(False)
    y = x**2
    print(y.requires_grad)
    
if __name__ == '__main__':
    test01()

输出

True
False
False
False

import torch

# 2. 累计梯度和梯度清零
def test02():

    x = torch.tensor([10, 20, 30, 40], requires_grad=True, dtype=torch.float64)

    # 当我们重复对x进行梯度计算的时候，是会将历史的梯度值累加到x.grad属性中
    # 希望不要去累加历史梯度
    for _ in range(3):

        # 对输入x的计算过程
        f1 = x**2 + 20
        # 将向量转换成标量
        f2 = f1.mean()

        # 梯度清零
        if x.grad is not None:
            x.grad.data.zero_()

        # 自动微分
        f2.backward()
        print(x.grad)

# 3. 案例-梯度下降优化函数
def test03():

    # y = x**2
    # 当x为什么值的情况下，y最小

    # 初始化
    x = torch.tensor(10, requires_grad=True, dtype=torch.float64)

    for _ in range(1000):

        # 正向计算
        y = x**2

        # 梯度清零
        if x.grad is not None:
            x.grad.data.zero_()

        # 反向传播（自动微分）
        y.backward()

        # 更新参数
        x.data = x.data - 0.01 * x.grad

        # 打印x的值
        print('%.10f' % x.data)

if __name__ == '__main__':
    test02()
    print('-' * 30)
    test03()

输出

tensor([ 5., 10., 15., 20.], dtype=torch.float64)
tensor([ 5., 10., 15., 20.], dtype=torch.float64)
tensor([ 5., 10., 15., 20.], dtype=torch.float64)
------------------------------
9.8000000000
9.6040000000
......
0.0000000175
0.0000000172
0.0000000168

(3) 梯度计算注意

当设置requires_grad=True的张量使用numpy函数进行转换时，会出现以下报错：

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

此时，需要先使用detach函数进行分离，再使用numpy函数

注意：detach之后又会产生一个新的张量，新的张量作为叶子结点，并且该张量和原来张量共享数据，但是分离之后的张量不需要计算梯度。

import torch

# 1. 演示下错误
def test01():

    x = torch.tensor([10, 20], requires_grad=True, dtype=torch.float64)

    # RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.
    # print(x.numpy())
    # 下面是正确的操作
    print(x.detach().numpy())

# 2. 共享数据
def test02():

    # x是叶子节点
    x1 = torch.tensor([10, 20], requires_grad=True, dtype=torch.float64)
    # 使用detach函数分离出一个新的张量
    x2 = x1.detach()

    print(id(x1.data), id(x2.data))

    # 修改分离后产生的新的张量
    x2[0] = 100

    print(x1)
    print(x2)

    # 通过结果我们发现，x2张量不存在requires_grad=True
    # 表示：对x1的任何计算都会影响到对x1的梯度计算
    # 但是，对x2的任何计算不会影响到对x1的梯度计算

    print(x1.requires_grad)
    print(x2.requires_grad)

if __name__ == '__main__':
    test01()
    test02()

输出

[10. 20.]
2429421557136 2429421557136
tensor([100.,  20.], dtype=torch.float64, requires_grad=True)
tensor([100.,  20.], dtype=torch.float64)
True
False