pytorch从入门到放弃一(pytorch基本使用)

自学AI的鲨鱼儿

已于 2024-03-15 23:47:41 修改

阅读量45

点赞数

分类专栏： # NLP_Code 文章标签：自然语言处理 tensorflow pytorch

于 2021-05-10 18:54:16 首次发布

本文链接：https://blog.csdn.net/qq_16555103/article/details/116605644

版权

NLP_Code 专栏收录该内容

21 篇文章 0 订阅

订阅专栏

一、关于pytorch一些基本的常识

一、tensor 与 numpy 之间的转化

1.1、创建 tensor 与创建numpy及其相似

1.2、tensor 与 numpy 的相互转换

1.3、tensor 与 numpy 一样，可以元素级操作

1.4、numpy 的很多函数，tensor也可以使用

1.5、numpy维度参数axis，tensor维度参数dim

二、tensor对象方法与属性

三、tensor 改变维度

3.1、tensor.view() 类似 numpy 中的 reshape()，注意与转置区别

3.2、tensor 对象增加与删除维度unsqeeze、squeeze

3.3、Variable的用法（过去式）

四、tensor更改设备，使用cuda或cpu

五、torch中计算 gradient

5.1、使用torch计算gradient

5.2、使用torch实现梯度下降更新参数

六、torch中model的基类 nn.Moudle

6.1、怎么利用nn.Moudle定义一个model

7、 torch 中神经网络层 nn

7.1、全连接层 nn.Linear() 线性模型

7.2、激活函数 active function

7.3 nn.Conv2d 卷积层的定义

7.4、 tensor.clone().detach()

7.5、pytorch中的pipline，nn.Sequential()

7.6、损失函数 loss function

八、Model 常用的方法与属性

8.1、获取模型参数的几个方法 .state_dict .namd_paramters() .paramters() .named_children()

8.2、Model 怎么获得各层以及该层的参数并更改

8.3、model.modules() nn.ModuleList()

九、torch中的优化器 optimizer

十、数据加载相关Dataset class，DataLoader 使用

10.1、Dataset 与 DataLoader 使用

10.2、DataLoader 中 collate fn 参数

import numpy as np
import pandas as pd
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

import torch
import torchvision  # torch中与计算机视觉有关的包
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable  #  将 tensor 创建为可以更新梯度的变量，新版的pytorch可以直接在tensor创建的时候定义

一、关于pytorch一些基本的常识

pytorch tensor 字符串数据类型默认为 torch.float32，整型数据类型默认为 torch.int64，默认的字符串类型打印时不进行展示，不同的数据类型之间可以直接计算，如不能计算需要进行转化
numpy 字符串数据类型与整形数据类型默认都是 64位的，不同的数据类型之间可以直接计算
定义tensor时尽量指定 dtype 与 requires_grad 参数，默认 requires_grad ==False

一、tensor 与 numpy 之间的转化

1.1、创建 tensor 与创建numpy及其相似

tensor.dtype 与 tensor.type() 区别

tensor.dtype 指的是 tensor中数据的数据类型，tensor.type() 指的是 tensor 的类型

# torch.tensor() 创建tensor 的API，当然还有一些torch.rand(n)这种类型numpy创建tensor的方法
torch.tensor(obj,dtype= ,requires_grad= ?)  # 建议指定dtype 与 requires_grad的参数

# 创建 numpy
arr1 = np.array([1.,1,1,0])
print(arr1,arr1.dtype)      # numpy 浮点型默认是 float64
arr2 = np.array([1,2,3,5])
print(arr2,arr2.dtype)    # numpy 整形默认是 int64
arr3 = np.array([1.,1,1,0],dtype='int32')
print(arr3,arr3.dtype)

out:
[1. 1. 1. 0.] float64
[1 2 3 5] int64
[1 1 1 0] int32

# 创建 tensor
tensor1 = torch.tensor([1.,1,0],requires_grad=False) # requires_grad 默认是False
print(tensor1,tensor1.dtype,tensor1.type())   # torch 浮点型默认是 单精度float32
tensor2 = torch.tensor([1,2,3,4])
print(tensor2,tensor2.dtype,tensor2.type())   # torch 整形默认为 int64
tensor3 = torch.tensor([1,2,3,4],dtype=torch.int32)
print(tensor3,tensor3.dtype,tensor2.type())

out:
tensor([1., 1., 0.]) torch.float32 torch.FloatTensor
tensor([1, 2, 3, 4]) torch.int64 torch.LongTensor
tensor([1, 2, 3, 4], dtype=torch.int32) torch.int32 torch.LongTensor

torch.tensor() 与 torch.Tensor()的区别，torch.Tensor() 不常用

# torch.tensor 与 torch.Tensor 的区别
   # torch.tensor() 仅仅是pyhton的一个函数，具体用法如下：
   # torch.tensor(data, dtype=None, device=None, requires_grad=False)
   # torch.tensor 会根据 函数形参 data 的数据类型进行实现生成相应的torch.LongTensor、torch.FloatTensor和torch.DoubleTensor。
    
    # torch.Tensor()是python类，更明确地说，是默认张量类型torch.FloatTensor()的别名，
    # torch.Tensor([1,2])会调用Tensor类的构造函数__init__，生成单精度浮点类型的张量。
    
tensor4 = torch.tensor([1,2])
print(tensor4,tensor4.type())
tensor5 = torch.Tensor([1,2])
print(tensor5,tensor5.type())

out:
tensor([1, 2]) torch.LongTensor
tensor([1., 2.]) torch.FloatTensor

1.2、tensor 与 numpy 的相互转换

# tensor 与 numpy 之间相互转化，numpy的函数大多数可以直接用于tensor对象上，但是也有一些函数不能用，这个时候
# 需要先将tensor 转化为 numpy ，然后再转化为 tensor 对象
tensor6 = torch.tensor([1,3,4,6.],dtype=torch.int16)
print(tensor6,tensor6.type())
arr4 = tensor6.numpy()
print(arr4,arr4.dtype,type(arr4))

tensor7 = torch.from_numpy(arr4)
print(tensor7,tensor7.type(),tensor7.dtype)

out：
tensor([1, 3, 4, 6], dtype=torch.int16) torch.ShortTensor
[1 3 4 6] int16 <class 'numpy.ndarray'>
tensor([1, 3, 4, 6], dtype=torch.int16) torch.ShortTensor torch.int16

1.3、tensor 与 numpy 一样，可以元素级操作

# tensor 和 numpy  一样，可以做元素级运算，例如  + - * / ；numpy 与tensor 之间不能交叉运算
arr5 = np.array([1,2,3,4,5.5],dtype='float64')
arr6 = np.array([1,2,3,4,5],dtype='int16')
print(arr5 + arr6)   # numpy 运算 可以不同的数据类型进行运算

tensor8 = torch.tensor([1,2,3,4,5,5.6],dtype=torch.float64)
tensor9 = torch.tensor([1,2,3,4,5,6],dtype=torch.int64)
print(tensor8 + tensor9)  # tensor 运算 支持不同的数据类型进行运算

out：
[ 2.   4.   6.   8.  10.5]
tensor([ 2.0000,  4.0000,  6.0000,  8.0000, 10.0000, 11.6000],
       dtype=torch.float64)

1.4、numpy 的很多函数，tensor也可以使用

# np中的方法，pytorch中很多都支持的，例如 二阶范数 norm
# 但有部分高级的矩阵计算Tensor并不支持，比如计算特征值特征向量等。
# 因此numpy还是有存在的必要的。
arr7 = np.array([3,4,],dtype='int16')
tensor10 = torch.tensor([3,4],dtype=torch.float64)
print(np.linalg.norm(arr7))
print(torch.norm(tensor10))

out:
5.0
tensor(5., dtype=torch.float64)

1.5、numpy维度参数axis，tensor维度参数dim

# np 中维度的参数为 axis ，而 tensor 中维度的参数为dim
arr8 = np.array([[3,4],[1,5]],dtype='int16')
tensor11 = torch.tensor([[3,4],[2,5]],dtype=torch.float64)
print(np.mean(arr8,axis=0))
print(torch.mean(tensor11,dim=0))

out:
[2.  4.5]
tensor([2.5000, 4.5000], dtype=torch.float64)

二、tensor对象方法与属性

1.tensor.type() 查看 tensor的类型
2.tensor.dtype 查看tensor的数据类型
3.tensor.shape 查看tensor的size
4.tensor.item() 获取tensor的python对象值
5.tensor.data tensor 本身
6.tensor.grad tensor 梯度
7.tensor.float() tensor转化为浮点型类型，默认是float32 
8.tensor.int() tensor转化为整型类型，默认是int64
9.int(tensor) 将tensor对象转化为python的整形数据
10.tensor.size() 与 tensor.shape 一样的作用

tensor1 = torch.tensor(1.,dtype=torch.int32,requires_grad=False)
print(tensor1)
print(tensor1.type())
print(tensor1.dtype)
print(tensor1.shape)
print(tensor1.item())   
print(tensor1.float().type())  
print(int(tensor1))
print(tensor1.size())

out:
tensor(1, dtype=torch.int32)
torch.IntTensor
torch.int32
torch.Size([])
1
torch.FloatTensor
1
torch.Size([])

三、tensor 改变维度

3.1、tensor.view() 类似 numpy 中的 reshape()，注意与转置区别

# 更改形状：tensor.view()  它的方法类似 与 numpy 中的 reshape() ，注意这种方法与装置的区别
N,H,W,C = 10000,28,28,3
tensor12 = torch.randn((N,H,W,C))
print(tensor12.shape)
print(tensor12.view(10000,3,784).shape)
print(tensor12.view(-1,3,392).shape)

out:
torch.Size([10000, 28, 28, 3])
torch.Size([10000, 3, 784])
torch.Size([20000, 3, 392])

contiguous() 在 tensor.view() 中的用法

# 如果 x 在 view 之前进行过 tranpose permute 的操作后，需要使用 .contiguous() 来返回一个 contiguous copy
# 因为view() 要求数据的内存是连续的，contiguous() 可以使tensor在内存中是连续的方式并返回
import torch
tensor1 = torch.arange(10).view((2,5))
# tensor 进行转置
print(tensor1)
tensor1 = tensor1.permute((1,0))
print(tensor1)
tensor1 = tensor1.contiguous().view((1,-1))
print(tensor1)

out:
tensor([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])
tensor([[0, 5],
        [1, 6],
        [2, 7],
        [3, 8],
        [4, 9]])
tensor([[0, 5, 1, 6, 2, 7, 3, 8, 4, 9]])

3.2、tensor 对象增加与删除维度unsqeeze、squeeze

import torch

tensor1 = torch.arange(20).view((2,2,5))
print(tensor1)
# 增加维度
tensor1 = tensor1.unsqueeze(dim=1)
tensor1 = tensor1.unsqueeze(dim=-1)
print(tensor1)
# 删除维度为1的维度,只能删除维度为1的维度
print(tensor1.squeeze(dim=1))
# 删除所有维度为1的维度
print(tensor1.squeeze())

3.3、Variable的用法（过去式）

#  Variable是过去式了，现在在创建tensor对象时，可以直接使用requires_grad参数指定该tensor对象是否需要自动计算梯度
from torch.autograd import Variable
tensor14 = torch.tensor(2.0,requires_grad=True)
print(tensor14)
tensor15 = torch.tensor(1.0,requires_grad=True)
a = tensor14 + tensor15
print(a)   # 其中 grad_fn 表示 梯度函数的类别

out:
tensor(2., requires_grad=True)
tensor(3., grad_fn=<AddBackward0>)

四、tensor更改设备，使用cuda或cpu

# 第一种方式：使用 to
dev = 'cuda:0' if torch.cuda.is_available() else 'cpu'
device = torch.device(dev)
# 模型更改设备的方法
model.to(device)
# tensor 更新设备的方法
tensor = tensor.to(device)

# 第二种方式：使用 .cuda() .cpu()
# 模型更改设备的方法
model.cuda()
model.cpu()
# tensor更改设备的方法
tensor = tensor.cuda()
tensor = tensor.cpu()

3、如果tensor输出数据时需要去cuda

if cuda_gpu:
    predict = predict.cpu().numpy()
else:
    predict = predict.numpy()

五、torch中计算 gradient

5.1、使用torch计算gradient

# 假设 f(x)  = (x-2)**2 ，求 f'(1) 的 梯度
def fp(x):
    """
        自定义的导数函数
    """
    return 2*(x-2)

# 用 torch 来计算梯度
def f(x):
    return (x-2)**2

x = torch.tensor([1.],requires_grad=True)
# 使用导数函数计算梯度
print("使用导数函数计算梯度:",fp(x))

# 使用 torch 计算 梯度
y  = f(x)         # 传入初始值计算误差
y.backward()  # 反向 BP
print('使用 torch 计算 梯度',x.grad)   # 梯度将会保存在变量x中

out:
使用导数函数计算梯度: tensor([-2.], grad_fn=<MulBackward0>)
使用 torch 计算 梯度 tensor([-2.])

5.2、使用torch实现梯度下降更新参数

# 原函数 f(x) = (x-2) ** 2 
def fp(x):
    return 2*(x-2)
# 定义待迭代的函数
def f(x):
    return (x-2)**2
# 定义初始化 x 的tensor对像
x = torch.tensor([5.],requires_grad=True)
lr = 0.25

print("{}\t\t{}\t\t{}\t\t{}".format('iter','x','grad_fp','grad_torch'))
for i in range(20):
    y = f(x)  # 传入初始值计算误差
    y.backward()  # 反向BP计算出当前轮的梯度
    x.data = x.data - lr * x.grad    #  对 x 进行更新
    
    print('{}\t\t{:.3f}\t\t{:.3f}\t\t{:.3f}'.format(i,x.item(),fp(x).item(),x.grad.item()))   #  tensor对象 不能直接用format，需要使用.item()提取出值。
    
    # 对梯度进行清0
    x.grad.detach_()
    x.grad.zero_()  # 迭代梯度需要在更新后清零

out:
iter		x		grad_fp		grad_torch
0		3.500		3.000		6.000
1		2.750		1.500		3.000
2		2.375		0.750		1.500
3		2.188		0.375		0.750
4		2.094		0.188		0.375
5		2.047		0.094		0.188
6		2.023		0.047		0.094
7		2.012		0.023		0.047
8		2.006		0.012		0.023
9		2.003		0.006		0.012
10		2.001		0.003		0.006
11		2.001		0.001		0.003
12		2.000		0.001		0.001
13		2.000		0.000		0.001
14		2.000		0.000		0.000
15		2.000		0.000		0.000
16		2.000		0.000		0.000
17		2.000		0.000		0.000
18		2.000		0.000		0.000
19		2.000		0.000		0.000

六、torch中model的基类 nn.Moudle

6.1、怎么利用nn.Moudle定义一个model

由于 model.paramters() 传入optimizer中的参数是 ___init__() 初始化中所有self 层的参数，尽管 forward()前向传播没有用到的self 层都会传入到optimizer优化器中，所以我们在定义self层时需要保证：① self层的顺序和forward执行的顺序相同 ② self层与 forward执行层一一对应

# 使用基类 nn.Moudle 继承的方式
import torch
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super(Model,self).__init__()
        # 这里面需要从上到下定义模型
        self.conv1 = nn.Conv2d(3,6,(3,3),stride=1,padding=1)
        self.conv2 = nn.Conv2d(6,1,(3,3),stride=1,padding=1)
    
    # 定义覆盖前向传播函数
    def forward(self,x):
        x = self.conv1(x)
        x = self.conv2(x)
        return x
x = torch.rand(size=(1,3,3,3))
model = Model()
print(model(x))

out:
tensor([[[[0.8633, 0.6278, 0.6290],
          [0.4292, 0.7160, 0.0500],
          [0.1715, 0.9834, 0.3542]],

         [[0.7432, 0.5427, 0.2108],
          [0.7793, 0.1968, 0.5308],
          [0.4827, 0.7692, 0.0608]],

         [[0.6688, 0.5944, 0.9697],
          [0.4360, 0.5204, 0.6005],
          [0.7696, 0.8666, 0.7157]]]])
tensor([[[[ 0.0897,  0.0145, -0.0479],
          [ 0.1160, -0.0082,  0.0785],
          [-0.0058, -0.1679, -0.0641]]]], grad_fn=<MkldnnConvolutionBackward>)

7、 torch 中神经网络层 nn

1、torch 中 nn 与 torch.nn.functional as F 的区别
    nn 下面操作都是类，需要初始化对象后直接使用
    F 下面的操作都是方法，直接使用即可，不需要初始化

7.1、全连接层 nn.Linear() 线性模型

# 全连接网络，输入、输出必须是二维的 tensor
in_feature = 3
out_feature = 4
linear_model = nn.Linear(in_features=in_feature,out_features=out_feature,bias=True)  # 定义全连接层的参数模型

# 创建tensor
input_ = torch.tensor([[2.,3.,4.],[3.,5.,6.]])       # 这里需要注意，必须要用 2. 浮点型表示，否则 tensor 会创建 整形类型
print(input_,input_.type())
output_ = linear_model(input_)

# 结果
print(input_.shape,input_.size(),input_.nelement()) 
print(output_.shape,output_.size(),output_.nelement())
print('='*30)
print('模型的weights：',linear_model.weight)      #  tip：这里必须要注意，torch 参数 第0维度为输出的维度，也就是4，第1维度为输入的维度，也就是3
print('模型的bias：',linear_model.bias)
# 使用 for 循环打印 parameters 
print('*'*30)
for i in linear_model.parameters():
    print(i)

out:
tensor([[2., 3., 4.],
        [3., 5., 6.]]) torch.FloatTensor
torch.Size([2, 3]) torch.Size([2, 3]) 6
torch.Size([2, 4]) torch.Size([2, 4]) 8
==============================
模型的weights： Parameter containing:
tensor([[ 0.5396, -0.0409, -0.4080],
        [-0.4170,  0.0430,  0.2176],
        [ 0.0727,  0.2042, -0.3300],
        [ 0.0542, -0.3428, -0.4786]], requires_grad=True)
模型的bias： Parameter containing:
tensor([-0.0985,  0.4785,  0.2246,  0.0755], requires_grad=True)
******************************
Parameter containing:
tensor([[ 0.5396, -0.0409, -0.4080],
        [-0.4170,  0.0430,  0.2176],
        [ 0.0727,  0.2042, -0.3300],
        [ 0.0542, -0.3428, -0.4786]], requires_grad=True)
Parameter containing:
tensor([-0.0985,  0.4785,  0.2246,  0.0755], requires_grad=True)

7.2、激活函数 active function

可以使用 nn 初始化激活函数对象，也可以直接使用 F 下面的激活函数

# Relu
relu_fun1= nn.ReLU(inplace=True)  # 创建一个Relu激活函数的对象；inplace 默认为关闭，打开指的是对源数据直接赋值更改，节省内存空间，一般默认即可
input_1 = torch.tensor([-3.,-1.,1.,0.,5.,10.])
output_1 = relu_fun1(input_1)
print(input_1)
print(output_1)  # Relu 函数 右边梯度不会进行截断

out:
tensor([ 0.,  0.,  1.,  0.,  5., 10.])
tensor([ 0.,  0.,  1.,  0.,  5., 10.])

# tanh
relu_fun= nn.Tanh()  # 创建一个Tanh激活函数的对象,Tanh没有 inplace 参数
input_2 = torch.tensor([-1.0,1.,0.,10.])
output_2 = relu_fun(input_2)
print(input_2)
print(output_2)  

out:
tensor([-1.,  1.,  0., 10.])
tensor([-0.7616,  0.7616,  0.0000,  1.0000])

# sigmoid
relu_fun= nn.Sigmoid()  # 创建一个sigmoid激活函数的对象
input_3 = torch.tensor([-1.,1.,0.,10.])
output_3 = relu_fun(input_3)
print(input_3)
print(output_3) 

out:
tensor([-1.,  1.,  0., 10.])
tensor([0.2689, 0.7311, 0.5000, 1.0000])

7.3 nn.Conv2d 卷积层的定义

convolutions 卷积层定义 （N，C，H，W） N:batch_size；C:输入通道
Conv2d(in_channels, out_channels, kernel_size, stride=1,padding=0, dilation=1, groups=1,bias=True, padding_mode=‘zeros’)
in_channels, 输入通道数
out_channels, 输出通道数
kernel_size, 卷积核的大小，可以是int 也可以是 tuple，其中tuple 表示 高和宽的大小
stride=1,卷积核的步长，可以是int 也可以是 tuple，其中tuple 表示 高和宽的大小
padding=0, 图片外层padding的层数,0表示不进行填充，例如：源图 33，若padding=1，则源图填充为 55【相当于最外层加了一圈】
dilation=1, 膨胀卷积的步长，默认为1表示不进行膨胀卷积，当dilation = 2 时表示步长为2做间隔为1的膨胀卷积
groups=1,输入通道分组，1个组表示进行“全连接”
bias=True, bias，默认即可
padding_mode=‘zeros’ padding时填充什么，默认即可

# 创建一个图片
image = np.array([0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.3803922 , 0.37647063, 0.3019608 ,0.46274513, 0.2392157 , 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.3529412 , 0.5411765 , 0.9215687 ,0.9215687 , 0.9215687 , 0.9215687 , 0.9215687 , 0.9215687 ,0.9843138 , 0.9843138 , 0.9725491 , 0.9960785 , 0.9607844 ,0.9215687 , 0.74509805, 0.08235294, 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.54901963,0.9843138 , 0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 ,0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 ,0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 ,0.7411765 , 0.09019608, 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.8862746 , 0.9960785 , 0.81568635,0.7803922 , 0.7803922 , 0.7803922 , 0.7803922 , 0.54509807,0.2392157 , 0.2392157 , 0.2392157 , 0.2392157 , 0.2392157 ,0.5019608 , 0.8705883 , 0.9960785 , 0.9960785 , 0.7411765 ,0.08235294, 0., 0., 0., 0.,0., 0., 0., 0., 0.,0.14901961, 0.32156864, 0.0509804 , 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.13333334,0.8352942 , 0.9960785 , 0.9960785 , 0.45098042, 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0.32941177, 0.9960785 ,0.9960785 , 0.9176471 , 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0.32941177, 0.9960785 , 0.9960785 , 0.9176471 ,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0.4156863 , 0.6156863 ,0.9960785 , 0.9960785 , 0.95294124, 0.20000002, 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0.09803922, 0.45882356, 0.8941177 , 0.8941177 ,0.8941177 , 0.9921569 , 0.9960785 , 0.9960785 , 0.9960785 ,0.9960785 , 0.94117653, 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.26666668, 0.4666667 , 0.86274517,0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 ,0.9960785 , 0.9960785 , 0.9960785 , 0.9960785 , 0.5568628 ,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0.14509805, 0.73333335,0.9921569 , 0.9960785 , 0.9960785 , 0.9960785 , 0.8745099 ,0.8078432 , 0.8078432 , 0.29411766, 0.26666668, 0.8431373 ,0.9960785 , 0.9960785 , 0.45882356, 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0.4431373 , 0.8588236 , 0.9960785 , 0.9490197 , 0.89019614,0.45098042, 0.34901962, 0.12156864, 0., 0.,0., 0., 0.7843138 , 0.9960785 , 0.9450981 ,0.16078432, 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0.6627451 , 0.9960785 ,0.6901961 , 0.24313727, 0., 0., 0.,0., 0., 0., 0., 0.18823531,0.9058824 , 0.9960785 , 0.9176471 , 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0.07058824, 0.48627454, 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.32941177, 0.9960785 , 0.9960785 ,0.6509804 , 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0.54509807, 0.9960785 , 0.9333334 , 0.22352943, 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.8235295 , 0.9803922 , 0.9960785 ,0.65882355, 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0.9490197 , 0.9960785 , 0.93725497, 0.22352943, 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.34901962, 0.9843138 , 0.9450981 ,0.3372549 , 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.01960784,0.8078432 , 0.96470594, 0.6156863 , 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0.01568628, 0.45882356, 0.27058825,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0., 0.,0., 0., 0., 0.], dtype=np.float32)
input_tensor = torch.tensor(image,dtype=torch.float32).view(1,1,28,28)   # 将图片转化为 （batch,输入通道数，H，W）
print(input_tensor.shape)

# 2D 卷积
conv_2d = nn.Conv2d(1,1,3)
print("模型初始化卷积核参数：",conv_2d.weight)
# 手动创建高斯卷积核
kernel_1 = torch.tensor([[1.,2.,1.],[2,4,1],[5,6,7]]) / 29
conv_2d.weight.data[:] = kernel_1  # 对模型的卷积核进行重新赋值
print("手动传入卷积核参数：",conv_2d.weight)

# 卷积操作
output_tensor = conv_2d(input_tensor)
print(output_tensor.shape)


print('原图：')
plt.imshow(input_tensor.clone().detach().view(28,28).numpy())  # ..clone() 深拷贝，对tensor更改不会影响源数据  .detach() 进行梯度截断，以后的操作不会影响源数据的梯度
plt.show()

#当我们训练网络的时候可能希望保持一部分的网络参数不变，
# 只对其中一部分的参数进行调整；
#或者值训练部分分支网络，并不让其梯度对主网络的梯度造成影响
#这时候我们就需要使用detach()函数来切断一些分支的反向传播

print('卷积后的图：')
plt.imshow(output_tensor.view(26,26).detach().numpy())
plt.show()

out：
模型初始化卷积核参数： Parameter containing:
tensor([[[[-0.1365, -0.0650, -0.1026],
          [ 0.3023,  0.0111, -0.2645],
          [ 0.2274, -0.1255,  0.1191]]]], requires_grad=True)
手动传入卷积核参数： Parameter containing:
tensor([[[[0.0345, 0.0690, 0.0345],
          [0.0690, 0.1379, 0.0345],
          [0.1724, 0.2069, 0.2414]]]], requires_grad=True)
torch.Size([1, 1, 26, 26])

7.4、 tensor.clone().detach()

.clone() 深拷贝，克隆，但是梯度还是共享叠加的
.detach() 隔绝梯度共享

tensor = tensor.clone().detach()  # tensor 深拷贝同时隔绝新tensor与网络的梯度连接

7.5、pytorch中的pipline，nn.Sequential()

# 使用 “管道流”
# nn.Sequential() 源码内部仅仅使用 for 循环遍历依次执行各层而已
d_in = 3
d_hidden = 4
d_out = 1

seq_model = nn.Sequential(
                                nn.Linear(d_in,d_hidden),  # (3,4)
                                nn.Tanh(),
                                nn.Linear(d_hidden,d_out), # (4,1)
                                nn.Sigmoid()
    )
input_tensor2 = torch.tensor([[1.,2.,1.],[4,5,7]])  # 2 * 3
output_tensor2 = seq_model(input_tensor2)

7.6、损失函数 loss function

可以用 nn 生成损失函数对象，也可以使用 F 下面的损失函数直接计算

# MSE loss
loss_fn = torch.nn.MSELoss(reduce=False)
a=torch.tensor(np.array([[1.,2],[3,4]]))
b=torch.tensor(np.array([[3.,3],[4,8]]))
print(loss_fn(a,b))
loss_fn = torch.nn.MSELoss(reduce="mean")
print(loss_fn(a,b))

out:
tensor([[ 4.,  1.],
        [ 1., 16.]], dtype=torch.float64)
tensor(5.5000, dtype=torch.float64)

"""
nn.CrossEntropyLoss() 、nn.Softmax()、torch.log()、nn.LogSoftmax()和nn.NLLLoss() 的关系 
nn.CrossEntropyLoss() 作用相当于 nn.LogSoftmax()和nn.NLLLoss()的结合
CrossEntropyLoss：input维度：（batch,classes） target维度：（batch,）
"""
# nn.Softmax()、torch.log()、nn.LogSoftmax()和nn.NLLLoss() 的方法
import torch
import torch.nn as nn

# 生成 logits 与 target 数据
y_target = torch.tensor([0,2,1,0]) # 1D 的tensor, 这里 batch = 4  classes = 3
print(y_target)
y_logits = torch.randn(4,3)   # 生成正态分布，2D的tensor
print(y_logits)

# nn.Softmax()、torch.log() 的使用方法
soft = nn.Softmax(dim=1)
y_logits_softmax = soft(y_logits)
print(y_logits_softmax)
    # 取【负】对数似然
y_logits_logsoftmax = torch.log(y_logits_softmax)
print("手写求解对数似然函数：",y_logits_logsoftmax)

# 使用 nn.LogSoftmax() 求 负对数似然'
logsoft = nn.LogSoftmax(dim=1)
y_logits_logsoftmax = logsoft(y_logits)
print('torch 中求解对数似然：',y_logits_logsoftmax)

# 使用 nn.NLLLoss() 求 交叉熵损失函数
nllloss  = nn.NLLLoss(reduction='mean')  # 默认的参数： reduction='mean'
cross_entropy = nllloss(y_logits_logsoftmax,y_target)  # 参数为：第一个参数是 log似然值，第二个参数为 target
print('手动计算 cross entropy 值：',cross_entropy)

# 使用 nn.CrossEntropyLoss() 直接计算 loss
#  nn.CrossEntropyLoss() 参数：
# weight：（张量，可选）可以指定一个一维的Tensor，用来设置每个类别的权重。用C表示类别的个数，Tensor的长度应该为C。
# 当训练集不平衡时该参数十分有用。
# reduction='mean' 默认
cross_loss_func = nn.CrossEntropyLoss(reduction='mean')
cross_entropy = cross_loss_func(y_logits,y_target)  # 第一个参数为 logits 【batch,classes】，第二参数为 target 【batch,】
print('torch 直接计算 cross entropy：',cross_entropy)

八、Model 常用的方法与属性

8.1、获取模型参数的几个方法 .state_dict .namd_paramters() .paramters() .named_children()

print(model.parameters())   # 获取模型的参数，需要注意的是改成参数不具备 层名key，仅仅是data
print(model.named_parameters())  # 带有层名key的迭代对象，类似于 .state_dict() 使用.items()
print(model.named_children())   # 获取模型不同层对象的列表，model.named_children()[:3] 取模型前三层
print(model.state_dict()) # 获取模型参数OrderDict字典对象

8.2、Model 怎么获得各层以及该层的参数并更改

1、获得model内层的方法：
    model.conv2   ------ 获得模型名为 conv2 的层
    model.named_children()[:3]  ---- 获得模型的前三层

2、获得model某层的参数
    model.conv2.weight.data   ------ 获得模型中名为 conv2 的层的参数值
    model.conv2.weight.data = tensor1  ------ 将模型conv2层的参数手动更改为 tensor1
    # 同时以下这种方法也可以用
    print(model.conv2.parameters())
    print(model.conv2.named_parameters())
    print(model.conv2.named_children())
    print(model.conv2.state_dict())

8.3、model.modules() nn.ModuleList()

详情见文章：resnet预训练模型使用【文章中分层设置学习率的小章节】

九、torch中的优化器 optimizer

看torch.optim - PyTorch中文文档摘抄的笔记。
class torch.optim.SGD(params, lr=, momentum=0, dampening=0, weight_decay=0, nesterov=False)
实现随机梯度下降算法（momentum可选）。
Nesterov动量基于On the importance of initialization and momentum in deep learning http://www.cs.toronto.edu/~hinton/absps/momentum.pdf 中的公式.
params (iterable) – 待优化参数的iterable或者是定义了参数组的dict
lr (float) – 学习率
momentum (float, 可选) – 动量因子（默认：0）
weight_decay (float, 可选) – 权重衰减（L2惩罚）（默认：0）
dampening (float, 可选) – 动量的抑制因子（默认：0）
nesterov (bool, 可选) – 使用Nesterov动量（默认：False）

查看文章：resnet50 预训练模型使用【章节：优化器optimizer设置分层学习率】

十、数据加载相关Dataset class，DataLoader 使用

Dataset class，DataLoader 使用
torch 中一个特殊的类,用生成训练集、测试集的迭代器
定义新的类 Dataset：固定写法重写父类中的函数 _ len _ _ getitem _
    __init__():

    __len__():

    __getitem__():

class torch.utils.data.DataLoader(
    dataset,
    batch_size=1,
    shuffle=False,
    sampler=None,
    batch_sampler=None,
    num_workers=0,
    collate_fn=<function default_collate>,
    pin_memory=False,
    drop_last=False,
    timeout=0,
    worker_init_fn=None)

DataLoader在数据集上提供单进程或多进程的迭代器
几个关键的参数意思：
- shuffle：设置为True的时候，每次迭代都会打乱数据集
- collate_fn：如何取样本的，我们可以定义自己的函数来准确地实现想要的功能
- drop_last：告诉如何处理数据集长度除于batch_size余下的数据。True就抛弃，否则保留

10.1、Dataset 与 DataLoader 使用

from torch.utils.data import Dataset,DataLoader
import torch
class Data_Set(Dataset):
    """
        生成 dataset
    """
    def __init__(self,X,Label=None):
        """
            X: 2D numpy int64
            Label: 1D numpy int64
        """
        self.X = X
        self.Label = Label
        
    def __len__(self):
        return len(self.X)
    
    def __getitem__(self,idx):
        if self.Label is not None:
            X = torch.tensor(self.X[idx],dtype=torch.int64) # 使用torch默认的整形数据
            Label = torch.tensor(self.Label[idx],dtype=torch.int64)
            return X,Label
        # 考虑predict阶段没有label
        else:
            X = torch.tensor(self.X[idx],dtype=torch.int64)
            return X

use_cuda = not args.no_cuda and torch.cuda.is_available()
# 添加 随机数种子
torch.cuda.manual_seed(200) if use_cuda else torch.manual_seed(200)
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
# 设置 使用gpu显存 创建dataloader时给定的参数
kwargs = {'num_workers':1,'pin_memory':True} if use_cuda else {}

x_train = np.random.rand(99,10)
y_train = np.random.rand(99)
dataset = FakeDataset(x_train,y_train)
dataloder = DataLoader(dataset=dataset,batch_size=4,shuffle=True,num_workers=2,**kwargs)
for i_batch,batch_data in enumerate(dataloder):
    print(i_batch,batch_data)   # 其中 batch_data 包含 x 与 y

10.2、DataLoader 中 collate fn 参数

"""
简单来说就是将一个batch的数据通过一定的映射关系转化为 tensor ，如果没有特殊需求其实不用自己写collate_fn方法，
有默认的default_collate方法。该参数可以传入一个自定义的函数的函数名
"""
# 自定义 collate fn 函数
    # 模板如下
def collate_fn(batch):
    # sort
    # pad
    # len
    # return text  label  lens[这里指的是序列非pad的真实的长度]
    pass