深度学习笔记——pytorch

lock cylinder

已于 2023-10-30 17:32:44 修改

阅读量129

点赞数

文章标签：深度学习笔记 pytorch

于 2023-10-30 17:10:44 首次发布

本文链接：https://blog.csdn.net/qq_34144750/article/details/134121863

版权

一、生成张量

pytorch中的张量类似于numpy中的ndarray,但是张量的数据类型只能为数值型。定义张量or创建函数时都可以指定device。

1.torch.empty(size,dtype=None,device=None, requires_grad=False)--生成一个指定形状为size的非初始化张量。

#数据随机生成
torch.empty(2)
tensor([6.9389e-18, 4.5836e-41])

torch.empty(2,3)
tensor([[1.3458e+22, 1.0186e-11, 4.1302e-08],
        [2.6951e-09, 2.1349e+20, 2.5810e-06]])

2.torch.rand(size,dtype=None,device=None, requires_grad=False)--从[0,1)均匀分布中随机初始化张量。

torch.rand(2)
tensor([0.3329, 0.5387])

torch.rand(2,3,4)
tensor([[[0.7969, 0.8706, 0.3243, 0.9121],
         [0.8215, 0.5874, 0.4392, 0.0104],
         [0.3109, 0.5347, 0.0642, 0.0348]],

        [[0.4504, 0.8025, 0.6011, 0.2948],
         [0.7261, 0.3526, 0.5889, 0.8369],
         [0.7874, 0.4325, 0.5080, 0.1977]]])

3.torch.randn(size,dtype=None,device=None, requires_grad=False)--从标准正态分布中随机初始化张量。

torch.randn(2)
tensor([1.2888, 0.5438])

torch.randn(2,3,4)
tensor([[[-1.4809,  0.5076,  0.6966, -1.2156],
         [-0.4119,  0.3242,  0.4690, -0.2551],
         [ 0.2470,  0.3578, -1.5854, -1.6585]],

        [[-1.2318, -2.5127,  2.3537, -0.0930],
         [ 0.9865,  0.7953,  0.1861, -1.0153],
         [ 0.5773,  0.5468,  0.2521, -0.4733]]])

4.torch.normal(mean, std, size)--从正态分布中随机初始化张量。

torch.normal(0,1,(2,3))
tensor([[-1.2514,  0.6536,  0.9443],
        [ 1.7148,  0.9770,  0.6666]])

5.张量名.uniform_(from=0, to=1)--将张量的元素重新从from-to的均匀分布抽样。

a
tensor([[-1.7225, -0.5679,  0.1066],
        [-1.6192, -0.7203, -0.1121]])

a.uniform_(1,2)
tensor([[1.6945, 1.8928, 1.5203],
        [1.4612, 1.6704, 1.0924]])

a
tensor([[1.6945, 1.8928, 1.5203],
        [1.4612, 1.6704, 1.0924]])

Remark：

1⃣️.pytorch中函数后面加下划线_属于inplace方法，原来的tensor会被改变。

2⃣️.uniform_方法只能使用float型数据，不能使用int。

6.torch.zeros(size,dtype=None,device=None, requires_grad=False)--生成一个指定形状为size的全0张量

torch.zeros(2)
tensor([0., 0.])

torch.zeros(2,3)
tensor([[0., 0., 0.],
        [0., 0., 0.]])

7.torch.ones(size,dtype=None,device=None, requires_grad=False)--生成一个指定形状为size的全1张量

torch.ones(2)
tensor([1., 1.])

torch.ones(2,3)
tensor([[1., 1., 1.],
        [1., 1., 1.]])

8.torch.eye(n, m=None, dtype=None, device=None, requires_grad=False)--生成对角线元素为1的对角矩阵

torch.eye(2,3)
tensor([[1., 0., 0.],
        [0., 1., 0.]])

torch.eye(2)#m值不输入时默认等于n
tensor([[1., 0.],
        [0., 1.]])

9.torch.arange(start=0, end, step=1, dtype=None, device=None, requires_grad=False)--同numpy的arange函数，在[start,end)区间以步长step生成一维等差张量。

torch.arange(3.2)
tensor([0., 1., 2., 3.])

torch.arange(1,3.2,0.3)
tensor([1.0000, 1.3000, 1.6000, 1.9000, 2.2000, 2.5000, 2.8000, 3.1000])

10.torch.linspace(start, end, steps, dtype=None, device=None, requires_grad=False) ---同numpy的linspace函数，在[start,end]区间均匀生成size为steps的一维等差张量。

torch.linspace(1,2,3)
tensor([1.0000, 1.5000, 2.0000])

11.torch.tensor(data, *, dtype=None, device=None, requires_grad=False)

data	列表，元组，ndarray,不能为数据框类型
dtype	指定数据类型
device	默认cpu device=cuda，则使用GPU
requires_grad	是否使用自动求导机制，默认false

torch.tensor(2)
tensor(2)

torch.tensor([1,2])
tensor([1, 2])

torch.tensor((1,2))
tensor([1, 2])

torch.tensor(np.array(list("abc")))
#type error

torch.tensor(np.arange(9))
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8])

12.randperm(n, dtype=torch.int64,device=None, requires_grad=False, pin_memory=False)---返回一个从0-n-1的随机整数permutation。

torch.randperm(5)
tensor([4, 0, 2, 3, 1])

13.数据框转换成tensor类型

因为torch.tensor的输入数据不能是数据框类型，所以在处理数据框的时候需要把数据框先转换成ndarray类型，再通过torch.from_numpy函数转换成tensor。

# 创建一个数据框
df = pd.DataFrame({
    'a': [1, 2, 3],
    'b': [4, 5, 6]
})

# 将数据框转换为 NumPy 数组
array = df.values

# 将 NumPy 数组转换为 PyTorch 张量
tensor = torch.from_numpy(array)

# 输出结果
print(tensor)
tensor([[1, 4],
        [2, 5],
        [3, 6]])

二、张量的属性和方法

张量名.size(dim=None )	默认返回张量的shape，指定dim时，返回该dim上的维度int。
张量名.shape	返回张量的shape
张量名.reshape(size) 张量名.view(size)	改变张量的shape并返回，不改变原张量
张量名.clone( )	创建张量的一个副本
张量名.item( )	将张量转化成标量，仅对单元素张量生效
张量名.type(dtype=None)	不输入dtype时，返回张量的数据类型输入dtype时，改变张量的数据类型为dtype
张量名.numpy( )	张量转换成ndarray类型，但是如果tensor已经包含梯度，则无法转化成ndarray，需要先用张量.data读取数据再使用numpy( ).
torch.from_numpy(ndarray)	把ndarray转换成张量类型
张量名.to(dtype=None, device=None)	改变张量的数据类型或device(CPU/CUDA)
torch.device( *args)	指定代码存储的位置
张量x.grad	在y.backward()后使用，返回y对张量x的偏导数
张量x.data/张量x.detach( )	取出张量x的数据，但是无法requires_grad
张量x.zero_( )	把张量的数据替换成0
张量.index_select(dim, index)	在张量的第dim维度，选择索引为index的数据
张量.flatten( )	展平成一维张量
torch.manual_seed(seed)	生成随机数种子
张量.norm(p=2,dim=None)	求张量的范数。p=2代表L2范数，p=1代表L1范数。 dim指定求范数的维度，默认求所有数的范数。
张量.tolist( )	把张量列表转化成普通列表
torch.exp(张量)	对张量进行指数计算 REMARK：numpy在pytorch中不支持求导机制
torch.gather(input, dim, index)	input为输入张量，dim为指定维度，index为张量索引可以对input按照指定维度进行数据选择，一般在自定义损失函数时用到。
torch.cat	拼接两个tensor # 按维度0连接（竖着拼） C = torch.cat((A, B), 0) # 按维度1连接（横着拼） C = torch.cat((A, B), 1)

a=torch.empty(2,3)
a.size()
#torch.Size([2, 3])

a.shape
#torch.Size([2, 3])

torch.randn(1).item()
#0.7592946887016296

b
#tensor([[1, 2],
        [3, 4]])
b.numpy()
#array([[1, 2],
       [3, 4]])

a=np.arange(1,5)
a
#array([1, 2, 3, 4])
torch.from_numpy(a)
#tensor([1, 2, 3, 4])

torch.device('cpu')
device(type='cpu')
torch.device('cuda')
device(type='cuda')

cuda0=torch.device('cuda:0')
b.to(cuda0)

b.to(torch.int)
tensor([3, 3, 3, 3, 3], dtype=torch.int32)

x
tensor([2., 3., 4.], requires_grad=True)
x.data.zero_()#必须加data，不然会报错
tensor([0., 0., 0.])
x
tensor([0., 0., 0.], requires_grad=True)

t1=torch.arange(9).reshape(3,3).type(torch.float)
t1
tensor([[0., 1., 2.],
        [3., 4., 5.],
        [6., 7., 8.]])


t1.norm()
tensor(14.2829)

t1.norm(dim=0)
tensor([6.7082, 8.1240, 9.6437])

Remark:张量名.numpy( )在tensor已经包含梯度的情况下，无法把tensor转化成ndarray，需要先用张量.data读取数据再使用numpy( ).

torch.arange(10.0,requires_grad=True).numpy()
#RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.


torch.arange(10.0,requires_grad=True).data.numpy()
array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)

三、张量的四则运算

+	加
-	减
*	乘
/	除
**	乘方
a.dot(b)	张量a和张量b的内积

Remark：

1.张量与标量四则运算会使用广播机制，标量自动补齐到和张量的shape相同。

2.张量之间进行四则运算要求shape相同，对应位置进行四则运算。

3.列张量和行张量之间可以进行四则运算，并使用广播机制。

b = torch.randn(4)
c = torch.randn(4, 1)
b,c
(tensor([ 0.5189,  0.9258, -1.2616, -0.2102]),
 tensor([[-2.6251],
         [ 0.3260],
         [ 0.4049],
         [-0.9350]]))

b+c
tensor([[-2.1061, -1.6993, -3.8867, -2.8353],
        [ 0.8450,  1.2518, -0.9356,  0.1158],
        [ 0.9238,  1.3307, -0.8567,  0.1947],
        [-0.4161, -0.0092, -2.1966, -1.1452]])

b*c#行张量乘以列张量并不是内积，而是按照广播机制形成矩阵
tensor([[-1.3623, -2.4302,  3.3118,  0.5518],
        [ 0.1692,  0.3018, -0.4113, -0.0685],
        [ 0.2101,  0.3748, -0.5108, -0.0851],
        [-0.4852, -0.8656,  1.1796,  0.1965]])

四、张量的矩阵运算

张量名.trace( )	返回矩阵的迹
张量名.diag(diagonal=0)	diagonal=0,表示主对角线 diagonal=n,表示主对角线上方第n条对角线 diagonal=-n,表示主对角线下方第n条对角线张量为1维tensor，则生成对角矩阵张量为2维tensor，则提取对角线元素
张量名.triu(diagonal=0)	返回上三角矩阵，以diagonal为起始对角线 diagonal=0,表示主对角线 diagonal=n,表示主对角线上方第n条对角线 diagonal=-n,表示主对角线下方第n条对角线
张量名.tril(diagonal=0)	返回下三角矩阵，以diagonal为起始对角线 diagonal=0,表示主对角线 diagonal=n,表示主对角线上方第n条对角线 diagonal=-n,表示主对角线下方第n条对角线
A.mm(B)	矩阵乘法A*B，注意矩阵运算的顺序,A,B是二维tensor
A.T	矩阵转置
A.inverse( )	矩阵求逆，矩阵A的数据类型必须为float/complex，其他类型报错
A.svd( )	矩阵SVD分解，矩阵A的数据类型必须为float/complex，其他类型报错

b=torch.randn(2)
b
#tensor([ 1.6144, -0.5923])
b.diag()
#tensor([[ 1.6144,  0.0000],
        [ 0.0000, -0.5923]])
b.diag(1)
#tensor([[ 0.0000,  1.6144,  0.0000],
        [ 0.0000,  0.0000, -0.5923],
        [ 0.0000,  0.0000,  0.0000]])

c=torch.randn(2,3)
c
#tensor([[-0.6504,  1.2259,  0.0892],
        [-0.9847, -2.0406,  0.1707]])
c.diag()
#tensor([-0.6504, -2.0406])
c.diag(1)
#tensor([1.2259, 0.1707])


c.triu()
#tensor([[-0.6504,  1.2259,  0.0892],
        [ 0.0000, -2.0406,  0.1707]])
c.triu(1)
#tensor([[0.0000, 1.2259, 0.0892],
        [0.0000, 0.0000, 0.1707]])
c.tril()
#tensor([[-0.6504,  0.0000,  0.0000],
        [-0.9847, -2.0406,  0.0000]])
c.tril(-2)
#tensor([[0., 0., 0.],
        [0., 0., 0.]])



a=torch.tensor([[0,1],[1,0]])
a
tensor([[0, 1],
        [1, 0]])
b=torch.tensor([[1,2],[3,4]])
b
tensor([[1, 2],
        [3, 4]])
a.mm(b)
#tensor([[3, 4],
        [1, 2]])
b.mm(a)
#tensor([[2, 1],
        [4, 3]])


b.inverse()
RuntimeError: linalg.inv: Expected a floating point or complex tensor as input. Got Long
b.type(torch.float).inverse()
tensor([[-2.0000,  1.0000],
        [ 1.5000, -0.5000]])

b.type(torch.float).svd()
#torch.return_types.svd(
U=tensor([[-0.4046, -0.9145],
        [-0.9145,  0.4046]]),
S=tensor([5.4650, 0.3660]),
V=tensor([[-0.5760,  0.8174],
        [-0.8174, -0.5760]]))

u,s,v=b.type(torch.float).svd()
u,s,v#赋值给三个变量之后可以分别输出
(tensor([[-0.4046, -0.9145],
         [-0.9145,  0.4046]]),
 tensor([5.4650, 0.3660]),
 tensor([[-0.5760,  0.8174],
         [-0.8174, -0.5760]]))

五、张量索引访问

b=torch.randn(2,3,4)
b
tensor([[[-1.0253, -0.0181,  1.1478, -1.0745],
         [-0.8782, -0.1004,  1.1939, -0.6919],
         [-0.0109, -1.2341,  0.0130,  1.4251]],

        [[ 0.4171, -0.6423, -0.4284,  2.3591],
         [ 0.7396, -1.1485, -1.5956, -0.0347],
         [ 0.2672, -1.9008,  0.4506,  0.5160]]])


b[0]
tensor([[-1.0253, -0.0181,  1.1478, -1.0745],
        [-0.8782, -0.1004,  1.1939, -0.6919],
        [-0.0109, -1.2341,  0.0130,  1.4251]])


b[1,2]
tensor([ 0.2672, -1.9008,  0.4506,  0.5160])

b[1,1,2]
tensor(-1.5956)

b[0][[1,2]]
tensor([[-0.8782, -0.1004,  1.1939, -0.6919],
        [-0.0109, -1.2341,  0.0130,  1.4251]])

六、张量的梯度计算

张量进行梯度计算前，需要在创建变量时将requires_grad设置为True。计算梯度时必须设置tensor的数据类型为float或者complex。

grad在反向传播过程中是累加的(accumulated)，这意味着每⼀一次运⾏行行反向传播，梯度都会累加之前的梯度，所以⼀一般在反向传播之前需把梯度清零。

b=torch.tensor([[1,2],[3,4]],requires_grad=True)
b
RuntimeError: Only Tensors of floating point and complex dtype can require gradients

1.张量.grad_fn--记录张量的生成方式(直接创建/函数表达式创建)

#直接创建张量
b=torch.tensor([[1,2],[3,4]],dtype=torch.float,requires_grad=True)
b
tensor([[1., 2.],
        [3., 4.]], requires_grad=True)
print(b.grad_fn)
None
#函数表达式创建张量
c=b+1
print(c.grad_fn)
<AddBackward0 object at 0x7fb92684c490>

2.张量.requires_grad--bool型，表明张量是否计算梯度

b
tensor([[1., 2.],
        [3., 4.]], requires_grad=True)
b.requires_grad
True

c=torch.tensor([[1,2],[3,4]])
c.requires_grad
False

3.张量.requires_grad_(requires_grad=True)--用inplace方法改变张量是否需要计算梯度

c
tensor([[1, 2],
        [3, 4]])
c.to(torch.float).requires_grad_()
tensor([[1., 2.],
        [3., 4.]], requires_grad=True)

b
tensor([[1., 2.],
        [3., 4.]], requires_grad=True)
b.requires_grad_(False)
tensor([[1., 2.],
        [3., 4.]])

4.张量.backward(gradient=None,retain_graph=None,create_graph=False, inputs=None)---反向传播求导，导数值储存在变量的grad属性中。

Remark:

1.grad在反向传播过程中是累加的(accumulated)，这意味着每⼀次运行反向传播，梯度都会累加之前的梯度，所以一般在反向传播之前需把梯度清零。

2.y.backward(w) 的含义是：因为高维张量对张量的求导非常复杂，为了避免复杂的计算，pytorch只允许标量对张量求导，所以当y为张量的时候，需要导入一个和y同型的张量w，先计算 L = torch.sum(y * w) ，则张量y被转换成标量L，然后求L对自变量量 x 的导数。

对于上图所示的单层神经网络， $y=w_{1}x_{1}+w_{2}x_{2}$ ,易知： $\frac{\partial^{_{y}} }{\partial w_{1}}=x_{1},\frac{\partial_{y} }{\partial w_{2}}=x_{2}$ ,现进行验证

y和w，x均为标量时：

x1=torch.tensor(2,dtype=torch.float,requires_grad=True)
x2=torch.tensor(3,dtype=torch.float,requires_grad=True)
w1=torch.tensor(1/2,requires_grad=True)
w2=torch.tensor(1/3,requires_grad=True)
y=w1*x1+w2*x2
y.backward()#等价于y.backward(torch.tensor(1))
print(x1.grad)
print(x2.grad)
print(w1.grad)
print(w2.grad)
输出：
tensor(0.5000)
tensor(0.3333)
tensor(2.)
tensor(3.)

y为标量，x,w为向量时：

x=torch.tensor((2,3),dtype=torch.float,requires_grad=True)
w=torch.tensor((1/2,1/3),requires_grad=True)
y=w.dot(x)
y.backward()
print(x.grad)
print(w.grad)

输出：
tensor([0.5000, 0.3333])
tensor([2., 3.])

对于上图所示的神经网络，可以构建如下方程组：

$y_{1}=w_{11}x_{1}+w_{21}x_{2}+w_{31}x_{3}$

$y=w_{12}x_{1}+w_{22}x_{3}+w_{32}x_{3}$

y为标量，x和w为向量时：发现对x的两次求导会产生叠加

x=torch.tensor((2,3,4),dtype=torch.float,requires_grad=True)
w1=torch.tensor((1/2,1/3,1/4),dtype=torch.float,requires_grad=True)
w2=torch.tensor((2/2,2/3,2/4),dtype=torch.float,requires_grad=True)
y1=x.dot(w1)
y2=x.dot(w2)
y2.backward()
print(x.grad)
y1.backward()
print(x.grad)
print(w1.grad)
print(w2.grad)

输出：
tensor([1.0000, 0.6667, 0.5000])
tensor([1.5000, 1.0000, 0.7500])
tensor([2., 3., 4.])
tensor([2., 3., 4.])

y,x,w均为向量时：

x=torch.tensor((2,3,4),dtype=torch.float,requires_grad=True)
w1=torch.tensor((1/2,1/3,1/4),dtype=torch.float,requires_grad=True)
w2=torch.tensor((2/2,2/3,2/4),dtype=torch.float,requires_grad=True)
y=torch.randn(2)
y[0]=x.dot(w1)
y[1]=x.dot(w2)
y.backward()#y为向量时backward参数必须赋值，不然会报错
RuntimeError: grad can be implicitly created only for scalar outputs
y.backward(torch.ones(2))#此时y对x的导数为2x3的矩阵，需要投影到(1,1)上
print(x.grad)
print(w1.grad)
print(w2.grad)

输出：
tensor([1.5000, 1.0000, 0.7500])#输出叠加后的梯度
tensor([2., 3., 4.])
tensor([2., 3., 4.])

5.with torch.no_grad()

在此模块下，所有表达式计算出的新tensor，会自动设置requires_grad=False

with torch.no_grad():
    expressions

七、torch包

1. nn.Module

CLASS torch.nn.Module(*args,**kwargs)，所有神经网络模型的搭建都必须继承该类!!!

每个神经网络模型都必须重写__init__和forward函数，其中__init__函数中使用的参数用来进行模型的实例化，forward中使用的参数是模型的输入，使用模版如下所示：

import torch.nn as nn
import torch.nn.functional as F

class 模型名(nn.Module):
    def __init__(self):#每个神经网络模型都必须重写该函数
        super(模型名,self).__init__()#对父类nn.module进行初始化，不可以省略
        self.conv1 = nn.Conv2d(1, 20, 5)#人为定义的神经网络层，可以修改
        self.conv2 = nn.Conv2d(20, 20, 5)#人为定义的神经网络层，可以修改

    def forward(self, x):#每个神经网络模型都必须重写该函数，x为输入
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

模型名.state_dict( )---返回参数字典。

2.nn.Sequential

CLASS torch.nn.Sequential(*arg )--构建一个模型执行序列，返回值仍然为一个类。

*arg取值有两种：module1, module2... 和 OrderedDict[(str1, module1),(str2, module2)....]

#example
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Using Sequential with OrderedDict. This is functionally the
# same as the above code
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU()),
        ]))


class Net(nn.Module):
    def __init__(self, in_dim, n_hidden_1, n_hidden_2, out_dim):
        super(Net, self).__init__()
        self.layer = nn.Sequential(
            nn.Linear(in_dim, n_hidden_1),
            nn.ReLU(True),
            nn.Linear(n_hidden_1, n_hidden_2),
            nn.ReLU(True),
            nn.Linear(n_hidden_2, out_dim)
        )

    def forward(self, x):
        x = self.layer(x)
        return x

REMARK：使用Sequential的易错点是前后层的输出输入数据格式不一致

3.nn.Flatten

CLASS torch.nn.Flatten(start_dim=1,end_dim=-1)

对于任意的数据torch.randn(a, b, c, d),a~d分别和dim 0～dim3对应，同时a~d也和(N,C,H,W)一一对应，所以默认情况下调用nn.Flatten时，batch_size不会改变，其他维度被展平。

input = torch.randn(32, 1, 5, 5)
# With default parameters
m = nn.Flatten()
output = m(input)
output.size()
#输出
torch.Size([32, 25])


# With non-default parameters
m = nn.Flatten(0, 2)
output = m(input)
output.size()
#输出
torch.Size([160, 5])

4.utils.data.Dataset

CLASS torch.utils.data.Dataset(*args,**kwds)--自定义数据集

所有自定义数据集都要继承Dataset类。所有子类都必须重写__getitem__( )函数，可以选择性的重写__len__()函数。其中，重写__len__( )后实例能直接使用len( )返回数据集的大小(行数)。重写__getitem__( )后实例可以直接使用索引访问数据集。使用模版如下所示：

example1:

import torch
from torch.utils.data import Dataset


class TensorDataset(Dataset):#其中TensorDataset为数据集名
    """
    TensorDataset继承Dataset, 重载了__init__(), __getitem__(), __len__()
    实现将一组Tensor数据对封装成Tensor数据集
    能够通过index得到数据集的数据，能够通过len，得到数据集大小
    """
    def __init__(self, data_tensor, target_tensor):
        self.data_tensor = data_tensor
        self.target_tensor = target_tensor

    def __getitem__(self, index):
        return self.data_tensor[index], self.target_tensor[index]

    def __len__(self):
        return self.data_tensor.size(0)

# 生成数据
data_tensor = torch.randn(4, 3)
target_tensor = torch.rand(4)

# 将数据封装成Dataset（实例化）
tensor_dataset = TensorDataset(data_tensor, target_tensor)

# 可使用索引调用数据,此时自动调用__getitem__函数
print(tensor_dataset[1])
# 输出：(tensor([-1.0351, -0.1004,  0.9168]), tensor(0.4977))

# 获取数据集大小，此时自动调用__len__函数
print(len(tensor_dataset))
# 输出：4

example2:

import os
from PIL import Image
from torch.utils.data import Dataset


class PatchDataset(Dataset):
    def __init__(self, data_dir, transforms=None):
        """
        :param data_dir: 数据集所在路径
        :param transform: 数据预处理
        """

        self.data_info = self.get_img_info(data_dir)
        self.transforms = transforms

    def __getitem__(self, item):
        path_img, label = self.data_info[item]
        image = Image.open(path_img).convert('RGB')
        if self.transforms is not None:
            image = self.transforms(image)

        return image, label

    def __len__(self):
        return len(self.data_info)

    @static method
    def get_img_info(data_dir):
        path_dir = os.path.join(data_dir, 'train_dataset.txt')
        data_info = []
        with open(path_dir) as file:
            lines = file.readlines()
            for line in lines:
                data_info.append(line.strip('\n').split(' '))
        return data_info

5.utils.data.TensorDataset

CLASS torch.utils.data.TensorDataset(*tensors)

TensorDataset可以用来对张量进行打包，就像Python中pandas的zip功能一样。该类通过每个张量的第一个维度进行索引，因此该类中的张量第一维度必须相等。

from torch.utils.data import TensorDataset

# create dataset
x = torch.tensor([[1, 2], [3, 4], [5, 6], [7, 8]])
y = torch.tensor([0, 1, 2, 3])
dataset = TensorDataset(x, y)

# retrieve sample
sample = dataset[0]
print(sample)

#输出(tensor([1, 2]), tensor(0))

6.utils.data.DataLoader

CLASS torch.utils.data.DataLoader(dataset,batch_size=1,drop_last=False,shuffle=None,num_workers=0)

torch.utils.data.DataLoader是PyTorch中的一个数据加载器，结合了数据集和取样器，并且可以提供多个线程处理数据集。在训练模型时使用到此函数，用来把训练数据分成多个小组，此函数每次抛出一组数据，直至把所有的数据都抛出。返回对象是一个迭代器对象，迭代器中的每一个元素为一个元组，元组包含两个元素，第一个元素是batch_size个样本组成的tensor，第二个元素是batch_size个target组成的tensor。

Remark：训练集加载数据时，需要使用shuffle提高训练精度。在验证集和测试集上不需要shuffle，用来保证在相同条件下评估训练精度。

dataset	封装好的数据集,取值为tuple型，装有样本和label, 可以使用TensorDataset对两个tensor进行封装。
batch_size	dataloader每次循环时，取出的数据量大小
shuffle	是否随机返回batch，默认不随机。
num_workers	计算机进程数
drop_last	当数据集无法整除batch_size的时候，如果drop_last=True，则最后一个batch会被丢掉。 drop_last=Fasle，则最后一个batch会被保留，但是batch_size会变小。

from torch.utils.data import DataLoader, TensorDataset

# create dataset
x = torch.tensor([[1, 2], [3, 4], [5, 6], [7, 8]])
y = torch.tensor([0, 1, 2, 3])
dataset = TensorDataset(x, y)

# create data loader
data_loader = DataLoader(dataset=dataset, batch_size=2)

# retrieve batch
for batch in data_loader:
    print(batch)

[tensor([[1, 2],
        [3, 4]]), tensor([0, 1])]
[tensor([[5, 6],
        [7, 8]]), tensor([2, 3])]

7.CLASS torch.Generator(device='cpu')

生成随机数种子的代码如下所示：

g_cpu = torch.Generator()
g_cpu.manual_seed(2147483647)

8.nn.ModuleList

CLASS torch.nn.ModuleList(modules=None),其中modules取值为列表。modulelist可以按照python的列表操作方式进行index访问以及增删操作。作用：与nn.Sequential类似。

module_list=nn.ModuleList([nn.Linear(30,10),nn.MaxPool2d(2),nn.Conv2d(1,1,3)])
print(module_list)
#输出
ModuleList(
  (0): Linear(in_features=30, out_features=10, bias=True)
  (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (2): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1))
)

module_list[0]
#Linear(in_features=30, out_features=10, bias=True)

module_list.append(nn.MaxPool2d(3))
#ModuleList(
  (0): Linear(in_features=30, out_features=10, bias=True)
  (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (2): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1))
  (3): MaxPool2d(kernel_size=3, stride=3, padding=0, dilation=1, ceil_mode=False)
)

9.nn.ModuleDict

CLASS torch.nn.ModuleDict(modules=None),其中modules取值为字典。moduledict可以按照python的字典操作方式进行keys访问以及增删操作。作用：与nn.Sequential类似。

dict=nn.ModuleDict({"layer1":nn.Conv2d(1,3,5),"layer2":nn.MaxPool2d(2),"layer3":nn.Linear(100,30)})
dict
#输出
ModuleDict(
  (layer1): Conv2d(1, 3, kernel_size=(5, 5), stride=(1, 1))
  (layer2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (layer3): Linear(in_features=100, out_features=30, bias=True)
)

dict['layer2']
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)

dict.values()
odict_values([Conv2d(1, 3, kernel_size=(5, 5), stride=(1, 1)), MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), Linear(in_features=100, out_features=30, bias=True)])

10.nn.init.normal_(tensor,mean=0.0,std=1.0)

其中，tensor为输入的任意张量，通过init.normal_函数，把tensor中的元素改成从N $(mean,std^{2})$ 中随机抽样，作用：实现参数初始化。

w = torch.arange(9).reshape(3,3).type(torch.float)
print(w)
#tensor([[0., 1., 2.],
        [3., 4., 5.],
        [6., 7., 8.]])


nn.init.normal_(w)
#tensor([[ 1.0046, -0.7605,  1.0955],
        [-0.1729, -0.4962,  0.6871],
        [-0.4778, -0.5271, -0.6193]])

11.nn.init.constant_(tensor,val)

其中，tensor为输入的张量，通过init.constant_函数，把tensor中的元素改成val。作用：实现参数初始化。

w = torch.empty(3, 5)
nn.init.constant_(w, 0.3)

#输出
tensor([[0.3000, 0.3000, 0.3000, 0.3000, 0.3000],
        [0.3000, 0.3000, 0.3000, 0.3000, 0.3000],
        [0.3000, 0.3000, 0.3000, 0.3000, 0.3000]])

八、torchvision包

torchvision是一个和pytorch配合使用的Python包，主要用来构建计算机视觉模型。它包含一些常用的数据集、模型、转换函数等等，可以帮助你更方便地构建计算机视觉模型。

1. ToTensor

Class torchvision.transforms.ToTensor( )-把PIL图像 or ndarray 转换成张量，并进行归一化处理

image2=PIL.Image.open("/Users/xuan/Desktop/2.png")
trans=transforms.ToTensor()#类一定要实例化才可以使用
trans(image2)
#输出
tensor([[[0.9725, 0.9725, 0.9725,  ..., 0.9725, 0.9725, 0.9725],
         [0.9725, 0.9725, 0.9725,  ..., 0.9725, 0.9725, 0.9725],
         [0.9725, 0.9725, 0.9725,  ..., 0.9725, 0.9725, 0.9725],
         ...,
         [0.9137, 0.9137, 0.9137,  ..., 0.9137, 0.9137, 0.9137],
         [0.9137, 0.9137, 0.9137,  ..., 0.9137, 0.9137, 0.9137],
         [0.9137, 0.9137, 0.9137,  ..., 0.9137, 0.9137, 0.9137]]])
------------------------------------------------------------------------

arr=cv2.imread("/Users/xuan/Desktop/2.png")
tt=torchvision.transforms.ToTensor()
tt(arr)
#输出
tensor([[[0.9137, 0.9137, 0.9137,  ..., 0.9137, 0.9137, 0.9137],
         [0.9137, 0.9137, 0.9137,  ..., 0.9137, 0.9137, 0.9137],
         [0.9137, 0.9137, 0.9137,  ..., 0.9137, 0.9137, 0.9137],
         ...,
         [0.9725, 0.9725, 0.9725,  ..., 0.9725, 0.9725, 0.9725],
         [0.9725, 0.9725, 0.9725,  ..., 0.9725, 0.9725, 0.9725],
         [0.9725, 0.9725, 0.9725,  ..., 0.9725, 0.9725, 0.9725]]])

2.datasets-- biuld-in数据集

torchvision.datasets.MNIST(root,train=True,transform=None,target_transform=None,download=False),读取的数据集为一个列表元组，每一个元素为一个元组(样本)，元组的第一个元素为图像的tensor，元组的第二个元素为target值。

root	存放datasets的路径
train	train=True,代表训练集 train=False，代表测试集
transform	取值为函数/transform，用来转换PIL image
target_transform	取值为函数/transform，用来转换target
download	是否从internet下载数据集，默认不下载

train_set=torchvision.datasets.MNIST(root="/Users/xuan/Desktop/深度学习",download=True)
test_set=torchvision.datasets.MNIST(root="/Users/xuan/Desktop/深度学习",train=False,download=True)
image,target=test_set[0]
print(image)
print(target)
#输出
<PIL.Image.Image image mode=L size=28x28 at 0x7F8632C10C40>
7

-------------------------------------------------------------------------------------
trans=torchvision.transforms.ToTensor()
test_set=torchvision.datasets.MNIST(root="/Users/xuan/Desktop/深度学习",train=False,transform=trans,download=True)
image,target=test_set[0]
print(image)
print(target)
#输出
tensor([[[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.2392, 0.9490, 0.9961, 0.9961, 0.2039, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.4745, 0.9961, 0.9961, 0.8588, 0.1569, 0.0000,
         
7

3.resize--调整图片大小

CLASS torchvision.transforms.Resize(size),其中size为想要输出的图片的大小；size的格式为(H,W)。

resize=torchvision.transforms.Resize((a,b))
resize(input)#input的类型为PIL Image 或者 tensor

Remark：png图像为4通道图像，除了三个颜色通道RGB以外还有一个透明度通道，需要用‘图像名.convert('RGB')转化成三通道图像or'图像名.convert('L')转化成单通道。

九、PIL包

PIL全称是Python Imaging Library，主要用于图像的基本处理，比如裁剪图像、调整图像大小和图像颜色处理等。

1.PIL.Image.open(fp, mode='r', formats=None),其中fp为文件的路径。把图像读取为PIL image类型

import PIL
PIL.Image.open("/Users/xuan/Desktop/2.png")#打开一个image

2.图像名.show()---把PIL image显示成图片

十、OpenCV

OpenCV是一个开源的计算机视觉库，它包含了很多用于图像处理和计算机视觉的函数和工具。你可以使用OpenCV来读取、处理和保存图像，以及进行各种图像处理操作，如滤波、边缘检测、形态学操作等等。在Python中，你可以使用cv2模块来调用OpenCV函数。

1.cv2.imread(filename)--把图像读成ndarray后输出

cv2.imread("/Users/xuan/Desktop/2.png")
#输出
array([[[233, 246, 248],
        [233, 246, 248],
        [233, 246, 248],dtype=uint8)

十一、神经网络

Remark:1.在pytorch中使用神经网络最好加入BN层归一化，以免梯度爆炸不收敛。sklearn中的神经网络封装的较好，不用处理也能正常运行。

2.搭建神经网络模型时，forward函数中的模型实例化必须使用out=model(input)的形式，不能使用如下形式：

def forward(self,x):
    model1=model()
    output=model1(x)
    return output

#error:forward() missing 1 required positional argument: 'input'

1.卷积层

CLASS torch.nn.Conv2d(in_channels,out_channels,kernel_size,stride=1,padding=0,device=None,dtype=None)

in_channels	输入图像的通道数，int
out_channels	卷积操作后的输出通道数，int
kernel_size	卷积核的size size为整数n时，代表nn的卷积核 size为tuple(m, n)时，代表mn的卷积核
stride	卷积核在输入图像上每次滑动的行数和列数，int/tuple
padding	表示输入图像四周填充的宽度值，int/tuple

如何通俗易懂地解释卷积？1.4 万赞同 · 450 评论回答编辑

工作原理：在二维互相关运算中，卷积窗⼝从输入数组的最左上⽅开始，按从左往右、从上往下的顺序，依次在输入数组上滑动。⼆维卷积层将输入和卷积核做互相关运算，并加上一个标量偏差来得到输出。卷积层的模型参数包括了卷积核和标量偏差。

输入通道数=卷积核通道数

卷积核个数=输出通道数

Remrak：1.输出通道数等于卷积核的个数，输入通道数等于卷积核通道数

2.模型的输入格式： $(N,C_{in},H_{in},W_{in})or(C_{in},H_{in},W_{in})$

3.模型的输出格式： $(N,C_{out},H_{out},W_{out})or(C_{out},H_{out},W_{out})$

4.padding的值要根据公式计算，kernel的大小不同，最终占用的行数不一样。

import torch.nn as nn
m = nn.Conv2d(3, 4, 3, stride=2)
input = torch.randn(2, 3, 5, 10)
output = m(input)
print(output)

#输出
tensor([[[[ 0.3600, -0.3192,  0.2018,  0.6144],
          [-0.1411,  0.2367,  0.5510, -0.2633]],

         [[-0.6624, -0.1053,  0.2785,  0.2037],
          [ 0.9166,  0.2029,  0.0881, -0.2335]],

         [[-0.1165,  0.3067, -0.0125, -0.1338],
          [-0.5957,  0.9807, -0.8395, -0.4014]],

         [[-0.8542,  0.0231, -0.7913,  0.7277],
          [ 1.6425, -0.5302,  0.7196, -0.5789]]],


        [[[ 1.2063,  0.5560, -0.3663,  0.4920],
          [-0.1443, -0.4489, -0.1167,  0.5509]],

         [[-0.3443,  0.0693, -0.7392,  0.6925],
          [ 0.6646,  0.2785, -0.5842, -0.1094]],

         [[-0.4770, -0.2577,  0.0899,  0.2201],
          [ 0.0733,  0.2156,  0.3627,  0.0953]],

         [[-0.9787,  0.5221, -0.3329,  0.1458],
          [ 0.4737, -0.5674, -0.5654, -0.2685]]]],
       grad_fn=<ConvolutionBackward0>)

output.shape
#torch.Size([2, 4, 2, 4])

2.池化层

池化层的操作原理类似卷积层，用一个池化核去在输入图像上求max值作为输出，池化层不需要学习权重，只需要指定size即可，池化层可以缓解卷积层对位置的过度敏感性，具体理解可以参考：

幽灵公主：pytorch学习：池化层的作用64 赞同 · 4 评论文章

语法：CLASS torch.nn.MaxPool2d(kernel_size,stride=None,padding=0,dilation=1,,ceil_mode=False)

kernel_size	卷积核的size size为整数n时，代表nn的卷积核 size为tuple(m, n)时，代表mn的卷积核
stride	卷积核在输入图像上移动的步长，int/tuple 默认值为kernel_size。
padding	在原图像的矩阵周围填充的宽度，int/tuple
ceil_mode	ceil_mode=False, 池化核跑到原图像外面，则舍弃 ceil_mode=False, 池化核跑到原图像外面，依然生效

#建立一个5*6的图像数据集
import numpy as np
import torch.nn as nn
data=torch.tensor(np.arange(30),dtype=torch.float).reshape(-1,1,5,6)#输入必须满足(NCHW)

#搭建神经网络模型
class model1(nn.Module):
    def __init__(self):
        super(model1,self).__init__()
        self.layer1=nn.MaxPool2d(3,ceil_mode=False)
    
    def forward(self,input):
        output=self.layer1(input)
        return output

#模型实例化
nn_pool=model1()
nn_pool(data)

#输出
tensor([[[[14., 17.]]]])

Remrak：1.模型的输入数据类型为float，不能为int

2.模型的输入格式： ( $N,C_{in},H_{in},W_{in}$ )

3.模型的输出格式： ( $N,C_{out},H_{out},W_{out}$ )

4.在处理多通道输⼊数据时，池化层对每个输入通道分别池化，⽽不是像卷积层那样将各通道的输⼊入按通道相加。这意味着池化层的输出通道数与输⼊入通道数相等。

3.ReLU激活函数

Remark：所有的激活函数输入和输出可以为任意格式，只需要保证输入和输出的维度相同即可。

CLASS torch.nn.ReLU(inplace=False)，inplace代表是否对input进行修改。

$ReLU(x)+=max(0,x)$

data=torch.tensor([[2,-1],[-1,1]])
print(data)
#输出
tensor([[ 2, -1],
        [-1,  1]])


net=nn.ReLU()
print(net(data))
#输出
tensor([[2, 0],
        [0, 1]])

4.Sigmoid激活函数

CLASS torch.nn.Sigmoid(*args,**kwargs)

$Sigmoid(x)=\sigma (x)=\frac{1}{1+exp(-x)}$

data=torch.tensor([[2,-1],[-1,1]])
print(data)
#输出
tensor([[ 2, -1],
        [-1,  1]])


net=nn.Sigmoid()
print(net(data))
#输出
tensor([[0.8808, 0.2689],
        [0.2689, 0.7311]])

5.softmax激活函数

CLASS torch.nn.Softmax(dim=None) ,其中dim代表沿着哪一维进行softmax运算。dim=0代表按行方向计算softmax，dim=1代表按列方向计算softmax，维度方向上的概率和为1。

$Softmax(x_{i})=\frac{exp(x_{i})}{\sum _{j}exp(x_{j})}$ ，Softmax的input必须为float，不能为int类型。

data=torch.tensor([[2,-1],[-1,1]],dtype=torch.float)
print(data)
#输出
tensor([[ 2., -1.],
        [-1.,  1.]])


net=nn.Softmax(dim=1)
print(net(data))
#输出
tensor([[0.9526, 0.0474],
        [0.1192, 0.8808]])

6.线性层(全连接层)--MLP(多层感知机中使用很多)

CLASS torch.nn.Linear(in_features,out_features,bias=True,device=None,dtype=None)

构建线性模型： $y=xA^{T}+b$

in_features	输出的x的特征数，int
out_features	输出的y的特征数，int
bias	是否加入偏置，默认不加，bool

m = nn.Linear(3, 2)#模型实例化
input = torch.randn(10, 3)
output = m(input)#给模型输入input
print(output)

#输出：tensor([[-0.5616,  1.4303],
        [ 0.7555,  0.4399],
        [-0.0664,  0.7192],
        [-0.1061, -0.5738],
        [ 0.0684,  1.0542],
        [-0.4535, -0.1720],
        [ 0.5164,  1.4096],
        [ 0.8567,  0.4455],
        [ 0.2171,  1.3015],
        [-0.6830,  0.6528]], grad_fn=<AddmmBackward0>)

Remark：1. 输入格式: $(*,H_{in}),H_{in}$ =in_features

2.输出格式： $(*,H_{out}),H_{out}$ =out_features，即使 $H_{out}$ =1 ，输出也必须是二维tensor。

3.该模型支持的数据类型为TensorFloat32. 亲测使用TensorFloat64会报错！！！可以使用模型名.to(torch.float64)可以支持双精度浮点数。

7.损失函数--只能求单个batch的损失

L1损失：CLASS torch.nn.L1Loss(size_average=None,reduce=None,reduction='mean')

reduction="mean",计算平均绝对误差；reduction="sum",计算绝对误差之和

import torch
from torch import nn
loss = nn.L1Loss(reduction="sum")
input = torch.tensor([[1,2],[3,4]],dtype=torch.float)
target = torch.tensor([[1,2],[3,5]],dtype=torch.float)
output = loss(input, target)#loss的输入和输出必须都是float/complex类型
print(output)

#tensor(1.)

L2损失：CLASS torch.nn.MSELoss(size_average=None,reduce=None,reduction='mean')

reduction="mean",计算平均平方误差；reduction="sum",计算平方误差之和。

import torch
from torch import nn
loss = nn.MSELoss(reduction="mean")
input = torch.tensor([[1,2],[3,4]],dtype=torch.float)
target = torch.tensor([[1,2],[3,6]],dtype=torch.float)
output = loss(input, target)
print(output)

#tensor(1.)

交叉熵损失：CLASS torch.nn.CrossEntropyLoss(weight=None,size_average=None,ignore_index=-100,reduce=None,reduction='mean',label_smoothing=0.0)

# Example of target with class indices
loss = nn.CrossEntropyLoss()
input = torch.randn(3, 5, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(5)
output = loss(input, target)
output.backward()


# Example of target with class probabilities
input = torch.randn(3, 5, requires_grad=True)
target = torch.randn(3, 5).softmax(dim=1)
output = loss(input, target)
output.backward()

8.优化器（执行算法）

step1: 建立优化器，此处的model.parameter（）要使用模型的实例对象非类名。

optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

step2: 执行优化器，step方法更新参数，一般要进行多轮训练，只需要在外面加一个for循环即可。

for input, target in dataset:
    optimizer.zero_grad()#梯度清零
    output = model(input)
    loss = loss_fn(output, target)
    loss.backward()
    optimizer.step()

REMARK：SGD算法会随机初始化权重和偏置，可能会得到不同的结果，所以最好设置一个随机数种子torch.manual_seed(seed)，保证每次运行得到相同结果。

9.dropout层

对于任意的输入 $x=(x_{1},x_{2}...x_{n})$ 和输出y，dropout层对每一个输入的维度 $x_{i}$ 会按照0-1份布，以概率p取0作为 $x_{i}$ 的权重，所以dropout层作用于输入x后，会产生一个新的输入 $(x_{1}\acute{},x_{2}\acute{}... x_{n}\acute{})$ $x\acute{}=(x_{1}\acute{},x_{2}\acute{}...x_{n}\acute{})$ 。作用是减少模型的过拟合。

CLASS torch.nn.Dropout(p=0.5,inplace=False),其中p为取0的概率

m = nn.Dropout(p=0.7)
input = torch.randn(4, 3)
output = m(input)

tensor([[-0.0000, 0.0000, -0.0000],
        [0.0000, 0.0000, 0.0000],
        [3.1193, 7.2059, 0.0000],
        [0.0000, -0.0000, -0.0000]])
#可以发现，在p=0.7时，input有很大的概率取0

10.数据集划分

torch.utils.data.random_split(dataset,lengths,generator=<torch._C.Generatorobject>),返回对象为划分后的两个数据集。

dataset	待划分数据集
lengths	[m,n,....] 若m，n为正整数，则代表划分后的数据集的长度若m,n为0～1的小数，且和为1，则表示划分的百分比
generator	一般用来生成随机数种子

import torch
from torch.utils.data import random_split
dataset = range(10)
train_dataset, test_dataset = random_split(
    dataset=dataset,
    lengths=[7, 3],
    generator=torch.Generator().manual_seed(42)
)
print(train_dataset)
print(list(test_dataset))

#输出
<torch.utils.data.dataset.Subset object at 0x7fb726502550>
[9, 3, 7]

Remark:1.神经网络的数据需要进行标准化，不然在梯度迭代过程中可能出现loss无穷大的情况。

2.在验证集和测试集上，最好取消梯度的计算加快计算过程。

11.随机梯度下降SGD

CLASS torch.optim.SGD(params,lr,momentum=0,dampening=0,weight_decay=0,nesterov=False,*,maximize=False,foreach=None,differentiable=False)

params	模型待优化参数,常见取值为model.parameters( )
lr	学习速率，取值类型为float
momentum	动量因子是一个介于0和1之间的值，它控制了之前梯度的影响程度。当动量因子越大时，之前梯度的影响就越大，这有助于优化器跳出局部最优解并更快地收敛到全局最优解。

12.初始化参数

模型名.parameters( )---返回一个有名称的参数迭代器，初始参数随机生成。

model1=nn.Conv2d(1,1,3)
model1.parameters()
#输出对象类型为generator
<generator object Module.parameters at 0x7f92d5c87c10>

list(model1.parameters())
#转化为列表后可以查看参数
[Parameter containing:
 tensor([[[[ 0.2116, -0.1535,  0.1537],
           [-0.0279,  0.1439, -0.0484],
           [ 0.2720, -0.2906,  0.2307]]]], requires_grad=True),
 Parameter containing:
 tensor([0.2716], requires_grad=True)]

模型名.named_parameters( )---返回一个有名称的参数迭代器，初始参数随机生成。

list(nn.Conv2d(1,1,3).named_parameters())
#输出
[('weight',
  Parameter containing:
  tensor([[[[-0.2901, -0.1788, -0.3073],
            [ 0.0752,  0.1680,  0.3188],
            [-0.1486, -0.3268,  0.0718]]]], requires_grad=True)),
 ('bias',
  Parameter containing:
  tensor([-0.2626], requires_grad=True))]

模型名.weight 和 模型名.bias可以分别访问模型的权重参数和偏置参数。

model1.weight
#输出
Parameter containing:
tensor([[[[ 0.2116, -0.1535,  0.1537],
          [-0.0279,  0.1439, -0.0484],
          [ 0.2720, -0.2906,  0.2307]]]], requires_grad=True)


model1.bias
#输出
Parameter containing:
tensor([0.2716], requires_grad=True)

模型名.state_dict( )---以字典返回模型的参数。

model1.state_dict()
#输出
OrderedDict([('weight',
              tensor([[[[ 0.2116, -0.1535,  0.1537],
                        [-0.0279,  0.1439, -0.0484],
                        [ 0.2720, -0.2906,  0.2307]]]])),
             ('bias', tensor([0.2716]))])

model1.state_dict()["weight"]
#输出
tensor([[[[ 0.2116, -0.1535,  0.1537],
          [-0.0279,  0.1439, -0.0484],
          [ 0.2720, -0.2906,  0.2307]]]])

13.批量归一化

CLASS torch.nn.BatchNorm2d(num_features, eps=1e-05，momentum=0.1,device=None,dtype=None)--对一个4维输入(N,C,H,W)进行归一化(标准化)。其中，num_features代表通道数C。eps表示分母上的方差加的常数，momentum表示上一个批次的均值对下个批次的均值和方差计算的影响程度,单词归一化不用设置。

sample=torch.arange(9).reshape(1,1,3,3).type(torch.float)
print(sample)
#输出
tensor([[[[0., 1., 2.],
          [3., 4., 5.],
          [6., 7., 8.]]]])

net=nn.BatchNorm2d(1)#实例化一定要指定通道数
net(sample)
#输出
tensor([[[[-1.5492, -1.1619, -0.7746],
          [-0.3873,  0.0000,  0.3873],
          [ 0.7746,  1.1619,  1.5492]]]], grad_fn=<NativeBatchNormBackward0>)

CLASS torch.nn.BatchNorm1d(num_features,eps=1e-05，momentum=0.1，device=None,dtype=None)--对一个2维输入(N,C)进行归一化(标准化)。其中，num_features代表feature的个数C。eps表示分母上的方差加的常数，momentum表示上一个批次的均值对下个批次的均值和方差计算的影响程度,单词归一化不用设置。

sample=torch.arange(9).reshape(3,3).type(torch.float)
print(sample)
#输出
tensor([[0., 1., 2.],
        [3., 4., 5.],
        [6., 7., 8.]])


net=nn.BatchNorm1d(3)
net(sample)
#输出
tensor([[-1.2247, -1.2247, -1.2247],
        [ 0.0000,  0.0000,  0.0000],
        [ 1.2247,  1.2247,  1.2247]], grad_fn=<NativeBatchNormBackward0>)

Remark: BatchNorm1d要求输入(N,C)中的N>1，且数据是在C维度进行标准化。

14.Adam算法

CLASS torch.optim.Adam(params,lr=0.001,betas=(0.9,0.999),eps=1e-08,weight_decay=0,amsgrad=False,*,foreach=None,maximize=False,capturable=False,differentiable=False,fused=None)

15.归一化函数

torch.nn.functional.normalize(input,p=2.0,dim=1,eps=1e-12,out=None)，其中input为任意维度的tensor，p代表 Lp正则化，dim表示按照哪个维度正则化，默认按行。

16.乘法函数

torch.mul(input,other,*,out=None),input和other对应位置相乘

a=torch.tensor([[2,0],[1,-1]])
b=torch.tensor([[1,0],[0,1]])
print(torch.mul(a,b))
print(a.mul(b))
输出：
tensor([[ 2,  0],
        [ 0, -1]])

torch.matmul(input,other,*,out=None),input和other的矩阵乘法

torch.matmul(a,b)
torch.mm(a,b)

输出：
tensor([[ 2,  0],
        [ 1, -1]])

17.合并函数

torch.cat(tensors,dim=0,*,out=None)，tensors为一个tensor sequence，cat函数按照指定的dim维度合并多个tensor成为一个tensor。

a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[5, 6]])

# 在第 0 维上连接张量 a 和 b
c = torch.cat((a, b), dim=0)
print(c)
# 输出:
# tensor([[1, 2],
#         [3, 4],
#         [5, 6]])

# 在第 1 维上连接张量 a 和 b
d = torch.cat((a, b.t()), dim=1)
print(d)
# 输出:
# tensor([[1, 2, 5],
#         [3, 4, 6]])

十二、图像

1.RGB图像

RGB图像中，可以用P(400,300,3)这样一个三维矩阵表示。其中，前面的400，300表示的是图像的空间信息，400表示行数，300表示列数。
3表示的是有三种基本颜色（红R，绿G，蓝B）的通道，在其中一层矩阵如（400，300，1）红色通道矩阵，表示该颜色红色光的灰度值。

2.NHWC和NCHW

其中 N 表示batch size；C表示 feature maps 的数量，又称之为通道数；H 表示图片的高度，W表示图片的宽度

jie.hang：Pytorch NCHW/NHWC 理解50 赞同 · 2 评论文章编辑

3.imshow

语法：matplotlib.pyplot.imshow(X,cmap=None,norm=None,*,aspect=None,interpolation=None,alpha=None,vmin=None,vmax=None,origin=None,extent=None,interpolation_stage=None,filternorm=True,filterrad=4.0,resample=None,url=None,data=None,**kwargs)

X	数组orPIL图像数组的输入格式(其中M为行数，N为列数)： (M, N): (M, N, 3): RGB values (0-1 float or 0-255 int). (M, N, 4):RGBA values (0-1 float or 0-255 int）
cmap	全称为colormap，默认为RGB彩色图像
alpha	0～1之间的float，指定图像透明度

十三、从文件中读取数据/保存数据

1.torch.save(数据object,文件路径)

x = torch.ones(3)
torch.save(x, '/Users/xuan/Desktop/x.py')
#会在桌面上生成一个二进制的数据文件

2.torch.load(文件路径)

torch.load('/Users/xuan/Desktop/x.py')
#输出
tensor([1., 1., 1.])

十四、用GPU运行程序

默认情况下，PyTorch会将数据创建在内存，然后利利⽤用CPU来计算。使用GPU的步骤：

step1: 将数据创建在GPU上(可以在创建数据时指定device 或者数据.cuda()复制数据到GPU 或者使用数据.to(device) )

step2:将神经网络模型和损失函数创建在GPU上，优化器和加载器不用创建在GPU上。

在GPU上进行运算的结果会默认保存在GPU上，GPU和CPU上的数据无法进行运算。to(device)方法会在指定的设备上创建张量或模型的副本。这意味着，当你使用to(device)方法将张量或模型从CPU移动到GPU时，原始数据仍然保留在CPU上，而新的副本将在GPU上创建。

查看电脑是否有GPU：    torch.cuda.is_available()

查看电脑GPU的数量：    torch.cuda.device_count() 

查看当前GPU的编号：    torch.cuda.current_device()

把数据复制到GPU：      数据.cuda(device=None),其中device指定GPU的编号

查看数据当前的device： 数据.device

查看显卡信息：        !nvidia-smi

指定device的常用模版：

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
x = torch.tensor([1, 2, 3], device=device)

# or

x = torch.tensor([1, 2, 3]).to(device)

在google的colab上通过修改-笔记本设置-GPU，能够实现免费的GPU计算加速。