深度学习个人问题汇总（一）

东方树叶我爱喝

已于 2024-05-07 15:57:32 修改

阅读量571

点赞数 17

文章标签：深度学习人工智能

于 2024-05-06 16:31:00 首次发布

本文链接：https://blog.csdn.net/a5014858/article/details/138496653

版权

本文章为个人学习深度学习中遇到的不理解或者容易遗忘的问题汇总。时不时更新。

1.深度学习中，构建卷积网络之后，kernel_size,padding,stride等对于输入输出的影响公式

图像shape为h*w，kernel_size为k，padding为p，stride为s

输出图像shape为h_out = $\frac{h-k+2p}{s}+1$ ，w_out = $\frac{w-k+2p}{s}+1$

2.在网络中经常能遇到nn.Flatten函数和flatten函数，下面对其进行剖析

nn.Flatten的作用即从指定的start_dim开始到end_dim进行降维操作，即将上述维度进行乘积。

nn.Flatten函数有两个参数:start_dim和end_dim即开始维度和结束维度。start_dim默认为1

而flatten基本相同，除了start_dim默认参数为0

例子flatten：

import torch
t1 = torch.rand([1,3,128,128])
print(t1.size())
print(t1)
>torch.Size([1, 3, 128, 128])
tensor([[[[0.7181, 0.7940, 0.8491,  ..., 0.2852, 0.1237, 0.4918],
          [0.3057, 0.6594, 0.1851,  ..., 0.1703, 0.2182, 0.7380],
          [0.2997, 0.8416, 0.2333,  ..., 0.4604, 0.8216, 0.0224],
          ...,
          [0.8779, 0.6553, 0.9430,  ..., 0.7996, 0.1001, 0.6024],
          [0.0629, 0.8335, 0.2517,  ..., 0.9505, 0.6166, 0.8198],
          [0.7135, 0.2935, 0.7577,  ..., 0.4647, 0.9408, 0.7737]],

         [[0.5488, 0.7251, 0.4430,  ..., 0.0023, 0.5326, 0.5516],
          [0.3260, 0.9030, 0.0071,  ..., 0.1855, 0.0209, 0.7478],
          [0.8567, 0.4301, 0.9661,  ..., 0.5774, 0.3355, 0.4649],
          ...,
          [0.5537, 0.4356, 0.0602,  ..., 0.8001, 0.9198, 0.7765],
          [0.0170, 0.5507, 0.4464,  ..., 0.2920, 0.6682, 0.7626],
          [0.3388, 0.0856, 0.9224,  ..., 0.9605, 0.0346, 0.8537]],

         [[0.4628, 0.7194, 0.6533,  ..., 0.0895, 0.6685, 0.3940],
          [0.6205, 0.0436, 0.5674,  ..., 0.6176, 0.1962, 0.4934],
          [0.1044, 0.4129, 0.3382,  ..., 0.0396, 0.4663, 0.9524],
          ...,
          [0.7869, 0.6005, 0.5627,  ..., 0.1468, 0.5692, 0.5209],
          [0.5684, 0.5694, 0.3933,  ..., 0.4613, 0.4954, 0.3585],
          [0.7571, 0.5818, 0.9687,  ..., 0.7675, 0.4926, 0.2255]]]])

t2 = t1.flatten()
print(t2.size())
print(t2)
>torch.Size([49152])
tensor([0.7181, 0.7940, 0.8491,  ..., 0.7675, 0.4926, 0.2255])

nn.Flatten例子看实例3.

3.结合1和2进行一个实例的操作

import torch
import torch.nn as nn

net = nn.Sequentail(
    nn.Conv2d(1,6,kernel_size = 5,padding=2),nn.Sigmoid(),
    nn.AvgPool2d(kernel_size = 2,stride = 2),
    nn.Conv2d(6,16,kernel_size = 5),nn.Sigmoid(),
    nn.AvgPool2d(kernel_size = 2,stride=2),
    nn.Flatten(),
    nn.Linear(16*5*5,120),nn.Sigmoid(),
    nn.Linear(120,84),nn.Sigmoid(),
    nn.Linear(84,10)
)

上述代码中我们构建了一个LeNet，其中包含了2个卷积2个平均池化层和3个线性层。

下面我们对每一层的输入输出进行分析

假设输入如下：

input_data = torch.randn(1,1,28,28)

经过第一个卷积层之后,运用1中的公式 $\frac{wh-k+2p}{s}+1$ ，得到(28+2*2-5)/1 + 1 = 28

下面来看结果，和公式中获得的结果相同。

net = nn.Sequential(
    nn.Conv2d(1,6,kernel_size=5,padding=2),nn.Sigmoid()
)
input_data = torch.randn(1,1,28,28)
output_data = net(input_data)
print(output_data.shape)

>>torch.Size([1,6,28,28])

依此类推到nn.Flatten(),由于此处的Flatten没有参数，所以默认为从维度1到最后一个维度。故得到的应该是所有维度大小的乘积1*16*5*5为400，故结果应该为torch.Size([1,400])

下面来看代码执行结果,与计算的相符。

import torch
from torch import nn

net = nn.Sequential(
    nn.Conv2d(1, 6, kernel_size=5, padding=2), nn.Sigmoid(),
    nn.AvgPool2d(kernel_size=2, stride=2),
    nn.Conv2d(6, 16, kernel_size=5), nn.Sigmoid(),
    nn.AvgPool2d(kernel_size=2, stride=2),
    nn.Flatten(),
)
input_data = torch.randn(1,1,28,28)
output_data = net(input_data)
print(output_data.shape)

>>torch.Size([1,400])

显示整体网络结构输出的代码：

X = torch.rand(size=(1, 1, 28, 28), dtype=torch.float32)
for layer in net:
    X = layer(X)
    print(layer.__class__.__name__,'output shape: \t',X.shape)

输出为：
Conv2d output shape:         torch.Size([1, 6, 28, 28])
Sigmoid output shape:        torch.Size([1, 6, 28, 28])
AvgPool2d output shape:      torch.Size([1, 6, 14, 14])
Conv2d output shape:         torch.Size([1, 16, 10, 10])
Sigmoid output shape:        torch.Size([1, 16, 10, 10])
AvgPool2d output shape:      torch.Size([1, 16, 5, 5])
Flatten output shape:        torch.Size([1, 400])
Linear output shape:         torch.Size([1, 120])
Sigmoid output shape:        torch.Size([1, 120])
Linear output shape:         torch.Size([1, 84])
Sigmoid output shape:        torch.Size([1, 84])
Linear output shape:         torch.Size([1, 10])

2024年5月6日

4.Pytorch构建网络之后查看网络结构或参数，部分内置函数经常忘记

截取LeNet部分网络结构

net = nn.Sequential(
    nn.Conv2d(1,6,kernel_size=5,padding=2),nn.Sigmoid(),
    nn.AvgPool2d(kernel_size=2,stride=2),
)

1.使用model.modules()

迭代遍历模型所有对象

for i in net.modules():
  print(i)
>>>
Sequential(
  (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (1): Sigmoid()
  (2): AvgPool2d(kernel_size=2, stride=2, padding=0)
)
Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
Sigmoid()
AvgPool2d(kernel_size=2, stride=2, padding=0)

2.使用model.children()

model.children()只会遍历模型的子层

for i in net.children():
  print(i)
>>>
Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
Sigmoid()
AvgPool2d(kernel_size=2, stride=2, padding=0)

3.使用model.parameters()

返回网络结构中最底层的所有层的参数。

for i in net.parameters():
  print(i)
>>>
Parameter containing:
tensor([[[[ 0.1587, -0.1602,  0.0730, -0.0558,  0.1954],
          [-0.0500, -0.0143, -0.0612,  0.0379,  0.1019],
          [ 0.0468,  0.0336, -0.1118, -0.1710,  0.0816],
          [ 0.0505,  0.0480, -0.1639,  0.1576, -0.0411],
          [ 0.1380,  0.1323, -0.0874,  0.0520,  0.1683]]],
...., requires_grad=True)
Parameter containing:
tensor([-0.1442,  0.0397,  0.1659, -0.1879, -0.1499,  0.1589],
       requires_grad=True)

2024.5.7