你的神经网络基础过关了吗？（Pytorch版本）

阿狗哲哲

已于 2023-09-06 13:41:57 修改

阅读量161

点赞数

分类专栏：机器学习与深度学习文章标签：神经网络深度学习 pytorch

于 2023-01-16 20:55:42 首次发布

本文链接：https://blog.csdn.net/qq_52438590/article/details/128704530

版权

机器学习与深度学习专栏收录该内容

6 篇文章

订阅专栏

1.卷积

1.1torch.nn.functional

1.1.1参数

输入类型需要有四个参数的输入，batchsize，通道数，高，宽

kernel_size (int or tuple) – Size of the convolving kernel //卷积核

stride (int or tuple, optional) – Stride of the convolution. Default: 1 //步长，为int或者元组形式，1or[3,4]代表横向步长为3，纵向步长为4

padding (int, tuple or str, optional) – Padding added to all four sides of the input. Default: 0

padding_mode (str, optional) – 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros'

dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1

groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1 //是否是分组卷积

bias (bool, optional) – If True, adds a learnable bias to the output. Default: True //是否加偏置

1.1.2卷积过程

下图为步长为1，卷积核为3*3,的卷积过程，可以看到输出图像变为3*3，若步长为2，输出变为2*2

1.1.3代码


# -*- coding: utf-8 -*-
# @Author  : 大叔azhe
# @Time    : 2023/1/16 15:09
# @Function:
import torch
import torch.nn.functional as F

input = torch.tensor([[1, 2, 0, 3, 1],
                      [0, 1, 2, 3, 1],
                      [1, 2, 1, 0, 0],
                      [5, 2, 3, 1, 1],
                      [2, 1, 0, 1, 1]])

kernel = torch.tensor([[1, 2, 1],
                       [0, 1, 0],
                       [2, 1, 0]])

print(input.shape)
input = torch.reshape(input, (1, 1, 5, 5))
kernel = torch.reshape(kernel, (1, 1, 3, 3))
print(input.shape)

output = F.conv2d(input, kernel, None, 1)
print(output)

输出：

1.1.4padding的使用

例如，padding填充1后，会整体每一行列都进行扩充一列、行，填充处默认是为1

1.2torch.nn

1.2.1参数

Parameters:

in_channels (int) – Number of channels in the input image

out_channels (int) – Number of channels produced by the convolution

kernel_size (int or tuple) – Size of the convolving kernel

stride (int or tuple, optional) – Stride of the convolution. Default: 1

padding (int, tuple or str, optional) – Padding added to all four sides of the input. Default: 0

padding_mode (str, optional) – 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros'

dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1

groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1

bias (bool, optional) – If True, adds a learnable bias to the output. Default: True

输入通道：

如下图所示，一般为三参数，或者加上batchsize的四参数

例如out_channel设置为2时，会生成两个卷积核，同时对输入进行卷积操作，然后对生成的两个out结果进行堆叠，得到output

1.2.2简单的卷积操作


# -*- coding: utf-8 -*-
# @Author  : 大叔azhe
# @Time    : 2023/1/16 15:52
# @Function:
import torch
import torchvision
from torch import nn
from torch.utils.data import DataLoader
from torch.nn import Conv2d

dataset = torchvision.datasets.CIFAR10("../azhe/data", train=False, transform=torchvision.transforms.ToTensor(),
                                       download=True)

dataloader = DataLoader(dataset, 64)


class Azhe(nn.Module):
    def __init__(self):
        super(Azhe, self).__init__()
        self.conv1 = Conv2d(3, 6, 3, 1, 0, groups=1)

    def forward(self, x):
        x = self.conv1(x)
        return x


azhe = Azhe()

for data in dataloader:
    imgs, targets = data
    output = azhe(imgs)
    print(output.shape)

Conv2d的几个参数如下所示！！！

in_channels:输入张量的通道数,这里为3

out_channels:输出张量的通道数，这里为6

karnel：卷积核大小，这里为3*3

stride：步长，这里为1

padding：这里为0,不进行填充，即下面情况不进行填充，则下面图像中的这一步卷积不可以进行！！！！

dilation:卷积核是否分散开来，默认为1，不分散

group，是否为分组卷积

卷积结果计算：

H=（原图高度-卷积核高度+2*padding的高度）/ 卷积步长Strid + 1;

W=（原图宽度-卷积核宽度+2*padding的宽度）/ 卷积步长Strid + 1;

Channel=卷积核个数

如果输入数据是32*32*3的图像，用10个5*5*3的filter来进行卷积操作，指定步长为1，边界填充为2，最终输入的规模为？

计算：(32-5+2*2)/1 +1 = 32，所以输出规模为32*32*10,经过卷积操作后也可以保持特征图长度、宽度不变。

2.最大池化

最大池化，也称降采样，顾名思义，作用主要为了压缩图像，类比1080P压缩为720P

2.1参数

Parameters:

kernel_size (Union[int, Tuple[int, int]]) – the size of the window to take a max over

stride (Union[int, Tuple[int, int]]) – the stride of the window. Default value is kernel_size

padding (Union[int, Tuple[int, int]]) – implicit zero padding to be added on both sides

dilation (Union[int, Tuple[int, int]]) – a parameter that controls the stride of elements in the window

return_indices (bool) – if True, will return the max indices along with the outputs. Useful for torch.nn.MaxUnpool2d later

ceil_mode (bool) – when True, will use ceil instead of floor to compute the output shape

2.1.1dilation

dilation与卷积网络中的用法相同，如下图所示，将卷积核分散开来，也称为空洞卷积，不常用

2.1.2ceil_mode

有两种模式，floor或者ceil，顾名思义，floor为向下取整，ceil为向上取整，默认是floor

如上所示，卷积核在移动时，先移动到第一个3*3方格中，取得最大值，为2，向右移动三个，这是发现不够了，如果选择为ceil_mode=true，会保留这6个格子，取得最大值，为3，如果选择为false，则会舍弃这6个格子。

然后向下移动三格，进行同样的取舍操作。

下面所示，为一个dataloader加载的图像经过一次最大池化后，图像尺寸的变化，可以看出

MaxPool不会改变图片的通道数，而是会压缩图像的高宽

3.非线性激活Non-linear Activations

3.1ReLU激活函数

基本是最简单的一类激活函数，也是最常用的，将输入值中的负数全部变为0，正数保持不变

3.1.1参数inplace

Parameters:

inplace (bool) – can optionally do the operation in-place. Default: False

作用如下图所示，设置为true后，原输入变量将被强制改变，而设置为false则不会影响到原输入值，而是产生一个新值

3.2Sigmoid激活函数

sigmoid作用效果如上图所示，将原本现行的值变为平滑的一段小区线

Sigmoid(x)=σ(x)=1+exp(−x)1

- 批处理归一化BatchNorm2d

对输入的四维数组进行批量标准化处理，具体计算公式如下

对于所有的batch中样本的同一个channel的数据元素进行标准化处理，即如果有C个通道，无论batch中有多少个样本，都会在通道维度上进行标准化处理，一共进行C次。

输出为，进行正则化和标准化的一个函数

tensor([[[[1., 2., 0., 3., 1.],
[0., 1., 2., 3., 1.],
[1., 2., 1., 0., 0.],
[5., 2., 3., 1., 1.],
[2., 1., 0., 1., 1.]]]])
tensor([[[[-0.3430, 0.5145, -1.2005, 1.3720, -0.3430],
[-1.2005, -0.3430, 0.5145, 1.3720, -0.3430],
[-0.3430, 0.5145, -0.3430, -1.2005, -1.2005],
[ 3.0870, 0.5145, 1.3720, -0.3430, -0.3430],
[ 0.5145, -0.3430, -1.2005, -0.3430, -0.3430]]]],
grad_fn=<NativeBatchNormBackward0>)

4.实例讲解

如上面例子所示，在前向传播时，我们可以求出每一个通道的平均值和方差

如果不开启其他参数，则默认就会通过方差和均值输出新的特征值，如果开启后，会使用到yi

其中的γ和β参数通过反向传播不断进行调整，初始化时都设置为1和0.

通过设置批处理归一化可以使通过卷积后的特征值重新调整，以达到平均分布的效果，就好像我们在预处理时都要把图像调整为224*224的效果一样，为了加快网络的学习。

4.1Dropout

随机设置某一通道的值为0，Dropout说的简单一点就是：我们在前向传播的时候，让某个神经元的激活值以一定的概率p（伯努利分布）停止工作，这样可以使模型泛化性更强，因为它不会太依赖某些局部的特征。

用来防止过拟合，两个参数一个用来设置概率，另一个和relu的inplace用法一样


# -*- coding: utf-8 -*-
# @Author  : 大叔azhe
# @Time    : 2023/1/16 20:05
# @Function:
import torch
from torch import nn
from torch.nn import Dropout

input = torch.tensor([[1, 2, 0, 3, 1],
                      [0, 1, 2, 3, 1],
                      [1, 2, 1, 0, 0],
                      [5, 2, 3, 1, 1],
                      [2, 1, 0, 1, 1]],dtype=float)
torch.reshape(input,(-1,1,5,5))
print(input)

m=Dropout(0.6,inplace=True)

m(input)
print(input)

m = nn.Dropout2d(p=0.6)
input = torch.randn(1, 1, 5, 5)//随机生成1batchsize，1通道 ，5*5的向量
output = m(input)
print(input)
print(output)

5.线性层

6.参数

输入通道数，输出通道数，是否使用偏置，如上图所示的b1,b2...bn，即为偏置

我们在使用pytorch官方提供的linear时只需输入输入特征层数和输出特征层数，偏置默认是开启的，推荐开启，上图所示的k1，k2...kn，pytorch官方已经为我们封装好了，如下图所示，weight即为k，狮子学习的，同样bias也为自学习的，初始化时的值为下面所示。

使用linear前一般也需要先将图像变为1*1*1*输入特征层数

可以使用flatten（）函数将图像展平，效果如下图所示


# -*- coding: utf-8 -*-
# 作者：小土堆
# 公众号：土堆碎念
import torch
import torchvision
from torch import nn
from torch.nn import Linear
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("../azhe/data", train=False, transform=torchvision.transforms.ToTensor(),
                                       download=True)

dataloader = DataLoader(dataset, batch_size=64, drop_last=True)

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.linear1 = Linear(196608, 10)

    def forward(self, input):
        output = self.linear1(input)
        return output

tudui = Tudui()

for data in dataloader:
    imgs, targets = data
    print(imgs.shape)
    output = torch.flatten(imgs)
    print(output.shape)
    output = tudui(output)
    print(output.shape)

7.其他常用的pytorch工具函数

7.1torch.flatten

用法如图所示，可指定输入向量从哪到哪进行展平

展平后，被展平的部分将只剩下一个维度，如下所示

7.2torch.randn

7.2.1rand


torch.rand(1,1,5,5)//代表生成一个1batch，1channel，5*5的张量

你的神经网络基础过关了吗？（Pytorch版本）

1.卷积

1.1torch.nn.functional

1.1.1参数

1.1.2卷积过程

1.1.3代码

1.1.4padding的使用

1.2torch.nn

1.2.1参数

1.2.2简单的卷积操作

2.最大池化

2.1参数

2.1.1dilation

2.1.2ceil_mode

3.非线性激活Non-linear Activations

3.1ReLU激活函数

3.1.1参数inplace

3.2Sigmoid激活函数

4.实例讲解

4.1Dropout

5.线性层

6.参数

7.其他常用的pytorch工具函数

7.1torch.flatten

7.2torch.randn

7.2.1rand

7.2.2randn