【PyTorch】常用网络层layers总结

前言

PyTorch中网络搭建主要是通过调用layers实现的,这篇文章总结了putorch中最常用的几个网络层接口及其参数。


一、Convolution Layers

pytorch官方文档介绍了众多卷积层算法,以最新的pytorch2.4为例,针对处理的数据维度不同,有如下卷积层layers:

卷积层名描述
nn.Conv1dApplies a 1D convolution over an input signal composed of several input planes.
nn.Conv2dApplies a 2D convolution over an input signal composed of several input planes.
nn.Conv3dApplies a 3D convolution over an input signal composed of several input planes.
nn.ConvTranspose1dApplies a 1D transposed convolution operator over an input image composed of several input planes.
nn.ConvTranspose2dApplies a 2D transposed convolution operator over an input image composed of several input planes.
nn.ConvTranspose3dApplies a 3D transposed convolution operator over an input image composed of several input planes.
…………
这里以使用最多的二维卷积为例,介绍卷积层的基本算法。
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)

Parameters:

  • in_channels (int) – 输入通道数
  • out_channels (int) – 输出通道数
  • kernel_size (int or tuple) – 卷积核大小
  • stride (int or tuple, optional) – 卷积步长
  • padding (int, tuple or str, optional) – 对于输入图像的四周进行填充的数量进行控制,可指定填充像素数量,也可以指定填充模式,如"same", “valid”
  • padding_mode (str, optional) – 填充类型
  • dilation (int or tuple, optional) – 孔洞卷积的孔洞大小
  • groups (int, optional) – 分组卷积的分组
  • bias (bool, optional) – 是否采用偏置

对于卷积层的输入输出维度换算,官方文档给出了如下公式:

  • 输入尺度: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
  • 输出尺度: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)
  • 换算关系:
    H o u t = ⌊ H i n + 2 × padding [ 0 ] − dilation [ 0 ] × ( kernel size [ 0 ] − 1 ) − 1 stride [ 0 ] + 1 ⌋ W o u t = ⌊ W i n + 2 × padding [ 1 ] − dilation [ 1 ] × ( kernel size [ 1 ] − 1 ) − 1 stride [ 1 ] + 1 ⌋ H_{out}=\left\lfloor\frac{H_{in}+2\times\text{padding}[0]-\text{dilation}[0]\times(\text{kernel size}[0]-1)-1}{\text{stride}[0]}+1\right\rfloor\\W_{out}=\left\lfloor\frac{W_{in}+2\times\text{padding}[1]-\text{dilation}[1]\times(\text{kernel size}[1]-1)-1}{\text{stride}[1]}+1\right\rfloor Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel size[0]1)1+1Wout=stride[1]Win+2×padding[1]dilation[1]×(kernel size[1]1)1+1

kernel_size可以以int或tuple传入,前者时公式中的kernel size[0]和kernel size[1]都为传参的int值,其他参数也类似。


二、Pooling Layers

池化层的作用是信息降维,针对二维信息,pytorch提供了以下池化算法

池化层名描述
nn.MaxPool2dApplies a 2D max pooling over an input signal composed of several input planes.
nn.MaxUnpool2dComputes a partial inverse of MaxPool2d.
nn.AvgPool2dApplies a 2D average pooling over an input signal composed of several input planes.
nn.FractionalMaxPool2dApplies a 2D fractional max pooling over an input signal composed of several input planes.
nn.LPPool2dApplies a 2D power-average pooling over an input signal composed of several input planes.
nn.AdaptiveMaxPool2dApplies a 2D adaptive max pooling over an input signal composed of several input planes.
nn.AdaptiveAvgPool2dApplies a 2D adaptive average pooling over an input signal composed of several input planes.

这里以常见的最大池化maxpool为例介绍池化层(其他类似)

torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

Parameters:

  • kernel_size (Union[int, Tuple[int, int]]) – 池化窗口大小
  • stride (Union[int, Tuple[int, int]]) – 池化步长
  • padding (Union[int, Tuple[int, int]]) – 填充大小
  • dilation (Union[int, Tuple[int, int]]) – 空洞大小
  • return_indices (bool) – 是否返回最大值所在位置
  • ceil_mode (bool) – 如果无法整除,选择向下取整还是向上取整,默认向下取整

池化层的输出大小计算和卷积是一样的。

  • 输入尺度: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
  • 输出尺度: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)
  • 换算关系:
    H o u t = ⌊ H i n + 2 × padding [ 0 ] − dilation [ 0 ] × ( kernel size [ 0 ] − 1 ) − 1 stride [ 0 ] + 1 ⌋ W o u t = ⌊ W i n + 2 × padding [ 1 ] − dilation [ 1 ] × ( kernel size [ 1 ] − 1 ) − 1 stride [ 1 ] + 1 ⌋ H_{out}=\left\lfloor\frac{H_{in}+2\times\text{padding}[0]-\text{dilation}[0]\times(\text{kernel size}[0]-1)-1}{\text{stride}[0]}+1\right\rfloor\\W_{out}=\left\lfloor\frac{W_{in}+2\times\text{padding}[1]-\text{dilation}[1]\times(\text{kernel size}[1]-1)-1}{\text{stride}[1]}+1\right\rfloor Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel size[0]1)1+1Wout=stride[1]Win+2×padding[1]dilation[1]×(kernel size[1]1)1+1

三、Padding Layers

pytorch提供了ReflectionPadReplicationPadZeroPadConstantPadCircularPad多种padding策略。以二维信号处理为例,各自的介绍如下:

填充层名描述
nn.ReflectionPad2dPads the input tensor using the reflection of the input boundary.
nn.ReplicationPad2dPads the input tensor using replication of the input boundary.
nn.ZeroPad2dPads the input tensor boundaries with zero.
nn.ConstantPad2dPads the input tensor boundaries with a constant value.
nn.CircularPad2dPads the input tensor using circular padding of the input boundary.

padding层代码比较简单,只有一个参数需要传递。以nn.ZeroPad2d为例:

torch.nn.ZeroPad2d(padding)

Parameters:

  • padding (int, tuple) – the size of the padding. If is int, uses the same padding in all boundaries. If a 4-tuple, uses (padding_left, padding_right, padding_top, padding_bottom)
    当padding参数传入整数,则在张量四周都填充相同维度的数值,如果以有四个元素的元组传入,则元组各元素分别控制左边、右边、上边、下边的维度。

不妨看一下官网示例:

>>> m = nn.ZeroPad2d(2)
>>> input = torch.randn(1, 1, 3, 3)
>>> input
tensor([[[[-0.1678, -0.4418,  1.9466],
          [ 0.9604, -0.4219, -0.5241],
          [-0.9162, -0.5436, -0.6446]]]])
>>> m(input)
tensor([[[[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.1678, -0.4418,  1.9466,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.9604, -0.4219, -0.5241,  0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.9162, -0.5436, -0.6446,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])
>>> # using different paddings for different sides
>>> m = nn.ZeroPad2d((1, 1, 2, 0))
>>> m(input)
tensor([[[[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000, -0.1678, -0.4418,  1.9466,  0.0000],
          [ 0.0000,  0.9604, -0.4219, -0.5241,  0.0000],
          [ 0.0000, -0.9162, -0.5436, -0.6446,  0.0000]]]])

总结

本文对pytorch使用最多的layers进行了介绍,重点介绍了网络层接口以及参数。由于篇幅限制,其余众多网络层建议查看PyTorch官方文档学习。
在这里插入图片描述

  • 14
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值