torch.nn.functional.conv2d函数详解

最新推荐文章于 2025-04-01 10:04:42 发布

Una_zh

最新推荐文章于 2025-04-01 10:04:42 发布

阅读量1.3w

点赞数 14

分类专栏： python

本文链接：https://blog.csdn.net/jingOlivia/article/details/104728222

版权

python 专栏收录该内容

6 篇文章

订阅专栏

记torch.nn.functional为F。

`F.conv1d`与`F.conv2d`的区别在于：

设input的大小为chanel * in_height * in_width，filter的大小为chanel * f_height * f_width。

F.conv1d
in_height=f_height，在一次卷积计算中，filter只在input的最后一个维度上扫描，即参数stride的取值为int。
F.conv2d
在一次卷积计算中，filter可以在input的两个维度上扫描，即参数stride的取值为一个元组，例如stride=(2, 3)，即在hieght维度上的步长为2，在width上的步长为3。

以`F.conv2d`解释参数`groups`的作用

torch.nn.functional.conv2d(self, input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)
Args:
input: 
input tensor of shape :math:`(\text{minibatch} , \text{in\_channels} , iH , iW)`
weight: filters of shape :math:`(\text{out\_channels} , \frac{\text{in\_channels}}{\text{groups}} , kH , kW)`
bias: optional bias tensor of shape :math:`(\text{out\_channels})`. Default: ``None``
stride: 
the stride of the convolving kernel. Can be a single number or a tuple `(sH, sW)`. Default: 1
padding: 
implicit paddings on both sides of the input. Can be a single number or a tuple `(padH, padW)`. Default: 0
dilation: 
the spacing between kernel elements. Can be a single number or a tuple `(dH, dW)`. Default: 1
groups: 
split input into groups, :math:`\text{in\_channels}` should be divisible by the number of groups. Default: 1

input
minibatch：batch中的样例个数
in_channels：每个样例数据的通道数
iH：每个样例的高（行数）
iW：每个样例的宽（列数）
weight（就是filter）
out_channels：卷积核的个数
in_channels/groups：每个卷积核的通道数
kH：每个卷积核的高（行数）
kW：每个卷积核的宽（列数）

groups作用：对input中的每个样例数据，将通道分为groups等份，即每个样例数据被分成了groups个大小为(in_channel/groups, iH, iW)的子数据。对于这每个子数据来说，卷积核的大小为(in_channel/groups, kH, kW)。这一整个样例数据的计算结果为各个子数据的卷积结果拼接所得。

举例：

import torch.nn.functional as F
inputs = torch.arange(1, 21).reshape(1, 2, 2, 5)
filters = torch.arange(1, 7).reshape(2, 1, 1, 3)
print(inputs)
print(filters)
res = F.conv2d(input=inputs, weight=filters, stride=(1, 1), groups=2)
print(res)

输出如下：

tensor([[[[ 1,  2,  3,  4,  5],
          [ 6,  7,  8,  9, 10]],

         [[11, 12, 13, 14, 15],
          [16, 17, 18, 19, 20]]]])

tensor([[[[1, 2, 3]]],

        [[[4, 5, 6]]]])
        
tensor([[[[ 14,  20,  26],
          [ 44,  50,  56]],

         [[182, 197, 212],
          [257, 272, 287]]]])

输入数据：batch中只有一个样例数据，该样例数据有2个通道，高2，宽5。
卷积核：有1个卷积核，大小为：2个通道，高1，宽3。
但是因为groups取值为2，所以单个样例数据和卷积核实际上都同时被分成了两组。每组的大小分别为：
数据：1个通道，高2，宽5
卷积核：1个通道，高1，宽3

样例数据：

[ [ [ 1,  2,  3,  4,  5],
    [ 6,  7,  8,  9, 10] ],

  [ [11, 12, 13, 14, 15],
    [16, 17, 18, 19, 20] ] ]

被分成：

[ [ [ 1,  2,  3,  4,  5],			 [ [ [11, 12, 13, 14, 15],
   [ 6,  7,  8,  9, 10] ] ] 	和		[16, 17, 18, 19, 20] ] ]

卷积核：

[ [ [ [ 1, 2, 3 ] ] ],

[ [ [ 4, 5, 6 ] ] ] ]

被分成：

[ [ [1, 2, 3] ] ]	和	 [ [4, 5, 6] ] ]

结果中：

[ [ 14,  20,  26],
 [ 44,  50,  56] ]

是

[ [ [ 1,  2,  3,  4,  5],
   [ 6,  7,  8,  9, 10] ] ]	和	 [ [ [ 1, 2, 3 ] ] ]	卷积所得。

torch.nn.functional.conv2d函数详解

F.conv1d与F.conv2d的区别在于：

以F.conv2d解释参数groups的作用

`F.conv1d`与`F.conv2d`的区别在于：

以`F.conv2d`解释参数`groups`的作用