nn.Conv1d简单理解-CSDN博客

本文链接：https://blog.csdn.net/chumingqian/article/details/130125900

文章详细介绍了卷积层的输出计算方式，包括输入输出尺寸、stride、padding、dilation和groups等关键参数的作用。通过实例展示了卷积层操作，并给出了输出尺寸的计算公式。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. 官方文档的定义

In the simplest case, the output value of the layer with input size
$C_{\text{in}}, L)$ and output $C_{\text{out}}, L_{\text{out}})$ can be precisely described as:
$\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) + \sum_{k = 0}^{C_{in} - 1} \text{weight}(C_{\text{out}_j}, k) \star \text{input}(N_i, k)$
where $\star$ is the valid cross-correlation_ operator,
N is a batch size, C denotes a number of channels,
L is a length of signal sequence.
$$

This module supports :ref:`TensorFloat32<tf32_on_ampere>`.

* :attr:`stride` controls the stride for the cross-correlation, a single
  number or a one-element tuple.

* :attr:`padding` controls the amount of implicit zero-paddings on both sides
  for :attr:`padding` number of points.

* :attr:`dilation` controls the spacing between the kernel points; also
  known as the à trous algorithm. It is harder to describe, but this `link`_
  has a nice visualization of what :attr:`dilation` does.

* :attr:`groups` controls the connections between inputs and outputs.
  :attr:`in_channels` and :attr:`out_channels` must both be divisible by
  :attr:`groups`. For example,

    * At groups=1, all inputs are convolved to all outputs.
    * At groups=2, the operation becomes equivalent to having two conv
      layers side by side, each seeing half the input channels,
      and producing half the output channels, and both subsequently
      concatenated.
    * At groups= :attr:`in_channels`, each input channel is convolved with
      its own set of filters,
      of size

$\left\lfloor\frac{out\_channels}{in\_channels}\right\rfloor$

1.1 参数解释

Input: $N, C_{in}, L_{in})$
Output: $N, C_{out}, L_{out})$ where

其中如上文所述：

N 代表 batch size,
C 代表channels,　通道的数量。在序列中，代表每个列向量的维度。　
$C_{in}$ 输入序列中，每个列向量的编码维度。　
$C_{out}$ 输出序列中，期待每个列向量的编码维度。
L 代表 sequence 序列的长度，即序列中有多少个列向量。
$L_{in}$ 输入序列中，包含多少个列向量。
$L_{out}$ 输出序列中，包含多少个列向量。

输出如下所示:

$kernel_size − 1 ) − 1 stride + 1 ⌋ L_{out} = \left\lfloor\frac{L_{in} + 2 \times \text{padding} - \text{dilation} \times (\text{kernel\_size} - 1) - 1}{\text{stride}} + 1\right\rfloor$

1.2 运行举例

其中padding 默认0, 　dilation 默认1, groups 默认1,

计算公式，按照上文计算。

import torch.nn as nn

m = nn.Conv1d(16,33, 3, stride =2)
input = torch.rand(20, 16, 50)

output = m(input)


print(output.shape)
torch.Size([20, 33, 24])