【pytorch学习】 nn.Conv1d()

[pytorch学习] nn.Conv1d详解

nn.Conv1d()

在深度学习领域和使用pytorch框架中,我们经常会用到

self.conv1 = torch.nn.Conv1d(128,128, 1)

初学者可能不了解此处的nn.Conv1d的用法。
首先我们来看官方文档的解释,并附带对照翻译。

torch.nn.modules.conv.Conv1d def __init__(self,
             in_channels: int,
             out_channels: int,
             kernel_size: Union[int, tuple],
             stride: Any = 1,
             padding: Any = 0,
             dilation: Any = 1,
             groups: int = 1,
             bias: bool = True,
             padding_mode: str = 'zeros') -> None
Applies a 1D convolution over an input signal composed of several input planes.
In the simplest case, the output value of the layer with input size (N, Cin, L) and output (N, Cout, Lout) can be precisely described as:
out(Ni, Coutj) = bias(Coutj) + Cin − 1∑k = 0 weight(Coutj, k)⋆input(Ni, k)
where ⋆ is the valid cross-correlation  operator, N is a batch size, C denotes a number of channels, L is a length of signal sequence.
stride controls the stride for the cross-correlation, a single number or a one-element tuple.
padding controls the amount of implicit zero-paddings on both sides for padding number of points.
dilation controls the spacing between the kernel points; also known as the à trous algorithm. It is harder to describe, but this link  has a nice visualization of what dilation does.
groups controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups. For example,
 
At groups=1, all inputs are convolved to all outputs.
At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.
At groups= in_channels, each input channel is convolved with its own set of filters, of size ⌊(out_channels)/(in_channels)⌋.
Note
Depending of the size of your kernel, several (of the last) columns of the input might be lost, because it is a valid cross-correlation , and not a full cross-correlation . It is up to the user to add proper padding.
Note
When groups == in_channels and out_channels == K * in_channels, where K is a positive integer, this operation is also termed in literature as depthwise convolution.
In other words, for an input of size (N, Cin, Lin), a depthwise convolution with a depthwise multiplier K, can be constructed by arguments (Cin = Cin, Cout = Cin × K, ..., groups = Cin).
Note
In some circumstances when using the CUDA backend with CuDNN, this operator may select a nondeterministic algorithm to increase performance. If this is undesirable, you can try to make the operation deterministic (potentially at a performance cost) by setting torch.backends.cudnn.deterministic = True. Please see the notes on /notes/randomness for background.
Shape:
Input: (N, Cin, Lin)
Output: (N, Cout, Lout) where
Lout = ⌊(Lin + 2 ×  padding − dilation × (kernel_size − 1) − 1)/( stride) + 1⌋
Examples:
>>> m = nn.Conv1d(16, 33, 3, stride=2)
>>> input = torch.randn(20, 16, 50)
>>> output = m(input)
参数:
in_channels – Number of channels in the input image
out_channels – Number of channels produced by the convolution
kernel_size – Size of the convolving kernel
stride – Stride of the convolution. Default: 1
padding – Zero-padding added to both sides of the input. Default: 0
dilation – Spacing between kernel elements. Default: 1
groups – Number of blocked connections from input channels to output channels. Default: 1
bias – If ``True``, adds a learnable bias to the output. Default: ``True``
padding_mode – ``'zeros'``, ``'reflect'``, ``'replicate'`` or ``'circular'``. Default: ``'zeros'``

英文好且理解能力高的小伙伴可以直接看上面官方解释文档,并停止看本博客以节约各位时间。

参数:
in_channels:输入图像中的通道数
out_channels:卷积产生的输出通道数
kernel_size:卷积内核的大小
stride:卷积的步长。默认值:1
padding:零填充添加到输入的两侧的层数。默认值:0
dilation:内核元素之间的间距。默认值:1
groups:从输入通道到输出通道的阻塞连接数。默认值:1
bias:如果“True”,则向输出添加可学习的偏差。默认值:True
padding_mode:“0”、“反射”、“复制”或“圆形”。默认值:'zeros'

示例

如我们一开始提到的用法:

self.conv1 = torch.nn.Conv1d(128, 128, 1)

这里即我们的输入通道为128,输出通道为128,卷积内核为1.
而这有什么意义呢,借用官方例子说明:

Examples:
>>> m = nn.Conv1d(16, 33, 3, stride=2)
>>> input = torch.randn(20, 16, 50)
>>> output = m(input)

Out: torch.Size([20, 33, 24])

我们随机创建了一个20*16*50的Tensors,
然后创建了一个16*33*3的Conv1d,
卷积维度是在第二维上扫描的,即两个16相互对应,同时卷积输出维度设置为33,最后在第三维上,输出为
⌊(50-1*(3-1)-1)/2+1⌋=24,最后的输出维度变为了20*33*24,

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值