PyTorch torch.nn.Conv1d计算过程理解

负反馈循环

已于 2023-02-27 14:59:34 修改

阅读量888

点赞数 4

于 2023-02-27 14:59:09 首次发布

本文链接：https://blog.csdn.net/qq_43659183/article/details/129240846

版权

点云的深度学习里面经常用到一维卷积，看了很多讲解的文章，但还是感觉摸不透，索性直接尝试手动复现 Conv1d 的计算过程。

创建点云

import torch

points = torch.tensor([
	[1, 2, 3],
	[4, 5, 6],
	[7, 8, 9],
	[10, 11, 12]
])
# points.shape
# torch.Size([4, 3])

points = points.unsqueeze(0)  # batch = 1
# points.shape
# torch.Size([1, 4, 3])

points = points.permute(0, 2, 1)
# points.shape
# torch.Size([1, 3, 4])

创建一维卷积层

conv1d = torch.nn.Conv1d(in_channels=3, out_channels=5, kernel_size=1)
for k, v in conv1d.named_parameters():
	print(k)
	print(v)
	print(v.shape)

'''
weight
Parameter containing:
tensor([[[-0.0527],
         [-0.4512],
         [ 0.1197]],

        [[ 0.4636],
         [ 0.4103],
         [-0.3061]],

        [[-0.0603],
         [ 0.4536],
         [ 0.5522]],

        [[-0.2800],
         [ 0.1951],
         [ 0.5738]],

        [[-0.3367],
         [ 0.4430],
         [ 0.0143]]], requires_grad=True)
torch.Size([5, 3, 1])
bias
Parameter containing:
tensor([ 0.2423, -0.1086, -0.0243,  0.2874, -0.1059], requires_grad=True)
torch.Size([5])
'''

进行运算并查看结果

out = conv1d(points)

'''
>>> out
tensor([[[-0.3538, -1.5065, -2.6592, -3.8118],
         [ 0.2573,  1.9608,  3.6644,  5.3679],
         [ 2.4792,  5.3157,  8.1521, 10.9886],
         [ 2.1188,  3.5853,  5.0519,  6.5184],
         [ 0.4864,  0.8483,  1.2102,  1.5721]]],
       grad_fn=<ConvolutionBackward0>)
>>> out.shape
torch.Size([1, 5, 4])
'''

梳理

忽略 batch 维度，我们输入的矩阵为（从上到下第三行分别表示 $x$ 、 $y$ 、 $z$ 坐标，每一列表示一个点）：
$\begin{pmatrix} 1 & 4 & 7 & 10 \\ 2 &5 & 8 & 11 \\ 3 & 6 & 9 & 12 \end{pmatrix}$
偏置为：
$\begin{pmatrix} 0.2423\\ -0.1086\\ -0.0243\\ 0.2874\\ -0.1059 \end{pmatrix}$
参数为：
$\begin{pmatrix} -0.0527 & -0.4512 & 0.1197\\ 0.4636 & 0.4103 & -0.3061\\ -0.0603 & 0.4536 & 0.5522\\ -0.2800 & 0.1951 & 0.5738\\ -0.3367 & 0.4430 & 0.0143 \end{pmatrix}$
输出为：
$\begin{pmatrix} -0.3538 & -1.5065 & -2.6592 & -3.8118\\ 0.2573 & 1.9608 & 3.6644 & 5.3679\\ 2.4792 & 5.3157 & 8.1521 & 10.9886\\ 2.1188 & 3.5853 & 5.0519 & 6.5184\\ 0.4864 & 0.8483 & 1.2102 & 1.5721 \end{pmatrix}$

$p o in t s$ 、 $bia s$ 、 $w e i g h t s$ 和 $o u tp u t$ 的一行表示一个 channel，分析的关键是理解这几点：

输出（ $o u tp u t$ ）的每一个 channel 都是独立的
输出（ $o u tp u t$ ）的 channel 和 $w e i g h t s$ 及 $bia s$ 相应 channel 对应
计算输出（ $o u tp u t$ ）的每一个 channel 都用到了所有的点数据

因此，我们重点关注 $bia s$ 、 $w e i g h t s$ 和 $o u tp u t$ 的第一行的计算过程，其他几行是类似的过程。

到这一步，根据这几个变量的形状，其实就能推断出 $o u tp u t$ 的计算过程：
$\approx 1 \times (-0.0527) + 2 \times (-0.4512) + 3 \times 0.1197 + 0.2423 \\ -1.5065 \approx 4 \times (-0.0527) + 5 \times (-0.4512) + 6 \times 0.1197 + 0.2423 \\ -2.6592 \approx 7 \times (-0.0527) + 8 \times (-0.4512) + 9 \times 0.1197 + 0.2423 \\ -3.8118 \approx 10 \times (-0.0527) + 11 \times (-0.4512) + 12 \times 0.1197 + 0.2423$