测试Pytorch Conv2d 中的group参数实际影响:
首先定义一个我们能验算的3 33的数组
import torch
import numpy as np
array1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]).reshape(3, 3)
array2 = np.array([2, 3, 4, 5, 6, 7, 8, 9, 1]).reshape(3, 3)
array3 = np.array([3, 4, 5, 6, 7, 8, 9, 1, 2]).reshape(3, 3)
array1 = array1[np.newaxis, :, :]
array2 = array1[np.newaxis, :, :]
array3 = array1[np.newaxis, :, :]
array = np.vstack((array1, array2, array3))
array = array[np.newaxis, :, :, :]
array = torch.tensor(array).float()
print(array)
tensor([[[[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]],
[[2., 3., 4.],
[5., 6., 7.],
[8., 9., 1.]],
[[3., 4., 5.],
[6., 7., 8.],
[9., 1., 2.]]]])
然后定义两个group不同的卷积层,一个group=1,另一个group=3
conv1 = torch.nn.Conv2d(3, 6, 1, 1, autopad(1), groups=1, bias=False)
conv2 = torch.nn.Conv2d(3, 6, 1, 1, autopad(1), groups=3, bias=False)
第一个卷积输出
print('conv1.weight', conv1.weight)
print('(conv1(array)', conv1(array))
conv1.weight Parameter containing:
tensor([[[[-0.21770]],
[[ 0.55208]],
[[-0.19912]]],
[[[-0.10678]],
[[-0.11752]],
[[ 0.56264]]],
[[[ 0.48454]],
[[-0.29961]],
[[-0.00285]]],
[[[-0.43829]],
[[-0.42696]],
[[-0.34118]]],
[[[ 0.03800]],
[[ 0.36227]],
[[ 0.50197]]],
[[[-0.40956]],
[[-0.28049]],
[[-0.14078]]]], requires_grad=True)
(conv1(array) tensor([[[[ 0.28909, 0.42435, 0.55960],
[ 0.69486, 0.83012, 0.96537],
[ 1.10063, 3.02800, -1.80544]],
[[ 1.34611, 1.68446, 2.02280],
[ 2.36115, 2.69949, 3.03784],
[ 3.37618, -1.34927, 0.04675]],
[[-0.12323, 0.05885, 0.24093],
[ 0.42301, 0.60509, 0.78717],
[ 0.96925, 1.17697, 4.05551]],
[[-2.31577, -3.52222, -4.72866],
[-5.93510, -7.14154, -8.34798],
[-9.55442, -7.69021, -5.05398]],
[[ 2.26845, 3.17069, 4.07293],
[ 4.97516, 5.87740, 6.77964],
[ 7.68188, 4.06638, 1.70823]],
[[-1.39288, -2.22370, -3.05453],
[-3.88535, -4.71618, -5.54700],
[-6.37783, -5.94164, -4.24805]]]], grad_fn=<ThnnConv2DBackward>)
验算第一个元素
0.28909 = -0.21770 + 2 * 0.55208 - 3 * 0.19912
可知这个卷积是核对于三层相乘之和。
第二个卷积输出
print('conv2.weight', conv2.weight)
print('(conv2(array)', (conv2(array)))
conv2.weight Parameter containing:
tensor([[[[ 0.06971]]],
[[[ 0.50235]]],
[[[-0.92221]]],
[[[-0.07961]]],
[[[-0.02840]]],
[[[ 0.12154]]]], requires_grad=True)
(conv2(array) tensor([[[[ 0.06971, 0.13942, 0.20912],
[ 0.27883, 0.34854, 0.41825],
[ 0.48795, 0.55766, 0.62737]],
[[ 0.50235, 1.00471, 1.50706],
[ 2.00941, 2.51176, 3.01412],
[ 3.51647, 4.01882, 4.52117]],
[[-1.84443, -2.76664, -3.68885],
[-4.61107, -5.53328, -6.45549],
[-7.37771, -8.29992, -0.92221]],
[[-0.15921, -0.23882, -0.31843],
[-0.39803, -0.47764, -0.55725],
[-0.63686, -0.71646, -0.07961]],
[[-0.08519, -0.11359, -0.14199],
[-0.17039, -0.19879, -0.22718],
[-0.25558, -0.02840, -0.05680]],
[[ 0.36461, 0.48615, 0.60768],
[ 0.72922, 0.85076, 0.97229],
[ 1.09383, 0.12154, 0.24307]]]], grad_fn=<MkldnnConvolutionBackward>)
肉眼观察可知,
结果的第1,2层是
矩阵
[[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]],
乘以核0.06971和 0.50235的结果,第3,4,5,6层以此类推。
可知,输入的3层被分了3组,每组与一个核相乘。
总结
我个人的理解是,对于输入是图片的信息来看,一张图片的三个通道的信息会因为卷积完全融到一块,打破了层与层之间的信息壁垒,而分组之后的卷积相当于只对某一层进行处理。