卷积后H、W输出:
, 如有小数,向下取整。
参数:groups
未分组:
输入图片shape: ,目标输出shape:,未分组卷积核shape: 。 参数量:
分组:
设置参数groups=g,即将输入特征图按通道分成g组 ,则每组shape:, 所以对应卷积核shape ,每组输出特征图shape:。
最终将g组输出特征图concat,得到 的输出。
所需参数量:
分组卷积的参数量是标准卷积的
Depthwise Convolution
当groups = in_channel = out_channel=c_1时,每个feature map一一对应一个卷积核,即参数量 进一步减少了参数。
Deformable Convolution 可变形卷积
CNN对于未知形状变换的建模存在缺陷,因为CNN模块有固定的形状结构,即感受野是固定的。在进行诸如分割等精确定位的任务上效果不佳。在卷积网络中加入可学习的偏移量offset,使卷积核在feature map上不断发生偏移,即可更好学习ROI特征。
(a)即常见3x3卷积核,(b)即deformable conv,加上offset后采样点发生变化 ; (c) (d)是deformable conv的特殊形式。
绿框是原始卷积window,deformable conv可视为2branch,1branch通过额外conv学习offset(HxWx2N,2N的意思是有x,y两个方向的偏移),获得的offsets与feature map共同作为input输入2branch中(即相当于在蓝框中对feature map做卷积操作)。
注意,offset而是对feature map中的每个位置学习而非对kernel内容学习!
torchvision.ops.DeformConv2d(input: Tensor, offset: Tensor, mask: Optional[Tensor] = None)
- input (Tensor[batch_size, in_channels, in_height, in_width]): input tensor
- offset (Tensor[batch_size, 2 * offset_groups * kernel_height * kernel_width, out_height, out_width]): offsets to be applied for each position in the convolution kernel.
- mask (Tensor[batch_size, offset_groups * kernel_height * kernel_width, out_height, out_width]): masks to be applied for each position in the convolution kernel.
#首先在__init__函数定义:
deform_conv2d = DeformConv2d(dim, dim, kernel_size, padding = 2, groups = deform_groups)
input = torch.rand(4, 3, 10, 10)
kh, kw = 3, 3
weight = torch.rand(5, 3, kh, kw)
# offset and mask should have the same spatial size as the output of the convolution.
# if input h,w = 10, k=3, s=1, p=0 -> output h,w = 8
offset = torch.rand(4, 2 * kh * kw, 8, 8)
mask = torch.rand(4, kh * kw, 8, 8)
out = deform_conv2d(input, offset, weight, mask=mask)
print(out.shape)
>>> # returns
>>> torch.Size([4, 5, 8, 8])
函数方法实现Conv2d:torch.nn.functional.conv2d
torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)
Parameters:
-
input – input tensor of shape (minibatch, in_channels, inH, inW)
-
weight – filters of shape (out_channels, groups / in_channels, kernel_H, kernel_W)
-
bias – optional bias tensor of shape (out_channels)(out_channels). Default:
None
-
stride – the stride of the convolving kernel. Can be a single number or a tuple (sH, sW). Default: 1.
nn.Conv2d是nn.Module的子类,而此为一个函数。
因此nn.Conv2d可以放在nn.Sequential里,而nn.functional.conv2d不可,同时,nn.functional.conv2d需要定义weight,每次调用它需要手动输入weight