可变形卷积

qq_478377515

已于 2022-09-18 06:52:21 修改

阅读量480

点赞数

分类专栏：日paper阅读和实践技巧分析-计算机视觉文章标签：深度学习 cnn 机器学习

于 2022-09-04 23:06:14 首次发布

本文链接：https://blog.csdn.net/qq_33031419/article/details/126692960

版权

日paper阅读和实践技巧分析-计算机视觉专栏收录该内容

42 篇文章 0 订阅

订阅专栏

参考：

可变形卷积从概念到实现过程_Clark-dj的博客-CSDN博客_可变形卷积实现

可变形卷积学习（RepPoints）_清梦枕星河~的博客-CSDN博客_可变形卷积

第三十六课.可变形卷积_tzc_fly的博客-CSDN博客_可变形卷积

更灵活、有个性的卷积——可变形卷积（Deformable Conv） - 哔哩哔哩

Deformable Convolution 关于可变形卷积 - 知乎

注意：这里有一个非常非常非常容易混淆的点，所谓的deformable，到底deformable在哪？很多人可能以为deformable conv学习的是可变形的kernel，其实不是不是不是！本文并不是对kernel学习offset而是对feature的每个位置学习一个offset。

实例效果

可以从上图看到，可以看到当绿色点在目标上时，红色点所在区域也集中在目标位置，并且基本能够覆盖不同尺寸的目标，因此经过可变形卷积，我们可以更好地提取出感兴趣物体的完整特征，效果是非常不错的。

DCN听起来不错，但其实也有问题：我们的可变形卷积有可能引入了无用的上下文（区域）来干扰我们的特征提取，这显然会降低算法的表现。通过上图的对比实验结果（多边形区域框，第三行）我们也可以看到DCN会引入无关信息，比如墙面引入了过多的猫的信息。

也许我们应该放弃使用卷积学习偏移量，而考虑使用自注意力机制去学习更全局的偏移量以去除无关上下文信息的干扰。
————————————————
版权声明：本文为CSDN博主「tzc_fly」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/qq_40943760/article/details/124984314

torchvision 中 deform_conv2d 操作的经验性解析

参考：torchvision 中 deform_conv2d 操作的经验性解析 - 知乎

参数介绍

input (Tensor[batch_size, in_channels, in_height, in_width]): input tensor输入的数据。
offset (Tensor[batch_size, 2 * offset_groups * kernel_height * kernel_width, out_height, out_width]): offsets to be applied for each position in the convolution kernel.这用于对卷积过程中各个卷积核参数的作用在输入特征上的位置进行偏移，即所谓调整采样点。其与输入的各个通道一一对应，即这里的offset_groups最大为in_channels，最小为 1。
weight (Tensor[out_channels, in_channels // groups, kernel_height, kernel_width]): convolution weights, split into groups of size (in_channels // groups)实际卷积核的参数。要明白，可变形卷积也是卷积，只是采样点有所不同，另外 v2 中也对每次卷积操作添加了一个空间调制（可以理解为空间注意力）。
bias (Tensor[out_channels]): optional bias of shape (out_channels,). Default: None卷积的偏置参数。
stride (int or Tuple[int, int]): distance between convolution centers. Default: 1卷积划窗的步长。
padding (int or Tuple[int, int]): height/width of padding of zeroes around each image. Default: 0卷积操作在输入数据周围补零的数量。注意这个是对称补零的。如果只想单边补零，可以对输入特征直接使用F.pad进行预处理。
dilation (int or Tuple[int, int]): the spacing between kernel elements. Default: 1卷积的扩张率。
mask (Tensor[batch_size, offset_groups * kernel_height * kernel_width, out_height, out_width]): masks to be applied for each position in the convolution kernel. Default: None：作用在卷积操作中窗口内实际参与计算元素上的mask，可以简单理解为局部空间 attention 的作用。mask对应的offset_groups必须于前面offset中对应的offset_groups一致，否则会报错。因而可以合理推测，这里的mask和offset是严格对应的。

class DCNConv(nn.Module):
    # Standard convolution
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        self.conv1 = nn.Conv2d(c1, c2, 3, 2, 1, groups=g, bias=False)
        deformable_groups = 1
        offset_channels = 18
        self.conv2_offset = nn.Conv2d(c2, deformable_groups * offset_channels, kernel_size=3, padding=1)
        self.conv2 = DeformConv2d(c2, c2, kernel_size=3, padding=1, bias=False)

        # self.conv2 = DeformableConv2d(c2, c2, k, s, autopad(k, p), groups=g, bias=False)
        self.bn1 = nn.BatchNorm2d(c2)
        self.act1 = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
        self.bn2 = nn.BatchNorm2d(c2)
        self.act2 = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

    def forward(self, x):
        # print(x.shape)
        # print('-'*50)
        x = self.act1(self.bn1(self.conv1(x)))
        # print(x.shape)
        offset = self.conv2_offset(x)
        x = self.act2(self.bn2(self.conv2(x, offset)))
        # print('-'*50)
        # print(x.shape)
        return x