
Refining activation downsampling with SoftPool




e p i ∑ i = 1 4 e p i \frac{e^{pi}}{\sum_{i=1}^{4}e^{pi}} i=14epiepi


3.1 池化算法变种

  上图展示了多个变种的池化层,具体包括Average Pooling、Max Pooling、Power Average Pooling、Stochastic Pooling、S3 Pooling、Local Importance Pooling与SoftPool。通过观察我们可以发现:(1)其它的池化操作基本都是在最大池化或者平均池化的变种;(2)S3池化操作的思路与最大池化类似;(3)其它的池化操作基本都是平均池化的变种;(4)Local Importance Pooling与SoftPool池化操作的思路类似,都给原图的区域计算了对应的区域,并进行了累计操作。

3.2 SoftPool计算

  前向计算的步骤包括:(1)计算候选的3*3区域的权重w;(2)将权重w与激活映射a相乘相加获得 a ~ \tilde{a} a~
  反向计算的步骤包括:(1)计算 a ~ \tilde{a} a~的梯度值 ▽ a ~ \bigtriangledown \tilde{a} a~;(2)将 ▽ a ~ \bigtriangledown \tilde{a} a~与权重w相乘获得 ▽ a \bigtriangledown {a} a



---  S T A R T  O F  F U N C T I O N  S O F T _ P O O L 1 D  ---
        Function for dowsampling based on the exponenial proportion rate of pixels (soft pooling).
        If the tensor is in CUDA the custom operation is used. Alternatively, the function uses
        standard (mostly) in-place PyTorch operations for speed and reduced memory consumption.
        It is also possible to use non-inplace operations in order to improve stability.
        - x: PyTorch Tensor, could be in either cpu of CUDA. If in CUDA the homonym extension is used.
        - kernel_size: Integer or Tuple, for the kernel size to be used for downsampling. If an `Integer`
                       is used, a `Tuple` is created for the rest of the dimensions. Defaults to 2.
        - stride: Integer or Tuple, for the steps taken between kernels (i.e. strides). If `None` the
                  strides become equal to the `kernel_size` tuple. Defaults to `None`.
        - force_inplace: Bool, determines if in-place operations are to be used regardless of the CUDA
                         custom op. Mostly useful for time monitoring. Defaults to `False`.
        - PyTorch Tensor, subsampled based on the specified `kernel_size` and `stride`
def soft_pool1d(x, kernel_size=2, stride=None, force_inplace=False):
    if x.is_cuda and not force_inplace:
        x = CUDA_SOFTPOOL1d.apply(x, kernel_size, stride)
        # Replace `NaN's if found
        if torch.isnan(x).any():
            return torch.nan_to_num(x)
        return x
    kernel_size = _single(kernel_size)
    if stride is None:
        stride = kernel_size
        stride = _single(stride)
    # Get input sizes
    _, c, d = x.size()
    # Create per-element exponential value sum : Tensor [b x c x d]
    e_x = torch.exp(x)
    # Apply mask to input and pool and calculate the exponential sum
    # Tensor: [b x c x d] -> [b x c x d']
    return F.avg_pool1d(x.mul(e_x), kernel_size, stride=stride).mul_(sum(kernel_size)).div_(F.avg_pool1d(e_x, kernel_size, stride=stride).mul_(sum(kernel_size)))


---  S T A R T  O F  F U N C T I O N  S O F T _ P O O L 2 D  ---
        Function for dowsampling based on the exponenial proportion rate of pixels (soft pooling).
        If the tensor is in CUDA the custom operation is used. Alternatively, the function uses
        standard (mostly) in-place PyTorch operations for speed and reduced memory consumption.
        It is also possible to use non-inplace operations in order to improve stability.
        - x: PyTorch Tensor, could be in either cpu of CUDA. If in CUDA the homonym extension is used.
        - kernel_size: Integer or Tuple, for the kernel size to be used for downsampling. If an `Integer`
                       is used, a `Tuple` is created for the rest of the dimensions. Defaults to 2.
        - stride: Integer or Tuple, for the steps taken between kernels (i.e. strides). If `None` the
                  strides become equal to the `kernel_size` tuple. Defaults to `None`.
        - force_inplace: Bool, determines if in-place operations are to be used regardless of the CUDA
                         custom op. Mostly useful for time monitoring. Defaults to `False`.
        - PyTorch Tensor, subsampled based on the specified `kernel_size` and `stride`
def soft_pool2d(x, kernel_size=2, stride=None, force_inplace=False):
    if x.is_cuda and not force_inplace:
        x = CUDA_SOFTPOOL2d.apply(x, kernel_size, stride)
        # Replace `NaN's if found
        if torch.isnan(x).any():
            return torch.nan_to_num(x)
        return x
    kernel_size = _pair(kernel_size)
    if stride is None:
        stride = kernel_size
        stride = _pair(stride)
    # Get input sizes
    _, c, h, w = x.size()
    # Create per-element exponential value sum : Tensor [b x c x h x w]
    e_x = torch.exp(x)
    # Apply mask to input and pool and calculate the exponential sum
    # Tensor: [b x c x h x w] -> [b x c x h' x w']
    return F.avg_pool2d(x.mul(e_x), kernel_size, stride=stride).mul_(sum(kernel_size)).div_(F.avg_pool2d(e_x, kernel_size, stride=stride).mul_(sum(kernel_size)))


---  S T A R T  O F  F U N C T I O N  S O F T _ P O O L 3 D  ---
        Function for dowsampling based on the exponenial proportion rate of pixels (soft pooling).
        If the tensor is in CUDA the custom operation is used. Alternatively, the function uses
        standard (mostly) in-place PyTorch operations for speed and reduced memory consumption.
        It is also possible to use non-inplace operations in order to improve stability.
        - x: PyTorch Tensor, could be in either cpu of CUDA. If in CUDA the homonym extension is used.
        - kernel_size: Integer or Tuple, for the kernel size to be used for downsampling. If an `Integer`
                       is used, a `Tuple` is created for the rest of the dimensions. Defaults to 2.
        - stride: Integer or Tuple, for the steps taken between kernels (i.e. strides). If `None` the
                  strides become equal to the `kernel_size` tuple. Defaults to `None`.
        - force_inplace: Bool, determines if in-place operations are to be used regardless of the CUDA
                         custom op. Mostly useful for time monitoring. Defaults to `False`.
        - PyTorch Tensor, subsampled based on the specified `kernel_size` and `stride`
def soft_pool3d(x, kernel_size=2, stride=None, force_inplace=False):
    if x.is_cuda and not force_inplace:
        x = CUDA_SOFTPOOL3d.apply(x, kernel_size, stride)
        # Replace `NaN's if found
        if torch.isnan(x).any():
            return torch.nan_to_num(x)
        return x
    kernel_size = _triple(kernel_size)
    if stride is None:
        stride = kernel_size
        stride = _triple(stride)
    # Get input sizes
    _, c, d, h, w = x.size()
    # Create per-element exponential value sum : Tensor [b x c x d x h x w]
    e_x = torch.exp(x)
    # Apply mask to input and pool and calculate the exponential sum
    # Tensor: [b x c x d x h x w] -> [b x c x d' x h' x w']
    return F.avg_pool3d(x.mul(e_x), kernel_size, stride=stride).mul_(sum(kernel_size)).div_(F.avg_pool3d(e_x, kernel_size, stride=stride).mul_(sum(kernel_size)))









[1] 原始论文


