Bipolar ReLU 激活函数

最新推荐文章于 2024-03-19 18:44:05 发布

皮鼓坐凳子

最新推荐文章于 2024-03-19 18:44:05 发布

阅读量256

点赞数 1

分类专栏：激活函数大全文章标签：深度学习人工智能神经网络

本文链接：https://blog.csdn.net/weixin_38190702/article/details/125457528

版权

激活函数大全专栏收录该内容

32 篇文章 0 订阅

订阅专栏

Bipolar ReLU激活函数

论文链接：[Shifting Mean Activation Towards Zero with Bipolar Activation Functions]

年份：2018

介绍

Bipolar ReLU是对ReLU的一个扩展，允许将一层的平均激活向0移动，结合适当的权值初始化，可以增强BatchNorm 层的工作需求。在没有进行批归一化的CNN中，Bipolar ReLU可以加强训练时误差的快速下降。

在Bipolar ReLU提出之前，ReLU的一些变形已经被提出，将负区域输出零值替换为输出负值，从而允许平均激活值更为接近于零。

方法

在神经网络中，ReLU只保留其正输入，从而将平均激活向正方向偏移，然而，当输入为负值时，则输出的平均激活值将不会保持0中心化，由此本文定义Bipolar ReLU为：
$f(x_i) = \begin{cases} f(x_i), & i \mod 2 = 0\\ -f(-x_i), & i \mod 2 \neq 0 \end{cases}$
对于卷积层，将激活函数翻转一半的特征映射。Bipolar ReLU确保了无论输入为何值，输出的平均激活值会向零偏移。
下图为Bipolar ReLU的曲线：
在这里插入图片描述
下图为Bipolar ELU的曲线：

Bipolar ReLU或Bipolar ELU具有更稳定的动力学，不容易出现均值和方差爆炸。

Pytorch 代码

import torch
from torch.autograd import Variable,Function
import torch.nn as nn
from torch.nn.parameter import Parameter # import Parameter to create custom activations with learnable parameters
from torch import optim # import optimizers for demonstrations
import torch.nn.functional as F # import torch functions
from torchvision import datasets, transforms 
# Implementation of BReLU activation function with custom backward step
class brelu(Function):
    '''
    Implementation of BReLU activation function.
    Shape:
        - Input: (N, *) where * means, any number of additional
          dimensions
        - Output: (N, *), same shape as the input
    References:
        - See BReLU paper:
        https://arxiv.org/pdf/1709.04054.pdf
    Examples:
        >>> brelu_activation = brelu.apply
        >>> t = torch.randn((5,5), dtype=torch.float, requires_grad = True)
        >>> t = brelu_activation(t)
    '''
    #both forward and backward are @staticmethods
    @staticmethod
    def forward(ctx, input):
        """
        In the forward pass we receive a Tensor containing the input and return
        a Tensor containing the output. ctx is a context object that can be used
        to stash information for backward computation. You can cache arbitrary
        objects for use in the backward pass using the ctx.save_for_backward method.
        """
        ctx.save_for_backward(input) # save input for backward pass

        # get lists of odd and even indices
        input_shape = input.shape[0]
        even_indices = [i for i in range(0, input_shape, 2)]
        odd_indices = [i for i in range(1, input_shape, 2)]

        # clone the input tensor
        output = input.clone()

        # apply ReLU to elements where i mod 2 == 0
        output[even_indices] = output[even_indices].clamp(min=0)

        # apply inversed ReLU to inversed elements where i mod 2 != 0
        output[odd_indices] = 0 - output[odd_indices] # reverse elements with odd indices
        output[odd_indices] = - output[odd_indices].clamp(min = 0) # apply reversed ReLU

        return output

    @staticmethod
    def backward(ctx, grad_output):
        """
        In the backward pass we receive a Tensor containing the gradient of the loss
        with respect to the output, and we need to compute the gradient of the loss
        with respect to the input.
        """
        grad_input = None # set output to None

        input, = ctx.saved_tensors # restore input from context

        # check that input requires grad
        # if not requires grad we will return None to speed up computation
        if ctx.needs_input_grad[0]:
            grad_input = grad_output.clone()

            # get lists of odd and even indices
            input_shape = input.shape[0]
            even_indices = [i for i in range(0, input_shape, 2)]
            odd_indices = [i for i in range(1, input_shape, 2)]

            # set grad_input for even_indices
            grad_input[even_indices] = (input[even_indices] >= 0).float() * grad_input[even_indices]

            # set grad_input for odd_indices
            grad_input[odd_indices] = (input[odd_indices] < 0).float() * grad_input[odd_indices]

        return grad_input