Bipolar ReLU 激活函数

Bipolar ReLU激活函数

论文链接:[Shifting Mean Activation Towards Zero with Bipolar Activation Functions]

年份:2018

介绍

Bipolar ReLU是对ReLU的一个扩展,允许将一层的平均激活向0移动,结合适当的权值初始化,可以增强BatchNorm 层的工作需求。在没有进行批归一化的CNN中,Bipolar ReLU可以加强训练时误差的快速下降。

在Bipolar ReLU提出之前,ReLU的一些变形已经被提出,将负区域输出零值替换为输出负值,从而允许平均激活值更为接近于零。

方法

在神经网络中,ReLU只保留其正输入,从而将平均激活向正方向偏移,然而,当输入为负值时,则输出的平均激活值将不会保持0中心化,由此本文定义Bipolar ReLU为:
f ( x i ) = { f ( x i ) , i m o d    2 = 0 − f ( − x i ) , i m o d    2 ≠ 0 f(x_i) = \begin{cases} f(x_i), & i \mod 2 = 0\\ -f(-x_i), & i \mod 2 \neq 0 \end{cases} f(xi)={f(xi),f(xi),imod2=0imod2=0
对于卷积层,将激活函数翻转一半的特征映射。Bipolar ReLU确保了无论输入为何值,输出的平均激活值会向零偏移。
下图为Bipolar ReLU的曲线:
在这里插入图片描述
下图为Bipolar ELU的曲线:
在这里插入图片描述
Bipolar ReLU或Bipolar ELU具有更稳定的动力学,不容易出现均值和方差爆炸。

Pytorch 代码

import torch
from torch.autograd import Variable,Function
import torch.nn as nn
from torch.nn.parameter import Parameter # import Parameter to create custom activations with learnable parameters
from torch import optim # import optimizers for demonstrations
import torch.nn.functional as F # import torch functions
from torchvision import datasets, transforms 
# Implementation of BReLU activation function with custom backward step
class brelu(Function):
    '''
    Implementation of BReLU activation function.
    Shape:
        - Input: (N, *) where * means, any number of additional
          dimensions
        - Output: (N, *), same shape as the input
    References:
        - See BReLU paper:
        https://arxiv.org/pdf/1709.04054.pdf
    Examples:
        >>> brelu_activation = brelu.apply
        >>> t = torch.randn((5,5), dtype=torch.float, requires_grad = True)
        >>> t = brelu_activation(t)
    '''
    #both forward and backward are @staticmethods
    @staticmethod
    def forward(ctx, input):
        """
        In the forward pass we receive a Tensor containing the input and return
        a Tensor containing the output. ctx is a context object that can be used
        to stash information for backward computation. You can cache arbitrary
        objects for use in the backward pass using the ctx.save_for_backward method.
        """
        ctx.save_for_backward(input) # save input for backward pass

        # get lists of odd and even indices
        input_shape = input.shape[0]
        even_indices = [i for i in range(0, input_shape, 2)]
        odd_indices = [i for i in range(1, input_shape, 2)]

        # clone the input tensor
        output = input.clone()

        # apply ReLU to elements where i mod 2 == 0
        output[even_indices] = output[even_indices].clamp(min=0)

        # apply inversed ReLU to inversed elements where i mod 2 != 0
        output[odd_indices] = 0 - output[odd_indices] # reverse elements with odd indices
        output[odd_indices] = - output[odd_indices].clamp(min = 0) # apply reversed ReLU

        return output

    @staticmethod
    def backward(ctx, grad_output):
        """
        In the backward pass we receive a Tensor containing the gradient of the loss
        with respect to the output, and we need to compute the gradient of the loss
        with respect to the input.
        """
        grad_input = None # set output to None

        input, = ctx.saved_tensors # restore input from context

        # check that input requires grad
        # if not requires grad we will return None to speed up computation
        if ctx.needs_input_grad[0]:
            grad_input = grad_output.clone()

            # get lists of odd and even indices
            input_shape = input.shape[0]
            even_indices = [i for i in range(0, input_shape, 2)]
            odd_indices = [i for i in range(1, input_shape, 2)]

            # set grad_input for even_indices
            grad_input[even_indices] = (input[even_indices] >= 0).float() * grad_input[even_indices]

            # set grad_input for odd_indices
            grad_input[odd_indices] = (input[odd_indices] < 0).float() * grad_input[odd_indices]

        return grad_input
  • 通过简单的试验发现Bipolar ReLU的损失下降速度较慢,没有文章中说的那么好
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值