Rectified Linear Unit (ReLU)

转载 2015年11月18日 15:57:21

ReLUThe Rectified Linear Unit (ReLU) computes the function f(x)=max(0,x), which is simply thresholded at zero.

There are several pros and cons to using the ReLUs:

  1. (Pros) Compared to sigmoid/tanh neurons that involve expensive operations (exponentials, etc.), the ReLU can be implemented by simply thresholding a matrix of activations at zero. Meanwhile, ReLUs does not suffer from saturating.
  2. (Pros) It was found to greatly accelerate the convergence of stochastic gradient descent compared to the sigmoid/tanh functions. It is argued that this is due to its linear, non-saturating form.
  3. (Cons) Unfortunately, ReLU units can be fragile during training and can “die”. For example, a large gradient flowing through a ReLU neuron could cause the weights to update in such a way that the neuron will never activate on any datapoint again. If this happens, then the gradient flowing through the unit will forever be zero from that point on. That is, the ReLU units can irreversibly die during training since they can get knocked off the data manifold. For example, you may find that as much as 40% of your network can be “dead” (i.e., neurons that never activate across the entire training dataset) if the learning rate is set too high. With a proper setting of the learning rate this is less frequently an issue.

Leaky ReLU

Leaky ReLU Leaky ReLUs are one attempt to fix the “dying ReLU” problem. Instead of the function being zero when x<0, a leaky ReLU will instead have a small negative slope(of 0.01, or so). That is, the function computes f(x)=ax if x<0 and f(x)=x if x0, where a is a small constant. Some people report success with this form of activation function, but the results are not always consistent.

Parametric ReLU

rectified unit family
The first variant is called parametric rectified linear unit (PReLU). In PReLU, the slopes of negative part are learned from data rather than pre-defined.

Randomized ReLU

In RReLU, the slopes of negative parts are randomized in a given range in the training, and then fixed in the testing. As mentioned in [B. Xu, N. Wang, T. Chen, and M. Li. Empirical Evaluation of Rectified Activations in Convolution Network. In ICML Deep Learning Workshop, 2015.], in a recent Kaggle National Data Science Bowl (NDSB) competition, it is reported that RReLU could reduce overfitting due to its randomized nature. Moreover, suggested by the NDSB competition winner, the random ai in training is sampled from 1/U(3,8) and in test time it is fixed as its expectation, i.e., 2/(l+u)=2/11.

In conclusion, three types of ReLU variants all consistently outperform the original ReLU in these three data sets. And PReLU and RReLU seem better choices.

ReLu(Rectified Linear Units)激活函数总结

最近在看一篇论文,里面有提到ReLu(修正线性单元),之前有听师兄讲过,有一定的兴趣,于是在网上搜找了一些资料,总结了一下,欢迎补充!Deep Sparse Rectifier Neural Netw...
  • 2015年12月08日 12:58
  • 17937


y代表真实值,y_代表预测值,损失函数采用交叉熵损失函数如下loss function:L(y,y_)=-(ylny_+(1-y)ln(1-y_))一般更新参数的方式,我们梯度下降的方式,目的是使得损...
  • u014296502
  • u014296502
  • 2017年12月14日 10:41
  • 361

RELU 激活函数及其他相关的函数

本博客仅为作者记录笔记之用,不免有很多细节不对之处。 还望各位看官能够见谅,欢迎批评指正。 更多相关博客请猛戳: 如需转载,请附...
  • u013146742
  • u013146742
  • 2016年07月21日 20:51
  • 44688


书籍: 《神经网络与深度学习》讲义,邱锡鹏 《Neural Networks and Deep Learning》:http://neuralnetworksanddeeplearning...
  • zhangweijiqn
  • zhangweijiqn
  • 2016年11月17日 15:08
  • 841

'Dead ReLU Problem' 产生的原因

原文地址: 译者话:看了一些激活函数优缺点的中文博客,很...
  • disiwei1012
  • disiwei1012
  • 2018年01月30日 11:53
  • 179

深度学习基础(十二)—— ReLU vs PReLU

从算法的命名上来说,PReLU 是对 ReLU 的进一步限制,事实上 PReLU(Parametric Rectified Linear Unit),也即 PReLU 是增加了参数修正的 ReLU。在...
  • lanchunhui
  • lanchunhui
  • 2016年09月23日 23:30
  • 7071

激活函数实现--4 Rectified linear函数实现

1.Rectified Linear函数的定义 2.Rectified Linear函数的导数 3.Re...
  • shengno1
  • shengno1
  • 2014年07月23日 01:51
  • 4339

【ReLU】Rectified Linear Units, 线性修正单元激活函数

在神经网络中,常用到的激活函数有sigmoid函数、双曲正切函数,而本文要介绍的是另外一种激活函数,Rectified Linear Unit Function(ReLU, 线性激活函数)...
  • Jkwwwwwwwwww
  • Jkwwwwwwwwww
  • 2016年10月14日 16:53
  • 5869

修正线性单元(Rectified linear unit,ReLU)

  • LG1259156776
  • LG1259156776
  • 2015年09月11日 21:17
  • 36828

激活函数-Concatenated Rectified Linear Units

ICML2016Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear...
  • cv_family_z
  • cv_family_z
  • 2016年09月01日 15:09
  • 3633
您举报文章:Rectified Linear Unit (ReLU)