人工智能/机器学习基础知识——激活函数汇总

ZreviaX

于 2024-04-13 14:00:00 发布

阅读量633

点赞数 19

分类专栏：人工智能/机器学习基础知识文章标签：机器学习人工智能深度学习激活函数

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/WindGrin_/article/details/137690000

版权

人工智能/机器学习基础知识专栏收录该内容

31 篇文章 0 订阅

订阅专栏

激活函数

Activation Function

Sigmoid（Logistic）

Sigmoid（Logistic）

数学描述

$\sigma(x)=\frac{1}{1+\exp (-x)}$
由于Sigmoid函数的特性，一些不适当的参数初始化会造成神经元过饱和（Fully Saturated），使Sigmoid的函数值过于趋近两端，造成梯度消失

Tanh

Tanh

数学描述

$\tanh (x)=\frac{\exp (x)-\exp (-x)}{\exp (x)+\exp (-x)}$
由于Tanh函数的特性，一些不适当的参数初始化会造成神经元过饱和（Fully Saturated），使Tanh的函数值过于趋近两端，造成梯度消失
Tanh函数的输出是零中心化的（Zero-Centered），而Logistic函数的输出恒大于 0．非零中心化的输出会使得其后一层的神经元的输入发生偏置偏移（Bias Shift），并进一步使得梯度下降的收敛速度变慢

在这里插入图片描述

Hard-Logistic & Hard-Tanh

Hard-Logistic & Hard-Tanh

基于泰勒展开逼近原函数值

$\begin{aligned} \text { hard-logistic }(x) &= \begin{cases}1 & g_{l}(x) \geq 1 \\ g_{l} & 0<g_{l}(x)<1 \\ 0 & g_{l}(x) \leq 0\end{cases} \\ &=\max \left(\min \left(g_{l}(x), 1\right), 0\right) \\ &=\max (\min (0.25 x+0.5,1), 0) . \end{aligned}$

$\begin{aligned} \text { hard-tanh }(x) &=\max \left(\min \left(g_{t}(x), 1\right),-1\right) \\ &=\max (\min (x, 1),-1) \end{aligned}$

在这里插入图片描述

ReLU

Rectified Linear Unit

数学描述

$\begin{aligned} \operatorname{ReLU}(x) &= \begin{cases}x & x \geq 0 \\ 0 & x<0\end{cases} \\ &=\max (0, x) . \end{aligned}$
死亡ReLU问题

Dying ReLU Problem
- 如果参数在一次不恰当的更新后，第一个隐藏层中的某个ReLU神经元在所有的训练数据上都不能被激活，那么这个神经元自身参数的梯度永远都会是0，在以后的训练过程中永远不能被激活，并且也有可能会发生在其他隐藏层

Leaky ReLU

Leaky ReLU

数学描述

$\begin{aligned} \operatorname{LeakyReLU}(x) &= \begin{cases}x & \text { if } x>0 \\ \gamma x & \text { if } x \leq 0\end{cases} \\ &=\max (0, x)+\gamma \min (0, x) \end{aligned}$

PReLU

Parametric ReLU

数学描述

$\begin{aligned} \operatorname{PReLU}_{i}(x) &= \begin{cases}x & \text { if } x>0 \\ \gamma_{i} x & \text { if } x \leq 0\end{cases} \\ &=\max (0, x)+\gamma_{i} \min (0, x) \end{aligned}$

ELU

Exponential Linear Unit

数学描述

$\begin{aligned} \operatorname{ELU}(x) &= \begin{cases}x & \text { if } x>0 \\ \gamma(\exp (x)-1) & \text { if } x \leq 0\end{cases} \\ &=\max (0, x)+\min (0, \gamma(\exp (x)-1)) \end{aligned}$

SELU

Scaled Exponential Linear Unit

数学描述

$\text{SELU}(x) = \text{scale} * (\max(0,x) + \min(0, \alpha * (\exp(x) - 1))) = \text{scale} * \text{ELU}(x, \alpha)$
其中， $\text{scale}$ 与 $\alpha$ 为超参数

$\alpha = 1.6732632423543772848170429916717$

$\text{scale} = 1.0507009873554804934193349852946$

Softplus

Softplus

数学描述

$\text { Softplus }(x)=\log (1+\exp (x))$

在这里插入图片描述

Swish

Swish

数学描述

$\operatorname{swish}(x)=x \sigma(\beta x)$

在这里插入图片描述

GELU

Gaussian Error Linear Unit

数学描述

$\operatorname{GELU}(x)=x P(X \leq x)$

Maxout

Maxout

数学描述

$z_{k}=\boldsymbol{w}_{k}^{\top} \boldsymbol{x}+b_{k}$

$\operatorname{maxout}(\boldsymbol{x})=\max _{k \in[1, K]}\left(z_{k}\right)$

GLU

Gated Linear Unit

数学描述

$h_{l}(\mathbf{X})=(\mathbf{X} * \mathbf{W}+\mathbf{b}) \otimes \sigma(\mathbf{X} * \mathbf{V}+\mathbf{c})$
其中， $\mathbf{W}, \mathbf{V}, \mathbf{b}, \mathbf{c}$ 为可学习参数， $\sigma(·)$ 为Sigmoid

GTU

Gated Tanh Unit

数学描述

$h_{l}(\mathbf{X})=\tanh (\mathbf{X} * \mathbf{W}+\mathbf{b}) \otimes \sigma(\mathbf{X} * \mathbf{V}+\mathbf{c})$
其中， $\mathbf{W}, \mathbf{V}, \mathbf{b}, \mathbf{c}$ 为可学习参数， $\sigma(·)$ 为Sigmoid， $t anh (\cdot)$ 为Tanh

关注

19
点赞
踩
9

收藏

觉得还不错? 一键收藏
0
评论
人工智能/机器学习基础知识——激活函数汇总

人工智能/机器学习基础知识——激活函数汇总
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。