【TensorFlow 入门】5、激活函数

呆呆的猫

已于 2022-09-28 11:46:11 修改

阅读量6.9k

点赞数 1

分类专栏： TensorFlow 文章标签： tensorflow 机器学习 python

于 2018-02-03 15:45:10 首次发布

本文链接：https://blog.csdn.net/jiaoyangwm/article/details/79247434

版权

TensorFlow 专栏收录该内容

16 篇文章 14 订阅

订阅专栏

TensorFlow提供了多种激活函数：

这里写图片描述

1. sigmoid函数

tf.sigmoid(x, name = None) == tf.nn.sigmoid(x, name = None)
# y = 1 / (1 + exp(-x))

Computes sigmoid of x element-wise.
Specifically, y = 1 / (1 + exp(-x)).
x: A Tensor with type float, double, int32, complex64, int64, or qint32.
name: A name for the operation (optional).

$y = 1/ (1 + e x p (- x))$

sigmoid函数优缺点：
优点：
可以把输入映射到(0, 1)区间，可以用来表示概率(eg：logistic regression)
在物理意义上最为接近生物神经元
缺点：

梯度消失问题

首先明确一点：误差反向传播时，梯度包含了f′(zl)和上一层的误差项(又包含了f′(zl+1)：z 为权重加权和)两个乘法因子，反向传播推导
由于 sigmoid 的导数f′(zl)区间为(0, 0.25]，所以其极易落入饱和区，导致梯度非常小，权重接近不变，无法正常更新
误差不断向底层传递的过程中，f′(zl)会呈指数倍增加，而其值域为(0, 0.25]，所以梯度越往后传递值越小，最终导致权重无法正常更新

sigmoid的输出并不是均值为0的，所有输出数据的大于0，会增加梯度的不稳定性
当输出接近饱和或剧烈变化时，对输出范围的这种缩减往往会带来一些不利影响

2. tanh函数

tf.tanh(x, name = None)  == tf.nn.tanh(x, name = None)
# y = (exp(x) - exp(-x)) / (exp(x) + exp(-x))

Computes hyperbolic tangent of x element-wise. x: A Tensor with type
float, double, int32, complex64, int64, or qint32. name: A name for
the operation (optional).

这里写图片描述

$t anh (x) = s inh (x) / cos h (x) = (e x p (x) - e x p (- x)) / (e x p (x) + e x p (- x))$

tanh函数的优缺点：

优点：

Tanh outputs are zero-centered，把输入映射到(-1, 1)区间
缺点：

虽然 tanh 的导数f′(zl)区间为(0, 1]，但仍然会导致梯度消失问题!

3. ReLU函数

tf.nn.relu(features, name=None)
# y = max(features, 0)

Computes rectified linear: max(features, 0).
features: A Tensor. Must be one of the following types: float32, float64, int32, int64,uint8, int16, int8.
name: A name for the operation (optional).

$R e LU (x) = ma x (0, x)$

ReLU函数的优缺点：

优点：
1. 比 sigmoid/tanh 收敛的更快(6x)，creating sparse representations with true zeros( more likely to be linearly separable)
2. 其导数在其权重和(z) 大于 0 的时候为 1，从而误差可以很好的传播，权重可以正常更新
缺点：
1. 其导数在其权重和(z) 小于 0 的时候为 0，会导致梯度值为0，从而权重无法正常更新
2. 输出具有偏移现象，即输出均值恒大于零
3. 当使用了较大的学习速率时，易受到饱和的神经元的影响。

4. ReLU函数的演变形式

Leaky-ReLU

ReLU6

tf.nn.relu6(features, name=None)
# y = min(max(features, 0), 6)

Computes Rectified Linear 6: min(max(features, 0), 6).
features(x): A Tensor with type float, double, int32, int64, uint8, int16, or int8.
name: A name for the operation (optional).

$re l u 6 (x) = min (ma x (x, 0), 6)$

tf.nn.elu(features, name=None)
# exp(features) - 1 if < 0, features otherwise

5. softplus函数

tf.nn.softplus(features, name=None)
# y = log(exp(features) + 1)

Computes softplus: log(exp(features) + 1).
features: A Tensor. Must be one of the following types: float32, float64, int32, int64,uint8, int16, int8.
name: A name for the operation (optional).

softplus(x) = log(exp(feature) + 1)

6. softsign函数

tf.nn.softsign(features, name=None)
# y = features / (abs(features) + 1)

呆呆的猫

关注

1
点赞
踩
14

收藏

觉得还不错? 一键收藏
打赏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录