Tensorflow(二十) —— 激活函数及其梯度
1. sigmoid / logistic
"""
该函数的致命缺陷:梯度长时间得不到更新
"""
x = tf.Variable(tf.linspace(-10.,10.,100))
with tf.GradientTape() as tape:
y = tf.nn.sigmoid(x)
[dy_dx] = tape.gradient(y,[x])
plt.figure()
plt.plot(x.numpy(),y.numpy())
plt.title("sigmoid")
plt.figure()
plt.plot(x.numpy(),dy_dx.numpy())
plt.title("derivative for sigmoid")
"""
梯度适合[-3,3]
"""
2. Tanh 在RNN中用得多
x = tf.linspace(-10.,10.,100)
with tf.GradientTape() as tape:
tape.watch([x])
y = tf.nn.tanh(x)
[dy_dx] = tape.gradient(y,[x])
fig = plt.figure(figsize=[20,24])
ax1 = fig.add_subplot(211)
ax1.plot(x.numpy(),y.numpy())
plt.title("tanh")
ax2 = fig.add_subplot(212)
ax2.plot(x.numpy(),dy_dx.numpy())
plt.title("derivative for tanh")
3. relu函数
# ****************** relu函数
"""
rectified linear unit
tf.nn.relu()
tf.nn.leaky_relu() x<0 时 可取很小的k y=kx
"""
x = tf.linspace(-10.,10.,100)
with tf.GradientTape(persistent = True) as tape:
tape.watch([x])
y1 = tf.nn.relu(x)
y2 = tf.nn.leaky_relu(x)
[dy1_dx] = tape.gradient(y1,[x])
[dy2_dx] = tape.gradient(y2,[x])
fig = plt.figure(figsize=[40,24])
ax1 = fig.add_subplot(221)
ax1.plot(x.numpy(),y1.numpy())
ax2 = fig.add_subplot(222)
ax2.plot(x.numpy(),y2.numpy())
ax3 = fig.add_subplot(223)
ax3.plot(x.numpy(),dy1_dx.numpy())
ax4 = fig.add_subplot(224)
ax4.plot(x.numpy(),dy2_dx.numpy())
本文为参考龙龙老师的“深度学习与TensorFlow 2入门实战“课程书写的学习笔记
by CyrusMay 2022 04 17