背景:
求解
z=σ(z)
的梯度
由于
sigmoid(x)=11+e−x
在python中利用numpy
模块实现:
# GRADED FUNCTION: sigmoid
import numpy as np
# this means you can access numpy functions by writing np.function() instead of numpy.function()
def sigmoid(x):
"""
Compute the sigmoid of x
Arguments:
x -- A scalar or numpy array of any size
Return:
s -- sigmoid(x)
"""
### START CODE HERE ### (≈ 1 line of code)
s = None
s = 1/(1+np.exp(-x))
### END CODE HERE ###
return s
求对应的导数
sigmoid_derivative(x)=σ′(x)=σ(x)(1−σ(x))(1)
那这个是怎么推导的呢?
σ(x)=11+e−x
另临时变量 t=1+e−x ,通过复合函数的求导法则,所以 σ′(x)=(t−1)′⋅t′=−t−2⋅(−e−x)=1(1+e−x)2⋅e−x=11+e−x(e−x1+e−x)=11+e−x(1+e−x−11+e−x)=11+e−x(1−11+e−x)=σ(x)⋅(1−σ(x))
得证!
python实现
def sigmoid_derivative(x):
"""
Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.
You can store the output of the sigmoid function into variables and then use it to calculate the gradient.
Arguments:
x -- A scalar or numpy array
Return:
ds -- Your computed gradient.
"""
### START CODE HERE ### (≈ 2 lines of code)
s = 1 / ( 1 + 1 / np.exp(x))
ds = s * (1 - s)
### END CODE HERE ###
return ds
x = np.array([1, 2, 3])
print ("sigmoid_derivative(x) = " + str(sigmoid_derivative(x)))
输出结果:
sigmoid_derivative(x) = [ 0.19661193 0.10499359 0.04517666]