梯度下降:代码
之前我们看到一个权重的计算是:
Δwi=ηδxi
这里 error term δ 是指
δ=(y−y^)f′(h)=(y−y^)f′(∑wixi)
记住,上面公式中 (y−y^) 是输出误差,f′(h) 是激活函数 f(h) 的导函数,我们把这个导函数称做输出的梯度。
现在假设只有一个输出,我来把这个写成代码。我们还是用 sigmoid 来作为激活函数 f(h)。
import numpy as np
def sigmoid(x):
"""
Calculate sigmoid
"""
return 1/(1+np.exp(-x))
def sigmoid_prime(x):
"""
# Derivative of the sigmoid function
"""
return sigmoid(x) * (1 - sigmoid(x))
learnrate = 0.5
x = np.array([1, 2, 3, 4])
y = np.array(0.5)
# Initial weights
w = np.array([0.5, -0.5, 0.3, 0.1])
### Calculate one gradient descent step for each weight
### Note: Some steps have been consilated, so there are
### fewer variable names than in the above sample code
# TODO: Calculate the node's linear combination of inputs and weights
h = np.dot(x, w)
# TODO: Calculate output of neural network
nn_output = sigmoid(h)
# TODO: Calculate error of neural network
error = y - nn_output
# TODO: Calculate the error term
# Remember, this requires the output gradient, which we haven't
# specifically added a variable for.
error_term = error * sigmoid_prime(h)
# TODO: Calculate change in weights
del_w = learnrate * error_term * x
print('Neural Network output:')
print(nn_output)
print('Amount of Error:')
print(error)
print('Change in Weights:')
print(del_w)