Gradient Descent
梯度下降求法:x' = x - ▼x,即:当前的x值减去y函数在x出的导数,此外这里会乘以一个learning rate(缩放倍数),使得他每一次调整的量不会太大 ,不同的 learning rate效果不一样,我们希望他慢慢的逼近最优解
1、怎么求解一个简单的二元一次方程
- y = w * x + b,用初中的消元法可以经精确求解(Closed Form Solution)
- 但是在实际问题中,一般就是求一个近似解,因为现实生活中获得到的数据都有一定的噪声或者观测误差
- 实际问题中稍微复杂点: y = w * x + b + θ,加入一个随机的高斯噪声,取自θ ~ N(0.01, 1)
- 目的:使得y 与 wx + b近似相等
损失函数(loss function):为了评估模型拟合的好坏,通常用损失函数来度量拟合的程度。损失函数极小化,意味着拟合程度最好,对应的模型参数即为最优参数。在线性回归中,损失函数通常为样本输出和假设函数的差取平方。
2、Let's see an example
- y = 1.477 * x + 0.089 + θ
- 实际中我们不知道模型的参数,观测后符合线性关系,需要求解的是 w 和 b
- 求loss函数最小
代码实现:
训练数据获取链接:链接: https://pan.baidu.com/s/1DtvXnIyfLQLiNZobHky_pQ 提取码: kb6i
- w的梯度值:▼loss / ▼w = 2(w*x + b - y)x
- b的梯度值:▼loss / ▼b = 2(w*x + b - y)
import numpy as np
#计算 实际值和预测值差的平方和
#计算 loss的值
#loss = (WX + b - y) ^ 2
def compute_error_for_line_given_points(b, w, points):
totalError = 0
for i in range(0, len(points)):
x = points[i, 0]
y = points[i, 1]
totalError += (y - (w * x + b)) ** 2
return totalError / float(len(points))
#计算梯度信息(W,b)
def step_gradient(b_current, w_current, points, learningRate):
b_gradient = 0
w_gradient = 0
N = float(len(points))
for i in range(0, len(points)):
x = points[i, 0]
y = points[i, 1]
b_gradient += -(2/N) * (y - ((w_current * x) + b_current)) #b的梯度
w_gradient += -(2/N) * x * (y - ((w_current * x) + b_current)) #w的梯度
new_b = b_current - (learningRate * b_gradient)
new_w = w_current - (learningRate * w_gradient)
return [new_b, new_w]
# Iterate to optimize
def gradient_descent_runner(points, starting_b, starting_w, learningRate, num_iterations):
b = starting_b
w = starting_w
#num_iterations迭代次数
for i in range(num_iterations):
b, w = step_gradient(b, w, np.array(points), learningRate)
return [b, w]
def run():
points = np.genfromtxt("data.csv", delimiter=",")
learningRate = 0.0001
initial_b = 0 # initial y-intercept guess
initial_w = 0 # initial slope guess
num_iterations = 1000
print("Starting gradient descent at b = {0}, w = {1}, error = {2}"
.format(initial_b, initial_w,
compute_error_for_line_given_points(initial_b, initial_w, points))
)
print("Running.....")
[b, w] = gradient_descent_runner(points, initial_b, initial_w, learningRate, num_iterations)
print("After {0} iterations b = {1}, w = {2}, error = {3}"
.format(num_iterations, b, w, compute_error_for_line_given_points(b, w, points))
)
if __name__ == '__main__':
run()