给定一组样本点:
x = [0.95, 3, 4, 5.07, 6.03, 8.21, 8.85, 12.02, 15]
y = [5.1, 8.7, 11.5, 13, 15.3, 18, 21, 26.87, 32.5]
然后求y=wx+b的w和b值。
-----------------------------------------------------------------------------------------------------
方法1:调python的sklearn库,看不到具体逻辑,都已经被封装好了。
from sklearn import linear_model
import time
start = time.time()
reg=linear_model.LinearRegression(fit_intercept=True,copy_X=False)
x=[[0.95], [3], [4], [5.07], [6.03], [8.21], [8.85], [12.02], [15]]
y=[5.1, 8.7, 11.5, 13, 15.3, 18, 21, 26.87, 32.5]
reg.fit(x,y)
w=reg.coef_
b=reg.intercept_
print(str(w)+","+str(b))
end = time.time()
print(end - start, '秒')
w和b值如下:
-----------------------------------------------------------------------------------------------------
方法2:批量梯度下降法,python实现
import matplotlib.pyplot as plt
import matplotlib
from math import pow
import time
start = time.time()
x = [0.95, 3, 4, 5.07, 6.03, 8.21, 8.85, 12.02, 15]
y = [5.1, 8.7, 11.5, 13, 15.3, 18, 21, 26.87, 32.5]
# 参数定义
w = 1.5
b = 1.5
learningRate = 0.01
length = len(x)
# 批量梯度下降法
for num in range(10000):
derivative_b = 0 #导数
derivative_w = 0 #导数
# 求导
for i in range(length):
derivative_w += - (w * x[i] + b - y[i]) * x[i] / length
derivative_b += - (w * x[i] + b - y[i]) / length
# delta w , delta b = 学习率 * 偏导数
w = w + learningRate * derivative_w
b = b + learningRate * derivative_b
if derivative_w <= pow(10,-13) and derivative_b <= pow(10,-13):
break
print("y={}*x+{}".format(w, b))
end = time.time()
print(end - start, '秒')
matplotlib.rcParams['font.sans-serif'] = ['SimHei']
plt.plot(x, y, 'bo', label='样本数据', color='black')
plt.plot(x, [b + w * x for x in x], label='线性回归直线', color='red')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
plt.xlabel('上方为w,下方为b')
plt.show()
plt.xlabel('上方为w,下方为b')
plt.show()
运行结果如下:
----------------