线性回归测试
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
data = np.loadtxt(r'D:\data\machine-learning-ex1\machine-learning-ex1\ex1\ex1data1.txt', delimiter=',')
data_x = data[:, [0]]
data_y = data[:, [1]]
x_train, x_test, y_train, y_test = train_test_split(data_x, data_y, test_size=0.3)
model = LinearRegression()
model.fit(x_train, y=y_train)
print(model.score(x_test, y_test))
测试用例与之前的一样。
C:\Users\G3\Anaconda3\python.exe D:/test/tool/tool.py
0.8616960429597703
Process finished with exit code 0
C:\Users\G3\Anaconda3\python.exe D:/test/tool/tool.py
0.5450383718610818
Process finished with exit code 0
两次训练打分相差很大。
原因是划分测试与训练集时是随机的。
可以在划分时添加种子参数来复现训练结果。
可通过多次随机梯度下降来优化。(同时还可以在一定程度上解决局部最优的问题)