Linear Regression
线性回归:
from sklearn.linear_model import LinearRegression
lr = LinearRegression(fit_intercept=True)
lr.fit(x, y)
p = map(lr.predict, x)
e = p - y
total_error = np.sum(e*e)
rmse_train = np.sqrt(total_error / len(p))
交叉验证:
from sklearn.cross_validation import Kfold
kf = Kfold(len(x), n_folds=10)
err = 0
for train, test in kf:
lr.fit(x[train], y[train])
p = map(lr.predict, x[test])
e = p - y[test]
err += np.sum(e*e)
rmse_10cv = np.sqrt(err/len(x))
普通的最小二乘法学习时间非常短:
- 小模型快速实现
- 学习曲线、误差分析、上限分析
假设以平方差为损失函数,则优化目标为:
minw∑i=1m(yi−wTxi)2 m i n w ∑ i = 1