算法模型之回归模型(线性回归)

最新推荐文章于 2024-02-17 01:27:09 发布

rookie-rookie-lu

最新推荐文章于 2024-02-17 01:27:09 发布

阅读量633

点赞数

分类专栏：机器学习文章标签：线性回归机器学习 python sklearn

本文链接：https://blog.csdn.net/cai_niao_lu/article/details/121816978

版权

机器学习专栏收录该内容

13 篇文章 1 订阅

订阅专栏

线性回归：
1.假设模型
线性模型和线性关系是不同的，线性关系一定是线性模型，而线性模型不一定是线性关系
2.优化算法
正规方程
正规方程可以比作成一个天才，只需要一次就可以求出各种权重和偏置
梯度下降
梯度下降算法可以比作一个勤奋努力的普通人，需要不断的迭代和试错
3.sklearn实现
LinearRegression
LinearRegression使用的是正规方程，正规方程的时间复杂度太大。一般不使用。
SGDRegressor
SGDRegressor使用的是梯度下降。其中，数据量在1000K以上，推荐SGDRegressor，可以调节的量有学习率、学习步长、最大迭代次数，因此我们可以采用网格搜索和交叉验证的方式进行参数调节
4.模型评估使用MSE均方差来评估

用线性回归实现波士顿房价的预测

from sklearn.linear_model import LinearRegression, SGDRegressor
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
# 1)数据获取
data = load_boston()
# 2)划分数据集
x_train, x_test, y_train, y_test = train_test_split(data.data, data.target, random_state=22)
# 3)特征工程，主要是数据标准化
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
# 4)LinearRegression模型（正规方程）
estimator = LinearRegression()
estimator.fit(x_train, y_train)
print('正规方程：权重值量\n', estimator.coef_)
print('正规方程：偏置量\n', estimator.intercept_)

# 5)模型评估
y_predict = estimator.predict(x_test)
print('预测值\n', y_predict)
mean_squared_error(y_test, y_predict)
# 4)SGDRegressor模型（梯度下降）
estimator = SGDRegressor(learning_rate='constant', eta0=0.01, max_iter=10000)
estimator.fit(x_train, y_train)
print('梯度下降：权重值量\n', estimator.coef_)
print('梯度下降：偏置量\n', estimator.intercept_)

# 5)模型评估
y_predict = estimator.predict(x_test)
print('预测值\n', y_predict)
mean_squared_error(y_test, y_predict)

结果如下：

正规方程：权重值量
 [-0.64817766  1.14673408 -0.05949444  0.74216553 -1.95515269  2.70902585
 -0.07737374 -3.29889391  2.50267196 -1.85679269 -1.75044624  0.87341624
 -3.91336869]
正规方程：偏置量
 22.62137203166228
预测值
 [28.22944896 31.5122308  21.11612841 32.6663189  20.0023467  19.07315705
 21.09772798 19.61400153 19.61907059 32.87611987 20.97911561 27.52898011
 15.54701758 19.78630176 36.88641203 18.81202132  9.35912225 18.49452615
 30.66499315 24.30184448 19.08220837 34.11391208 29.81386585 17.51775647
 34.91026707 26.54967053 34.71035391 27.4268996  19.09095832 14.92742976
 30.86877936 15.88271775 37.17548808  7.72101675 16.24074861 17.19211608
  7.42140081 20.0098852  40.58481466 28.93190595 25.25404307 17.74970308
 38.76446932  6.87996052 21.80450956 25.29110265 20.427491   20.4698034
 17.25330064 26.12442519  8.48268143 27.50871869 30.58284841 16.56039764
  9.38919181 35.54434377 32.29801978 21.81298945 17.60263689 22.0804256
 23.49262401 24.10617033 20.1346492  38.5268066  24.58319594 19.78072415
 13.93429891  6.75507808 42.03759064 21.9215625  16.91352899 22.58327744
 40.76440704 21.3998946  36.89912238 27.19273661 20.97945544 20.37925063
 25.3536439  22.18729123 31.13342301 20.39451125 23.99224334 31.54729547
 26.74581308 20.90199941 29.08225233 21.98331503 26.29101202 20.17329401
 25.49225305 24.09171045 19.90739221 16.35154974 15.25184758 18.40766132
 24.83797801 16.61703662 20.89470344 26.70854061 20.7591883  17.88403312
 24.28656105 23.37651493 21.64202047 36.81476219 15.86570054 21.42338732
 32.81366203 33.74086414 20.61688336 26.88191023 22.65739323 17.35731771
 21.67699248 21.65034728 27.66728556 25.04691687 23.73976625 14.6649641
 15.17700342  3.81620663 29.18194848 20.68544417 22.32934783 28.01568563
 28.58237108]

20.6275137630954

梯度下降：权重值量
 [-0.20772372  0.92645947  0.08913743  0.67508683 -1.81886124  3.31301882
 -0.09589654 -3.41350815  2.40213736 -1.81839293 -1.99261014  0.19576098
 -3.96829135]
梯度下降：偏置量
 [22.90221188]
预测值
 [28.69926108 32.0915701  20.79223736 32.8937585  19.84098535 19.30255953
 20.58343958 18.78774851 19.06006298 34.5407245  20.61251147 28.11210719
 15.14599618 20.04571413 39.40308551 18.20420392  9.77610014 18.09600492
 31.33918263 24.13638754 19.47197815 35.49071922 29.88777789 18.17847282
 36.04087245 26.24983122 34.35731476 27.20631046 20.23509946 14.29477105
 32.01893577 15.93548213 37.60453406 14.66899567 15.72154622 17.4889796
  8.96511029 20.037672   41.47414724 29.15744292 25.06806802 19.11772306
 41.25061009  6.62917756 21.69709812 25.18807372 20.55495108 20.72539931
 15.93705148 28.22776274  8.43071754 27.09047449 30.56231022 18.03051734
 12.98449678 36.8738101  31.45091728 21.72921981 17.21819124 21.17067927
 23.31278054 23.54205278 19.47221091 39.78947865 24.6728373  20.22064798
 16.97877248  6.77752142 44.47977636 21.48633305 19.86995263 22.16357429
 43.1219311  20.5142809  38.49077018 27.15577981 20.10382585 22.52967159
 25.16014233 22.43394576 31.71259105 19.72882704 23.46073261 31.3022365
 27.21230652 21.24926909 29.24259912 21.5515579  26.42326851 20.30778628
 24.11303643 23.57204282 20.83597147 22.39879372 18.95797632 18.53305472
 24.5827223  19.56052329 21.22868979 27.03304457 21.75240614 18.67303644
 24.76943211 23.14815905 21.64948558 36.47095948 15.16765657 21.03525069
 33.10610477 34.33994843 20.06509792 26.58272839 23.88930703 16.79020745
 20.97231297 20.23254099 26.89956457 23.86782309 23.48138422 14.36490843
 20.16681532  4.1253226  29.24471559 21.02254601 21.78107116 28.06316421
 29.56039563]