线性回归

最新推荐文章于 2024-06-07 18:20:20 发布

心沉海枯…

最新推荐文章于 2024-06-07 18:20:20 发布

阅读量349

点赞数 4

本文链接：https://blog.csdn.net/brighttaoist/article/details/88804555

版权

使用线性回归预测波士顿房价

载入数据集

import numpy as np
from sklearn import datasets
from sklearn import metrics
from sklearn import model_selection as modsel
from sklearn import linear_model
import matplotlib.pyplot as plt
plt.style.use('ggplot')

boston = datasets.load_boston()
dir(boston)  #  "dir"显示文本列表

训练模型

linreg = linear_model.LinearRegression()  
x_train, x_test, y_train, y_test = modsel.train_test_split(
    boston.data, boston.target, test_size=0.1, random_state=42)
linreg.fit(x_train, y_train)
metrics.mean_squared_error(y_train, linreg.predict(x_train))
#  linreg对象的score方法返回的是确定系数（R方值）
linreg.score(x_train, y_train)

测试模型
在测试数据上计算均方误差

y_pred = linreg.predict(x_test)   #  预测值
metrics.mean_squared_error(y_test, y_pred)

画出数据

plt.figure(figsize=(10, 6)) #  对应figsize的宽和高
plt.plot(y_test, linewidth=1, label='ground truth')
plt.plot(y_pred, linewidth=1, label='predicted')
plt.legend(loc='best')
plt.xlabel('test data points')
plt.ylabel('target value')
plt.show()

在这里插入图片描述

plt.plot(y_test, y_pred, '^') #'^'表示正三角形， "o"代表圆点
plt.plot([-10, 60], [-10, 60], 'k--')  
plt.axis([-10, 60, -10, 60])
plt.xlabel('ground truth')
plt.ylabel('predicted')
scorestr = r'R$^2$ = %.3f' % linreg.score(x_test, y_test)  # R**2值
errstr = 'MSE = %.3f' % metrics.mean_squared_error(y_test, y_pred)
#  用一个文本框显示R**2值和均方误差值
plt.text(-5, 50, scorestr, fontsize=10)
plt.text(-5, 45, errstr, fontsize=10)
plt.show()

在这里插入图片描述
从图中可以得到：R**2表明我们可以解释数据76%的离散度，均方误差为15.011。