python_评估回归模型

最新推荐文章于 2024-05-10 03:05:19 发布

炼丹师666

最新推荐文章于 2024-05-10 03:05:19 发布

阅读量1.1k

点赞数

分类专栏：算法

本文链接：https://blog.csdn.net/wj1298250240/article/details/103748379

版权

算法专栏收录该内容

101 篇文章 5 订阅

订阅专栏

评估回归模型

均方误差越小越好

MSE=1n∗∑i=0n(ŷ i−yi)2
R方越接近一越好
在这里插入图片描述

# load libraries 加载库
from sklearn.datasets import make_regression
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression

# generate features matrix, target vector
features, target = make_regression(n_samples = 100,
                                   n_features = 3,
                                   n_informative = 3,
                                   n_targets = 1,
                                   noise = 50,
                                   coef = False,
                                   random_state = 1)

# create a linear regression object
ols = LinearRegression()

# cross-validate the lienar regression using (negative) MSE
# 使用MSE  均方误差 进行交叉验证
cross_val_score(ols, features, target, scoring='neg_mean_squared_error')
array([-1718.22817783, -3103.4124284 , -1377.17858823])
Another common regression metric is the coefficient of determination,  R2R2 
cross_val_score(ols, features, target, scoring='r2')
array([0.87804558, 0.76395862, 0.89154377])
Discussion
MSE is one of the most common evaluation metrics for regression models. Formally, MSE is:

MSE=1n∗∑i=0n(ŷ i−yi)2
MSE=1n∗∑i=0n(y^i−yi)2
 
where  nn  is the number of observations  yiyi  is the true value of the target we are trying to predict for observation  ii   ŷ iy^i  is the model's predicted value

MSE is a measurement of the squared sum of all distances between predicted and true values.

The higher the value of MSE, the greater the total squared error and thus the worse the model. Ther are a number of mathematical benefits to squaring the error term, including that it forced all error alues to be positive, but one often unrealized implication is that squaring penalizes a few large errors mroe than many small errors, even if the absolute value of the errors is the same.

For example, imagine wo models, A and B, each with two observations:

Model A has errors of 0 and 10 and thus its MSE is  02+102=10002+102=100 .
Model B has two errors of 5 each, and thus its MSE is  52+52=5052+52=50 
Both models have the same total error, 10; however, MSE would consider Model A (MSE = 100) worse than Model B (MSE=50). In practice this implicatino is rarely an issue (and indeed can be theoretically beneficial) and MSE works perfectly fine as an evaluation metric

One important note: by default in scikit-learn arguments of the scoring parameter assume that higher values are better than lower values. However, this is not the case for MSE, where higher values mean a worse model. For this reason, scikit-learn looks at the negative MSE using the neg_mean_squared_error argument

A common alternative regression evaluation metric is  R2R2 , which measures the amount of variance in the target vector that is explained by the model:
R2=1−∑ni=1(yi−ŷ i)2∑ni=1(yi−y¯)2
R2=1−∑i=1n(yi−y^i)2∑i=1n(yi−y¯)2
 
where  yiyi  is the true target value of the ith observation

ŷ iy^i  is the predicted value for the ith observation

and  y¯y¯  is the mean value of the target vector.

The closer to 1.0, the better the model.

炼丹师666

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python_评估回归模型

评估回归模型均方误差越小越好R方越接近一越好# load libraries 加载库from sklearn.datasets import make_regressionfrom sklearn.model_selection import cross_val_scorefrom sklearn.linear_model import LinearRegression# ...
复制链接

扫一扫