python_评估回归模型

评估回归模型

均方误差 越小越好

MSE=1n∗∑i=0n(ŷ i−yi)2
R方越接近一越好
在这里插入图片描述

# load libraries 加载库
from sklearn.datasets import make_regression
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression
​
# generate features matrix, target vector
features, target = make_regression(n_samples = 100,
                                   n_features = 3,
                                   n_informative = 3,
                                   n_targets = 1,
                                   noise = 50,
                                   coef = False,
                                   random_state = 1)# create a linear regression object
ols = LinearRegression()# cross-validate the lienar regression using (negative) MSE
# 使用MSE  均方误差 进行交叉验证
cross_val_score(ols, features, target, scoring='neg_mean_squared_error')
array([-1718.22817783, -3103.4124284 , -1377.17858823])
Another common regression metric is the coefficient of determination,  R2R2 
cross_val_score(ols, features, target, scoring='r2')
array([0.87804558, 0.76395862, 0.89154377])
Discussion
MSE is one of the most common evaluation metrics for regression models. Formally, MSE is:

MSE=1n∗∑i=0n(ŷ i−yi)2
MSE=1n∗∑i=0n(y^i−yi)2
 
where  nn  is the number of observations  yiyi  is the true value of the target we are trying to predict for observation  ii   ŷ iy^i  is the model's predicted value

MSE is a measurement of the squared sum of all distances between predicted and true values.

The higher the value of MSE, the greater the total squared error and thus the worse the model. Ther are a number of mathematical benefits to squaring the error term, including that it forced all error alues to be positive, but one often unrealized implication is that squaring penalizes a few large errors mroe than many small errors, even if the absolute value of the errors is the same.

For example, imagine wo models, A and B, each with two observations:

Model A has errors of 0 and 10 and thus its MSE is  02+102=10002+102=100 .
Model B has two errors of 5 each, and thus its MSE is  52+52=5052+52=50 
Both models have the same total error, 10; however, MSE would consider Model A (MSE = 100) worse than Model B (MSE=50). In practice this implicatino is rarely an issue (and indeed can be theoretically beneficial) and MSE works perfectly fine as an evaluation metric

One important note: by default in scikit-learn arguments of the scoring parameter assume that higher values are better than lower values. However, this is not the case for MSE, where higher values mean a worse model. For this reason, scikit-learn looks at the negative MSE using the neg_mean_squared_error argument

A common alternative regression evaluation metric is  R2R2 , which measures the amount of variance in the target vector that is explained by the model:
R2=1−∑ni=1(yi−ŷ i)2∑ni=1(yi−y¯)2
R2=1−∑i=1n(yi−y^i)2∑i=1n(yi−y¯)2
 
where  yiyi  is the true target value of the ith observation

ŷ iy^i  is the predicted value for the ith observation

and  y¯y¯  is the mean value of the target vector.

The closer to 1.0, the better the model.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值