评估回归模型
均方误差 越小越好
R方越接近一越好
# load libraries 加载库
from sklearn.datasets import make_regression
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression
# generate features matrix, target vector
features, target = make_regression(n_samples = 100,
n_features = 3,
n_informative = 3,
n_targets = 1,
noise = 50,
coef = False,
random_state = 1)
# create a linear regression object
ols = LinearRegression()
# cross-validate the lienar regression using (negative) MSE
# 使用MSE 均方误差 进行交叉验证
cross_val_score(ols, features, target, scoring='neg_mean_squared_error')
array([-1718.22817783, -3103.4124284 , -1377.17858823])
Another common regression metric is the coefficient of determination, R2R2
cross_val_score(ols, features, target, scoring='r2')
array([0.87804558, 0.76395862, 0.89154377])
Discussion
MSE is one of the most common evaluation metrics for regression models. Formally, MSE is:
MSE=1n∗∑i=0n(ŷ i−yi)2
MSE=1n∗∑i=0n(y^i−yi)2
where nn is the number of observations yiyi is the true value of the target we are trying to predict for observation ii ŷ iy^i is the model's predicted value
MSE is a measurement of the squared sum of all distances between predicted and true values.
The higher the value of MSE, the greater the total squared error and thus the worse the model. Ther are a number of mathematical benefits to squaring the error term, including that it forced all error alues to be positive, but one often unrealized implication is that squaring penalizes a few large errors mroe than many small errors, even if the absolute value of the errors is the same.
For example, imagine wo models, A and B, each with two observations:
Model A has errors of 0 and 10 and thus its MSE is 02+102=10002+102=100 .
Model B has two errors of 5 each, and thus its MSE is 52+52=5052+52=50
Both models have the same total error, 10; however, MSE would consider Model A (MSE = 100) worse than Model B (MSE=50). In practice this implicatino is rarely an issue (and indeed can be theoretically beneficial) and MSE works perfectly fine as an evaluation metric
One important note: by default in scikit-learn arguments of the scoring parameter assume that higher values are better than lower values. However, this is not the case for MSE, where higher values mean a worse model. For this reason, scikit-learn looks at the negative MSE using the neg_mean_squared_error argument
A common alternative regression evaluation metric is R2R2 , which measures the amount of variance in the target vector that is explained by the model:
R2=1−∑ni=1(yi−ŷ i)2∑ni=1(yi−y¯)2
R2=1−∑i=1n(yi−y^i)2∑i=1n(yi−y¯)2
where yiyi is the true target value of the ith observation
ŷ iy^i is the predicted value for the ith observation
and y¯y¯ is the mean value of the target vector.
The closer to 1.0, the better the model.