一、安装sklearn
如果安装了Anoconda,可以直接从Anoconda Navigator——Environment里面搜索添加。
pip install -U scikit-learn
二、scikit-learn.metrics导入与调用
有两种方式导入。
方式一:
from sklearn.metrics import 评价指标函数名称
例如:
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
直接使用函数名调用:
mse = mean_squared_error(y_test, y_pre)
R2 = r2_score(y_test,y_pre)
方式二:
from sklearn import metrics
调用方式为:metrics.评价指标函数名称(parameter)
例如:
计算均方误差mean squared error
计算回归的决定系数R2
mse = metrics.mean_squared_error(y_test, y_pre)
R2 = metrics.r2_score(y_test,y_pre)
【分类指标】
1.accuracy_score(y_true,y_pre) : 精度
2.auc(x, y, reorder=False) : ROC曲线下的面积;较大的AUC代表了较好的performance。
3.average_precision_score(y_true, y_score, average='macro', sample_weight=None):根据预测得分计算平均精度(AP)
4.brier_score_loss(y_true, y_prob, sample_weight=None, pos_label=None):The smaller the Brier score, the better.
5.confusion_matrix(y_true, y_pred, labels=None, sample_weight=None):通过计算混淆矩阵来评估分类的准确性 返回混淆矩阵
6.f1_score(y_true, y_pred, labels=None, pos_label=1, average='binary', sample_weight=None): F1值
F1 = 2 * (precision * recall) / (precision + recall) precision(查准率)=TP/(TP+FP) recall(查全率)=TP/(TP+FN)
7.log_loss(y_true, y_pred, eps=1e-15, normalize=True, sample_weight=None, labels=None):对数损耗,又称逻辑损耗或交叉熵损耗
8.precision_score(y_true, y_pred, labels=None, pos_label=1, average='binary',) :查准率或者精度; precision(查准率)=TP/(TP+FP)
9.recall_score(y_true, y_pred, labels=None, pos_label=1, average='binary', sample_weight=None):查全率 ;recall(查全率)=TP/(TP+FN)
10.roc_auc_score(y_true, y_score, average='macro', sample_weight=None):计算ROC曲线下的面积就是AUC的值,the larger the better
11.roc_curve(y_true, y_score, pos_label=None, sample_weight=None, drop_intermediate=True);计算ROC曲线的横纵坐标值,TPR,FPR
TPR = TP/(TP+FN) = recall(真正例率,敏感度) FPR = FP/(FP+TN)(假正例率,1-特异性)
【回归指标】
1.explained_variance_score(y_true, y_pred, sample_weight=None, multioutput='uniform_average'):回归方差(反应自变量与因变量之间的相关程度)
2.mean_absolute_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average'):平均绝对误差
3.mean_squared_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average'):均方差
4.median_absolute_error(y_true, y_pred) 中值绝对误差
5.r2_score(y_true, y_pred, sample_weight=None, multioutput='uniform_average') :R平方值