一.inspection
1.简介:
该模块用于进行"模型检验"(model inspection)
2.使用
(1)检验:
求特征的"部分依赖"(Partial dependence;PD):[<predictions>,<values>=]sklearn.inspection.partial_dependence(<estimator>,<X>,<features>[,response_method='auto',percentiles=(0.05,0.95),grid_resolution=100,method='auto',kind='legacy'])
求用于"特征评估"(feature evaluation)的"置换重要性"(Permutation importance;PI):[<result>=]sklearn.inspection.permutation_importance(<estimator>,<X>,<y>[,scoring=None,n_repeats=5,n_jobs=None,random_state=None,sample_weight=None])
(2)绘图:
"部分依赖图"(Partial Dependence Plot;PDP):class sklearn.inspection.PartialDependenceDisplay(<pd_results>,<features>,<feature_names>,<target_idx>,<pdp_lim>,<deciles>[,kind='average',subsample=1000,random_state=None])
######################################################################################################################
绘制"部份依赖与个体条件期望图"(Partial dependence and individual conditional expectation plots;PD and ICE plots):sklearn.inspection.plot_partial_dependence(<estimator>,<X>,<features>[,feature_names=None,target=None,response_method='auto',n_cols=3,grid_resolution=100,percentiles=(0.05,0.95),method='auto',n_jobs=None,verbose=0,line_kw=None,contour_kw=None,ax=None,kind='average',subsample=1000,random_state=None])
二.metrics
1.简介:
该模块包含各种"评分函数"(score functions)/"性能指标"(performance metrics)/"成对指标"(pairwise metrics)/"距离计算"(distance
computations),用于对模型性能进行定量评估
2.模型选择接口(Model Selection Interface):
通过用户选择确定"计分器"(scorer):[<scoring>=]sklearn.metrics.check_scoring(<estimator>[,scoring=None,allow_none=False])
通过str获取记分器:[<scorer>=]sklearn.metrics.get_scorer(<scoring>)
通过性能指标或"损失函数"(loss function)创建记分器:[<scorer>=]sklearn.metrics.make_scorer(<score_func>[,greater_is_better=True,needs_proba=False,needs_threshold=False,**kwargs])
3.分类指标(Classification metrics):
求"准确率分类得分"(Accuracy classification score):[<score>=]sklearn.metrics.accuracy_score(<y_true>,<y_pred>[,normalize=True,sample_weight=None])
使用"梯形法则"(trapezoidal rule)求"曲线下面积"(Area Under the Curve;AUC):[<auc>=]sklearn.metrics.auc(<x>,<y>)
通过"预测得分"(prediction scores)求"平均精度"(average precision):[<average_precision>=]sklearn.metrics.average_precision_score(<y_true>,<y_score>[,average='macro',pos_label=1,sample_weight=None])
求"均衡准确率"(balanced accuracy):[<balanced_accuracy>=]sklearn.metrics.balanced_accuracy_score(<y_true>,<y_pred>[,sample_weight=None,adjusted=False])
求"布赖尔分数"(Brier score):[<score>=]sklearn.metrics.brier_score_loss(<y_true>,<y_prob>[,sample_weight=None,pos_label=None])
求主要分类指标:[<report>=]sklearn.metrics.classification_report(<y_true>,<y_pred>[,labels=None,target_names=None,sample_weight=None,digits=2,output_dict=False,zero_division='warn'])
求"科恩的κ统计量"(Cohen's kappa statistic):[<kappa>=]sklearn.metrics.cohen_kappa_score(<y1>,<y2>[,labels=None,weights=None,sample_weight=None])
求"混淆矩阵"(confusion matrix):[<C>=]sklearn.metrics.confusion_matrix(<y_true>,<y_pred>[,labels=None,sample_weight=None,normalize=None])
求"累计贴现收益"(Discounted Cumulative Gain;DCG):[<discounted_cumulative_gain>=]sklearn.metrics.dcg_score(<y_true>,<y_score>[,k=None,log_base=2,sample_weight=None,ignore_ties=False])
求"检测错误权衡曲线"(Detection Error Tradeoff curve;DET curve):[<fpr>,<fnr>,<thresholds>=]sklearn.metrics.det_curve(<y_true>,<y_score>[,pos_label=None,sample_weight=None])
#即不同"概率阈值"(probability thresholds)下的"假阳性率-假阴性率对"(False positive rate-False negative rate pairs)构成的曲线
求"F1分数"(F1 score):[<f1_score>=]sklearn.metrics.f1_score(<y_true>,<y_pred>[,labels=None,pos_label=1,average='binary',sample_weight=None,zero_division='warn'])
求"F-β分数"(F-beta score):[<fbeta_score>=]sklearn.metrics.fbeta_score(<y_true>,<y_pred>,<beta>[,labels=None,pos_label=1,average='binary',sample_weight=None,zero_division='warn'])
求"平均汉明损失"(average Hamming loss):[<loss>=]sklearn.metrics.hamming_loss(<y_true>,<y_pred>[,sample_weight=None])
求"平均合页损失"(average hinge loss):[<loss>=]sklearn.metrics.hinge_loss(<y_true>,<pred_decision>[,labels=None,sample_weight=None])
求"杰卡德相似性系数得分"(Jaccard similarity coefficient score):[<score>=]sklearn.metrics.jaccard_score(<y_true