我使用此代码来比较许多模型的性能:
from sklearn import model_selection
X = input data
Y = binary labels
models = []
models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
results = []
names = []
scoring = 'accuracy'
for name, model in models:
kfold = model_selection.KFold(n_splits=10, random_state=7)
cv_results = model_selection.cross_val_score(model, X, Y, cv=kfold,scoring=scoring)
results.append(cv_results)
names.append(name)
msg = "%s: %.2f (%.2f)" % (name, cv_results.mean(), cv_results.std())
print(msg)
我可以使用'准确度'和'召回'作为评分,这些将提供准确性和灵敏度 . 如何创建一个给我“特异性”的得分手
特异性= TN /(TN FP)
其中TN和FP在混淆矩阵中是真阴性和假阳性值
我试过这个
def tp(y_true, y_pred):
error= confusion_matrix(y_true, y_pred)[0,0]/(confusion_matrix(y_true,y_pred)[0,0] + confusion_matrix(y_true, y_pred)[0,1])
return error
my_scorer = make_scorer(tp, greater_is_better=True)
然后
cv_results = model_selection.cross_val_score(model, X,Y,cv=kfold,scoring=my_scorer)
但它不适用于n_split> = 10我在计算my_scorer时遇到此错误
IndexError:索引1超出轴1的大小为1的范围