记录5个模型(逻辑回归、SVM、决策树、随机森林、XGBoost)关于accuracy、precision,recall和F1-score、auc值的评分表格,并画出ROC曲线。参考:https://www.jianshu.com/p/5df19746daf9 https://blog.csdn.net/huacha__/article/details/81029680
先定义评估函数
#记录5个模型(逻辑回归、SVM、决策树、随机森林、XGBoost)关于accuracy、precision,recall和F1-score、auc值的评分表格,
#并画出ROC曲线。
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score
from sklearn.metrics import roc_auc_score
def eva(y_real,y_predict):
acc = accuracy_score(y_real, y_predict)
precision = precision_score(y_real, y_predict)
recall = recall_score(y_real, y_predict)
f1=f1_score(y_real, y_predict,average='weighted')
auc = roc_auc_score(y_real,y_predict)
return acc,precision,recall,f1,auc
训练集与测试集的评估指标
import warnings
warnings.filterwarnings('ignore') # "error", "ignore", "always", "default", "module" or "once"
names = ["LR","SVC", "DT", "RFC","Xgb"]
evals=["acc","precision","recall","f1","auc"]
scores=[]
for pred in y_train_pred:
score = eva(y_train, pred)
scores.append(score)
df_train=pd.DataFrame(scores,columns=evals,index=names)
画训练集和测试集ROC曲线
from sklearn.metrics import roc_auc_score, auc,roc_curve
import matplotlib.pyplot as plt
#训练集
# for pred in y_train_pred:
# fpr, tpr, thresholds =roc_curve(y_train,pred)
#测试集
for pred in y_pred:
fpr, tpr, thresholds =roc_curve(y_test,pred)
roc_auc = auc(fpr, tpr) #auc为Roc曲线下的面积
#开始画ROC曲线
plt.plot(fpr, tpr, 'b',label='AUC = %0.2f'% roc_auc)
plt.legend(loc='lower right')
plt.plot([0,1],[0,1],'r--')
plt.xlim([-0.1,1.1])
plt.ylim([-0.1,1.1])
plt.xlabel('False Positive Rate') #横坐标是fpr
plt.ylabel('True Positive Rate') #纵坐标是tpr
plt.title('Receiver operating characteristic example')
plt.show()