算法实践第三天
数据
和day01中的数据一样data_all.csv
任务:模型评估
记录7个模型(在Task1的基础上)关于accuracy、precision,recall和F1-score、auc值的评分表格,并画出Roc曲线。
代码实现
导入包
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import metrics
import matplotlib.pyplot as plt
加载数据
file_path = 'G:\DatawhaleWeek01\Data\data_all.csv'
row_data = pd.read_csv(file_path)
划分数据集
X = row_data.drop(columns=['status']).values
y = row_data['status'].values
X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=0.3,random_state=2018)
定义得分和ROC曲线函数
def get_scores(y_true, y_predict, y_predict_pro):
accuracy_score = metrics.accuracy_score(y_true, y_predict)
precision_score = metrics.precision_score(y_true, y_predict)
recall_score = metrics.recall_score(y_true, y_predict)
f1_score = metrics.f1_score(y_true, y_predict)
auc_score = metrics.roc_auc_score(y_true, y_predict_pro)
test_fprs, test_tprs, test_thresholds = metrics.roc_curve(y_test, y_predict_pro)
plt.plot(test_fprs, test_tprs)
plt