金融贷款逾期的模型构建(1)
1.数据信息:金融数据(非原始数据)
2. 任务类型:模型构建
预测贷款用户是否会逾期(“status” 是结果标签:0表示未逾期,1表示逾期)
主要内容:构建逻辑回归、决策树、SVM三个模型并进行分类效果评估(这里采用ACC值)。
3.代码及注释
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import LinearSVC
## 调取相关的包
test = pd.read_csv(r"D:\datawhale\12.17\data_all.csv")
df = pd.DataFrame(test)
y = df["status"]
del df["status"]
X_train,X_test,y_train,y_test = train_test_split(df, y, test_size=0.3, random_state=2018)
## 将数据集划分为训练集:测试集=7:3,随即种子2018
lr= LogisticRegression()
lr.fit(X_train, y_train)
y_train_predictions = lr.predict(X_train)
y_test_predictions = lr.predict(X_test)
## 构建逻辑回归模型
Svc = LinearSVC()
Svc.fit(X_train,y_train)
## 构建SVM
dt = DecisionTreeClassifier()
dt.fit(X_train,y_train)
## 构建决策树
lr_acc = lr.score(X_test, y_test)
Svc_acc = Svc.score(X_test, y_test)
dt_acc = dt.score(X_test, y_test)
print("LogisticRegressiom Acc: %f, SVM Acc: %f, tree Acc: %f"%(lr_acc, Svc_acc, dt_acc))
## 评分并输出结果
## LogisticRegressiom Acc: 0.748423, SVM Acc: 0.748423, tree Acc: 0.669937