五折交叉验证
交叉验证法
交叉验证(cross-validation 简称cv)将数据集分为k等份,对于每一份数据集,其中k-1份用作训练集,单独的那一份用作验证集。一般采用xgboost.cv可以进行交叉验证
for i, param in enumerate(param_grid):
cv_result = xgb.cv(param, self.train_matrix,
num_boost_round=self.num_boost_round, # max iter round
nfold=self.nfold,
stratified=self.stratified,
metrics=self.metrics, # metrics focus on
early_stopping_rounds=self.early_stopping_rounds) # stop when metrics not get better
cur_auc = cv_result.iloc[len(cv_result)-1, 0]
cur_iter_round = len(cv_result)
if cur_auc > best_auc:
best_auc, best_param, best_iter_round = cur_auc, param, cur_iter_round
print('Param select {}, auc: {}, iter_round: {}, params: {}, now best auc: {}'
.format(i, cur_auc, cur_iter_round, param, best_auc))
输出结果
E:\PycharmProject\interrogative\venv\Scripts\python.exe E:/PycharmProject/interrogative/manage.py
Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\dell\AppData\Local\Temp\jieba.cache
Loading model cost 0.596 seconds.
Prefix dict has been built successfully.
Param select 0, auc: 0.9873519999999999, iter_round: 270, params: {'eta': 0.1, 'max_depth': 4, 'objective': 'binary:logistic', 'silent': 0, 'subsample': 0.5}, now best auc: 0.9873519999999999
Param select 1