import lightgbm as lgb
from sklearn.metrics import roc_auc_score
lgb_train = lgb.Dataset(x_train,y_train)
lgb_eval = lgb.Dataset(x_test,y_test,reference = lgb_train)
params = {
'boosting_type':'gbdt', #提升器的类型
'objective':'binary',
'metric':{'auc'},
'num_leaves':32,
'learning_rate':0.01,
'feature_fraction':0.9, #每棵树训练之前选择90%的特征
'bagging_fraction':0.8, #类似于feature_fraction,加速训练,处理过拟合
'bagging_freq':5
'verbose':0
}
gbm = lgb.train(params,
lgb_train,
num_boost_round = 4000, #number of boosting iterations,
valid_sets = lgb_eval,
verbose_eval=250,
early_stopping_rounds=50)
y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
LightGBM Python 模块加载后的数据存在于Dataset对象中。
创建验证数据:
lgb_eval = lgb.Dataset(x_test,y_test,reference = lgb_train)
在LightGBM中,验证数据应该与训练数据一致(格式一致)
或者另一种形式:
lgb_eval = lgb_train.create_vaild(x_test,y_test)