版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/weixin_41089007/article/details/90510248
数据
train_x, test_x, train_y, test_y = train_test_split(data, target, shuffle = True, random_state = 2019)
X_train = train_x.values
X_test = test_x.values
y_train = train_y.values
y_test = test_y.values
方式一:直接跑
params = {
'boosting_type': 'gbdt',
'objective': 'multiclass',
'num_class': 7,
'metric': 'multi_error',
'num_leaves': 120,
'min_data_in_leaf': 100,
'learning_rate': 0.06,
'feature_fraction': 0.8,
'bagging_fraction': 0.8,
'bagging_freq': 5,
'lambda_l1': 0.4,
'lambda_l2': 0.5,
'min_gain_to_split': 0.2,
'verbose': -1,
}
print('Training...')
trn_data = lgb.Dataset(X_train, y_train)
val_data = lgb.Dataset(X_test, y_test)
clf = lgb.train(params,
trn_data,
num_boost_round = 1000,
valid_sets = [trn_data,val_data],
verbose_eval = 100,
early_stopping_rounds = 100)
print('Predicting...')
y_prob = clf.predict(X_test, num_iteration=clf.best_iteration)
y_pred = [list(x).index(max(x)) for x in y_prob]
print("AUC score: {:<8.5f}".format(metrics.accuracy_score(y_pred, test_y)))
方式二:加入交叉验证
param = {
'boosting_type': 'gbdt',
'objective': 'multiclass',
'num_class': 7,
'metric': 'multi_error',
'num_leaves': 300,
'min_data_in_leaf': 500,
'learning_rate': 0.01,
'feature_fraction': 0.8,
'bagging_fraction': 0.8,
'bagging_freq': 5,
'lambda_l1': 0.4,
'lambda_l2': 0.5,
'min_gain_to_split': 0.2,
'verbose': -1,
'num_threads':4,
}
# 五折交叉验证
folds = KFold(n_splits=5, shuffle=False, random_state=2019)
oof = np.zeros([len(X_train),7])
predictions = np.zeros([len(X_test),7])
for fold_, (trn_idx, val_idx) in enumerate(folds.split(X_train, y_train)):
print("fold n°{}".format(fold_+1))
trn_data = lgb.Dataset(X_train[trn_idx], y_train[trn_idx])
val_data = lgb.Dataset(X_train[val_idx], y_train[val_idx])
num_round = 1000
clf = lgb.train(param,
trn_data,
num_round,
valid_sets = [trn_data, val_data],
verbose_eval = 100,
early_stopping_rounds = 100)
#oof[val_idx] = clf.predict(X_train[val_idx], num_iteration=clf.best_iteration)
predictions += clf.predict(X_test, num_iteration=clf.best_iteration) / folds.n_splits
#print(predictions)
————————————————
版权声明:本文为CSDN博主「睡熊猛醒」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_41089007/article/details/90510248