模型融合 Boosting 方法

模型融合 Boosting 方法




1. GBDT 模型

  • 使用网格搜索寻找具备最优超参 GBDT 模型对数据进行预测,采用 MSE 指标对模型进行评价,相关代码如下:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error

# no. k-fold splits
splits = 5
repeats = 1

gbdt = GradientBoostingRegressor()
rkfold = RepeatedKFold(n_splits=splits, n_repeats=repeats)

param_grid = {
    'n_estimators': [150, 250, 350],
    'max_depth': [1, 2, 3],
    'min_samples_split': [5, 6, 7]
}

gsearch = GridSearchCV(gbdt, param_grid, cv=rkfold, scoring='neg_mean_squared_error', verbose=1, return_train_score=True)
gsearch.fit(X, y)

model = gsearch.best_estimator_

y_pred = model.predict(X)

print('mse: ', mean_squared_error(y_pred, y))

"""
0.029349091720938178
"""

2. XGB 模型

  • 使用网格搜索寻找具备最优超参 XGB 模型对数据进行预测,采用 MSE 指标对模型进行评价,相关代码如下:
import xgboost as xgb
from xgboost import XGBRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error

# no. k-fold splits
splits = 5
repeats = 1

xgb = XGBRegressor(objective='reg:squarederror')
rkfold = RepeatedKFold(n_splits=splits, n_repeats=repeats)

param_grid = {
    'n_estimators': [100, 200, 300, 400, 500],
    'max_depth': [1, 2, 3]
}

gsearch = GridSearchCV(xgb, param_grid, cv=rkfold, scoring='neg_mean_squared_error', verbose=1, return_train_score=True)
gsearch.fit(X, y)

model = gsearch.best_estimator_

y_pred = model.predict(X)

print('mse: ', mean_squared_error(y_pred, y))

"""
mse:  0.028142921912724782
"""

3. 随机森林

  • 使用网格搜索寻找具备最优超参的随机森林模型对数据进行预测,采用 MSE 指标对模型进行评价,相关代码如下:
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error

# no. k-fold splits
splits = 5
repeats = 1

# X, y = get_training_data_omitoutliers()  # 不加这一步有问题,参考阿里云天池大赛解析中该函数的具体内容

rfr = RandomForestRegressor()
rkfold = RepeatedKFold(n_splits=splits, n_repeats=repeats)

param_grid = {
    'n_estimators': [100, 150, 200],
    'max_features': [8, 12, 16, 20, 24],
    'min_samples_split': [2, 4, 6]
}

gsearch = GridSearchCV(rfr, param_grid, cv=rkfold, scoring='neg_mean_squared_error', verbose=1, return_train_score=True)
gsearch.fit(X, y)

model = gsearch.best_estimator_

y_pred = model.predict(X)

print('mse: ', mean_squared_error(y_pred, y))

"""
mse:  0.013881205786668335
"""
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值