Gradient boosting trees实现+特征值重要性+依赖相关——及相关包解释_整理

最新推荐文章于 2023-12-23 14:14:13 发布

Leon_124

最新推荐文章于 2023-12-23 14:14:13 发布

阅读量567

点赞数 1

分类专栏： Python 机器学习

本文链接：https://blog.csdn.net/SDAU_LY124/article/details/109018390

版权

Python 同时被 2 个专栏收录

28 篇文章 2 订阅

订阅专栏

机器学习

7 篇文章 0 订阅

订阅专栏

原理：

https://www.cnblogs.com/pinard/p/6140514.html

https://zhuanlan.zhihu.com/p/108641227

示例：

https://zhuanlan.zhihu.com/p/40356430

https://www.pythonf.cn/read/5079

随机选择训练+测试样本参数解释：

https://www.cnblogs.com/pinard/p/6143927.html

https://www.cnblogs.com/Yanjy-OnlyOne/p/11288098.html

调参

网格搜索 GridSearchCV：https://zhuanlan.zhihu.com/p/37310443

# 调参
cv_params = {'learning_rate': [0.1, 0.05,0.01], 'max_depth': [1,3,5,7,10], 'n_estimators': [100,200,300]}
ind_params = {'random_state': 10}

optimized_GBM = GridSearchCV(GradientBoostingRegressor(**ind_params),cv_params,
                             scoring='neg_mean_squared_error', cv=5, n_jobs=-1, verbose=10)

optimized_GBM.fit(X_pr_train, y_pr_train)

随机搜索 RandomizedSearchCV：https://blog.csdn.net/juezhanangle/article/details/80051256

# 调参
param_dist = {'learning_rate': [0.1, 0.05,0.01], 'max_depth': [10,50,100],  'n_estimators': [100,200,300]}
ind_params = {'random_state': 10}

n_iter_search = 20
random_search = RandomizedSearchCV(GradientBoostingRegressor(**ind_params), param_distributions=param_dist,
                                   n_iter=n_iter_search, cv = 3,scoring = 'roc_auc',n_jobs = -1)

start = time()
random_search.fit(X_nopr_train, y_nopr_train)

print("RandomizedSearchCV took %.2f seconds for %d candidates"
      " parameter settings." % ((time() - start), n_iter_search))
report(random_search.cv_results_)

贝叶斯优化: 一种更好的超参数调优方式

https://zhuanlan.zhihu.com/p/29779000

#贝叶斯优化
def rf_cv(n_estimators, min_samples_split, learning_rate, max_depth):
    val = cross_val_score(
        GradientBoostingRegressor(n_estimators=int(n_estimators),
            min_samples_split=int(min_samples_split),
            learning_rate = min(learning_rate, 0.999), # float
            max_depth=int(max_depth),
            random_state=2
        ),
        X_nopr_train, y_nopr_train, scoring='roc_auc', cv=5
    ).mean()
    return val

 #建立贝叶斯优化对象：
rf_bo = BayesianOptimization(
        rf_cv,
        {'n_estimators': (100, 300),
        'min_samples_split': (2, 25),
        'learning_rate': (0.01, 0.999),
        'max_depth': (10, 150)}
        ,n_jobs = -1
    )

rf_bo.maximize()

#最大值
rf_bo.max

plot_partial_dependence()官方文档：

https://scikit-learn.org/stable/modules/generated/sklearn.inspection.plot_partial_dependence.html

Leon_124

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
Gradient boosting trees实现+特征值重要性+依赖相关——及相关包解释_整理

示例：https://zhuanlan.zhihu.com/p/40356430https://www.pythonf.cn/read/5079随机选择训练+测试样本参数解释：https://www.cnblogs.com/Yanjy-OnlyOne/p/11288098.html调参GridSearchCV解释：https://zhuanlan.zhihu.com/p/37310443plot_partial_dependence()官方文档：https:...
复制链接

扫一扫