调研 auto-sklearn 2.0

最新推荐文章于 2024-06-05 21:45:00 发布

数学工具构造器

最新推荐文章于 2024-06-05 21:45:00 发布

阅读量487

点赞数 1

分类专栏： automl

本文链接：https://blog.csdn.net/TQCAI666/article/details/107485246

版权

本文介绍了auto-sklearn 2.0在应对时间约束的超参数优化策略，包括早期停止和间歇性结果检索，以防止过拟合并提高效率。研究中，超参数数量从110减至42，专注于迭代模型。此外，讨论了最高可用预算下的模型构建以及重要性采样的应用，旨在为多样化的数据集提供稳健的配置选择。

摘要由CSDN通过智能技术生成

To mitigate this risk, for algorithms that can be trained iteratively (e.g.,gradient boosting and linear models trained with stochastic gradient descent)

we implemented two measures.

Firstly,we allow a pipeline to stop training based on a heuristicat any time, i.e. early stopping, which prevents overfitting.
Secondly, we make use of intermittent results retrieval,e.g., saving the results at checkpoints spaced at geometrically increasing iteration numbers, thereby ensuring that every evaluation returns a performance and thus yields information for the optimizer.

intermittent 间歇的
retrieval 检索
geometrically increasing iteration numbers 几何级数递增的迭代

Current hyperparameter optimization algorithms can cope with such spaces, given enough time, but, in this work, we consider a heavily time-bounded setting. Therefore, we reduced our space to 42 hyperparameters only including iterative models to benefit from the early stopping and intermittent results retrieval.

意思是AS2有153个超参，比AS1的110个超参多了43个。用SMAC这样的算法在一定的时间内肯定是可以对这些超参进行优化的，但是考虑到很重的时间限制(heavily time-bounded setting)，所以将超参数缩减至42，只保留迭代类算法。

备注：AS2是否为PoSH-autosklearn的工程化实现，是否采用BOHB算法，模型的启动次数是否为42/0.15=280

We build the model for BO on the highest available budget where we have observed the performance of $\frac{|\Lambda|}{2}$ pipelines 。

这句话信息量相当大。 highest available budget指的是最高可用budget。重点调查 $\frac{|\Lambda|}{2}$ 的出处，重新看BOHB论文和代码。

SH potentially provides large speedups, but it could also too aggressively cut away good configurations that need a higher budget to perform best.

Thus, we expect SH to work best for large datasets, for which there is not enough time to train many ML pipelines for the full budget, but for which training a ML pipeline on a small budget already yields a good indication of the generalization error.

第四范式对SH做了改动，用重要性采样代替减半，并且参考其他band的样本

herefore, here we propose a meta-feature-free approach which does not warmstart with a set of configurations specific to a new dataset, but which uses a portfolio– a set of complementary configurations that covers as many diverse datasets as possible and minimizes the risk of failure when facing a new task

训练集

experiment_scripts/portfolio/portfolio_util.py


_training_task_ids = [
    232, 236, 241, 245, 253, 254, 256, 258, 260, 262, 267, 271, 273, 275, 279, 288, 336, 340, 2119,
    2120, 2121, 2122, 2123, 2125, 2356, 3044, 3047, 3048, 3049, 3053, 3054, 3055, 75089, 75092,
    75093, 75098, 75100, 75108, 75109, 75112, 75114, 75115, 75116, 75118, 75120, 75121, 75125,
    75126, 75129, 75131, 75133, 75134, 75136, 75139, 75141, 75142, 75143, 75146, 75147, 75148,
    75149,

最低0.47元/天解锁文章

数学工具构造器

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
调研 auto-sklearn 2.0

To mitigate this risk, for algorithms that can be trained iteratively (e.g.,gradient boosting and linear models trained with stochastic gradient descent)we implemented two measures.Firstly,we allow a pipeline to stop training based on a h
复制链接

扫一扫

专栏目录