TPOT(Tree-based Pipeline Optimization Tool) API简介

最新推荐文章于 2023-10-17 02:15:00 发布

故园稻香

最新推荐文章于 2023-10-17 02:15:00 发布

阅读量868

点赞数

文章标签： python 机器学习人工智能 TPOT AML

本文链接：https://blog.csdn.net/sjtulgl/article/details/129382592

版权

文章目录

TPOT简介
TPOT API

TPOT简介

TPOT是一个Python自动机器学习（AML）工具，它使用遗传算法优化机器学习管道；
TPOT完成搜索或到达最长等待时间后，它能提供一个最优管道的Python代码；
TPOT是基于sklearn，所以风格与其类似。

TPOT API

Classification

接口形式：

tpot.TPOTClassifier(generations=100, population_size=100,
                          offspring_size=None, mutation_rate=0.9,
                          crossover_rate=0.1,
                          scoring='accuracy', cv=5,
                          subsample=1.0, n_jobs=1,
                          max_time_mins=None, max_eval_time_mins=5,
                          random_state=None, config_dict=None,
                          template=None,
                          warm_start=False,
                          memory=None,
                          use_dask=False,
                          periodic_checkpoint_folder=None,
                          early_stop=None,
                          verbosity=0,
                          disable_update_check=False,
                          log_file=None
                          )

Parameters：

generations: int or None optional (default=100)【运行优化过程的迭代次数。如果设置为None，就必须定义max_time_mins参数】
population_size: int, optional (default=100)【每一代保留的个体个数】
offspring_size: int, optional (default=None)【每一代后代个数。默认等于population_size】
mutation_rate: float, optional (default=0.9)【变异比例。】
crossover_rate: float, optional (default=0.1)【交叉比例。mutation_rate+crossover_rate≤1】
scoring: string or callable, optional (default=‘accuracy’)【评价指标。内建的评价指标包括：‘accuracy’, ‘adjusted_rand_score’, ‘average_precision’, ‘balanced_accuracy’, ‘f1’, ‘f1_macro’, ‘f1_micro’, ‘f1_samples’, ‘f1_weighted’, ‘neg_log_loss’, ‘precision’ etc. (suffixes apply as with ‘f1’), ‘recall’ etc. (suffixes apply as with ‘f1’), ‘jaccard’ etc. (suffixes apply as with ‘f1’), ‘roc_auc’, ‘roc_auc_ovr’, ‘roc_auc_ovo’, ‘roc_auc_ovr_weighted’, ‘roc_auc_ovo_weighted’。也可以自定义评价函数scorer(estimator, X, y)】
cv: int, cross-validation generator, or an iterable, optional (default=5)
subsample: float, optional (default=1.0)
n_jobs: integer, optional (default=1)
max_time_mins: integer or None, optional (default=None)【单位：分钟。】
max_eval_time_mins: float, optional (default=5)【评估单个pipeline最大时间。】
random_state: integer or None, optional (default=None)
config_dict: Python dictionary, string, or None, optional (default=None)【可能的输入：（1）自定义配置字典；（2）‘TPOT light’，只使用fast模型；（3）‘TPOT MDR’，用于基因组研究的配置；（4） ‘TPOT sparse’，配置字典包含一个one-hot编码，支持稀疏矩阵处理；（5）None，使用默认配置。】
template: string (default=None)【预定义的pipeline结构模板。】
warm_start: boolean, optional (default=False)【指示标志，是否使用上一次fit的种群结果。中间停止，观察结果，接着搜索。】
memory: a joblib.Memory object or string, optional (default=None)
use_dask: boolean, optional (default: False)
periodic_checkpoint_folder: path string, optional (default: None)【以下情形下会很有用（1）TPOT突然中断；（2）追踪搜索过程；（3）优化过程中抓取pipeline】
early_stop: integer, optional (default: None)【给定如果多少代没有提升，就终止优化过程】
verbosity: integer, optional (default=0)【（1）0，不打印；（1）打印很少信息；（2）打印更多信息并显示一个进度条；（3）打印所有信息。】
disable_update_check: boolean, optional (default=False)【是否检查TPOT版本，如果有新版本，会提醒】
log_file: file-like class (io.TextIOWrapper or io.StringIO) or string, optional (default: None)【输出过程内容的文件】

Attributes:

fitted_pipeline_: scikit-learn Pipeline object【最优pipeline结果】
pareto_front_fitted_pipelines_: Python dictionary【verbosity=3时才能用】
evaluated_individuals_: Python dictionary

Functions：

fit(features, classes, sample_weight=None, groups=None)
predict(features)
predict_proba(features)
score(testing_features, testing_classes)
export(output_file_name)

Regression

接口形式

tpot.TPOTRegressor(generations=100, population_size=100,
                         offspring_size=None, mutation_rate=0.9,
                         crossover_rate=0.1,
                         scoring='neg_mean_squared_error', cv=5,
                         subsample=1.0, n_jobs=1,
                         max_time_mins=None, max_eval_time_mins=5,
                         random_state=None, config_dict=None,
                         template=None,
                         warm_start=False,
                         memory=None,
                         use_dask=False,
                         periodic_checkpoint_folder=None,
                         early_stop=None,
                         verbosity=0,
                         disable_update_check=False)

Parameters:（只列与分类任务有差异的参数）

scoring: string or callable, optional (default=‘neg_mean_squared_error’)【‘neg_median_absolute_error’, ‘neg_mean_absolute_error’, ‘neg_mean_squared_error’, ‘r2’】

故园稻香

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
TPOT(Tree-based Pipeline Optimization Tool) API简介

TPOT是一个Python自动机器学习（AML）工具，它使用遗传算法优化机器学习管道；TPOT完成搜索或到达最长等待时间后，它能提供一个最优管道的Python代码；TPOT是基于sklearn，所以风格与其类似。
复制链接

扫一扫