CatBoost 英文官网地址:https://catboost.ai/docs/concepts/python-reference_parameters-list.html
Training parameters
Python package training parameters
Several parameters have aliases. For example, the iterations parameter has the following synonyms: num_boost_round, n_estimators, num_trees. Simultaneous usage of different names of one parameter raises an error.
Training on GPU requires NVIDIA Driver of version 390.xx or higher.
Parameter | Type | Description | Default value | Supported processing units |
---|---|---|---|---|
Common parameters | ||||
loss_function Alias: objective |
| The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric). Format: Supported metrics:
A custom python object can also be set as the value of this parameter (see an example). For example, use the following construction to calculate the value of Quantile with the coefficient : | Depends on the class | CPU and GPU |
custom_metric |
| Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metricssection for details on each metric).. Format: Supported metrics:
Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv and test_error.tsvrespectively). The directory for these files is specified in the --train-dir (train_dir) parameter. Use the visualization tools to see a live chart with the dynamics of the specified metrics. | None | CPU and GPU |
eval_metric |
| The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric). Format: Supported metrics:
A user-defined function can also be set as the value (see an example). Examples: | Optimized objective is used | CPU and GPU |
iterations Aliases:
| int | The maximum number of trees that can be built when solving machine learning problems. When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter. | 1000 | CPU and GPU |
learning_rate Alias: eta | float | The learning rate. Used for reducing the gradient step. | The default value is defined automatically for binary classification based on the dataset properties and the number of iterations if none of these parametersis set. In this case, the selected learning rate is printed to stdout and saved in the model. In other cases, the default value is 0.03. | CPU and GPU |
random_seed Alias: random_state | int | The random seed used for training. | None (0) | CPU and GPU |
l2_leaf_reg Alias: reg_lambda | float | Coefficient at the L2 regularization term of the cost function. Any positive value is allowed. | 3.0 | CPU and GPU |
bootstrap_type | string | Bootstrap type. Defines the method for sampling the weights of objects. Supported methods:
| Bayesian | CPU and GPU |
bagging_temperature | float | Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes. Use the Bayesian bootstrap to assign random weights to objects. The weights are sampled from exponential distribution if the value of this parameter is set to “1”. All weights are equal to 1 if the value of this parameter is set to “0”. Possible values are in the range . The higher the value the more aggressive the bagging is. This parameter can be used if the selected bootstrap type is Bayesian. | 1 | CPU and GPU |
subsample | float | Sample rate for bagging. This parameter can be used if one of the following bootstrap types is selected:
| 0.66 | CPU and GPU |
sampling_frequency | string | Frequency to sample weights and objects when building trees. Supported values:
| PerTreeLevel | CPU and GPU |
sampling_unit | String | The sampling scheme. Possible values:
| Object | CPU and GPU |
mvs_head_fraction | float | Controls the fraction of the highest by absolute value gradients taken for the minimal variance sampling. Possible values are in the range . This parameter can be used if the selected bootstrap type is MVS. | 1.0 | CPU |
random_strength | float | The amount of randomness to use for scoring splits when the tree structure is selected. Use this parameter to avoid overfitting the model. The value of this parameter is used when selecting splits. On every iteration each possible split gets a score (for example, the score indicates how much adding this split will improve the loss function for the training dataset). The split with the highest score is selected. The scores have no randomness. A normally distributed random variable is added to the score of the feature. It has a zero mean and a variance that decreases during the training. The value of this parameter is the multiplier of the variance.Note.This parameter is not supported for the following loss functions:
| 1 | CPU |
use_best_model | bool | If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
No trees are saved after this iteration. This option requires a validation dataset to be provided. | True if a validation set is input (the eval_setparameter is defined) and at least one of the label values of objects in this set differs from the others. False otherwise. | CPU and GPU |
best_model_min_trees | int | The minimal number of trees that the best model should have. If set, the output model contains at least the given number of trees even if the best model is located within these trees. Should be used with the use_best_model parameter. | None (The minimal number of trees for the best model is not set) | CPU and GPU |
depth Alias: max_depth | int | Depth of the tree. The range of supported values depends on the processing unit type and the type of the selected loss function:
| 6 (16 if the growing policy is set to Lossguide) | CPU and GPU |
grow_policy | string | The tree growing policy. Defines how to perform greedy tree construction. Possible values:
Note. The Depthwise and Lossguidegrowing policies are currently supported only in training and prediction modes. They are not supported for model analysis (such as Feature importance and ShapValues) and exporting to different model formats (such as AppleCoreML , onnx and json) . | SymmetricTree | GPU |
min_data_in_leaf | int | The minimum number of training samples in a leaf. CatBoost does not search for new splits in leaves with samples count less than the specified value. Can be used only with the Lossguide and Depthwisegrowing policies. | 1 | GPU |
max_leaves | int | The maximum number of leafs in the resulting tree. Can be used only with the Lossguide growing policy. Tip. It is not recommended to use values greater than 64, since it can significantly slow down the training process. | 31 | GPU |
ignored_features | list | Feature indices or names to exclude from the training. It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices. Specifics:
| None (use all features) | CPU and GPU |
one_hot_max_size | int | Use one-hot encoding for all categorical features with a number of different values less than or equal to the given parameter value. Ctrs are not calculated for such features. See details. | The default value depends on various conditions:
| CPU and GPU |
has_time | bool | Use the order of objects in the input data (do not perform random permutations during the Transforming categorical features to numerical features and Choosing the tree structure stages). The Timestamp column type is used to determine the order of objects if specified in the input data. | False (not used; generates random permutations) | CPU and GPU |
rsm Alias: colsample_bylevel | float (0;1] | Random subspace method. The percentage of features to use at each split selection, when features are selected over again at random. The value must be in the range (0;1]. | None (set to 1) | CPU |
nan_mode | string | The method for processing missing values in the input dataset. Possible values:
Using the Min or Max value of this parameter guarantees that a split between missing values and other values is considered when selecting a new split in the tree. Note.The method for processing missing values can be set individually for each feature in the Custom quantization borders and missing value modes input file. Such values override the ones specified in this parameter. | Min | CPU and GPU |
input_borders | string | Load Custom quantization borders and missing value modes from a file (do not generate them). Borders are automatically generated before training if this parameter is not set. | None | CPU and GPU |
output_borders | string | Save quantization borders for the current dataset to a file. Refer to the file format description. | Noneкк | CPU and GPU |
fold_permutation_block | int | Objects in the dataset are grouped in blocks before the random permutations. This parameter defines the size of the blocks. The smaller is the value, the slower is the training. Large values may result in quality degradation. | 1 | CPU and GPU |
leaf_estimation_method | string | The method used to calculate the values in leaves. Possible values:
| Gradient | CPU and GPU |
leaf_estimation_iterations | int | The number of gradient steps when calculating the values in leaves. | None (Depends on the training objective) | CPU and GPU |
leaf_estimation_backtracking | string | The type of backtracking to use during the gradient descent. Possible values:
| AnyImprovement | Depends on the selected value |
fold_len_multiplier | float | Coefficient for changing the length of folds. The value must be greater than 1. The best validation result is achieved with minimum values. With values close to 1 (for example, ), each iteration takes a quadratic amount of memory and time for the number of objects in the iteration. Thus, low values are possible only when there is a small number of objects. | 2 | CPU and GPU |
approx_on_full_history | bool | The principles for calculating the approximated values. Possible values:
| False | CPU |
class_weights | list | Class weights. The values are used as multipliers for the object weights. This parameter can be used for solving classification and multiclassification problems. Tip.For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to for class 1. For example, | None (the weight for all classes is set to 1) | CPU and GPU |
scale_pos_weight | float | The weight for class 1 in binary classification. The value is used as a multiplier for the weights of objects from class 1. Tip. For imbalanced datasets, the weight multiplier can be set to | 1.0 | CPU and GPU |
boosting_type | string | Boosting scheme. Possible values:
| Depends on the number of objects in the training dataset and the selected learning mode | CPU and GPU Only the Plainmode is supported for the MultiClassloss on GPU |
allow_const_label | bool | Use it to train models with datasets that have equal label values for all objects. | False | CPU and GPU |
score_function | string | The score type used to select the next split during the tree construction. Possible values:
| Correlation (NewtonL2 if the growing policy is set to Lossguide) | GPU |
Overfitting detection settings | ||||
early_stopping_rounds | int | Set the overfitting detector type to Iter and stop the training after the specified number of iterations since the iteration with the optimal metric value. | False | CPU and GPU |
od_type | string | The type of the overfitting detector to use. Possible values:
| IncToDec | CPU and GPU |
od_pval | float | The threshold for the IncToDec overfitting detectortype. The training is stopped when the specified value is reached. Requires that a validation dataset was input. For best results, it is recommended to set a value in the range . The larger the value, the earlier overfitting is detected. Restriction.Do not use this parameter with the Iteroverfitting detector type. | 0 (the overfitting detection is turned off) | CPU and GPU |
od_wait | int | The number of iterations to continue the training after the iteration with the optimal metric value.The purpose of this parameter differs depending on the selected overfitting detector type:
| 20 | CPU and GPU |
Quantization settings | ||||
target_border | float | If set, defines the border for converting target values to 0 and 1. Depending on the specified value:
| None | CPU and GPU |
border_count Alias: max_bin | int | The number of splits for numerical features. Allowed values depend on the processing unit type:
| 254 (if training is performed on CPU) or 128 (if training is performed on GPU) | CPU and GPU |
feature_border_type | string | The quantization mode for numerical features. Possible values:
| GreedyLogSum | CPU and GPU |
Multiclassification settings | ||||
classes_count | int | The upper limit for the numeric class label. Defines the number of classes for multiclassification. Only non-negative integers can be specified. The given integer should be greater than any of the label values. If this parameter is specified the labels for all classes in the input dataset should be smaller than the given value | None. Calculation principles | CPU and GPU |
Performance settings | ||||
thread_count | int | The number of threads to use during training.
| -1 (the number of threads is equal to the number of processor cores) | CPU and GPU |
used_ram_limit | int | Attempt to limit the amount of used CPU RAM. Restriction.
Supported measures of information (non case-sensitive):
| None (memory usage is no limited) | CPU |
gpu_ram_part | float | How much of the GPU RAM to use for training. | 0.95 | GPU |
pinned_memory_size | int | How much pinned (page-locked) CPU RAM to use per GPU. | 1073741824 | GPU |
gpu_cat_features_storage | string | The method for storing the categorical features' values. Possible values:
Use the CpuPinnedMemory value if feature combinations are used and the available GPU RAM is not sufficient. | None (set to GpuRam) | GPU |
data_partition | string | The method for splitting the input dataset between multiple workers. Possible values:
| Depends on the learning mode and the input dataset | GPU |
Processing unit settings | ||||
task_type | string | The processing unit type to use for training. Possible values:
| CPU | CPU and GPU |
devices | string | IDs of the GPU devices to use for training (indices are zero-based). Format
| NULL (all GPU devices are used if the corresponding processing unit type is selected) | GPU |
Visualization settings | ||||
name | string | The experiment name to display in visualization tools. | experiment | CPU and GPU |
Output settings | ||||
logging_level | string | The logging level to output to stdout. Possible values:
| None (corresponds to the Verboselogging level) | CPU and GPU |
metric_period | int | The frequency of iterations to calculate the values of objectives and metrics. The value should be a positive integer. The usage of this parameter speeds up the training. Note.It is recommended to increase the value of this parameter to maintain training speed if a GPU processing unit type is used. | 1 | CPU and GPU |
verbose Alias: verbose_eval |
| The purpose of this parameter depends on the type of the given value:
Restriction. Do not use this parameter with the logging_level parameter. | 1 | CPU and GPU |
train_dir | string | The directory for storing the files generated during training. | catboost_info | CPU and GPU |
model_size_reg | float | The model size regularization coefficient. The larger the value, the smaller the model size. Refer to the Model size regularization coefficient section for details. Possible values are in the range . This regularization is needed only for models with categorical features (other models are small). Models with categorical features might weight tens of gigabytes or more if categorical features have a lot of values. If the value of the regularizer differs from zero, then the usage of categorical features or feature combinations with a lot of values has a penalty, so less of them are used in the resulting model. Note that the resulting quality of the model can be affected. Set the value to 0 to turn off the model size optimization option. | None (Turned on and set to 0.5 on CPU and turned off for GPU) | CPU |
allow_writing_files | bool | Allow to write analytical and snapshot files during training. If set to “False”, the snapshot and data visualizationtools are unavailable. | True | CPU and GPU |
save_snapshot | bool | Enable snapshotting for restoring the training progress after an interruption. If enabled, the default period for making snapshots is 600 seconds. Use the snapshot_interval parameter to change this period. Note. This parameter is not supported in the params parameter of the cv function. | None | CPU and GPU |
snapshot_file | string | The name of the file to save the training progress information in. This file is used for recovering training after an interruption. Depending on whether the specified file exists in the file system:
Note. This parameter is not supported in the params parameter of the cv function. | experiment... | CPU and GPU |
snapshot_interval | int | The interval between saving snapshots in seconds. The first snapshot is taken after the specified number of seconds since the start of training. Every subsequent snapshot is taken after the specified number of seconds since the previous one. The last snapshot is taken at the end of the training. Note. This parameter is not supported in the params parameter of the cv function. | 600 | CPU and GPU |
roc_file | string | The name of the output file to save the ROC curve points to. This parameter can only be set in cross-validation mode if the Logloss loss function is selected. The ROC curve points are calculated for the test fold. The output file is saved to the catboost_infodirectory. | None (the file is not saved) | CPU and GPU |
CTR settings | ||||
simple_ctr | string | Quantization settings for simple categorical features. Use this parameter to specify the principles for defining the class of the object for regression tasks. By default, it is considered that an object belongs to the positive class if its' label value is greater than the median of all label values of the dataset. Format: Components:
| CPU and GPU | |
combinations_ctr | string | Quantization settings for combinations of categorical features. Components:
| CPU and GPU | |
per_feature_ctr | string | Per-feature quantization settings for categorical features. Components:
| CPU and GPU | |
ctr_target_border_count | int | The maximum number of borders to use in target quantization for categorical features that need it. Allowed values are integers from 1 to 255 inclusively. The value of the
| Number_of_classes - 1 for Multiclassification problems when training on CPU, 1 otherwise | CPU and GPU |
counter_calc_method | string | The method for calculating the Counter CTR type. Possible values:
| None (Full is used) | CPU and GPU |
max_ctr_complexity | int | The maximum number of features that can be combined. Each resulting combination consists of one or more categorical features and can optionally contain binary features in the following form: “numeric feature > value”. | 4 | CPU and GPU |
ctr_leaf_count_limit | int | The maximum number of leaves with categorical features. If the quantity exceeds the specified value a part of leaves is discarded. The leaves to be discarded are selected as follows:
This option reduces the resulting model size and the amount of memory required for training. Note that the resulting quality of the model can be affected. | None The number of different category values is not limited | CPU |
store_all_simple_ctr | bool | Ignore categorical features, which are not used in feature combinations, when choosing candidates for exclusion. Use this parameter with ctr_leaf_count_limitonly. | None (set to False) Both simple features and feature combinations are taken in account when limiting the number of leafs with categorical features | CPU |
final_ctr_computation_mode | string | Final CTR computation mode. Possible values:
| Default | CPU and GPU |