function | XGBoost | CatBoostGBM | Light GBM |
---|---|---|---|
Importance parameters which control overfitting | 1.learning_rate or eta-optimal values lie between 0.01-0.2 **2.max_depth****3.min_child_weight:**similar to min _child leaf;default is 1 | ****1.learning_rate 2.Depth-****value can be any integer up to 16 Recommended-[1 to 10]3.no such feature like min_child_weight **4.I2-leaf-reg:**L2 regularization coefficient.Used for leaf value calculation(any positive integer allowed) | ****1.learning_rate 2.max_depth:****defaults is 20.important to note that tree still grows leaf_wise.Hence it is important to tune num_leaves(number of leavers in a tree)which should be smaller than 2^(max_depth).It is a very important parameter for LGBM **3.min_data_in_leaf:**default=20,alias=min_data,min_child_samples |
Parameters for categorical values | not availiable | **1.cat_features**:It denotes the index of categorical features**2.one_hot_max_size**;Use one-hot encoding for all features with number of different values less than or equal to the given parameter value(max-255) | 1**.categorical_feature;**specify the catagorical features we want to use for training our model |
Parameters for controlling speed | **1.colsample_bytree**:subsample ratio of columns **2.subsample:**subsample ratio of the training instance **3.n_estimators**:maximun number of decision trees;high value can lead to overfitting | **1.rsm:random subspace method.**The percentage of features to use at each split selection**2.No such parameter to subset data** **3.Iterations:maximun** number of trees that can be built;high value canbe lead to overfitting | `**1.feature_fraction:**`fraction of features to be taken for each iteration 2.bagging_fraction:data to be used for each iteration and is generally used to speed up the training and avoid overfitting **3.num_iterations**:number of boosting iterations to be performed;default=100 |
常用的集成学习算法比较重要的需要调参的参数
最新推荐文章于 2023-07-19 20:30:04 发布