常用的集成学习算法比较重要的需要调参的参数

最新推荐文章于 2023-07-19 20:30:04 发布

正襟危

最新推荐文章于 2023-07-19 20:30:04 发布

阅读量248

点赞数 1

文章标签： python

原文链接：https://www.cnblogs.com/wj-1314/p/10422159.html

版权

                    
                    function XGBoost CatBoostGBM Light GBM 
Importance parameters which control overfitting1.learning_rate or eta-optimal values lie between 0.01-0.2 **2.max_depth****3.min_child_weight:**similar to min _child leaf;default is 1****1.learning_rate 2.Depth-****value can be any integer up to 16 Recommended-[1 to 10]3.no such feature like min_child_weight **4.I2-leaf-reg:**L2 regularization coefficient.Used for leaf value calculation(any positive integer allowed)****1.learning_rate 2.max_depth:****defaults is 20.important to note that tree still grows leaf_wise.Hence it is important to tune num_leaves(number of leavers in a tree)which should be smaller than 2^(max_depth).It is a very important parameter for LGBM **3.min_data_in_leaf：**default=20，alias=min_data,min_child_samples
Parameters for categorical valuesnot availiable**1.cat_features**:It denotes the index of categorical features**2.one_hot_max_size**；Use one-hot encoding for all features with number of different values less than or equal to the given parameter value(max-255)1**.categorical_feature；**specify the catagorical features we want to use for training our model
Parameters for controlling speed**1.colsample_bytree**:subsample ratio of columns **2.subsample:**subsample ratio of the training instance **3.n_estimators**:maximun number of decision trees;high value can lead to overfitting**1.rsm:random subspace method.**The percentage of features to use at each split selection**2.No such parameter to subset data** **3.Iterations:maximun** number of trees that can be built;high value canbe lead to overfitting`**1.feature_fraction:**`fraction of features to be taken for each iteration 2.bagging_fraction:data to be used for each iteration and is generally used to speed up the training and avoid overfitting **3.num_iterations**:number of boosting iterations to be performed;default=100

function	XGBoost	CatBoostGBM	Light GBM
Importance parameters which control overfitting	1.learning_rate or eta-optimal values lie between 0.01-0.2 2.max_depth3.min_child_weight:similar to min _child leaf;default is 1	**1.learning_rate 2.Depth-value can be any integer up to 16 Recommended-[1 to 10]3.no such feature like min_child_weight 4.I2-leaf-reg:**L2 regularization coefficient.Used for leaf value calculation(any positive integer allowed)	**1.learning_rate 2.max_depth:defaults is 20.important to note that tree still grows leaf_wise.Hence it is important to tune num_leaves(number of leavers in a tree)which should be smaller than 2^(max_depth).It is a very important parameter for LGBM 3.min_data_in_leaf：**default=20，alias=min_data,min_child_samples
Parameters for categorical values	not availiable	1.cat_features:It denotes the index of categorical features2.one_hot_max_size；Use one-hot encoding for all features with number of different values less than or equal to the given parameter value(max-255)	1.categorical_feature；specify the catagorical features we want to use for training our model
Parameters for controlling speed	1.colsample_bytree:subsample ratio of columns 2.subsample:subsample ratio of the training instance 3.n_estimators:maximun number of decision trees;high value can lead to overfitting	1.rsm:random subspace method.The percentage of features to use at each split selection2.No such parameter to subset data 3.Iterations:maximun number of trees that can be built;high value canbe lead to overfitting	`1.feature_fraction:`fraction of features to be taken for each iteration 2.bagging_fraction:data to be used for each iteration and is generally used to speed up the training and avoid overfitting 3.num_iterations:number of boosting iterations to be performed;default=100