XGBOOST回归用法和官方参数解释

本文详细介绍了 XGBoost 的参数设置,包括通用参数、树增强器参数和学习任务参数,并提供了实际训练代码示例。通过调整参数如 `eta`, `gamma`, `max_depth` 和 `subsample`,可以优化模型以适应不同任务,如回归问题。代码展示了使用这些参数进行训练和评估的过程。" 80641612,6863257,MyBatis动态SQL实战:foreach遍历集合详解,"['MyBatis', '动态SQL', 'foreach', '映射文件']
摘要由CSDN通过智能技术生成

XGBoost Parameters
本文连接官网地址:https://xgboost.readthedocs.io/en/latest/parameter.html

Before running XGBoost, we must set three types of parameters: general parameters, booster parameters and task parameters.
General parameters relate to which booster we are using to do boosting, commonly tree or linear model
Booster parameters depend on which booster you have chosen
Learning task parameters decide on the learning scenario. For example, regression tasks may use different parameters with ranking tasks.
Command line parameters relate to behavior of CLI version of XGBoost.
Note
Parameters in R package

In R-package, you can use . (dot) to replace underscore in the parameters, for example, you can use max.depth to indicate max_depth. The underscore parameters are also valid in R.

General Parameters

Parameters for Tree Booster

Additional parameters for Dart Booster (booster=dart)

Parameters for Linear Booster (booster=gblinear)

Parameters for Tweedie Regression (objective=reg:tweedie)

Learning Task Parameters

Command Line Parameters

General Parameters

booster [default= gbtree ]

Which booster to use. Can be gbtree, gblinear or dart; gbtree and dart use tree based models while gblinear uses linear functions.

silent [default=0] [Deprecated]

Deprecated. Please use verbosity instead.

verbosity [default=1]

Verbosity of printing messages. Valid values are 0 (silent), 1 (warning), 2 (info), 3 (debug). Sometimes XGBoost tries to change configurations based on heuristics, which is displayed as warning message. If there’s unexpected behaviour, please try to increase value of verbosity.

nthread [default to maximum number of threads available if not set]

Number of parallel threads used to run XGBoost

disable_default_eval_metric [default=0]

Flag to disable default metric. Set to >0 to disable.

num_pbuffer [set automatically by XGBoost, no need to be set by user]

Size of prediction buffer, normally set to number of training instances. The buffers are used to save the prediction results of last boosting step.

num_feature [set automatically by XGBoost, no need to be set by user]

Feature dimension used in boosting, set to maximum dimension of the feature

Parameters for Tree Booster

eta [default=0.3, alias: learning_rate]

Step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features, and eta shrinks the feature weights to make the boosting process more conservative.

range: [0,1]

gamma [default=0, alias: min_split_loss]

Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger gamma is, the more conservative the algorithm will be.

range: [0,∞]

max_depth [default=6]

Maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit. 0 is only accepted in lossguided growing policy when tree_method is set as hist and it indicates no limit on depth. Beware that XGBoost aggressively consumes memory when training a deep tree.

range: [0,∞] (0 is only accepted in lossguided growing policy when tree_method is set as hist)

min_child_weight [default=1]

Minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression task, this simply corresponds to minimum number of instances needed to be in each node. The larger min_child_weight is, the more conservative the algorithm will be.

range: [0,∞]

max_delta_step [default=0]

Maximum delta step we allow each leaf output to be. If the value is set to 0, it means there is no constraint. If it is set to a positive value, it can help making the update step more conservative. Usually this parameter is not needed, but it might help in logistic regression when class is extremely imbalanced. Set it to value of 1-10 might help control the update.

range: [0,∞]

subsample [default=1]

Subsample ratio of the training instances. Setting it to 0.5 means that XGBoost would randomly sample half of the training data prior to growing trees. and this will prevent overfitting. Subsampling will occur once in every boosting iteration.

range: (0,1]

colsample_bytree, colsample_bylevel, colsample_bynode [default=1]

This is a family of paramete

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值