话不多说,首先让我们看官网对此参数的解读。
API 中这样说:
scale_pos_weight : float
Balancing of positive and negative weights.
Parameter Tuning —— Handle Imbalanced Dataset 中这样说:
For common cases such as ads clickthrough log, the dataset is extremely imbalanced. This can affect the training of xgboost model,
and there are two ways to improve it.
If you care only about the ranking order (AUC) of your prediction
Balance the positive and negative weights, via scale_pos_weight
Use AUC for evaluation
If you care about predicting the right probability
In such a case, you cannot re-balance the dataset
In such a case, set parameter max_delta_step to a finite number (say 1) will help con