3.2.4.3.6. sklearn.ensemble.GradientBoostingRegressor
-
class
sklearn.ensemble.
GradientBoostingRegressor
(
loss='ls',
learning_rate=0.1,
n_estimators=100,
subsample=1.0,
min_samples_split=2,
min_samples_leaf=1,
min_weight_fraction_leaf=0.0,
max_depth=3,
init=None,
random_state=None,
max_features=None,
alpha=0.9,
verbose=0,
max_leaf_nodes=None,
warm_start=False
)
[source]
-
Gradient Boosting for regression.
GB builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.
Read more in the User Guide.
P
a
r
a
m
e
t
e
r
s
:
loss : {‘ls’, ‘lad’, ‘huber’, ‘quantile’}, optional (default=’ls’)
loss function to be optimized. ‘ls’ refers to least squares regression. ‘lad’ (least absolute deviation) is a highly robust loss function solely based on order information of the input variables. ‘huber’ is a combination of the two. ‘quantile’ allows quantile regression (usealpha to specify the quantile).
learning_rate : float, optional (default=0.1)
learning rate shrinks the contribution of each tree by learning_rate. There is a trade-off between learning_rate and n_estimators.
n_estimators : int (default=100)
The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance.
max_depth : integer, optional (default=3)
maximum depth of the individual regression estimators. The maximum depth limits the number of nodes in the tree. Tune this parameter for best performance; the best value depends on the interaction of the input variables. Ignored if max_leaf_nodes is not None.
min_samples_split : integer, optional (default=2)
The minimum number of samples required to split an internal node.
min_samples_leaf : integer, optional (default=1)
The minimum number of samples required to be at a leaf node.
min_weight_fraction_leaf : float, optional (default=0.)
The minimum weighted fraction of the input samples required to be at a leaf node.
subsample : float, optional (default=1.0)
The fraction of samples to be used for fitting the individual base learners. If smaller than 1.0 this results in Stochastic Gradient Boosting. subsample interacts with the parametern_estimators. Choosing subsample < 1.0 leads to a reduction of variance and an increase in bias.
max_features : int, float, string or None, optional (default=None)
-
The number of features to consider when looking for the best split:
-
- If int, then consider max_features features at each split.
- If float, then max_features is a percentage and int(max_features * n_features)features are considered at each split.
- If “auto”, then max_features=n_features.
- If “sqrt”, then max_features=sqrt(n_features).
- If “log2”, then max_features=log2(n_features).
- If None, then max_features=n_features.
Choosing max_features < n_features leads to a reduction of variance and an increase in bias.
Note: the search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features.
max_leaf_nodes : int or None, optional (default=None)
Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes.
alpha : float (default=0.9)
The alpha-quantile of the huber loss function and the quantile loss function. Only ifloss='huber' or loss='quantile'.
init : BaseEstimator, None, optional (default=None)
An estimator object that is used to compute the initial predictions. init has to provide fitand predict. If None it uses loss.init_estimator.
verbose : int, default: 0
Enable verbose output. If 1 then it prints progress and performance once in a while (the more trees the lower the frequency). If greater than 1 then it prints progress and performance for every tree.
warm_start : bool, default: False
When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just erase the previous solution.
random_state : int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
A
t
t
r
i
b
u
t
e
s
:
-