立即学习:https://edu.csdn.net/course/play/10581/236112?utm_source=blogtoedu
sklearn决策树
穷举搜素所有特征可能的分裂点,没有实现剪枝(需要用交叉验证选择最佳参数)
分类:DecisionTreeClassifier
回归:DecisonTreeRegressor
criterion:"gini","entropy",
splitter:"best","random"
max_features:int, float or {“auto”, “sqrt”, “log2”}, default=None
The number of features to consider when looking for the best split:
-
If int, then consider
max_features
features at each split. -
If float, then
max_features
is a fraction andint(max_features * n_features)
features are considered at each split. -
If “auto”, then
max_features=sqrt(n_features)
. -
If “sqrt”, then
max_features=sqrt(n_features)
. -
If “log2”, then
max_features=log2(n_features)
. -
If None, then
max_features=n_features
以下限制模型复杂度的参数:
max_depth:int
max_leaf_nodes:int
min_samples_split:int or float, default=2
The minimum number of samples required to split an internal node
min_samples_leaf:int or float, default=1
The minimum number of samples required to be at a leaf node.
min_impurity_decrease:float, default=0.0
A node will be split if this split induces a decrease of the impurity greater than or equal to this value.