sklearn.ensemble之RandomForestClassifier源码解读(一)

class RandomForestClassifier(ForestClassifier)

    A random forest classifier.

    A random forest is a meta estimator that fits a number of decision tree
    classifiers on various sub-samples of the dataset and use averaging to
    improve the predictive accuracy and control over-fitting.

    The sub-sample size is always the same as the original
    input sample size but the samples are drawn with replacement if
    `bootstrap=True` (default).

    # 将数据集(dataset)分成若干子集(sub-sample)
    # 每个子集作为一棵决策树(decision tree)的训练集(training data)
    # 参数 bootstrap 的值会影响到数据子集(sub-sample)的划分

参数(Parameters):

[ bootstrap ] ==> boolean, optional (default=True)

    Whether bootstrap samples are used when building trees.

    # 构建树(即子分类器)的时候,样本选取是否采用有放回抽样。

[ criterion ] ==> string, optional (default=”gini”)

    The function to measure the quality of a split. Supported criteria are
    "gini" for the Gini impurity and "entropy" for the information gain.

    Note: this parameter is tree-specific.

    # 不纯度判断标准,判断决策树节点是否需要继续分裂时采用的计算方法,
    # 默认是gini,可以修改为entropy。

[ max_features ] ==> int, float, string or None, optional (default=”auto”)

    The number of features to consider when looking for the best split:
        - If int, then consider `max_features` features at each split.
        - If float, then `max_features` is a percentage and
          `int(max_features * n_features)` features are considered at each split.
        - If "auto", then `max_features=sqrt(n_features)`.
        - If "sqrt", then `max_features=sqrt(n_features)` (same as "auto").
        - If "log2", then `max_features=log2(n_features)`.
        - If None, then `max_features=n_features`.

    Note: the search for a split does not stop until at least one
        valid partition of the node samples is found, even if it requires to
        effectively inspect more than ``max_features`` features.

    # 节点分裂的时候,参与判断的最大特征数,默认是auto模式。
    #        int:个数
    #        float:占所有特征的百分比
    #        auto:所有特征数的开方
    #        sqrt:所有特征数的开方
    #        log2:所有特征数的log2值
    #        None:等于所有特征数

[ max_depth ] ==> integer or None, optional (default=None)

    The maximum depth of the tree. If None, then nodes are expanded until
    all leaves are pure or until all leaves contain less than
    mi
  • 2
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值