weka[6] - Random Forest

最新推荐文章于 2024-06-20 17:37:06 发布

杨之之

最新推荐文章于 2024-06-20 17:37:06 发布

阅读量8.6k

点赞数 1

分类专栏： weka 文章标签： kenny weka 算法 randomforest

本文链接：https://blog.csdn.net/u011292007/article/details/31488343

版权

本文介绍了Weka中Random Forest算法的实现，重点解析了RandomTree的构建过程，包括如何选择特征和建立子树。尽管没看到样本随机抽样，但注意到特征选择在每个节点上是随机的。随机森林主要通过改变基学习器和策略来扩展Bagging。后续将探讨DecisionStump和Adaboost。

摘要由CSDN通过智能技术生成

终于来到Random Forests啦。随机森林应该不难理解，算法本身就不细说了，直接进入代码！

buildClassifer:

  public void buildClassifier(Instances data) throws Exception {

    // can classifier handle the data?
    getCapabilities().testWithFail(data);

    // remove instances with missing class
    data = new Instances(data);
    data.deleteWithMissingClass();

    m_bagger = new Bagging();
    RandomTree rTree = new RandomTree();

    // set up the random tree options
    m_KValue = m_numFeatures;
    if (m_KValue < 1)
      m_KValue = (int) Utils.log2(data.numAttributes()) + 1;
    rTree.setKValue(m_KValue);
    rTree.setMaxDepth(getMaxDepth());

    // set up the bagger and build the forest
    m_bagger.setClassifier(rTree);
    m_bagger.setSeed(m_randomSeed);
    m_bagger.setNumIterations(m_numTrees);
    m_bagger.setCalcOutOfBag(true);
    m_bagger.buildClassifier(data);
  }

前三行再熟悉不过了。第四行, m_bagger初始化一个bagging类(其实random forests跟bagging区别的区别是base learner)。

RandomTree就是一棵随机树，后面讲(清楚随机森林的同学，已经大致猜到了这是棵怎么样的树)。

后面几部就是设置下参数而已。其实就跟bagging一模一样，只不过我们增加一些参数，并且把base learner换一换。

下面来看看随机森林的base learner - RandomTree。

buildClassifier:

  public void buildClassifier(Instances data) throws Exception {

    // Make sure K value is in range
    // m_KValue: number of instances for spliting
    if (m_KValue > data.numAttributes() - 1)
      m_KValue = data.numAttribute