Weka 学习:J48(C4.5)

Before writing:To improve my english,I will write my blog in English.

 Section 1: J48

J48 is a class to implement C4.5 algorithm.Look at part of the code.In thebuildClassifier(...) function,there are two important classes:ModelSelection (It is extended by the class of BinC45ModelSelectionand the class of C45ModelSelection)& ClassifierTree (It is extended by the class of C45PruneableClassifierTreeandthe class of PruneableClassifierTree) .theModelSelection is used to select sons of the node and theClassifierTree  is used with ModelSelection to build the tree.

 public void buildClassifier(Instances instances) 
       throws Exception {

    ModelSelection modSelection;	 

    if (m_binarySplits)
      modSelection = new BinC45ModelSelection(m_minNumObj, instances);
    else
      modSelection = new C45ModelSelection(m_minNumObj, instances);
    if (!m_reducedErrorPruning)
      m_root = new C45PruneableClassifierTree(modSelection, !m_unpruned, m_CF,
					    m_subtreeRaising, !m_noCleanup);
    else
      m_root = new PruneableClassifierTree(modSelection, !m_unpruned, m_numFolds,
					   !m_noCleanup, m_Seed);
    m_root.buildClassifier(instances);
    if (m_binarySplits) {
      ((BinC45ModelSelection)modSelection).cleanup();
    } else {
      ((C45ModelSelection)modSelection).cleanup();
    }
  }

Section 2: ModelSelection

Secition 2.1 Split

In this part,I will discuss class BinC45ModelSelection and class C45ModelSelection.In general,these two classes are the iterations of selecting the best attribute with the highest info gain ratio.To calculate the info gain of a certain attribute  is what the class BinC45Split or C45Split dose.So firstly,let us give a glance of BinC45Split  and C45Split .


These two classes are so alike that the only difference isthat for a nominal attribute, the former splits it into two subsets while the latter splits it into multiple subsets.So i will only explain the former(the latter is easier).

 public void buildClassifier(Instances trainInstances) 
       throws Exception {

    // Initialize the remaining instance variables.
    m_numSubsets = 0;
    m_splitPoint = Double.MAX_VALUE;
    m_infoGain = 0;
    m_gainRatio = 0;

    // Different treatment for enumerated and numeric
    // attributes.
    if (trainInstances.attribute(m_attIndex).isNominal()) {
      m_complexityIndex = trainInstan
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值