diabetes prediction dataset
https://archive.ics.uci.edu/ml/datasets/Early+stage+diabetes+risk+prediction+dataset.
在weka中打开
How to use Weka to run a classifier(a classification model)
Choose classifier
这个就是C4.5决策树算法的实现(weka成为J48)
这里 -C 0.25 是Confidence Factor=0.25
-M 2 是minNumObj=2,即 the minimum number of instances per leaf
可以在这change options
Classifier evalution
for several classifier evaluation method, see
可以看到这里有几个选项可以选择
k-fold Cross-validation in Weka
meta-classifier
Weka provides a set of meta-classifiers that combine tools with existing classifiers
CVParameterSelection
采用交叉验证的方法,对参数进行优化选择
如果要使用J48 algorithm using CVParameterSelection
就要先选择CVParameterSelection,然后在CVParameterSelection的参数选择的classifier中选择J48 algorithm
执行结果中可以看到classifier选择的C,也就是最有的C值
即C的值0.2是最优的
Weka Knowledge Flow