代码示例
package test;
import java.io.File;
import weka.classifiers.Classifier;
import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.core.converters.ArffLoader;
public class WekaTest {
public static void main(String[] args) throws Exception {
Classifier m_classifier = new J48();
// 训练语料文件
File inputFile = new File("D:/Program Files/Weka-3-6/data/cpu.with.vendor.arff");
ArffLoader atf = new ArffLoader();
atf.setFile(inputFile);
// 读入训练文件
Instances instancesTrain = atf.getDataSet();
instancesTrain.setClassIndex(0);
// 训练
m_classifier.buildClassifier(instancesTrain);
// 测试语料文件
inputFile = new File("D:/Program Files/Weka-3-6/data/cpu.with.vendor.arff");
atf.setFile(inputFile);
// 读入测试文件
Instances instancesTest = atf.getDataSet();
// 设置分类属性所在行号(第一行为0号),instancesTest.numAttributes()可以取得属性总数
instancesTest.setClassIndex(0);
// 测试语料实例数
double sum = instancesTest.numInstances();
double right = 0.0f;
// 测试分类结果
for (int i = 0; i < sum; i++) {
// 如果预测值和答案值相等(测试语料中的分类列提供的须为正确答案,结果才有意义)
if (m_classifier.classifyInstance(instancesTest.instance(i)) == instancesTest.instance(i).classValue()) {
// 正确值加1
right++;
}
}
System.out.println("J48 classification precision:" + (right / sum));
}
}
操作步骤
-
新建一个java project,创建类WekaTest
-
引入weka.jar包(weka安装目录D:\Program Files\Weka-3-6\weka.jar)
问题
调用过程顺利,但是结果与在weka中得出的结果不同,贴出图,求明白人指点
程序运行结果:
J48 classification precision:0.8373205741626795
WEKA运行结果:
=== Run information ===
Scheme:weka.classifiers.trees.J48 -C 0.25 -M 2
Relation: bank-data-weka.filters.unsupervised.attribute.Remove-R1
Instances: 600
Attributes: 11
age
sex
region
income
married
children
car
save_act
current_act
mortgage
pep
Test mode:evaluate on training data
=== Classifier model (full training set) ===
J48 pruned tree
------------------
children <= 1
| children <= 0
| | married = NO
| | | mortgage = NO: YES (48.0/3.0)
| | | mortgage = YES
| | | | save_act = NO: YES (12.0)
| | | | save_act = YES: NO (23.0)
| | married = YES
| | | save_act = NO
| | | | mortgage = NO
| | | | | income <= 21506.2
| | | | | | age <= 41: NO (11.0/1.0)
| | | | | | age > 41: YES (5.0/1.0)
| | | | | income > 21506.2: NO (20.0)
| | | | mortgage = YES: YES (25.0/3.0)
| | | save_act = YES: NO (119.0/12.0)
| children > 0
| | income <= 15538.8
| | | age <= 41: NO (22.0/2.0)
| | | age > 41: YES (2.0)
| | income > 15538.8: YES (111.0/5.0)
children > 1
| income <= 30404.3: NO (124.0/12.0)
| income > 30404.3
| | children <= 2: YES (51.0/5.0)
| | children > 2
| | | income <= 44288.3: NO (19.0/2.0)
| | | income > 44288.3: YES (8.0)
Number of Leaves : 15
Size of the tree : 29
Time taken to build model: 0.01 seconds
=== Evaluation on training set ===
=== Summary ===
Correctly Classified Instances 554 92.3333 %
Incorrectly Classified Instances 46 7.6667 %
Kappa statistic 0.845
K&B Relative Info Score 45010.1705 %
K&B Information Score 447.6762 bits 0.7461 bits/instance
Class complexity | order 0 596.7451 bits 0.9946 bits/instance
Class complexity | scheme 222.7757 bits 0.3713 bits/instance
Complexity improvement (Sf) 373.9693 bits 0.6233 bits/instance
Mean absolute error 0.1389
Root mean squared error 0.2636
Relative absolute error 27.9979 %
Root relative squared error 52.9137 %
Total Number of Instances 600
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.894 0.052 0.935 0.894 0.914 0.936 YES
0.948 0.106 0.914 0.948 0.931 0.936 NO
Weighted Avg. 0.923 0.081 0.924 0.923 0.923 0.936
=== Confusion Matrix ===
a b <-- classified as
245 29 | a = YES
17 309 | b = NO
quote:http://blog.csdn.net/felomeng/article/details/4688257#comments