weka java apriori_数据挖掘:关联规则的apriori算法在weka的源码分析

相对于机器学习,关联规则的apriori算法更偏向于数据挖掘。

1) 测试文档中调用weka的关联规则apriori算法,如下

try{

File file= new File("F:\\tools/lib/data/contact-lenses.arff");

ArffLoader loader= newArffLoader();

loader.setFile(file);

Instances m_instances=loader.getDataSet();

Discretize discretize= newDiscretize();

discretize.setInputFormat(m_instances);

m_instances=Filter.useFilter(m_instances, discretize);

Apriori apriori= newApriori();

apriori.buildAssociations(m_instances);

System.out.println(apriori.toString());

}catch(Exception e) {

e.printStackTrace();

}

步骤

1 读取数据集data,并提取样本集instances

2 离散化属性Discretize

3 创建Apriori 关联规则模型

4 输出大频率项集和关联规则集

2) 创建分类器的时候,调用设置默认参数方法

public voidresetOptions() {

m_removeMissingCols= false;

m_verbose= false;

m_delta= 0.05;

m_minMetric= 0.90;

m_numRules= 10;

m_lowerBoundMinSupport= 0.1;

m_upperBoundMinSupport= 1.0;

m_significanceLevel= -1;

m_outputItemSets= false;

m_car= false;

m_classIndex= -1;

}

参数详细解析,见后面的备注1

3)buildAssociations方法的解析,源码如下

public voidbuildAssociations(Instances instances) throws Exception {double[] confidences, supports;int[] indices;

FastVector[] sortedRuleSet;int necSupport = 0;

instances= newInstances(instances);if(m_removeMissingCols) {

instances=removeMissingColumns(instances);

}if (m_car && m_metricType !=CONFIDENCE)throw new Exception("For CAR-Mining metric type has to be confidence!");//only set class index if CAR is requested

if(m_car) {if (m_classIndex == -1) {

instances.setClassIndex(instances.numAttributes()- 1);

}else if (m_classIndex <= instances.numAttributes() && m_classIndex > 0) {

instances.setClassIndex(m_classIndex- 1);

}else{throw new Exception("Invalid class index.");

}

}//can associator handle the data?

getCapabilities().testWithFail(instances);

m_cycles= 0;//make sure that the lower bound is equal to at least one instance

double lowerBoundMinSupportToUse =(m_lowerBoundMinSupport* instances.numInstances() < 1.0) ? 1.0 /instances.numInstances()

: m_lowerBoundMinSupport;if(m_car) {//m_instances does not contain the class attribute

m_instances = LabeledItemSet.divide(instances, false);//m_onlyClass contains only the class attribute

m_onlyClass = LabeledItemSet.divide(instances, true);

}elsem_instances=instances;if (m_car && m_numRules ==Integer.MAX_VALUE) {//Set desired minimum support

m_minSupport =lowerBoundMinSupportToUse;

}else{//Decrease minimum support until desired number of rules foun

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值