关联规则
挖掘目的
发现商品之间的关系模式
指标
支持度 support(x) = P(x)
置信度 confidence(X -> Y) = support(X,Y)/support(X)
提升度 lift(X -> Y) = confidence(X -> Y)/support(Y)
关联规则 最小支持度阈值、最小置信度阈值
Apriori算法
R的函数
arules包
read.transactions
apriori
inspect
R语言实例:
# install.packages(‘arules’)
> library(arules)
> gd = read.transactions(‘e:/groceries.csv’,sep = ‘,’)
> # data(“Groceries”)
> # inspect(Groceries)
> summary(gd)
> inspect(gd)
> inspect(gd[1:5])
> itemFrequency(gd)
> itemFrequencyPlot(gd,support = 0.1) # 支持度
> itemFrequencyPlot(gd,topN = 30) # top N
> myrules = apriori(data = gd,parameter = list(support = 0.01,confidence = 0.3,minlen = 1))
> inspect(myrules)
> inspect(sort(myrules,by = 'lift'))
> summary(myrules)
> write(myrules,file = 'e:/grules.txt',sep = ',',col.names = NA) # 把生成的结果写入文件