Apriori Algorithm

Overview


  • Apriori is an algorithm mining frequent item sets and association rules in transactional databases.
  • Apriori uses Breadth-First Search and a Hash Tree Structure to count candidate item sets efficiently.
  • if {X} is a frequent item set, so all subsets of {X} are frequent item sets.
  • if {X} is a infrequent item set, so all supersets of {X} are infrequent item sets.

Preliminary


Hash Tree Structure

Pseudo Code


Pseudo Code of Apriori Algorithm

  • T means a tansactional database, ϵ is the support threshold, which means the minimum times of occurrence of a item set. ( T 表示我们的数据集, ϵ 表示的支持阈值,它表示在一个项集(item set)可以被称为频繁项集的最小出现次数).
  • L1{large 1itemsets} means the frequent item sets with only one item.
  • Ck{a{b}aLk1ba}{c{ssc|s|=k1}Lk1} means the candidate set for level k .({a{b}aLk1ba} 表示的是在 Lk1 这个频繁项集中, a 和不包含a b 的合集,{c{ssc|s|=k1}Lk1}表示 Ck 中某些 c 包含有一些s,这些 s Ck1中但是不在 Fk1 中,减去这些集合意味着他们不可能是频繁项集,这是根据Overview中的第四条得到的)
  • Ct{c | cCkct,transactions tT} means the item sets exist in the database indeed.
  • The following step is counting the time of occurrence for a certain item set and keep iteration.

Example


Itemsets
{1,2,3,4}
{1,2,4}
{1,2}
{2,3,4}
{2,3}
{3,4}
{2,4}

Step 1

L1{large 1itemsets} , so we can obtain the following table:

ItemSupport
{1}3
{2}6
{3}4
{4}5

Step 2

Ck{a{b}aLk1ba}{c{ssc|s|=k1}Lk1}     k=2

ItemSupport
{1,2}3
{1,3}1
{1,4}2
{2,3}3
{2,4}4
{3,4}3

So the frequent item sets ( Lk  k=2 ) are:

ItemSupport
{1,2}3
{2,3}3
{2,4}4
{3,4}3

And the Prune sets which means any lager set that contains the prune sets cannot be frequent are :

ItemSupport
{1,3}1
{1,4}2

Step 3

Ck{a{b}aLk1ba}{c{ssc|s|=k1}Lk1}     k=3

First, {a{b}aLk1ba} is :

Item
{1,2,3}
{2,3,4}
{1,2,4}
{1,2,3,4}

And then, we need to cancel the items contain the Prune sets, we got:

ItemSupport
{2,3,4}2

Finally, there is no frequent item in the Lk  k=3 , the calculation is over.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值