概述
一种高效的纯Python实现的Apori算法。为python 3.6 和 python 3.7 创建。
Apriori algorithm 可揭示分类数据中的隐藏结构。经典示例是包含从超市购买的商品的数据库。每次购买都有许多与之关联的项目。我们希望从数据中发现诸如{面包,鸡蛋}->{培根}之类的关联规则。这是关联规则学习的目标,Apriori算法可以说是这个问题最着名的算法。
安装
该包在 PyPI 上分发。从您的终端,需要运行以下命令来安装软件包。
! pip install efficient-apriori
导入包 efficient-apriori
import efficient_apriori
from efficient_apriori import apriori
示例
下面是一个示例。请注意,在每笔有鸡蛋的交易中,培根也存在。因此,规则 {鸡蛋} -> {培根} 以 100% 置信度返回
transactions = [('eggs', 'bacon', 'soup'),
('eggs', 'bacon', 'apple'),
('soup', 'bacon', 'banana')]
itemsets, rules = apriori(transactions, min_support=0.5, min_confidence=1)
print(rules) # [{eggs} -> {bacon}, {soup} -> {bacon}]
筛选和排序关联规则
from efficient_apriori import apriori
transactions = [('eggs', 'bacon', 'soup'),
('eggs', 'bacon', 'apple'),
('soup', 'bacon', 'banana')]
itemsets, rules = apriori(transactions, min_support=0.2, min_confidence=1)
# Print out every rule with 2 items on the left hand side,
# 1 item on the right hand side, sorted by lift
rules_rhs = filter(lambda rule: len(rule.lhs) == 2 and len(rule.rhs) == 1, rules)
for rule in sorted(rules_rhs, key=lambda rule: rule.lift):
print(rule) # Prints the rule and its confidence, support, lift, ...