6. Association Analysis
Given a set of records each of which contains some number of items from a given collection.
Produce dependency rules that will predict the occurrence of an item based on occurrences of other items.
Application area: Marketing and Sales Promotion, Content-based recommendation, Customer loyalty programs
Initially used for Market Basket Analysis to find how items purchased by customers are related. Later extended to more complex data structures: sequential patterns and subgraph patterns
6.1 Simple Approach: Pearson’s correlation coefficient
correlation not equals to causality
6.2 Definitoin
6.2.1 Frequent Itemset
6.2.2 Association Rule
6.2.3 Evaluation Metrics
6.3 Associate Rule Mining Task
Given a set of transactions T, the goal of association rule mining is to find all rules having
– support ≥ minsup threshold
– confidence ≥ minconf threshold
minsup and minconf are provided by the user
Brute-force approach
Step1: List all possible association rules
Step2: Compute the support and confidence for each rule
Step3: Remove rules that fail the minsup and minconf thresholds
But Computationally prohibitive due to large number of candidates!