Pattern Discovery Basic Concepts

本文阐述了模式挖掘的基本概念,包括频繁模式与关联规则、闭合模式与最大模式。模式挖掘用于发现数据集中的内在规律,是许多数据挖掘任务的基础,如市场篮子分析、生物序列分析等。频繁模式需满足支持度阈值,而闭合模式和最大模式则是对频繁模式的压缩表示。关联规则挖掘寻找满足最小支持度和置信度的规则。
摘要由CSDN通过智能技术生成

Pattern Discovery Basic Concepts

@(Pattern Discovery in Data Mining)[Pattern Discovery]
本文介绍了基本的模式挖掘的概念

Pattern: A set of items, subsequences, or substructures that occur
frequently together (or strongly correlated) in a data set.

Motivation to do pattern discovery in data:
* To find what may be bought after one/some goods by customer;
* To find what code segment may likely contain copy/paste bugs;
* To find what kind of events may happen after some news posted;
* What products were often purchased together?
* What are the subsequent purchases after buying an iPad?
* What code segments likely contain copy-and-paste bugs?
* What word sequences likely form phrases in this corpus?
* …

In conclusion, pattern discovery is important because
* Finding inherent regularities in a data set
* Foundation for many essential data mining tasks
* Association, correlation, and causality analysis
* Mining sequential, structural (e.g., sub-graph) patterns
* Pattern analysis in spatiotemporal, multimedia, time-series, and stream data
* Classification: Discriminative pattern-based analysis
* Cluster analysis: Pattern-based subspace clustering
* Broad applications
* Market basket analysis, cross-marketing, catalog design, sale campaign analysis, Web log analysis, biological sequence analysis

TODO: 上述具体应用

Frequent Pattern and Association Rule

Itemset: A set of one or more items
k-itemset: X=x1,...

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值