Introduction
@(Pattern Discovery in Data Mining)[Data Mining, Notes]
Jiawei Han的Pattern Discovery课程笔记
Why data mining?
data explosion and abundant(but unstructured) data everywhere
drowning in data but starving in knowledge
keyword: interdisciplinary
Data mining is to mine knowledge from data.
Data Mining Process
Q: DM中的模式,有两种,一种是人们根据自己的假设来推论出来的;另一种是纯粹机器挖掘出来的,两种分别在什么情况下更有用?
Q: What is “Pattern”?
A: A set of items, subsequences, or substructures that occur
frequently together (or strongly correlated) in a data set
Data Mining in different views
Data View: What kinds of data?
- structured data(relational data, object-relational data, transaction data, data warehouse data)
- unstructured data(text data, web data, stream data, social network data, information networks, multimedia data, time-series data, temporal data, sequence data)
Knowledge View
Methodology View
DM is a confluence of many different disciplines.
* Machine learning, statistics and pattern recognition play great role in DM.
* Application is the driver of DM.
* Algorithm, database technology and distributed/cloud computing are also important methodologies in DM.
Application View
- Mining text data and mining the Web
Web page classification and ranking, Weblog analysis, recommender systems, …
Mining business data - Transaction data, market basket analysis, fraud detection, …
Data mining and software/system engineering, e.g., mining software bugs - Mining biological and medical data
Gene, protein, microarray data, biological networks - Mining social and information networks
Community discovery, information propagation, … - Invisible data mining
Reference Book
Download: 提取码请邮件索取
Text books on Data Mining:
Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining: Concepts and Techniques. Morgan Kaufmann, 3rd ed. , 2011
Mohammed J. Zaki and Wagner Meira, Jr., Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014
Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining, 2nd ed., Wiley, 2014
Reference book on Pattern Discovery:
Charu Aggarwal and Jiawei Han (eds.), Frequent Pattern Mining, Springer, 2014