Introduction - Notes of Data Mining

Introduction

@(Pattern Discovery in Data Mining)[Data Mining, Notes]
Jiawei Han的Pattern Discovery课程笔记

Why data mining?

data explosion and abundant(but unstructured) data everywhere
drowning in data but starving in knowledge
keyword: interdisciplinary

Data mining is to mine knowledge from data.

Data Mining Process

Q: DM中的模式,有两种,一种是人们根据自己的假设来推论出来的;另一种是纯粹机器挖掘出来的,两种分别在什么情况下更有用?

Q: What is “Pattern”?
A: A set of items, subsequences, or substructures that occur
frequently together (or strongly correlated) in a data set

Data Mining in different views

Data View: What kinds of data?
  1. structured data(relational data, object-relational data, transaction data, data warehouse data)
  2. unstructured data(text data, web data, stream data, social network data, information networks, multimedia data, time-series data, temporal data, sequence data)
Knowledge View

Methodology View

DM is a confluence of many different disciplines.
* Machine learning, statistics and pattern recognition play great role in DM.
* Application is the driver of DM.
* Algorithm, database technology and distributed/cloud computing are also important methodologies in DM.

Application View
  • Mining text data and mining the Web
    Web page classification and ranking, Weblog analysis, recommender systems, …
    Mining business data
  • Transaction data, market basket analysis, fraud detection, …
    Data mining and software/system engineering, e.g., mining software bugs
  • Mining biological and medical data
    Gene, protein, microarray data, biological networks
  • Mining social and information networks
    Community discovery, information propagation, …
  • Invisible data mining

Reference Book

Download: 提取码请邮件索取
Text books on Data Mining:

Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining: Concepts and Techniques. Morgan Kaufmann, 3rd ed. , 2011
Mohammed J. Zaki and Wagner Meira, Jr., Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014
Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining, 2nd ed., Wiley, 2014

Reference book on Pattern Discovery:

Charu Aggarwal and Jiawei Han (eds.), Frequent Pattern Mining, Springer, 2014

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值