- 博客(0)
- 资源 (4)
- 收藏
- 关注
Data Mining - Concepts and Techniques.pdf
Chapter 1 Introduction 1
1.1 What Motivated Data Mining? Why Is It Important? 1
1.2 So, What Is Data Mining? 5
1.3 Data Mining—On What Kind of Data? 9
1.3.1 Relational Databases 10
1.3.2 Data Warehouses 12
1.3.3 Transactional Databases 14
1.3.4 Advanced Data and Information Systems and Advanced
Applications 15
1.4 Data Mining Functionalities—What Kinds of Patterns Can Be
Mined? 21
1.4.1 Concept/Class Description: Characterization and
Discrimination 21
1.4.2 Mining Frequent Patterns, Associations, and Correlations 23
1.4.3 Classification and Prediction 24
1.4.4 Cluster Analysis 25
1.4.5 Outlier Analysis 26
1.4.6 Evolution Analysis 27
1.5 Are All of the Patterns Interesting? 27
1.6 Classification of Data Mining Systems 29
1.7 Data Mining Task Primitives 31
1.8 Integration of a Data Mining System with
a Database or Data Warehouse System 34
1.9 Major Issues in Data Mining 36
ix
x Contents
1.10 Summary 39
Exercises 40
Bibliographic Notes 42
Chapter 2 Data Preprocessing 47
2.1 Why Preprocess the Data? 48
2.2 Descriptive Data Summarization 51
2.2.1 Measuring the Central Tendency 51
2.2.2 Measuring the Dispersion of Data 53
2.2.3 Graphic Displays of Basic Descriptive Data Summaries 56
2.3 Data Cleaning 61
2.3.1 Missing Values 61
2.3.2 Noisy Data 62
2.3.3 Data Cleaning as a Process 65
2.4 Data Integration and Transformation 67
2.4.1 Data Integration 67
2.4.2 Data Transformation 70
2.5 Data Reduction 72
2.5.1 Data Cube Aggregation 73
2.5.2 Attribute Subset Selection 75
2.5.3 Dimensionality Reduction 77
2.5.4 Numerosity Reduction 80
2.6 Data Discretization and Concept Hierarchy Generation 86
2.6.1 Discretization and Concept Hierarchy Generation for
Numerical Data 88
2.6.2 Concept Hierarchy Generation for Categorical Data 94
2.7 Summary 97
Exercises 97
Bibliographic Notes 101
Chapter 3 Data Warehouse and OLAP Technology: An Overview 105
3.1 What Is a Data Warehouse? 105
3.1.1 Differences between Operational Database Systems
and Data Warehouses 108
3.1.2 But, Why Have a Separate Data Warehouse? 109
3.2 A Multidimensional Data Model 110
3.2.1 From Tables and Spreadsheets to Data Cubes 110
3.2.2 Stars, Snowflakes, and Fact Constellations:
Schemas for Multidimensional Databases 114
3.2.3 Examples for Defining Star, Snowflake,
and Fact Constellation Schemas 117
Contents xi
3.2.4 Measures: Their Categorization and Computation 119
3.2.5 Concept Hierarchies 121
3.2.6 OLAP Operations in the Multidimensional Data Model 123
3.2.7 A Starnet Query Model for Querying
Multidimensional Databases 126
2012-10-23
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人