memory513773348-CSDN博客

转载正交、独立、不相关区别

一、三者的定义假设X为一个随机过程，则在t1和t2时刻的随机变量的相关定义如下（两个随机过程一样）：（1）定义Rx（t1，t2）=E{X（t1）X（t2）}为相关函数，若R=0，称正交（注意，相关函数为0，不是不相关，而是正交）。正交不仅仅描述确定函数之间的关系，也用以描述随机过程。两个随机过程X(t) Y(t)正交，即E[X(t)Y(t)]=0, 若E[X(t)Y(t)]=E[X

2013-12-26 19:27:01 24156 1

转载线性判别分析LDA

首先搞清楚什么叫判别分析？Discriminant Analysis就是根据研究对象的各种特征值判别其类型归属问题的一种多变量统计分析方法。根据判别标准不同，可以分为距离判别、Fisher判别、Bayes判别法等。比如在KNN中用的就是距离判别，当然这里的“距离”又有好几种：欧氏距离、街区距离、甚至可以用皮尔森相关系数等。朴素贝叶斯分类用的就是Bayes判别法。本文要讲的线性判别分析就是用是F

2013-12-12 17:51:17 1471

转载皮尔逊相关系数（Pearson Correlation Coefficient）理解

转自：http://segmentfault.com/q/1010000000094674皮尔逊相关系数理解有两个角度其一, 按照高中数学水平来理解, 它很简单, 可以看做将两组数据首先做Z分数处理之后, 然后两组数据的乘积和除以样本数Z分数一般代表正态分布中, 数据偏离中心点的距离.等于变量减掉平均数再除以标准差.(就是高考的标准分类似的处理)标准差则等于变量减掉平均数的平方和,

2013-11-27 22:56:41 12108 2

转载为什么朴素贝叶斯分类器本质上是线性分类器

转自：http://blog.163.com/rustle_go_go/blog/static/20294501420122110431306/ 在准备组会报告的时候，无意中发现了“朴素贝叶斯分类器本质上是线性分类器”的说法。整理了相关的资料，对这个说法有了更深的了解，写了这篇博文。欢迎各路大牛拍砖。一开始介绍朴素贝叶斯分类器和线性分类器的定义，然后介绍两种是线性分类器的

2013-11-27 20:26:28 6300

转载贝叶斯网络简介

简介贝叶斯网络(Bayesian network)，又称信念网络(belief network)或是有向无环图模型(directed acyclic graphical model)，是一种概率图型模型，借由有向无环图(directed acyclic graphs, or DAGs )中得知一组随机变量{}及其n组条件概率分配(conditional probability distri

2013-11-26 23:18:29 88750 2

原创 HTM皮质学习算法资料

简介HTM(Hierarchical Temporal Memory)算法，英文全称HTM Cortical Learning Algorithms是由《人工智能的未来》(On Intelligence)一书作者Jeff Hawkins创建的Numenta公司（先改名为Grok）发表的新一代人工智能算法。Jeff Hawkins 是一位工程师，连续创业者，科学家，发明家，作家。他曾是两个移

2013-11-26 21:07:26 7352

转载 Logistic regression （逻辑回归）概述

转自：http://hi.baidu.com/hehehehello/item/40025c33d7d9b7b9633aff87Logistic regression （逻辑回归）是当前业界比较常用的机器学习方法，用于估计某种事物的可能性。比如某用户购买某商品的可能性，某病人患有某种疾病的可能性，以及某广告被用户点击的可能性等。（注意这里是：“可能性”，而非数学上的“概

2013-11-26 16:44:58 1572

转载数学之美番外篇：平凡而又神奇的贝叶斯方法

转自：http://mindhacks.cn/2008/09/21/the-magical-bayesian-method/By 刘未鹏– September 21, 2008Posted in:数学, 机器学习与人工智能, 计算机科学概率论只不过是把常识用数学公式表达了出来。——拉普拉斯记得读本科的时候，最喜欢到城里的计算机书店里面去闲逛，一逛

2013-11-26 16:32:55 2594

转载关于SVD的物理意义

转自：http://www.ams.org/samplings/feature-column/fcarc-svd （部分修改）We Recommend a Singular Value DecompositionIn this article, we will offer a geometric explanation of singular value decompositions a

2013-11-26 16:24:15 4443

原创在命令行界面使用vs2008的 cl 进行编译

【背景】Win7+vs2008。若安装了vc6则一般不存在上述问题。此处假设vs2008安装路径为D:\Microsoft Visual Studio 9.0【问题与解决方法】1. “cl不是可用的内部命令或可执行程序”----原因：环境变量Path没有设置好。----解决：在环境变量Path中添加VC的bin目录。Path= D:\Microsoft Visual Stu

2013-08-01 09:48:14 3607

转载 PCA的一些看法

PCA主元分析，即找出数据中最主要的信息，去除次要的，以降低数据量。具体步骤是：1.对每个样本提取出有用的信息组成一个向量；2.求取出所有样本向量的平均值；3.用每个样本向量减去向量的平均值后组成一个矩阵；4.该矩阵乘以该矩阵的逆为协方差矩阵，这个协方差矩阵是可对角化的，对角化后剩下的元素为特征值，每个特征值对应一个特征向量（特征向量要标准化）；5.选取最大的N个特征值（其

2013-03-31 14:08:33 1124

前几天无意中看到微软发布了Kinect for windows sensor，进去看了一下Kinect应用的例子，发现Kinect除了作为xbox360游戏的外设外还能开发一些很酷的应用，而且微软也发布可针对Kinect开发的Kinect for windows SDK1.0正式版本，原本想买一个Kinect for windows sensor来进行开发玩一玩的，可是那个出来没多久，淘宝上只有代

2013-03-29 13:33:07 1035

转载 FAQ of LIBSVM

问: 已有哪些学校的课程应用了libsvm？德国弗赖堡大学，计算机科学学院应用科学系数学与计算机科学部. Faculteit der Exacte Wetenschappen 荷兰，自由大学(VU). 威斯康辛大学-麦迪逊分校，电子工程与计算机工程系以色列，以色列理工学院. 佛罗里达州立大学计算机与信息科学系. 肯尼亚，奈洛比大学，计算机科学学院. 冰岛大学，

2013-03-28 14:42:21 1323

转载关于matlab中用textread读取txt文件

今天打算跑下程序，突然发现，真的很烂，不会读入数据，简单的Iris.txt一上午都没读进去，在此对matlab中的textread函数做下总结，textscan函数待续。本文主要内容引自http://linux.chinaitlab.com/administer/872894.html笔者在此基础上进行运行，修改得到以下内容，希望大家给与补充：textread基本语法是：

2013-03-26 22:00:31 22441

转载 Gabor滤波 + 多尺度问题

Gabor函数Gabor变换属于加窗傅立叶变换，Gabor函数可以在频域不同尺度、不同方向上提取相关的特征。另外Gabor函数与人眼的生物作用相仿，所以经常用作纹理识别上，并取得了较好的效果。二维Gabor函数可以表示为：其中：v的取值决定了Gabor滤波的波长，u的取值表示Gabor核函数的方向，K表示总的方向数。参数决定了高斯窗口的大小，这里取。程序中取4个频率（v=0,

2013-03-24 14:12:38 20092

转载 Gabor二维卷积运算

Gabor变换的本质实际上还是对二维图像求卷积。因此二维卷积运算的效率就直接决定了Gabor变换的效率。在这里我先说说二维卷积运算以及如何通过二维傅立叶变换提高卷积运算效率。在下一步分内容中我们将此应用到Gabor变换上，抽取笔迹纹理的特征。1、离散二维叠加和卷积关于离散二维叠加和卷积的运算介绍的书籍比较多，我这里推荐William K. Pratt著，邓鲁华张延恒等译的《数字图像处理

2013-03-23 11:45:32 2820

转载数据标准化处理方法

在数据分析之前，我们通常需要先将数据标准化（normalization），利用标准化后的数据进行数据分析。数据标准化也就是统计数据的指数化。数据标准化处理主要包括数据同趋化处理和无量纲化处理两个方面。数据同趋化处理主要解决不同性质数据问题，对不同性质指标直接加总不能正确反映不同作用力的综合结果，须先考虑改变逆指标数据性质，使所有指标对测评方案的作用力同趋化，再加总才能得出正确结果。数据无量纲化处理

2013-03-21 15:33:37 3341 1

原创 LIBSVM在matlab中使用小结

安装环节：安装部分，可以参考网上的方法，或者自带readme文件，大致步骤都大同小异，具体安装过程可参考 http://blog.csdn.net/abcjennifer/article/details/7370177以下我说一下，我安装过程中所遇到的困难1、matlab未安装完整导致编译后libsvm仍无法使用对于32位的系统，libsvm没有现成的命令执行文件供matla

2013-03-20 16:51:26 3482

转载 LBP学习

纹理分类是一个很老的topic，但是一些纹理分类的方法为以后的图片分类奠定了基础。首先定义一下纹理图片，他是一个随一下变量变化的函数：纹理表面材质，反射率，光照，照相机和他的角度。现在纹理分类比较流行的有两种方法：一个是全局特征，如lbp，gabor，另一种是基于局部特征的，如：harris-laplace，基于局部特征的方法主要基于texton的框架，也就是现在图片分

2013-03-19 15:35:25 1917

weka-Data Mining (Third Edition)

数据挖掘方面的好书，就是weka官网的那本，而且是最新的第三版，文字版pdf，有索引！ PART I INTRODUCTION TO DATA MINING CHAPTER 1 What’s It All About? ................................................................3 1.1 Data Mining and Machine Learning ..............................................3 Describing Structural Patterns ........................................................5 Machine Learning ...........................................................................7 Data Mining ....................................................................................8 1.2 Simple Examples: The Weather Problem and Others ....................9 The Weather Problem .....................................................................9 Contact Lenses: An Idealized Problem ........................................12 Irises: A Classic Numeric Dataset ................................................13 CPU Performance: Introducing Numeric Prediction....................15 Labor Negotiations: A More Realistic Example ..........................15 Soybean Classification: A Classic Machine Learning Success ....19 1.3 Fielded Applications .....................................................................21 Web Mining...................................................................................21 Decisions Involving Judgment .....................................................22 Screening Images ..........................................................................23 Load Forecasting ...........................................................................24 Diagnosis .......................................................................................25 Marketing and Sales .....................................................................26 Other Applications ........................................................................27 1.4 Machine Learning and Statistics ..................................................28 1.5 Generalization as Search .............................................................29 1.6 Data Mining and Ethics ................................................................33 Reidentification .............................................................................33 Using Personal Information ..........................................................34 Wider Issues ..................................................................................35 1.7 Further Reading ............................................................................36 v vi Contents CHAPTER 2 Input: Concepts, Instances, and Attributes .............................39 2.1 What’s a Concept? ........................................................................40 2.2 What’s in an Example? .................................................................42 Relations ........................................................................................43 Other Example Types ....................................................................46 2.3 What’s in an Attribute? .................................................................49 2.4 Preparing the Input .......................................................................51 Gathering the Data Together .........................................................51 ARFF Format ................................................................................52 Sparse Data ...................................................................................56 Attribute Types ..............................................................................56 Missing Values ..............................................................................58 Inaccurate Values ..........................................................................59 Getting to Know Your Data ..........................................................60 2.5 Further Reading ............................................................................60 CHAPTER 3 Output: Knowledge Representation ........................................61 3.1 Tables ............................................................................................61 3.2 Linear Models ...............................................................................62 3.3 Trees ..............................................................................................64 3.4 Rules ..............................................................................................67 Classification Rules .......................................................................69 Association Rules ..........................................................................72 Rules with Exceptions ..................................................................73 More Expressive Rules .................................................................75 3.5 Instance-Based Representation .....................................................78 3.6 Clusters ..........................................................................................81 3.7 Further Reading ............................................................................83 CHAPTER 4 Algorithms: The Basic Methods .............................................85 4.1 Inferring Rudimentary Rules ........................................................86 Missing Values and Numeric Attributes .......................................87 Discussion .....................................................................................89 4.2 Statistical Modeling ......................................................................90 Missing Values and Numeric Attributes ......................................94 Naïve Bayes for Document Classification....................................97 Discussion .....................................................................................99 4.3 Divide-and-Conquer: Constructing Decision Trees .....................99 Calculating Information ..............................................................103 Highly Branching Attributes .......................................................105 Discussion ...................................................................................107 Contents vii 4.4 Covering Algorithms: Constructing Rules .................................108 Rules versus Trees ......................................................................109 A Simple Covering Algorithm ....................................................110 Rules versus Decision Lists ........................................................115 4.5 Mining Association Rules ...........................................................116 Item Sets ......................................................................................116 Association Rules ........................................................................119 Generating Rules Efficiently .......................................................122 Discussion ...................................................................................123 4.6 Linear Models .............................................................................124 Numeric Prediction: Linear Regression .....................................124 Linear Classification: Logistic Regression .................................125 Linear Classification Using the Perceptron ................................127 Linear Classification Using Winnow ..........................................129 4.7 Instance-Based Learning .............................................................131 Distance Function .......................................................................131 Finding Nearest Neighbors Efficiently .......................................132 Discussion ...................................................................................137 4.8 Clustering ....................................................................................138 Iterative Distance-Based Clustering ...........................................139 Faster Distance Calculations .......................................................139 Discussion ...................................................................................141 4.9 Multi-Instance Learning ..............................................................141 Aggregating the Input .................................................................142 Aggregating the Output ..............................................................142 Discussion ...................................................................................142 4.10 Further Reading ..........................................................................143 4.11 Weka Implementations ................................................................145 CHAPTER 5 Credibility: Evaluating What’s Been Learned ........................147 5.1 Training and Testing ...................................................................148 5.2 Predicting Performance ...............................................................150 5.3 Cross-Validation ..........................................................................152 5.4 Other Estimates ...........................................................................154 Leave-One-Out Cross-Validation ................................................154 The Bootstrap ..............................................................................155 5.5 Comparing Data Mining Schemes ..............................................156 5.6 Predicting Probabilities ...............................................................159 Quadratic Loss Function .............................................................160 Informational Loss Function .......................................................161 Discussion ...................................................................................162 viii Contents 5.7 Counting the Cost .......................................................................163 Cost-Sensitive Classification ......................................................166 Cost-Sensitive Learning ..............................................................167 Lift Charts ...................................................................................168 ROC Curves ................................................................................172 Recall–Precision Curves .............................................................174 Discussion ...................................................................................175 Cost Curves ................................................................................177 5.8 Evaluating Numeric Prediction ...................................................180 5.9 Minimum Description Length Principle .....................................183 5.10 Applying the MDL Principle to Clustering ................................186 5.11 Further Reading ..........................................................................187 PART II ADVANCED DATA MINING CHAPTER 6 Implementations: Real Machine Learning Schemes ..............191 6.1 Decision Trees .............................................................................192 Numeric Attributes ......................................................................193 Missing Values ............................................................................194 Pruning ........................................................................................195 Estimating Error Rates ................................................................197 Complexity of Decision Tree Induction .....................................199 From Trees to Rules ....................................................................200 C4.5: Choices and Options .........................................................201 Cost-Complexity Pruning ...........................................................202 Discussion ...................................................................................202 6.2 Classification Rules .....................................................................203 Criteria for Choosing Tests .........................................................203 Missing Values, Numeric Attributes ...........................................204 Generating Good Rules ...............................................................205 Using Global Optimization .........................................................208 Obtaining Rules from Partial Decision Trees.............................208 Rules with Exceptions ................................................................212 Discussion ...................................................................................215 6.3 Association Rules ........................................................................216 Building a Frequent-Pattern Tree ...............................................216 Finding Large Item Sets .............................................................219 Discussion ...................................................................................222 6.4 Extending Linear Models ...........................................................223 Maximum-Margin Hyperplane ...................................................224 Nonlinear Class Boundaries .......................................................226 Contents ix Support Vector Regression..........................................................227 Kernel Ridge Regression ............................................................229 Kernel Perceptron.......................................................................231 Multilayer Perceptrons ................................................................232 Radial Basis Function Networks ................................................241 Stochastic Gradient Descent .......................................................242 Discussion ...................................................................................243 6.5 Instance-Based Learning .............................................................244 Reducing the Number of Exemplars ..........................................245 Pruning Noisy Exemplars ...........................................................245 Weighting Attributes ...................................................................246 Generalizing Exemplars ..............................................................247 Distance Functions for Generalized Exemplars ....................................................................................248 Generalized Distance Functions .................................................249 Discussion ...................................................................................250 6.6 Numeric Prediction with Local Linear Models ..........................251 Model Trees ................................................................................252 Building the Tree ........................................................................253 Pruning the Tree ..........................................................................253 Nominal Attributes ......................................................................254 Missing Values............................................................................254 Pseudocode for Model Tree Induction .......................................255 Rules from Model Trees .............................................................259 Locally Weighted Linear Regression ..........................................259 Discussion ...................................................................................261 6.7 Bayesian Networks .....................................................................261 Making Predictions .....................................................................262 Learning Bayesian Networks ......................................................266 Specific Algorithms .....................................................................268 Data Structures for Fast Learning ..............................................270 Discussion ...................................................................................273 6.8 Clustering ....................................................................................273 Choosing the Number of Clusters ..............................................274 Hierarchical Clustering ...............................................................274 Example of Hierarchical Clustering ...........................................276 Incremental Clustering ................................................................279 Category Utility..........................................................................284 Probability-Based Clustering ......................................................285 The EM Algorithm ......................................................................287 Extending the Mixture Model ....................................................289 x Contents Bayesian Clustering ....................................................................290 Discussion ...................................................................................292 6.9 Semisupervised Learning ............................................................294 Clustering for Classification .......................................................294 Co-training ..................................................................................296 EM and Co-training ....................................................................297 Discussion ...................................................................................297 6.10 Multi-Instance Learning ..............................................................298 Converting to Single-Instance Learning .....................................298 Upgrading Learning Algorithms .................................................300 Dedicated Multi-Instance Methods .............................................301 Discussion ...................................................................................302 6.11 Weka Implementations ................................................................303 CHAPTER 7 Data Transformations ..........................................................305 7.1 Attribute Selection ......................................................................307 Scheme-Independent Selection ...................................................308 Searching the Attribute Space ....................................................311 Scheme-Specific Selection ..........................................................312 7.2 Discretizing Numeric Attributes .................................................314 Unsupervised Discretization .......................................................316 Entropy-Based Discretization .....................................................316 Other Discretization Methods .....................................................320 Entropy-Based versus Error-Based Discretization .....................320 Converting Discrete Attributes to Numeric Attributes ...............322 7.3 Projections ...................................................................................322 Principal Components Analysis ..................................................324 Random Projections ....................................................................326 Partial Least-Squares Regression ...............................................326 Text to Attribute Vectors .............................................................328 Time Series .................................................................................330 7.4 Sampling .....................................................................................330 Reservoir Sampling .....................................................................330 7.5 Cleansing .....................................................................................331 Improving Decision Trees ...........................................................332 Robust Regression ......................................................................333 Detecting Anomalies ...................................................................334 One-Class Learning ....................................................................335 7.6 Transforming Multiple Classes to Binary Ones .........................338 Simple Methods ..........................................................................338 Error-Correcting Output Codes ..................................................339 Ensembles of Nested Dichotomies .............................................341 Contents xi 7.7 Calibrating Class Probabilities ...................................................343 7.8 Further Reading ..........................................................................346 7.9 Weka Implementations ................................................................348 CHAPTER 8 Ensemble Learning .............................................................351 8.1 Combining Multiple Models .......................................................351 8.2 Bagging .......................................................................................352 Bias–Variance Decomposition ....................................................353 Bagging with Costs .....................................................................355 8.3 Randomization ............................................................................356 Randomization versus Bagging ..................................................357 Rotation Forests ..........................................................................357 8.4 Boosting ......................................................................................358 AdaBoost .....................................................................................358 The Power of Boosting ...............................................................361 8.5 Additive Regression ....................................................................362 Numeric Prediction .....................................................................362 Additive Logistic Regression .....................................................364 8.6 Interpretable Ensembles ..............................................................365 Option Trees ................................................................................365 Logistic Model Trees ..................................................................368 8.7 Stacking .......................................................................................369 8.8 Further Reading ..........................................................................371 8.9 Weka Implementations ................................................................372 Chapter 9 Moving on: Applications and Beyond ...................................375 9.1 Applying Data Mining ................................................................375 9.2 Learning from Massive Datasets ................................................378 9.3 Data Stream Learning .................................................................380 9.4 Incorporating Domain Knowledge .............................................384 9.5 Text Mining .................................................................................386 9.6 Web Mining.................................................................................389 9.7 Adversarial Situations .................................................................393 9.8 Ubiquitous Data Mining .............................................................395 9.9 Further Reading ..........................................................................397 PART III THE WEKA DATA MINING WORKBENCH CHAPTER 10 Introduction to Weka ..........................................................403 10.1 What’s in Weka? .........................................................................403 10.2 How Do You Use It? ..................................................................404 10.3 What Else Can You Do? .............................................................405 10.4 How Do You Get It? ...................................................................406 xii Contents CHAPTER 11 The Explorer .......................................................................407 11.1 Getting Started ............................................................................407 Preparing the Data ......................................................................407 Loading the Data into the Explorer ............................................408 Building a Decision Tree ............................................................410 Examining the Output .................................................................411 Doing It Again ............................................................................413 Working with Models .................................................................414 When Things Go Wrong .............................................................415 11.2 Exploring the Explorer ...............................................................416 Loading and Filtering Files ........................................................416 Training and Testing Learning Schemes ....................................422 Do It Yourself: The User Classifier ............................................424 Using a Metalearner ....................................................................427 Clustering and Association Rules ...............................................429 Attribute Selection ......................................................................430 Visualization ................................................................................430 11.3 Filtering Algorithms ....................................................................432 Unsupervised Attribute Filters ....................................................432 Unsupervised Instance Filters .....................................................441 Supervised Filters ........................................................................443 11.4 Learning Algorithms ...................................................................445 Bayesian Classifiers ....................................................................451 Trees ............................................................................................454 Rules ............................................................................................457 Functions .....................................................................................459 Neural Networks .........................................................................469 Lazy Classifiers ...........................................................................472 Multi-Instance Classifiers ...........................................................472 Miscellaneous Classifiers ............................................................474 11.5 Metalearning Algorithms ............................................................474 Bagging and Randomization .......................................................474 Boosting ......................................................................................476 Combining Classifiers .................................................................477 Cost-Sensitive Learning ..............................................................477 Optimizing Performance .............................................................478 Retargeting Classifiers for Different Tasks ................................479 11.6 Clustering Algorithms .................................................................480 11.7 Association-Rule Learners ..........................................................485 11.8 Attribute Selection ......................................................................487 Attribute Subset Evaluators ........................................................488 Contents xiii Single-Attribute Evaluators ........................................................490 Search Methods ...........................................................................492 CHAPTER 12 The Knowledge Flow Interface ............................................495 12.1 Getting Started ............................................................................495 12.2 Components.................................................................................498 12.3 Configuring and Connecting the Components ...........................500 12.4 Incremental Learning ..................................................................502 CHAPTER 13 The Experimenter ...............................................................505 13.1 Getting Started ............................................................................505 Running an Experiment ..............................................................506 Analyzing the Results .................................................................509 13.2 Simple Setup ...............................................................................510 13.3 Advanced Setup ..........................................................................511 13.4 The Analyze Panel ......................................................................512 13.5 Distributing Processing over Several Machines .........................515 CHAPTER 14 The Command-Line Interface ...............................................519 14.1 Getting Started ............................................................................519 14.2 The Structure of Weka ................................................................519 Classes, Instances, and Packages ................................................520 The weka.core Package ...............................................................520 The weka.classifiersPackage ......................................................523 Other Packages ............................................................................525 Javadoc Indexes ..........................................................................525 14.3 Command-Line Options ..............................................................526 Generic Options ..........................................................................526 Scheme-Specific Options ............................................................529 CHAPTER 15 Embedded Machine Learning ..............................................531 15.1 A Simple Data Mining Application ............................................531 MessageClassifier() .....................................................................536 updateData() ...............................................................................536 classifyMessage() ........................................................................537 CHAPTER 16 Writing New Learning Schemes ..........................................539 16.1 An Example Classifier ................................................................539 buildClassifier()...........................................................................540 makeTree() ...................................................................................540 computeInfoGain() ......................................................................549 classifyInstance().........................................................................549 xiv Contents toSource() ....................................................................................550 main() ..........................................................................................553 16.2 Conventions for Implementing Classifiers .................................555 Capabilities ..................................................................................555 CHAPTER 17 Tutorial Exercises for the Weka Explorer .............................559 17.1 Introduction to the Explorer Interface ........................................559 Loading a Dataset .......................................................................559 The Dataset Editor ......................................................................560 Applying a Filter .........................................................................561 The Visualize Panel ....................................................................562 The Classify Panel ......................................................................562 17.2 Nearest-Neighbor Learning and Decision Trees ........................566 The Glass Dataset .......................................................................566 Attribute Selection ......................................................................567 Class Noise and Nearest-Neighbor Learning .............................568 Varying the Amount of Training Data ........................................569 Interactive Decision Tree Construction ......................................569 17.3 Classification Boundaries ............................................................571 Visualizing 1R .............................................................................571 Visualizing Nearest-Neighbor Learning .....................................572 Visualizing Naïve Bayes .............................................................573 Visualizing Decision Trees and Rule Sets ..................................573 Messing with the Data ................................................................574 17.4 Preprocessing and Parameter Tuning .........................................574 Discretization ..............................................................................574 More on Discretization ...............................................................575 Automatic Attribute Selection ....................................................575 More on Automatic Attribute Selection .....................................576 Automatic Parameter Tuning ......................................................577 17.5 Document Classification .............................................................578 Data with String Attributes .........................................................579 Classifying Actual Documents ...................................................580 Exploring the StringToWordVectorFilter ...................................581 17.6 Mining Association Rules ...........................................................582 Association-Rule Mining ............................................................582 Mining a Real-World Dataset .....................................................584 Market Basket Analysis ..............................................................584 REFERENCES ...............................................................................................587 INDEX .........................................................................................................607

2014-04-02

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

memory的专栏