matlab决策树模型过程,利用MATLAB统计工具箱进行决策树分类的一个例子

这个例子开始从lda线性分类算法,最后引出决策树分类算法,不错,初学者可以参考下

网上的很多决策树算法都没有例子,都是就一堆代码都不知道参数怎么传递。直接用工具箱里面的决策树算法,不懂得就help一下就ok了。

Classification

Suppose you have a data set containing observations with measurements on different variables (called predictors) and their known class labels. If you obtain predictor values for new observations, could you determine to which classes those observations probably belong? This is the problem of classification. This demo illustrates how to perform some classification algorithms in MATLAB? using Statistics Toolbox? by applying them to Fisher's iris data.

Contents

Fisher's Iris Data

Fisher's iris data consists of measurements on the sepal length, sepal width, petal length, and petal width for 150 iris specimens. There are 50 specimens from each of three species. Load the data and see how the sepal measurements differ between species. You can use the two columns containing sepal measurements.

load fisheriris gscatter(meas(:,1), meas(:,2), species,'rgb','osd');

xlabel('Sepal length');

ylabel('Sepal width');

N = size(meas,1);

uid-29235952-id-4187114.html

Suppose you measure a sepal and petal from an iris, and you need to determine its species on the basis of those measurements. One approach to solving this problem is known as discriminant analysis.

Linear and Quadratic Discriminant Analysis

The classify function can perform classification using different types of discriminant analysis. First classify the data using the default linear discriminant analysis (LDA).

ldaClass = classify(meas(:,1:2),meas(:,1:2),species);

The observations with known class labels are usually called the training data. Now compute the resubstitution error, which is the misclassification error (the proportion of misclassified observations) on the training set.

bad = ~strcmp(ldaClass,species);

ldaResubErr = sum(bad) / N

ldaResubErr =

0.2000

You can also compute the confusion matrix on the training set. A confusion matrix contains information about known class labels and predicted class labels. Generally speaking, the (i,j) element in the confusion matrix is the number of samples whose known class label is class i and whose predicted class is j. The diagonal elements represent correctly classified observations.

[ldaResubCM,grpOrder] = confusionmat(species,ldaClass)

ldaResubCM =

49 1 0

0 36 14

0 15 35

grpOrder =

'setosa'

'versicolor'

'virginica'

Of the 150 training observations, 20% or 30 observations are misclassified by the linear discriminant function. You can see which ones they are by drawing X through the misclassified points.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值