Classification methods 分类算法 (R)

本文介绍了LDA, QDA, LR, SVM和KNN五种分类算法,并通过R语言进行模型构建与分析。在测试集上,LDA和QDA的准确率分别为0.71和0.75,SVM表现良好,而KNN通过选择最优k值以提高预测效果。不同算法在敏感性和特异性方面表现各异。" 90260182,7414441,Vue实现滚动监听与索引控制,"['VUE监听滚动', 'VUE控制锚点和索引', 'Vue路由', 'scroll事件', '前端开发']
摘要由CSDN通过智能技术生成

写在前面

介绍了 6 种分类算法, 分别是 

Linear discriminant analysis (LDA),

Quadratic discriminant analysis (QDA),

Logistic regression (LR),

Support vector machines (SVM),

K-nearest neighbour (KNN).

为了介绍这五种算法是怎么操作的,我们会使用一个模拟数据的例子,先介绍算法的原理,再使用的R语言搭建模型,再判断模型的拟合程度,再对多个算法进行对比。

我写的初稿就是英文,所以这里就直接用英文了,也许后面会翻译一个中文版本。

Linear discriminant analysis (LDA)

Description of the method:

The LDA algorithm starts by finding directions that maximize the separation between classes, then use these directions to predict the class of individuals. These directions, called linear discriminants, are a linear combinations of predictor variables.

LDA assumes that predictors are normally distributed (Gaussian distribution) and that the different classes have class-specific means and equal variance/covariance.

Analysis and results:

Use function “lda()” in “MASS” to build the model based on trainSet, make prediction on testSet. The prediction provides “class”, which is the predicted classes of observation, use it to compute the confusion matrix.

We can find:

  1. This model gives an accuracy rate 0.71 on testSet, which is barely good;
  2. Sensitivity is 0.27 and Specificity is 0.89, Sensitivity is low;
  3. Confusion matrix, of the 59 actual Group0 points, the system predicted that 43 were Group1, most of the points were misallocated. This is another way of showing Sensitivity (1-4359=0.27 ). Of the 141 Group1 points, the system predicted that 15 were Group0, only a small part of points were misallocated. This is another way of showing Specificity (1-15141=0.89 ). Again we can say Specificity is good but Sensitivity is too low.
> model1 <- lda(Group ~ X1+X2, data = trainSet)
> prediction1 <- model1 %>% predict(testSet)
> confusionMatrix(as.factor(prediction1$class),as.factor(testSet$Group)) 
Confusion Matrix and Statistics

          Reference
Prediction   0   1
         0  16  15
         1  43 126
                                          
               Accuracy : 0.71            
                 95% CI : (0.6418, 0.7718)
    No Information Rate : 0.705           
    P-V
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Clark Kent 2000

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值