Classification methods 分类算法（R）_分类算法英文有哪些-CSDN博客

本文链接：https://blog.csdn.net/qq_20241587/article/details/106911775

本文介绍了LDA, QDA, LR, SVM和KNN五种分类算法，并通过R语言进行模型构建与分析。在测试集上，LDA和QDA的准确率分别为0.71和0.75，SVM表现良好，而KNN通过选择最优k值以提高预测效果。不同算法在敏感性和特异性方面表现各异。" 90260182,7414441,Vue实现滚动监听与索引控制,"['VUE监听滚动', 'VUE控制锚点和索引', 'Vue路由', 'scroll事件', '前端开发']

摘要由CSDN通过智能技术生成

写在前面

介绍了 6 种分类算法，分别是

Linear discriminant analysis （LDA），

Quadratic discriminant analysis （QDA），

Logistic regression (LR),

Support vector machines (SVM),

K-nearest neighbour (KNN).

为了介绍这五种算法是怎么操作的，我们会使用一个模拟数据的例子，先介绍算法的原理，再使用的R语言搭建模型，再判断模型的拟合程度，再对多个算法进行对比。

我写的初稿就是英文，所以这里就直接用英文了，也许后面会翻译一个中文版本。

Linear discriminant analysis （LDA）

Description of the method:

The LDA algorithm starts by finding directions that maximize the separation between classes, then use these directions to predict the class of individuals. These directions, called linear discriminants, are a linear combinations of predictor variables.

LDA assumes that predictors are normally distributed (Gaussian distribution) and that the different classes have class-specific means and equal variance/covariance.

Analysis and results:

Use function “lda()” in “MASS” to build the model based on trainSet, make prediction on testSet. The prediction provides “class”, which is the predicted classes of observation, use it to compute the confusion matrix.

We can find:

This model gives an accuracy rate 0.71 on testSet, which is barely good;
Sensitivity is 0.27 and Specificity is 0.89, Sensitivity is low;
Confusion matrix, of the 59 actual Group0 points, the system predicted that 43 were Group1, most of the points were misallocated. This is another way of showing Sensitivity (1-4359=0.27 ). Of the 141 Group1 points, the system predicted that 15 were Group0, only a small part of points were misallocated. This is another way of showing Specificity (1-15141=0.89 ). Again we can say Specificity is good but Sensitivity is too low.

> model1 <- lda(Group ~ X1+X2, data = trainSet)
> prediction1 <- model1 %>% predict(testSet)
> confusionMatrix(as.factor(prediction1$class),as.factor(testSet$Group)) 
Confusion Matrix and Statistics

          Reference
Prediction   0   1
         0  16  15
         1  43 126
                                          
               Accuracy : 0.71            
                 95% CI : (0.6418, 0.7718)
    No Information Rate : 0.705           
    P-V