K-NN笔记1.0

最新推荐文章于 2022-04-19 22:32:38 发布

Crazysheep123

最新推荐文章于 2022-04-19 22:32:38 发布

阅读量212

点赞数

分类专栏：学习笔记文章标签： KNN

本文链接：https://blog.csdn.net/qq_35771867/article/details/103008749

版权

K-NN是一种非参数统计的实例基分类和回归算法，常用于监督学习中的分类任务。该算法依赖特征相似性，理想情况下，数据应带有标签、无噪声且数据集较小。K的选择影响模型准确性，通常设为根号下样本数或奇数。常见的距离度量包括欧氏距离、曼哈顿距离、闵可夫斯基距离、切比雪夫距离和余弦相似度。流程包括数据加载、训练验证集划分、初始化K值、评估混淆矩阵并选择最佳K值进行预测。在R中，分类任务采用多数投票，回归任务则取均值。

摘要由CSDN通过智能技术生成

K-NN(the k-nearest neighbors algorithm)

It’s a nonparametric statistics (instance-based)used for classification and regression(robust and versatile) and the simplest Supervised machine learning algorithm mostly used for classification. KNN is based on feature similarity. Data used in K-NN should better be Labeled, Noise-free and Small data set

Initialize K: Choosing different K values may influence the accuracy of the model. If the K is too small, the noise will influence the prediction greatly. However, a large one will make it computationally expensive. Usually, we set it as the square root of nor an odd number.