K-NN笔记1.0

K-NN是一种非参数统计的实例基分类和回归算法,常用于监督学习中的分类任务。该算法依赖特征相似性,理想情况下,数据应带有标签、无噪声且数据集较小。K的选择影响模型准确性,通常设为根号下样本数或奇数。常见的距离度量包括欧氏距离、曼哈顿距离、闵可夫斯基距离、切比雪夫距离和余弦相似度。流程包括数据加载、训练验证集划分、初始化K值、评估混淆矩阵并选择最佳K值进行预测。在R中,分类任务采用多数投票,回归任务则取均值。
摘要由CSDN通过智能技术生成

K-NN(the k-nearest neighbors algorithm)

It’s a nonparametric statistics (instance-based)used for classification and regression(robust and versatile) and the simplest Supervised machine learning algorithm mostly used for classification. KNN is based on feature similarity. Data used in K-NN should better be Labeled, Noise-free and Small data set

Initialize K: Choosing different K values may influence the accuracy of the model. If the K is too small, the noise will influence the prediction greatly. However, a large one will make it computationally expensive. Usually, we set it as the square root of nor an odd number.

Feature Output
Classification Class(Discrete value)
Regression Value(real number)
<
Feature Input Output
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值