KNN算法-查找最近的邻居

KNN算法-查找最近的邻居 (KNN Algorithm - Finding Nearest Neighbors)

介绍 (Introduction)

K-nearest neighbors (KNN) algorithm is a type of supervised ML algorithm which can be used for both classification as well as regression predictive problems. However, it is mainly used for classification predictive problems in industry. The following two properties would define KNN well −

K最近邻(KNN)算法是一种监督的ML算法,可用于分类以及回归预测问题。 但是,它主要用于行业中的分类预测问题。 以下两个属性将很好地定义KNN-

  • Lazy learning algorithm − KNN is a lazy learning algorithm because it does not have a specialized training phase and uses all the data for training while classification.

    惰性学习算法 -KNN是一种惰性学习算法,因为它没有专门的训练阶段,并且在分类时将所有数据用于训练。

  • Non-parametric learning algorithm − KNN is also a non-parametric learning algorithm because it doesn’t assume anything about the underlying data.

    非参数学习算法 -KNN也是非参数学习算法,因为它不假设有关基础数据的任何信息。

KNN算法的工作 (Working of KNN Algorithm)

K-nearest neighbors (KNN) algorithm uses ‘feature similarity’ to predict the values of new datapoints which further means that the new data point will be assigned a value based on how closely it matches the points in the training set. We can understand its working with the help of following steps −

K最近邻(KNN)算法使用“特征相似性”来预测新数据点的值,这进一步意味着,将根据新数据点与训练集中的点的匹配程度为该新数据点分配一个值。 我们可以通过以下步骤了解其工作方式-

  • Step 1 − For implementing any algorithm, we need dataset. So during the first step of KNN, we must load the training as well as test data.

    步骤1-为了实现任何算法,我们需要数据集。 因此,在KNN的第一步中,我们必须加载训练以及测试数据。

  • Step 2 − Next, we need to choose the value of K i.e. the nearest data points. K can be any integer.

    步骤2-接下来,我们需要选择K的值,即最近的数据点。 K可以是任何整数。

  • Step 3 − For each point in the test data do the following −

    步骤3-对于测试数据中的每个点,请执行以下操作-

    3.1 − Calculate the distance between test data and each row of training data with the help of any of the method namely: Euclidean, Manhattan or Hamming distance. The most commonly used method to calculate distance is Euclidean.

    3.1-借助以下任意一种方法来计算测试数据与训练数据的每一行之间的距离:欧几里得距离,曼哈顿距离或汉明距离。 最常用的距离计算方法是欧几里得。

    3.2 − Now, based on the distance value, sort them in ascending order.

    3.2-现在,基于距离值,按升序对它们进行排序。

    3.3 − Next, it will choose the top K rows from the sorted array.

    3.3-接下来,它将从排序

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值