文章目录
算法实现
1,主函数
算出测试数据到每个训练数据的欧拉距离
收集前k个最小距离
获得前k个最小距离多对应数据的标签
返回出现最多次的标签
def kNN_classify(k, x, x_train, y_train):
# 断言,检查各项是否合理
assert 1 <= k <= x_train.shape[0],
"k 必须大于零且小于训练样本个数"
assert x_train.shape[0] == y_train.shape[0],
"每一个训练样本必须有一一对应的分类(标签)"
assert x_train.shape[1] == x.shape[1],
"训练样本和测试样本必须有相同的属性个数"
# 计算测试样本到每一个训练样本的举例
dateSize = x_train.shape[0]
New_x = np.tile(x, (dateSize, 1))
dist = (np.sum((New_x - x_train) ** 2, axis=1)) ** 0.5
# 得到最小的距离
nearest = np.argsort(dist)
topK_y = [y_train[i] for i in nearest[:k]]
# 找到并返回,与测试样本产生最小距离的训练样本所对应的分类
return Counter(topK_y).most_common(1)[0][0]
2,建立自己的kNN算法
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter
class kNNClassifier:
def __init__(self, k