k邻近算法Python实现

最新推荐文章于 2024-03-11 19:56:55 发布

Chiak1

最新推荐文章于 2024-03-11 19:56:55 发布

阅读量453

点赞数

分类专栏：机器学习文章标签：算法 python 机器学习人工智能 1024程序员节

本文链接：https://blog.csdn.net/qq_43116030/article/details/109181559

版权

机器学习专栏收录该内容

8 篇文章 0 订阅

订阅专栏

训练：存储训练集
预测：
遍历每一个测试样本xi
遍历每一个训练样本fit_xi
计算xi到fit_xi的距离并保存

选择出到xi距离最小的k个fit_xi
如果是分类任务，以k个fit_xi中样本数最多的标签作为测试样本标签
如果是回归任务，将k个fit_xi标签的均值作为样本标签

2.2 距离度量

两个样本的距离度量方式通常有以下几种：

欧几里得距离
$\sqrt{(x_1-y_1)^2+(x_2-y_2)^2+...}$
余弦距离
$1-\frac{x^Ty}{\|x\|^2\|y\|^2}$
曼哈顿距离
$d(x, y) = |x_1-y_1|+|x_2-y_2|+...$

下面的实现将以欧几里得距离为例。

三、Python实现

k邻近分类器：

import numpy as np
from scipy import stats

class KNNClassifier:
    def __init__(self, k=1):
        self.k = k
        self.fit_X = None
        self.fit_y = None

    # 欧几里得距离
    def _d(self, x, y):
        return np.sqrt(np.sum((x - y)**2))

    # 惰性学习
    def fit(self, X, y):
        self.fit_X = X
        self.fit_y = y
        return self

    # 开销主要集中在预测
    def _predict(self, x):
        d = []
        indexs = []
        for fit_x in self.fit_X:
            d.append(self._d(fit_x, x))
        for i in range(self.k):
            index = np.argmin(d)
            indexs.append(index)
            d.pop(index)
        return stats.mode(self.fit_y[indexs])[0][0]

    def predict(self, X):
        y = []
        for x in X:
            y.append(self._predict(x))
        return np.array(y)

k邻近回归：

import numpy as np

class KNNRegression:
    def __init__(self, k=1):
        self.k = k
        self.fit_X = None
        self.fit_y = None

    # 欧几里得距离
    def _d(self, x, y):
        return np.sqrt(np.sum((x - y)**2))

    # 惰性学习
    def fit(self, X, y):
        self.fit_X = X
        self.fit_y = y
        return self

    # 开销主要集中在预测
    def _predict(self, x):
        d = []
        indexs = []
        for fit_x in self.fit_X:
            d.append(self._d(fit_x, x))
        for i in range(self.k):
            index = np.argmin(d)
            indexs.append(index)
            d.pop(index)
        return np.mean(self.fit_y[indexs])

    def predict(self, X):
        y = []
        for x in X:
            y.append(self._predict(x))
        return np.array(y)