4-1 k近邻算法基础

# 首先导入一些库
import numpy as np
import sklearn
import matplotlib.pyplot as plt

# 生成一些随机数
raw_data_x = np.random.rand(20)
raw_data_x
array([0.73926217, 0.85530604, 0.57330886, 0.10024075, 0.86413266,
       0.75800531, 0.90690065, 0.41902893, 0.2067514 , 0.31596071,
       0.79943476, 0.31684349, 0.08522602, 0.51840576, 0.83998825,
       0.28529047, 0.56895044, 0.65566675, 0.02615133, 0.92569249])
# 将数据的形状改变一下
raw_data_x = raw_data_x.reshape(10,2)
# 标签数据
raw_data_y = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]

# 将数据赋给训练数据和标签数据
x_train = np.array(raw_data_y)
y_train = np.array(raw_data_y)

暂时看一下输出的结果

x_train
array([[0.73926217, 0.85530604],
       [0.57330886, 0.10024075],
       [0.86413266, 0.75800531],
       [0.90690065, 0.41902893],
       [0.2067514 , 0.31596071],
       [0.79943476, 0.31684349],
       [0.08522602, 0.51840576],
       [0.83998825, 0.28529047],
       [0.56895044, 0.65566675],
       [0.02615133, 0.92569249]])
y_train
array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

将数据可视化一下,结果如下:

plt.scatter( x_train[y_train == 0, 0], x_train[y_train == 0,  1], color = 'g' )
plt.scatter( x_train[y_train == 1, 0], x_train[y_train == 1,  1], color = 'r' )

在这里插入图片描述

在上面的图像上再增加一个蓝色的点

# 新增的点的坐标
x = np.array([0.8, 0.5])
plt.scatter( x_train[y_train == 0, 0], x_train[y_train == 0,  1], color = 'g' )
plt.scatter( x_train[y_train == 1, 0], x_train[y_train == 1,  1], color = 'r' )
plt.scatter(x[0], x[1], color = 'b')
plt.show()

在这里插入图片描述

KNN

from math import sqrt
distance = []
for xtrain in x_train:
    d =sqrt(np.sum((xtrain - x) ** 2))
    distance.append(d)
    
distance
[0.3604600793002089,
 0.45956101424338214,
 0.265856610529157,
 0.1341046688862789,
 0.6211395713879941,
 0.18315737983164557,
 0.7150109186989645,
 0.21840155714132786,
 0.27859655394067245,
 0.8832077063446624]
distance = [sqrt(np.sum((xtrain - x)**2)) for xtrain in x_train]
distance
[0.3604600793002089,
 0.45956101424338214,
 0.265856610529157,
 0.1341046688862789,
 0.6211395713879941,
 0.18315737983164557,
 0.7150109186989645,
 0.21840155714132786,
 0.27859655394067245,
 0.8832077063446624]
np.argsort(distance)
array([3, 5, 7, 2, 8, 0, 1, 4, 6, 9], dtype=int64)
nearest = np.argsort(distance)
k = 6
topK_y = [y_train[i] for i in nearest[:k]]
topK_y
[0, 1, 1, 0, 1, 0]
from collections import Counter
Counter(topK_y)
Counter({0: 3, 1: 3})
votes = Counter(topK_y)
votes.most_common(2)
[(0, 3), (1, 3)]
votes.most_common(1)
[(0, 3)]
votes.most_common(1)[0][0]
0
predict_y = votes.most_common(1)[0][0]
predict_y
0
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值