Python 简单实现KNN算法

数据集是自己下载的mnist的手写识别的数据,有一个train.csv文本,一个test.csv测试文本,还有一个submission.csv文本(存放的是test.csv的标签),不多说了,KNN原理很简单,直接上代码吧


#autor:zhouchao
#date:2017-12-07 11:13
#description:use knn to recognize num

import numpy as np
from numpy import *
import operator
from numpy import random  

def load_train_data(path):
	train=np.loadtxt(path,delimiter=",", skiprows=0)
	vec=train[:,1:]
	labels=train[:,0:1].tolist()
	print type(labels)
	return vec,labels
def predict(line,vec,labels):
	numSamples = vec.shape[0]
	diff = tile(line, (numSamples, 1)) - vec
	squaredDiff = diff ** 2
	squaredDist = sum(squaredDiff, axis = 1)
	distance = squaredDist ** 0.5
	sortedDistIndices = argsort(distance)
	
	classCount = {}
	for i in xrange(20):
		voteLabel = labels[sortedDistIndices[i]][0]
		classCount[voteLabel] = classCount.get(voteLabel, 0) + 1
	maxCount = 0
	for key, value in classCount.items():
		if value > maxCount:
			maxCount = value
			maxIndex = key
	return maxIndex 
if __name__=="__main__":
	vec,labels=load_train_data("../../data/handwrite/train.csv")
	f=open("../../data/handwrite/test.txt")
	for line in f.readlines():
		nums = line.split(",")
		nums = [int(x) for x in nums ]
		matrix = np.array(nums)
		print predict(matrix,vec,labels)




评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值