先验知识
numpy.argsort(a, axis=-1, kind=’quicksort’, order=None)
返回的是数组值从小到大的索引值
参数:
a为要排序的数组
axis:按哪一维进行排序
kind:排序算法的选择,有quicksort,mergesort,heapsort对于一维数组
>>>import numpy as np
>>>x=np.array([1,4,3,-1,5,9])
>>>x.argsort()
array([3,0,2,1,4,5)]
numpy.tile(array, (dim))
把array的维度扩充和dim一样,dim是一个元组
k-近邻算法大致流程
dataSetSize = dataSet.shape[0]
diffMat = np.tile(inX, (dataSetSize, 1)) - dataSet
sqDiffMat = diffMat**2
sqDistances = sqDiffMat.sum(axis=1)
distances = sqDistances**0.5
sortedDistIndicies = distances.argsort()
classCount = {}
for i in range(k):
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1
sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)
return sortedClassCount[0][0]