创建KNN.py
第一步,创建数据
代码:
import numpy as np
import operator
def createDataSet():
group = np.array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
labels = ['A','A','B','B']
return group,labels
创建了数据集和标签,现在有四个数据,给一个数据有两个属性。
第二步,使用KNN分类
算法:
for each point in our dataset
calculate the distance betweent inX and the current point
sort in increaseing order
take k items with lowest distances to inX
find the majority class among these items
return the majority class
代码:
def classify0(inX,dataSet,labels,k):
dataSetSize = dataSet.shape[0]
diffMat = np.tile(inX,(dataSetSize,1)) - dataSet
sqDiffMat = diffMat ** 2
sqDistance = sqDiffMat.sum(axis=1)
distance = sqDistance**0.5
sortedDistIndicie = distance.argsort()
classCount = {}
for i in range(k):
voteIlabel = labels[sortedDistIndicie[i]]
classCount[voteIlabel] = classCount.get(voteIlabel,0)+1
sortedClassCount = sorted(classCount.iteritems(),key=operator.itemgetter(1),reverse=True)
return sortedClassCount[0][0]
测试效果:
import KNN
group,labels = KNN.createDataSet()
print KNN.classify0([0,0],group,labels,3)
未完待续。。。。