#数据来自UCI Machine Learning知识库的Iris数据集
#紧邻算法,通过计算测试集与训练集上诉特征之间的距离,分类,《Data Analysis with open Source Tools》中没有产生上述图的程序,所以,也模拟了下上述图形,但是暂#时没有平滑效果
from numpy import *
import matplotlib.pylab as pl
train = loadtxt("D:\\iris.trn",delimiter=',',usecols=(0,1,2,3))
trainlabel= loadtxt("D:\\iris.trn",delimiter=',',usecols=(4,),dtype=str)
test = loadtxt("D:\\iris.tst",delimiter=',',usecols=(0,1,2,3))
testlabel= loadtxt("D:\\iris.tst",delimiter=',',usecols=(4,),dtype=str)
hit,miss=0,0
for i in range(test.shape[0]):
dist = sqrt(sum((test[i]-train)**2,axis=1))
k = argmin(dist)
if trainlabel[k]== testlabel[i]:
flag='+'
hit +=1
else:
flag='-'
miss +=1
print flag,"\t Predicted:",trainlabel[k],"\t True:",testlabel[i]
print
print hit ,"out of",hit + miss ,"correct-Accuracy:",hit/(hit+miss+0.0)
运行结果
+ Predicted: Iris-setosa True: Iris-setosa
+ Predicted: Iris-setosa True: Iris-setosa
+ Predicted: Iris-versicolor True: Iris-versicolor
+ Predicted: Iris-versicolor True: Iris-versicolor
+ Predicted: Iris-virginica True: Iris-virginica
5 out of 5 correct-Accuracy: 1.0