python knn算法和结果_菜鸟之路——机器学习之KNN算法个人理解及Python实现

KNN(K Nearest Neighbor)

还是先记几个关键公式

距离:一般用Euclidean distance   E(x,y)√∑(xi-yi)2 。名字这么高大上,就是初中学的两点间的距离嘛。

还有其他距离的衡量公式,余弦值(cos),相关度(correlation) 曼哈顿距离(manhatann distance)。我觉得针对于KNN算法还是Euclidean distance最好,最直观。

然后就选择最近的K个点。根据投票原则分类出结果。

首先利用sklearn自带的的iris数据集和KNN算法运行一下

1 from sklearn import neighbors #knn算法在neighbor包里

2 from sklearn import datasets #包含常用的机器学习的包

3

4 knn=neighbors.KNeighborsClassifier() #新建knn算法类

5

6 iris=datasets.load_iris() #加载虹膜这种花的数据

7 #print(iris) #这是个字典有data,target,target_name,这三个key,太多了,就打印出来了

8

9 knn.fit(iris.data,iris.target)10 print(knn.fit(iris.data,iris.target)) #我也不知道为什么要这样fit一下形成一个模型。打印一下看看我觉得应该是为了记录一下数据的信息吧

11

12

13 predictedLabel=knn.predict([[0.1,0.2,0.3,0.4]])#预测一下

14 print(predictedLabel)15 print("predictedName:",iris.target_names[predictedLabel[0]])

然后就自己写KNN算法啦

1 importcsv2 importrandom3 importmath4 importoperator5

6 #加载数据的

7 def LoadDataset(filename,split):#split这个参数是用来分开训练集与测试集的,split属于[0,1]。即有多大的概率将所有数据选取为训练集

8 trainingSet=[]9 testSet=[]10 with open(filename,'rt') as csvfile:11 lines=csv.reader(csvfile)12 dataset=list(lines)13 for x in range(len(dataset)-1):14 for y in range(4):15 dataset[x][y]=float(dataset[x][y])16 if random.random()

17 trainingSet.append(dataset[x])18 else:19 testSet.append(dataset[x])20 return[trainingSet,testSet]21

22 #此函数用来计算两点之间的距离

23 def enclideanDinstance(instance1,instance2,length):#legdth为维度

24 distance=025 for x inrange(length):26 distance+=pow((instance1[x]-instance2[x]),2)27 returnmath.sqrt(distance)28

29 #此函数选取K个离testInstance最近的trainingSet里的实例

30 defgetNeighbors(trainingSet,testInstance,k):31 distances=[]32 length=len(testInstance)-1

33 for x inrange(len(trainingSet)):34 dist=enclideanDinstance(testInstance,trainingSet[x],length)35 distances.append([trainingSet[x],dist])36 distances.sort(key=operator.itemgetter(1))#operator.itemgetter函数获取的不是值,而是定义了一个函数,取列表的第几个域的函数。

37 #sort中的key也是用来指定取待排序元素的哪一项进行排序

38 #这句的意思就是按照distances的第二个域进行排序

39 neighbors=[]40 for x inrange(k):41 neighbors.append(distances[x][0])42 returnneighbors43

44 #这个函数就是从K的最邻近的实例中利用投票原则分类出结果

45 defgetResponce(neighbors):46 classVotes={}47 for x inrange(len(neighbors)):48 responce=neighbors[x][-1]49 if responce inclassVotes:50 classVotes[responce]+=1

51 else:52 classVotes[responce] = 1

53 sortedVotes=sorted(classVotes.items(),key=operator.itemgetter(1),reverse=True)54 returnsortedVotes[0][0]55

56 #这个函数从测试结果与真实结果中得出正确率

57 defgetAccuracy(testSet,predictions):58 corrrect=059 for x inrange(len(testSet)):60 if testSet[x][-1] ==predictions[x]:61 corrrect+=1

62 return (corrrect/float(len(testSet)))*100

63

64 defmain():65 split=0.67 #将选取67%的数据作为训练集

66 [trainingSet,testSet]=LoadDataset('irisdata.txt',split)67 print("trainingSet:",len(trainingSet),trainingSet)68 print("testSet",len(testSet),testSet)69

70 predictions=[]71 k=3 #选取三个最邻近的实例

72 #测试所有测试集

73 for x inrange(len(testSet)):74 neighbors=getNeighbors(trainingSet,testSet[x],k)75 result=getResponce(neighbors)76 predictions.append(result)77 print(">predicted",result,",actual=",testSet[x][-1])78

79 accuracy=getAccuracy(testSet,predictions)80 print("Accuracy:",accuracy,r"%")81

82 if __name__ == '__main__':83 main()

里面有我对代码的理解

运行结果为

trainingSet: 110 [[4.9, 3.0, 1.4, 0.2, 'Iris-setosa'], [4.7, 3.2, 1.3, 0.2, 'Iris-setosa'], [5.0, 3.6, 1.4, 0.2, 'Iris-setosa'], [5.4, 3.9, 1.7, 0.4, 'Iris-setosa'], [4.6, 3.4, 1.4, 0.3, 'Iris-setosa'], [4.4, 2.9, 1.4, 0.2, 'Iris-setosa'], [4.9, 3.1, 1.5, 0.1, 'Iris-setosa'], [5.4, 3.7, 1.5, 0.2, 'Iris-setosa'], [4.8, 3.4, 1.6, 0.2, 'Iris-setosa'], [4.3, 3.0, 1.1, 0.1, 'Iris-setosa'], [5.8, 4.0, 1.2, 0.2, 'Iris-setosa'], [5.7, 4.4, 1.5, 0.4, 'Iris-setosa'], [5.4, 3.9, 1.3, 0.4, 'Iris-setosa'], [5.7, 3.8, 1.7, 0.3, 'Iris-setosa'], [5.4, 3.4, 1.7, 0.2, 'Iris-setosa'], [4.6, 3.6, 1.0, 0.2, 'Iris-setosa'], [4.8, 3.4, 1.9, 0.2, 'Iris-setosa'], [5.0, 3.0, 1.6, 0.2, 'Iris-setosa'], [5.0, 3.4, 1.6, 0.4, 'Iris-setosa'], [5.2, 3.5, 1.5, 0.2, 'Iris-setosa'], [4.7, 3.2, 1.6, 0.2, 'Iris-setosa'], [4.8, 3.1, 1.6, 0.2, 'Iris-setosa'], [5.4, 3.4, 1.5, 0.4, 'Iris-setosa'], [5.2, 4.1, 1.5, 0.1, 'Iris-setosa'], [4.9, 3.1, 1.5, 0.1, 'Iris-setosa'], [5.0, 3.2, 1.2, 0.2, 'Iris-setosa'], [5.5, 3.5, 1.3, 0.2, 'Iris-setosa'], [4.4, 3.0, 1.3, 0.2, 'Iris-setosa'], [5.0, 3.5, 1.3, 0.3, 'Iris-setosa'], [4.5, 2.3, 1.3, 0.3, 'Iris-setosa'], [4.4, 3.2, 1.3, 0.2, 'Iris-setosa'], [5.1, 3.8, 1.9, 0.4, 'Iris-setosa'], [4.8, 3.0, 1.4, 0.3, 'Iris-setosa'], [5.1, 3.8, 1.6, 0.2, 'Iris-setosa'], [4.6, 3.2, 1.4, 0.2, 'Iris-setosa'], [5.3, 3.7, 1.5, 0.2, 'Iris-setosa'], [7.0, 3.2, 4.7, 1.4, 'Iris-versicolor'], [6.4, 3.2, 4.5, 1.5, 'Iris-versicolor'], [5.5, 2.3, 4.0, 1.3, 'Iris-versicolor'], [6.5, 2.8, 4.6, 1.5, 'Iris-versicolor'], [5.7, 2.8, 4.5, 1.3, 'Iris-versicolor'], [4.9, 2.4, 3.3, 1.0, 'Iris-versicolor'], [6.6, 2.9, 4.6, 1.3, 'Iris-versicolor'], [5.0, 2.0, 3.5, 1.0, 'Iris-versicolor'], [5.9, 3.0, 4.2, 1.5, 'Iris-versicolor'], [6.0, 2.2, 4.0, 1.0, 'Iris-versicolor'], [5.6, 2.9, 3.6, 1.3, 'Iris-versicolor'], [6.7, 3.1, 4.4, 1.4, 'Iris-versicolor'], [5.6, 3.0, 4.5, 1.5, 'Iris-versicolor'], [5.8, 2.7, 4.1, 1.0, 'Iris-versicolor'], [5.6, 2.5, 3.9, 1.1, 'Iris-versicolor'], [5.9, 3.2, 4.8, 1.8, 'Iris-versicolor'], [6.3, 2.5, 4.9, 1.5, 'Iris-versicolor'], [6.4, 2.9, 4.3, 1.3, 'Iris-versicolor'], [6.8, 2.8, 4.8, 1.4, 'Iris-versicolor'], [6.7, 3.0, 5.0, 1.7, 'Iris-versicolor'], [6.0, 2.9, 4.5, 1.5, 'Iris-versicolor'], [5.7, 2.6, 3.5, 1.0, 'Iris-versicolor'], [5.5, 2.4, 3.8, 1.1, 'Iris-versicolor'], [5.8, 2.7, 3.9, 1.2, 'Iris-versicolor'], [6.0, 2.7, 5.1, 1.6, 'Iris-versicolor'], [5.4, 3.0, 4.5, 1.5, 'Iris-versicolor'], [6.0, 3.4, 4.5, 1.6, 'Iris-versicolor'], [6.3, 2.3, 4.4, 1.3, 'Iris-versicolor'], [5.6, 3.0, 4.1, 1.3, 'Iris-versicolor'], [5.5, 2.6, 4.4, 1.2, 'Iris-versicolor'], [6.1, 3.0, 4.6, 1.4, 'Iris-versicolor'], [5.8, 2.6, 4.0, 1.2, 'Iris-versicolor'], [5.0, 2.3, 3.3, 1.0, 'Iris-versicolor'], [5.6, 2.7, 4.2, 1.3, 'Iris-versicolor'], [5.7, 3.0, 4.2, 1.2, 'Iris-versicolor'], [5.7, 2.9, 4.2, 1.3, 'Iris-versicolor'], [6.2, 2.9, 4.3, 1.3, 'Iris-versicolor'], [5.1, 2.5, 3.0, 1.1, 'Iris-versicolor'], [5.7, 2.8, 4.1, 1.3, 'Iris-versicolor'], [6.3, 3.3, 6.0, 2.5, 'Iris-virginica'], [5.8, 2.7, 5.1, 1.9, 'Iris-virginica'], [7.1, 3.0, 5.9, 2.1, 'Iris-virginica'], [6.5, 3.0, 5.8, 2.2, 'Iris-virginica'], [7.6, 3.0, 6.6, 2.1, 'Iris-virginica'], [4.9, 2.5, 4.5, 1.7, 'Iris-virginica'], [6.5, 3.2, 5.1, 2.0, 'Iris-virginica'], [6.4, 2.7, 5.3, 1.9, 'Iris-virginica'], [5.8, 2.8, 5.1, 2.4, 'Iris-virginica'], [6.4, 3.2, 5.3, 2.3, 'Iris-virginica'], [6.5, 3.0, 5.5, 1.8, 'Iris-virginica'], [7.7, 2.6, 6.9, 2.3, 'Iris-virginica'], [6.0, 2.2, 5.0, 1.5, 'Iris-virginica'], [6.9, 3.2, 5.7, 2.3, 'Iris-virginica'], [7.7, 2.8, 6.7, 2.0, 'Iris-virginica'], [6.3, 2.7, 4.9, 1.8, 'Iris-virginica'], [7.2, 3.2, 6.0, 1.8, 'Iris-virginica'], [6.2, 2.8, 4.8, 1.8, 'Iris-virginica'], [6.1, 3.0, 4.9, 1.8, 'Iris-virginica'], [6.4, 2.8, 5.6, 2.1, 'Iris-virginica'], [7.4, 2.8, 6.1, 1.9, 'Iris-virginica'], [6.4, 2.8, 5.6, 2.2, 'Iris-virginica'], [6.1, 2.6, 5.6, 1.4, 'Iris-virginica'], [7.7, 3.0, 6.1, 2.3, 'Iris-virginica'], [6.3, 3.4, 5.6, 2.4, 'Iris-virginica'], [6.4, 3.1, 5.5, 1.8, 'Iris-virginica'], [6.9, 3.1, 5.4, 2.1, 'Iris-virginica'], [6.7, 3.1, 5.6, 2.4, 'Iris-virginica'], [6.9, 3.1, 5.1, 2.3, 'Iris-virginica'], [5.8, 2.7, 5.1, 1.9, 'Iris-virginica'], [6.8, 3.2, 5.9, 2.3, 'Iris-virginica'], [6.7, 3.0, 5.2, 2.3, 'Iris-virginica'], [6.3, 2.5, 5.0, 1.9, 'Iris-virginica'], [6.5, 3.0, 5.2, 2.0, 'Iris-virginica'], [6.2, 3.4, 5.4, 2.3, 'Iris-virginica']]

testSet 40 [[5.1, 3.5, 1.4, 0.2, 'Iris-setosa'], [4.6, 3.1, 1.5, 0.2, 'Iris-setosa'], [5.0, 3.4, 1.5, 0.2, 'Iris-setosa'], [4.8, 3.0, 1.4, 0.1, 'Iris-setosa'], [5.1, 3.5, 1.4, 0.3, 'Iris-setosa'], [5.1, 3.8, 1.5, 0.3, 'Iris-setosa'], [5.1, 3.7, 1.5, 0.4, 'Iris-setosa'], [5.1, 3.3, 1.7, 0.5, 'Iris-setosa'], [5.2, 3.4, 1.4, 0.2, 'Iris-setosa'], [5.5, 4.2, 1.4, 0.2, 'Iris-setosa'], [4.9, 3.1, 1.5, 0.1, 'Iris-setosa'], [5.1, 3.4, 1.5, 0.2, 'Iris-setosa'], [5.0, 3.5, 1.6, 0.6, 'Iris-setosa'], [5.0, 3.3, 1.4, 0.2, 'Iris-setosa'], [6.9, 3.1, 4.9, 1.5, 'Iris-versicolor'], [6.3, 3.3, 4.7, 1.6, 'Iris-versicolor'], [5.2, 2.7, 3.9, 1.4, 'Iris-versicolor'], [6.1, 2.9, 4.7, 1.4, 'Iris-versicolor'], [6.2, 2.2, 4.5, 1.5, 'Iris-versicolor'], [6.1, 2.8, 4.0, 1.3, 'Iris-versicolor'], [6.1, 2.8, 4.7, 1.2, 'Iris-versicolor'], [6.6, 3.0, 4.4, 1.4, 'Iris-versicolor'], [5.5, 2.4, 3.7, 1.0, 'Iris-versicolor'], [6.7, 3.1, 4.7, 1.5, 'Iris-versicolor'], [5.5, 2.5, 4.0, 1.3, 'Iris-versicolor'], [6.3, 2.9, 5.6, 1.8, 'Iris-virginica'], [7.3, 2.9, 6.3, 1.8, 'Iris-virginica'], [6.7, 2.5, 5.8, 1.8, 'Iris-virginica'], [7.2, 3.6, 6.1, 2.5, 'Iris-virginica'], [6.8, 3.0, 5.5, 2.1, 'Iris-virginica'], [5.7, 2.5, 5.0, 2.0, 'Iris-virginica'], [7.7, 3.8, 6.7, 2.2, 'Iris-virginica'], [5.6, 2.8, 4.9, 2.0, 'Iris-virginica'], [6.7, 3.3, 5.7, 2.1, 'Iris-virginica'], [7.2, 3.0, 5.8, 1.6, 'Iris-virginica'], [7.9, 3.8, 6.4, 2.0, 'Iris-virginica'], [6.3, 2.8, 5.1, 1.5, 'Iris-virginica'], [6.0, 3.0, 4.8, 1.8, 'Iris-virginica'], [6.7, 3.3, 5.7, 2.5, 'Iris-virginica'], [5.9, 3.0, 5.1, 1.8, 'Iris-virginica']]

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-setosa ,actual= Iris-setosa

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-versicolor ,actual= Iris-versicolor

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-versicolor ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

>predicted Iris-virginica ,actual= Iris-virginica

Accuracy: 97.5 %

以下拓展几个知识点

1,random库的一些用法

random.randint(1,10) #产生 1 到 10 的一个整数型随机数random.random()#产生 0 到 1 之间的随机浮点数

random.uniform(1.1,5.4) #产生 1.1 到 5.4 之间的随机浮点数,区间可以不是整数

random.choice('tomorrow') #从序列中随机选取一个元素

random.randrange(1,100,2) #生成从1到100的间隔为2的随机整数random.shuffle(a)#将序列a中的元素顺序打乱

2,排序函数

sorted(exapmle[, cmp[, key[, reverse]]])

example.sort(cmp[, key[, reverse]])

example是和待排序序列

cmp为函数,指定排序时进行比较的函数,可以指定一个函数或者lambda函数

key为函数,指定取待排序元素的哪一项进行排序

reverse实现降序排序,需要提供一个布尔值,默认为False(升序排列)。

程序中的第53行   sortedVotes=sorted(classVotes.items(),key=operator.itemgetter(1),reverse=True)就是按照sortedVotes的第二个域进行降序排列

key=operator.itemgetter(n)就是按照第n+1个域

写完喽,图书馆也该闭馆了。学习的感觉真舒服。接下来就是最出名的SVM算法啦

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值