KNN算法使用
工具:Pycharm,win10,Python3.6.4
1.题目要求
现有数据如下,根据一些属性,选出适合向客户提供广告的方式,使用KNN算法,K为3.
Age m/f Sales Channel 20 f 10 E-mail 30 m 90 Phone 40 m 70 Post 60 f 100 Phone 20 m 30 E-mail 30 f 40 E-mail 70 m 80 Post 20 f 110 Phone 60 m 80 Post 40 f 20 E-mail 30 m 50 ? 90 f 100 ?
2.Python代码
1.首先要从文件中读出数据,并向量化,同时在读数据的时候注意把数据进行转化,f/m属性我们可以用0/1替换,Channel有三种结果,我们用1,2,3表示。代码和结果如下
import numpy as np
def file2matrix(filename):
fr = open(filename)
array_lines = fr.readlines()
number_lines = len(array_lines)
returnMat = np.zeros((number_lines, 3))
LabelVector = []
index = 0
for line in array_lines:
line = line.strip()
list_line = line.split(' ')
# print(list_line)
if list_line[1] == 'f':
list_line[1] = '1'
else:
list_line[1] = '0'
returnMat[index, :] = list_line[0:3]
if list_line[-1] == 'E-mail':
LabelVector.append(1)
elif list_line[-1] == 'Phone':
LabelVector.append(2)
else: