给定一个二维的数据集,要求化为2个类别。
点 x y
1 0 0
2 1 1
3 5 5
4 5 6
5 2 0
文件内容(路径:D:\name.txt)
1,0,0
2,1,1
3,5,5
4,5,6
5,2,0
代码段:
from sklearn.cluster import KMeans
def loadData(filePath):
fr = open(filePath, 'r+')
lines = fr.readlines()
retData = []
retName = []
for line in lines:
items = line.strip().split(",")#文件处理
retName.append(items[0])
retData.append([float(items[i]) for i in range(1, len(items))])
return retData, retName
if __name__ == '__main__':
data, Name = loadData('D:\name.txt')
km = KMeans(n_clusters=2)
#n_clusters=2:分成两类
label = km.fit_predict(data)
NameCluster = [[], []]
#有两个类建两个子集
for i in range(len(Name)):
NameCluster [label[i]].append(Name[i])
for i in range(len(NameCluster )):
print(NameCluster [i])
输出结果,与作图目测结论相符:
['1', '2', '5']
['3', '4']