一.K-means Clustering
1.读取数据
import pandas as pd
beer = pd.read_csv('data.txt', sep=' ')
beer
2.只取有用的四列数据
X = beer[["calories","sodium","alcohol","cost"]]
X.head()
3.使用Kmeans对X进行分类
from sklearn.cluster import KMeans
km = KMeans(n_clusters=3).fit(X)
km2 = KMeans(n_clusters=2).fit(X)
km.labels_
4.根据分类来排序
beer['cluster'] = km.labels_
beer['cluster2'] = km2.labels_
beer.sort_values('cluster')
5.cluster=3的平均值
beer.groupby("cluster").mean()
6.cluster=2的平均值
beer.groupby("cluster2").mean()
7.reset_index
centers = beer