1.数据格式
Id(索引) | R(特征) | F(特征) | M(特征) |
1 | 27 | 6 | 232.61 |
2 | 3 | 5 | 1507.11 |
3 | 4 | 16 | 817.62 |
4 | 3 | 11 | 232.81 |
5 | 14 | 7 | 1913.05 |
6 | 19 | 6 | 220.07 |
7 | 5 | 2 | 615.83 |
8 | 26 | 2 | 1059.66 |
... | ... | ... | ... |
2.代码
(1)主体(输出聚类中心)
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
import numpy as np
df = pd.read_excel("consumption_data.xls")
scaler = StandardScaler()
df_scaled = scaler.fit_transform(df.drop('Id', axis=1))
model = KMeans(n_clusters=3)
model.fit(df_scaled)
centers = pd.DataFrame(model.cluster_centers_, columns=df.drop('Id', axis=1).columns)
r1 = pd.Series(model.labels_).value_counts()
r = pd.concat([centers, r1], axis=1)
r.columns = list(df.drop('Id', axis=1).columns) + ['聚类类别数量']
print(r)
(2)可视化(聚类中心雷达图)
attributes = ['R', 'F', 'M']
angles = np.linspace(0, 2 * np.pi, len(attributes), endpoint=False).tolist()
angles += angles[:1]
fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(polar=True))
for i in range(len(centers)):
values = centers.iloc[i].values.tolist()
values += values[:1]
ax.plot(angles, values, linewidth=1, linestyle='solid', label='Cluster {}'.format(i))
ax.set_xticks(angles[:-1])
ax.set_xticklabels(attributes)
ax.legend(loc='upper right', bbox_to_anchor=(1.1, 1.0))
plt.show()
![](https://img-blog.csdnimg.cn/direct/5b9070eb6e434b6c84e0f3adf37c6b44.png)
3.后续更新中