Section I: Code Bundle and Result Analyses
代码
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings("ignore")
np.random.seed(123)
#Section 1: Generate random data
variables=['X','Y','Z']
labels=['ID_1','ID_2','ID_3','ID_4','ID_5']
X=np.random.random_sample([5,3])*10
df=pd.DataFrame(X,columns=variables,index=labels)
print("Original DataFrame:\n",df)
#Section 2: Model Agglomerative algorithm
from sklearn.cluster import AgglomerativeClustering
ac=AgglomerativeClustering(n_clusters=3,
affinity='euclidean',
linkage='complete')
labels=ac.fit_predict(X)
print("Cluster Labels: %s" % labels)
结果
Cluster Labels: [1 0 0 2 1]
由上述结果可以得知,ID_1和ID_5为一类,ID_2和ID_3为一类,而ID_4单独为一类。
参考文献
Sebastian Raschka, Vahid Mirjalili. Python机器学习第二版. 南京:东南大学出版社,2018.