一般聚完类之后,如果直接使用如下代码来作图,也可以按照默认的颜色定义来区分类簇,但是当类簇个数比较多的时候,颜色会有重复,无法从视觉上较好的区分。
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import rgb2hex
from sklearn.cluster import KMeans
# 以下为示意,train为训练数据
clusters_number = 20
y_pred = KMeans(n_clusters=clusters_number, random_state=9).fit_predict(train)
fig, ax = plt.subplots()
ax.scatter(train[:,1],train[:,0], c=y_pred)
plt.savefig("cluster.png")
我们可以在作图时,自己来指定类簇的颜色,并使用legend 进行标记,下面的代码为一个示意。(示意代码除了颜色标记之外,没有做其他处理,从美观上来说,还有很多改进空间)
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import rgb2hex
from sklearn.cluster import KMeans
# 以下为示意,train为训练数据
clusters_number = 20
y_pred = KMeans(n_clusters=clusters_number, random_state=9).fit_predict(train)
fig, ax = plt.subplots()
types = []
# 如果以颜色表示缩写来定义颜色,这样类簇大时颜色可能重复较多
# c = ['b','c','y','r','g','m','w','k']
# colors = [c[i%len(c)] for i in range(clusters_number)]
# 随机生成使用16进制表示的颜色
colors = tuple([(np.random.random(),np.random.random(), np.random.random()) for i in range(clusters_number)])
colors = [rgb2hex(x) for x in colors] # from matplotlib.colors import rgb2hex
for i, color in enumerate(colors):
need_idx = np.where(y_pred==i)[0]
ax.scatter(train[need_idx,1],train[need_idx,0], c=color, label=i)
# 改变坐标轴间隔
# x_locator = MultipleLocator(0.01)
# y_locator = MultipleLocator(0.01)
# ax = plt.gca()
# ax.xaxis.set_major_locator(x_locator)
# ax.yaxis.set_major_locator(y_locator)
legend = ax.legend(loc='upper right')
plt.savefig("cluster.png")