为了简单起见,我将文件名硬编码为“image.jpg”。调整图像大小是为了提高速度:如果您不介意等待,请注释掉resize调用。当在这个{a3}上运行时,它通常说主色是{d8c865,它大致相当于两个辣椒左下角的亮黄色区域。我之所以说“通常”,是因为使用的clustering algorithm具有一定程度的随机性。有多种方法可以改变这一点,但对于你的目的,它可能很适合。(如果需要确定的结果,请查看kmeans2()变量上的选项。)from __future__ import print_function
import binascii
import struct
from PIL import Image
import numpy as np
import scipy
import scipy.misc
import scipy.cluster
NUM_CLUSTERS = 5
print('reading image')
im = Image.open('image.jpg')
im = im.resize((150, 150)) # optional, to reduce time
ar = np.asarray(im)
shape = ar.shape
ar = ar.reshape(scipy.product(shape[:2]), shape[2]).astype(float)
print('finding clusters')
codes, dist = scipy.cluster.vq.kmeans(ar, NUM_CLUSTERS)
print('cluster centres:\n', codes)
vecs, dist = scipy.cluster.vq.vq(ar, codes) # assign codes
counts, bins = scipy.histogram(vecs, len(codes)) # count occurrences
index_max = scipy.argmax(counts) # find most frequent
peak = codes[index_max]
colour = binascii.hexlify(bytearray(int(c) for c in peak)).decode('ascii')
print('most frequent is %s (#%s)' % (peak, colour))
注意:当我将集群的数量从5个扩展到10个或15个时,它经常给出绿色或蓝色的结果。给定输入图像,这些结果也是合理的。。。我也分不清哪种颜色在那张图片中占主导地位,所以我不怪算法!
还有一个小小的好处:只使用N种最常见的颜色保存缩小的图像:# bonus: save image using only the N most common colours
import imageio
c = ar.copy()
for i, code in enumerate(codes):
c[scipy.r_[scipy.where(vecs==i)],:] = code
imageio.imwrite('clusters.png', c.reshape(*shape).astype(np.uint8))
print('saved clustered image')