是的,通过init设置初始质心应该可以工作。以下是scikit learndocumentation的一段引述:init : {‘k-means++’, ‘random’ or an ndarray}
Method for initialization, defaults to ‘k-means++’:
If an ndarray is passed, it should be of shape (n_clusters, n_features)
and gives the initial centers.What is the shape (n_clusters, n_features) referring to?
形状要求意味着init必须正好有n_clusters行,并且每行中的元素数应与actual_data_points的维度匹配:>>> init = np.array([[-0.12, 0.939, 0.321, 0.011],
[0.0, 0.874, -0.486, 0.862],
[0.0, 1.0, 0.0, 0.033],
[0.12, 0.939, 0.321, -0.7],
[0.0, 1.0, 0.0, -0.203],
[0.12, 0.939, -0.321, 0.25],
[0.0, 0.874, 0.486, -0.575],
[-0.12, 0.939, -0.321, 0.961]],
np.float64)
>>> init.shape[0] == 8
True # n_clusters
>>> init.shape[1] == actual_data_points.shape[1]
True # n_featuresWhat is n_features?
n_features是样本的维数。例如,如果要在二维平面上聚集点,n_features将是2。