5行代码实现K-Means算法

最新推荐文章于 2023-04-24 13:47:34 发布

置顶 ChinaYiqun

最新推荐文章于 2023-04-24 13:47:34 发布

阅读量711

点赞数 1

分类专栏：算法 python 文章标签：聚类 K-Means python Scipy

本文链接：https://blog.csdn.net/Real_neu/article/details/83933205

版权

算法同时被 2 个专栏收录

12 篇文章 0 订阅

订阅专栏

python

11 篇文章 0 订阅

订阅专栏

5行代码实现K-Means算法

data = vstack((rand(10,2) + array([3,3]),rand(10,2))) 
#data = whiten(data)
plt.scatter(data[:,0], data[:,1]) 
centroids,_ = kmeans2(data,2,thresh= 0.0001,minit ='random')
plt.scatter(centroids[:,0],centroids[:,1], c='r')

在这里插入图片描述
scipy.cluster.vq.kmeans2(data, k, iter=10, thresh=1e-05, minit=‘random’, missing=‘warn’, check_finite=True)

Classify a set of observations into k clusters using the k-means algorithm. The algorithm attempts to minimize the Euclidian distance between observations and centroids. Several initialization methods are included.

Parameters:

data : ndarray A ‘M’ by ‘N’ array of ‘M’ observations in ‘N’ dimensions or a length ‘M’ array of ‘M’ one-dimensional observations.

k : int or ndarray
The number of clusters to form as well as the number of centroids to generate. If minit initialization string is ‘matrix’, or if a ndarray is given instead, it is interpreted as initial cluster to use instead.

iter : int, optional
Number of iterations of the k-means algorithm to run. Note that this differs in meaning from the iters parameter to the kmeans function.

thresh : float, optional
(not used yet)

minit : str, optional
Method for initialization. Available methods are ‘random’, ‘points’, and ‘matrix’:

‘random’: generate k centroids from a Gaussian with mean and variance estimated from the data.

‘points’: choose k observations (rows) at random from data for the initial centroids.

‘matrix’: interpret the k parameter as a k by M (or length k array for one-dimensional data) array of initial centroids.

missing : str, optional
Method to deal with empty clusters. Available methods are ‘warn’ and ‘raise’:

‘warn’: give a warning and continue.

‘raise’: raise an ClusterError and terminate the algorithm.

check_finite : bool, optional
Whether to check that the input matrices contain only finite numbers. Disabling may give a performance gain, but may result in problems (crashes, non-termination) if the inputs do contain infinities or NaNs. Default: True

Returns:
centroid : ndarray
A ‘k’ by ‘N’ array of centroids found at the last iteration of k-means.

label : ndarray
label[i] is the code or index of the centroid the i’th observation is closest to.

参考 Scipy官方文档

ChinaYiqun

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
5行代码实现K-Means算法

5行代码实现K-Means算法data = vstack((rand(10,2) + array([3,3]),rand(10,2))) #data = whiten(data)plt.scatter(data[:,0], data[:,1]) centroids,_ = kmeans2(data,2,thresh= 0.0001,minit ='random')plt.scatter...
复制链接

扫一扫

专栏目录