机器学习读书笔记(十一):Clustering Analysis

K-means

some basic words

  1. centroid:质心
  2. medoid:most representative or most frequently occurring points

Steps

  1. Randomly pick k k k centroids from the sample points as initial cluster centers
  2. Assign each sample to the nearest centroid μ ( j ) , j ∈ { 1 , . . . , k } \mu^{(j)},j \in \{ 1,...,k\} μ(j),j{ 1,...,k}
  3. Move the centroids to the center of the samples that were assigned to it.
  4. Repeat the step 2 and 3 until the cluster assignment do not change or a user-defined tolerance or a maximum number of iterations is reached.

SSE

Based on the Euclidean distance metric, we can describe the k-means algorithm as a imple optimization problem, an iterative approach for minimizing the within-cluster sum of squared errors(SSE), which is sometimes also called cluster inertia.
d ( x , y ) 2 = ∑ j = 1 m ( x j − y j ) 2 = ∣ ∣ x − y ∣ ∣ 2 2 d(x,y)^2 = \sum_{j=1} ^{m} (x_j -y_j)^2 = ||x - y||_2^2 d(x,y)2=j=1m(xjyj)2=xy22

S S E = ∑ i = 1 n ∑ j = 1 k w ( i , j ) ∣ ∣ x ( i ) − μ ( j ) ∣ ∣ 2 2 SSE = \sum_{i=1}^{n}\sum_{j=1}^{k} w^{(i,j)} ||x^{(i)}-\mu^{(j)} || _2^2 SSE=i=1nj=1kw(i,j)x(i)μ(j)22
Here, μ ( j ) \mu^{(j)} μ(j) is the representative point (centroid) for cluster j j j, and w ( i , j ) w^{(i,j)} w(i,j) =1 if the sample x ( i ) x^{(i)} x(i) is in cluster j and w ( i , j ) w^{(i,j)} w(i,j) =0 otherwise.

K-means++

Place the initial centroids far away from each other via the k-means++ algorithm.

Steps

  1. Initialize an empty set M M M to store the k centroids being selected.
  2. Randomly choose the first centroid μ j \mu^j μj from the input samples and assign it to M M M.
  3. For each sample x ( i ) x^{(i)} x(i) that is not in M M M , find the minimum squared distance d ( x ( i ) , M
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值