论文欣赏—A Linear Assignment Clustering Algorithm Based on the Least Similar Cluster Representatives

这篇博文列出了主要的公式,和讲解视频一起看效果更佳!
讲解视频

Similarity and Dissimilarity Coefficients——相似度不相似度的定义

首先我们来看一下两个系数的定义方法,Similarity and dissimilarity coefficients代表着两个数据之间的相似和不相似的程度
一个典型的dissimilarity coefficient是the Minkowski metric
d i , j = [ ∑ k = 1 m ( a k i − a k j ) q ] 1 q d_{i,j}=\bigg[\sum_{k=1}^m(a_{ki}-a_{kj})^q\bigg]^{\frac{1}{q}} di,j=[k=1m(akiakj)q]q1
其中 q > 0 q>0 q>0
定义similarity coefficient
s i j = ∑ k = 1 m a k i a k j ∑ k = 1 m ( a k i + a k j − a k i a k j ) s_{ij}=\frac{\sum_{k=1}^{m}a_{ki}a_{kj}}{\sum_{k=1}^{m}(a_{ki}+a_{kj}-a_{ki}a_{kj})} sij=k=1m(aki+akjakiakj)k=1makiakj

Cluster Representatives——选代表

方法一:
{ r 1 , r 2 } = a r g min ⁡ ( i , j ) s i j \{r_1,r_2\}=arg\min_{(i,j)}s_{ij} {r1,r2}=arg(i,j)minsij
r k = a r g min ⁡ i ∈ { 1 , 2 , . . . , k − 1 } ∑ j = 1 k − 1 s i r j k = 3 , 4 , . . . p r_k=arg\min_{i\in\{1,2,...,k-1\}}\sum_{j=1}^{k-1}s_ir_j\\ k=3,4,...p rk=argi{1,2,...,k1}minj=1k1sirjk=3,4,...p
其中 r k r_k rk表示的是第k个cluter的index
方法二:
{ r 1 , r 2 . . . r p } = a r g min ⁡ r ∈ { 1 , 2 , . . . p } { ∑ i = 1 n ∑ r < i s r i r j ∣ r i , r j ∈ { 1 , 2 , . . . , n } } \{r_1,r_2...r_p\}=arg\min_{r\in\{1,2,...p\}}\bigg\{\sum_{i=1}^n\sum_{r<i}s_{r_ir_j}|r_i,r_j\in\{1,2,...,n\}\bigg\} {r1,r2...rp}=argr{1,2,...p}min{i=1nr<isrirjri,rj{1,2,...,n}}

Linear Assignment Model

M a x i m i z e ∑ i = 1 n ∑ k = 1 p s i r k x i k o r m i n i m i z e ∑ i = 1 n ∑ k = 1 p d i r k x i k Maximize\quad\sum_{i=1}^n\sum_{k=1}^ps_{ir_k}x_{ik}\quad or \quad minimize\quad\sum_{i=1}^n\sum_{k=1}^pd_{ir_k}x_{ik} Maximizei=1nk=1psirkxikorminimizei=1nk=1pdirkxik
s u b j e c t    t o ∑ i = 1 n x i k = 1 k = 1 , 2 , . . . p subject\;to\quad\sum_{i=1}^nx_{ik}=1\qquad k=1,2,...p subjecttoi=1nxik=1k=1,2,...p
∑ k = 1 p x i k ≤ u i = 1 , 2 , . . . n \sum_{k=1}^px_{ik}\le u\qquad i=1,2,...n k=1pxikui=1,2,...n
x i k ≥ 0 i = 1 , 2 , . . . , n ; k = 1 , 2 , . . . , p x_{ik}\ge0\qquad i=1,2,...,n;\quad k=1,2,...,p xik0i=1,2,...,n;k=1,2,...,p
其中 x i k x_{ik} xik是二值决策变量

Assignment Clustering Algorithm

Step 0: Set I = { i ∣ 1 , 2 , . . . , n } I=\{i|1,2,...,n\} I={i1,2,...,n}and K = { k ∣ 1 , 2 , . . . p } . K=\{k|1,2,...p\}. K={k1,2,...p}.
Step 1: Load the number of clusters n and the upper bound of data per cluster u:
Step 2: Load or compute the similarity coefficients between every pair of data.
Step 3: Determine cluster representatives using (15) and (16) or (17), then remove r k    ( k ∣ 1 , 2 , . . . p ) r_k\;(k|1,2,...p) rk(k1,2,...p)from I.
Step 4: Determine ( v , w ) = a r g max ⁡ i ∈ I , k ∈   K s i r k (v,w)=arg\max_{i\in I,k\in\ K}s_{ir_k} (v,w)=argmaxiI,k Ksirk.
Step 5: If the number of data in cluster w is u; then remove w from K and go to Step 4; otherwise, assign datum v to cluster w and delete v from I:
Step 6: If I ≠ ∅ I\ne \emptyset I=; go to Step 4.
Step 7: Evaluate the clustering result using one or more performance criteria.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

老实人小李

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值