吴恩达·Machine Learning || chap13 Clustering简记

13 Clustering

13-1 Unsupervised learning introduction

Supervised learning

在这里插入图片描述

Training set: { ( x ( 1 ) , y ( 1 ) ) , ( x ( 2 ) , y ( 2 ) ) , ( x ( 3 ) , y ( 3 ) ) , ⋯   , ( x ( m ) , y ( m ) ) } \{ ( x ^ { ( 1 ) } , y ^ { ( 1 ) } ) , ( x ^ { ( 2 ) } , y ^ { ( 2 ) } ) , ( x ^ { ( 3 ) } , y ^ { ( 3 ) } ) , \cdots , ( x ^ { ( m ) } , y ^ { ( m ) } ) \} {(x(1),y(1)),(x(2),y(2)),(x(3),y(3)),,(x(m),y(m))}

Unsupervised learning:

在这里插入图片描述

Training set: { x ( 1 ) , x ( 2 ) , x ( 3 ) , ⋯   , x ( m ) } \{ x ^ { ( 1 ) } , x ^ { ( 2 ) } , x ^ { ( 3 ) } , \cdots , x ^ { ( m ) } \} {x(1),x(2),x(3),,x(m)}

Applications of clustering

Market segmentation

Social network analysis

Organize computing clusters

Astronomical data analysis

13-2 K-means algorithm

step1:cluster assignment 簇分配

step2:move centroid 移动聚类中心

K-means algorithm

Input:

  • K(number of cluster)
  • Training set: { x ( 1 ) , x ( 2 ) , x ( 3 ) , ⋯   , x ( m ) } \{ x ^ { ( 1 ) } , x ^ { ( 2 ) } , x ^ { ( 3 ) } , \cdots , x ^ { ( m ) } \} {x(1),x(2),x(3),,x(m)}

x ( i ) ∈ R n drop  x 0 = 1  convention x ^ { ( i ) } \in \mathbb{R} ^ { n }\quad\text{drop $x_0=1$ convention} x(i)Rndrop x0=1 convention


Randomly initialize K cluster centroids μ 1 , μ 2 , ⋯   , μ K ∈ R n \mu_1,\mu_2,\cdots,\mu_K\in\mathbb{R}^n μ1,μ2,,μKRn

Repeat{

​ for i = 1 i=1 i=1 to m

c ( i ) c^{(i)} c(i):=index (from 1 to K) of cluster centroid closet to x ( i ) x^{(i)} x(i)

​ for k = 1 k=1 k=1 to K

μ k \mu_k μk:=average(mean) of points assigned to cluster k

}

K-means for non-separated clusters

13-3 Optimization objective

K-means optimization objective

c ( i ) c^{(i)} c(i)= index of cluster (1, 2, …,K) to which example x ( i ) x^{(i)} x(i) is currently assigned
μ k \mu_k μk= cluster centroid k    ( μ k ∈ R n ) k\;(\mu_k\in\mathbb{R}^n) k(μkRn)

μ c ( i ) \mu_{c^{(i)}} μc(i)= cluster centroid of cluster to which example x ( i ) x^{(i)} x(i) has been assigned

Optimization objective:
J ( c ( 1 ) , ⋯   , c ( m ) , μ 1 , ⋯   , μ K ) = 1 m ∑ i = 1 m ∣ ∣ x ( i ) − μ c ( i ) ∣ ∣ 2 J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K)=\frac{1}{m}\sum_{i=1}^{m}||x^{(i)}-\mu_{c^{(i)}}||^2 J(c(1),,c(m),μ1,,μK)=m1i=1mx(i)μc(i)2

min ⁡ c ( i ) , ⋯   , c ( m ) μ 1 , ⋯   , μ K J ( c ( 1 ) , ⋯   , c ( m ) , μ 1 , ⋯   , μ K ) \underset{\mu_1,\cdots,\mu_K}{\underset{c^{(i)},\cdots,c^{(m)}}{\min}}J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K) μ1,,μKc(i),,c(m)minJ(c(1),,c(m),μ1,,μK)

13-4 Random initialization

Random initialization

Should have K < m
Randomly pick K training examples
Set μ 1 , ⋯   , μ k \mu_1,\cdots,\mu_k μ1,,μk equal to these K examples

Local optima Random initialization

K=2-10,if large not necessary

For i= 1 to 100
{
Randomly initialize K-means
Run K-means. Get c ( 1 ) , ⋯   , c ( m ) , μ 1 , ⋯   , μ k c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_k c(1),,c(m),μ1,,μk
Compute cost function( distortion )

J ( c ( 1 ) , ⋯   , c ( m ) , μ 1 , ⋯   , μ K ) J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K) J(c(1),,c(m),μ1,,μK)

}

Pick clustering that gave lowest cost J ( c ( 1 ) , ⋯   , c ( m ) , μ 1 , ⋯   , μ K ) J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K) J(c(1),,c(m),μ1,,μK)

13-5 Choosing the number of cluster

by hand better

What is the right value of K? No right answer

Choosing the value of K

Elbow method:

Sometimes, you’re running K-means to get clusters to use for some later/downstream purpose. Evaluate K-means based on a metric for how well it performs for that later purpose.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值