吴恩达·Machine Learning || chap13 Clustering简记

最新推荐文章于 2024-08-14 19:43:55 发布

The Prestige

最新推荐文章于 2024-08-14 19:43:55 发布

阅读量94

点赞数

分类专栏： Machine Learning 文章标签：机器学习

本文链接：https://blog.csdn.net/qq_46203130/article/details/120182641

版权

Machine Learning 专栏收录该内容

17 篇文章 0 订阅

订阅专栏

13 Clustering

13-1 Unsupervised learning introduction

Supervised learning

在这里插入图片描述

Training set: $\{ ( x ^ { ( 1 ) } , y ^ { ( 1 ) } ) , ( x ^ { ( 2 ) } , y ^ { ( 2 ) } ) , ( x ^ { ( 3 ) } , y ^ { ( 3 ) } ) , \cdots , ( x ^ { ( m ) } , y ^ { ( m ) } ) \}$

Unsupervised learning:

在这里插入图片描述

Training set: $\{ x ^ { ( 1 ) } , x ^ { ( 2 ) } , x ^ { ( 3 ) } , \cdots , x ^ { ( m ) } \}$

Applications of clustering

Market segmentation

Social network analysis

Organize computing clusters

Astronomical data analysis

13-2 K-means algorithm

step1：cluster assignment 簇分配

step2：move centroid 移动聚类中心

K-means algorithm

Input:

K(number of cluster)
Training set: $\{ x ^ { ( 1 ) } , x ^ { ( 2 ) } , x ^ { ( 3 ) } , \cdots , x ^ { ( m ) } \}$

$\in \mathbb{R} ^ { n }\quad\text{drop $x_0=1$ convention}$

Randomly initialize K cluster centroids $\mu_1,\mu_2,\cdots,\mu_K\in\mathbb{R}^n$

Repeat{

for $i = 1$ to m

$c^{(i)}$ :=index (from 1 to K) of cluster centroid closet to $x^{(i)}$

for $k = 1$ to K

$\mu_k$ :=average(mean) of points assigned to cluster k

}

K-means for non-separated clusters

13-3 Optimization objective

K-means optimization objective

$c^{(i)}$ = index of cluster (1, 2, …,K) to which example $x^{(i)}$ is currently assigned
$\mu_k$ = cluster centroid $k\;(\mu_k\in\mathbb{R}^n)$

$\mu_{c^{(i)}}$ = cluster centroid of cluster to which example $x^{(i)}$ has been assigned

Optimization objective:
$J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K)=\frac{1}{m}\sum_{i=1}^{m}||x^{(i)}-\mu_{c^{(i)}}||^2$

$\underset{\mu_1,\cdots,\mu_K}{\underset{c^{(i)},\cdots,c^{(m)}}{\min}}J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K)$

13-4 Random initialization

Random initialization

Should have K < m
Randomly pick K training examples
Set $\mu_1,\cdots,\mu_k$ equal to these K examples

Local optima Random initialization

K=2-10,if large not necessary

For i= 1 to 100
{
Randomly initialize K-means
Run K-means. Get $c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_k$
Compute cost function( distortion )

$J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K)$

}

Pick clustering that gave lowest cost $J(c^{(1)},\cdots,c^{(m)},\mu_1,\cdots,\mu_K)$

13-5 Choosing the number of cluster

by hand better

What is the right value of K? No right answer

Choosing the value of K

Elbow method:

Sometimes, you’re running K-means to get clusters to use for some later/downstream purpose. Evaluate K-means based on a metric for how well it performs for that later purpose.

The Prestige

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
吴恩达·Machine Learning || chap13 Clustering简记

13 Clustering13-1 Unsupervised learning introductionSupervised learningTraining set: {(x(1),y(1)),(x(2),y(2)),(x(3),y(3)),⋯ ,(x(m),y(m))}\{ ( x ^ { ( 1 ) } , y ^ { ( 1 ) } ) , ( x ^ { ( 2 ) } , y ^ { ( 2 ) } ) , ( x ^ { ( 3 ) } , y ^ { ( 3 ) } ) , \cdo
复制链接

扫一扫

专栏目录