9.26无监督学习unsupervised learning algorithm

Clustering(聚类算法)

Unsupervised learning introduction

supervised learning problem in case is given a set of labels to fit a hypothesis to it.

K-means algorithm(K均值)

  • coherent subsets子集

  • coherent clusters簇

  • The first step is to randomly initialize two points ,called the cluster centroids because I want to group my data into two clusters ,it is an iterative algorithm and does two things, First is a cluster assignment step(簇分配),second is a move centroid step(移动聚类中心).Specifically,what I mean by that, is going to go through your data set and colo each of the point like the color of cluster centroids ,depending on whether it is closer to the cluster centroid

  • the inner loop of k means, is the move centroid step,and what we are going to do is to take the cluster centroids that is ,red and blue and we are going to move them to the average of the points colored the same colour .(computer the average of same color point,and move the centroids to there)

  • and then inner loop

Optimization objective

optimization objective of K-means

J ( c ( 1 ) , . . . , c ( m ) , μ 1 , . . . . , μ K ) = 1 m ∑ i = 1 m ∣ ∣ x ( i ) − μ c ( i ) ∣ ∣ 2 J(c^{(1)},...,c^{(m)},\mu_1,....,\mu_K)=\frac{1}{m}\displaystyle \sum^{m}_{i=1}{||x^{(i)}-\mu_c(i)||^2} J(c(1),...,c(m),μ1,....,μK)=m1i=1mx(i)μc(i)2
m i n J ( c ( 1 ) , . . . , c ( m ) , μ 1 , . . . . , μ K ) minJ(c^{(1)},...,c^{(m)},\mu_1,....,\mu_K) minJ(c(1),...,c(m),μ1,....,μK)

  • distortion cost function(失真代价函数or k均值算法的失真)

Random initialization

how to initialize K-mean
how to make K-means avoid local optima as well

Most number of random initialization maybe can make situation be better when the number of cluster is less.

在这里插入图片描述

Choosing the number of cluster

how to choose the number of cluster or how to choose the value of parameter capital K

choosing the value of K

Elbow method(肘部法则):Going down rapidly and then going down slowly after that. (the elbow of the curve)
在这里插入图片描述
sometime, you’re running K-means to get clusters to use for some later/downstream purpose. Evaluate K-means base on a metric for how well it performs for that later purpose.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值