GMM & K-means 高斯混合模型和K-means聚类详解

本文详细介绍了高斯混合模型(GMM)和K-means聚类算法。GMM从几何和混合模型两个角度进行解释,通过最大似然估计(MLE)和EM算法求解。K-means是一种常见的聚类算法,通过迭代更新质心来将数据点分配到最近的簇中。GMM作为更通用的模型,包含了K-means并允许概率分配。
摘要由CSDN通过智能技术生成

往期文章链接目录

Gaussian mixture model (GMM)

A Gaussian mixture model is a probabilistic model that assumes all the data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters.

Interpretation from geometry

p ( x ) p(x) p(x) is a weighted sum of multiple Gaussian distribution.

p ( x ) = ∑ k = 1 K α k ⋅ N ( x ∣ μ k , Σ k ) p(x)=\sum_{k=1}^{K} \alpha_{k} \cdot \mathcal{N}\left(x | \mu_{k}, \Sigma_{k}\right) p(x)=k=1KαkN(xμk,Σk)

Interpretation from mixture model

setup:

  • The total number of Gaussian distribution K K K.

  • x x x, a sample (observed variable).

  • z z z, the distribution of the sample x x x (a latent variable), where

    • z ∈ { c 1 , c 2 , . . . , c K } z \in \{c_1, c_2, ..., c_K\} z{ c1,c2,...,cK}.

    • ∑ k = 1 K p ( z = c k ) = 1 \sum_{k=1}^K p(z=c_k)= 1 k=1Kp(z=ck)=1. We denote p ( z = c k ) p(z=c_k) p(z=ck) by p k p_k pk.

Mixture models are usually generative models, which means new data can be drawn from the distribution of models. Specifically, in the Gaussian Mixture Model (GMM), a new data is generated by first select a class c k c_k ck based on the probability distribution of all classes c c c, and then draw a value from the Gaussian distribution of that class. Therefore, we could write p ( x ) p(x) p(x) as the following

p ( x ) = ∑ z p ( x , z ) = ∑ k = 1 K p ( x , z = c k ) = ∑ k = 1 K p ( z = c k ) ⋅ p ( x ∣ z = c k ) = ∑ k = 1 K p k ⋅ N ( x ∣ μ k , Σ k ) \begin{aligned} p(x) &= \sum_z p(x,z) \\ &= \sum_{k=1}^{K} p(x, z=c_k) \\ &= \sum_{k=1}^{K} p(z=c_k) \cdot p(x|z=c_k) \\ &= \sum_{k=1}^{K} p_k \cdot \mathcal{N}(x | \mu_{k}, \Sigma_{k}) \end{aligned} p(x)=zp(x,z)=k=1Kp(x,z=ck)=k=1Kp(z=ck)p(xz=ck)=k=1KpkN(xμk,Σk)

We see that two ways of interpretation reach to the same result.

GMM Derivation

set up

  • X: observed data, where X = ( x 1 , x 2 , . . . , x N ) X = (x_1, x_2, ..., x_N) X=(x1,x2,...,xN)

  • θ \theta θ: parameter of the model, where θ = { p 1 , p 2 , ⋯   , p K , μ 1 , μ 2 , ⋯   , μ K , Σ 1 , Σ 2 , ⋯   , Σ K } \theta=\left\{p_{1}, p_{2}, \cdots, p_{K}, \mu_{1}, \mu_{2}, \cdots, \mu_{K}, \Sigma_{1}, \Sigma_{2}, \cdots, \Sigma_{K}\right\} θ={ p1,p2,,pK,μ1,μ2,,μK,Σ1,Σ2,,ΣK}

  • p ( x ) = ∑ k = 1 K p k ⋅ N ( x ∣ μ k , Σ k ) p(x) = \sum_{k=1}^{K} p_k \cdot \mathcal{N}(x | \mu_{k}, \Sigma_{k}) p(x)=k=1KpkN(xμk,Σk).

  • p ( x , z ) = p ( z ) ⋅ p ( x ∣ z ) = p z ⋅ N ( x ∣ μ z , Σ z ) p(x,z) = p(z) \cdot p(x|z) = p_z \cdot \mathcal{N}(x | \mu_{z}, \Sigma_{z}) p(x,z)=p(z)p(xz)=pzN(xμz,Σz)

  • p ( z ∣ x ) = p ( x , z ) p ( x ) = p z ⋅ N ( x ∣ μ z , Σ z ) ∑ k = 1 K p z ⋅ N ( x ∣ μ z , Σ z ) p(z|x) = \frac{p(x,z)}{p(x)} = \frac{p_z \cdot \mathcal{N}(x | \mu_{z}, \Sigma_{z})}{\sum_{k=1}^K p_z \cdot \mathcal{N}(x | \mu_{z}, \Sigma_{z})} p(zx)=p(x)p(x,z)=k=1KpzN(xμz,Σz)pzN(xμz,Σz)

Solve by MLE

θ ^ M L E = argmax ⁡ θ p ( X ) = argmax ⁡ θ log ⁡ p ( X ) = argmax ⁡ θ ∑ i = 1 N log ⁡ p ( x i ) = argmax ⁡ θ ∑ i = 1 N log ⁡   [ ∑ i = 1 K p k ⋅ N ( x i ∣ μ k , Σ k ) ] \begin{aligned} \hat{\theta}_{MLE} &= \underset{\theta}{\operatorname{argmax}} p(X) \\ &=\underset{\theta}{\operatorname{argmax}} \log p(X) \\ &=\underset{\theta}{\operatorname{argmax}} \sum_{i=1}^{N} \log p\left(x_{i}\right) \\ &=\underset{\theta}{\operatorname{argmax}} \sum_{i=1}^{N} \log \, [\sum_{i=1}^{K} p_{k} \cdot \mathcal{N}\left(x_{i} | \mu_{k}, \Sigma_{k}\right)] \end{aligned} θ^MLE=θargmaxp(X)=θargmaxlogp(X)=θargmaxi=1Nlogp(xi)=θa

  • 3
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值