半监督生成式学习(Semi-Supervised Generative Learning)

本文探讨了半监督学习的概念,包括聚类假设和流型假设,并对比了生成式和判别式模型的区别。深入解析了高斯混合模型(GMM)的参数估计,通过EM算法进行迭代优化,适用于处理大量未标记数据。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Semi-Supervised Learning

半监督学习, 针对标记样本数量不足,寻找充分利用未标记样本的方法. 半监督学习一般两个基本假设:

  • 聚类假设:处于相同聚类的样本更可能具有相同标记;
  • 流型假设:处于很小的局部区域的样本更相似,更可能具有相同标记;

Generative and Discriminative Model

判别式学习对条件概率建模
y ∗ = arg ⁡ max ⁡ y p ( y ∣ x ) y^* = \arg\max_yp(y|\pmb x) y=argymaxp(yxxx)
生成式学习对联合概率建模
y ∗ = arg ⁡ max ⁡ y p ( y ∣ x ) = arg ⁡ max ⁡ y p ( x ∣ y ) p ( y ) p ( x ) = arg ⁡ max ⁡ y p ( x ∣ y ) p ( y ) y^* = \arg\max_y p(y|\pmb x) = \arg\max_y\frac{p(\pmb x|y)p(y)}{p(\pmb x)} = \arg\max_yp(\pmb x|y)p(y) y=argymaxp(yxxx)=argymaxp(xxx)p(xxxy)p(y)=argymaxp(xxxy)p(y)
生成式假定样本数据服从某一潜在分布(模型泛化能力强),需要充分可靠的知识.


Likelihood Function of Gaussian Mixture Model

高斯混合模型的概率密度函数
p ( x ∣ Θ ) = ∑ k p ( x ∣ θ k ) p ( θ k ) = ∑ k α k p ( x ∣ θ k ) p(\pmb x|\Theta) =\sum_k p(\pmb x|\theta_k)p(\theta_k)=\sum_k\alpha_kp(\pmb x|\theta_k) p(xxxΘ)=kp(xxxθk)p(θk)=kαkp(xxxθk)
采用最大后验概率预测 x \pmb x xxx的标记, Y = { 1 , 2 , ⋯   , K } \mathcal Y=\{1, 2, \cdots, K\} Y={1,2,,K},则
f ( x ) = arg ⁡ max ⁡ y ∈ Y p ( y ∣ x ) = arg ⁡ max ⁡ y ∈ Y ∑ k p ( y , θ k ∣ x ) = arg ⁡ max ⁡ y ∈ Y ∑ k p ( y ∣ θ k , x ) p ( θ k ∣ x ) \begin{aligned} f(\pmb x)=\arg\max_{y\in\mathcal Y}p(y|\pmb x)=\arg\max_{y\in\mathcal Y}\sum\nolimits_kp(y, \theta_k|\pmb x)=\arg\max_{y\in\mathcal Y}\sum\nolimits_kp(y|\theta_k,\pmb x)p(\theta_k|\pmb x) \end{aligned} f(xxx)=argyYmaxp(yxxx)=argyYmaxkp(y,θkxxx)=argyYmaxkp(yθk,xxx)p(θkxxx)

式中

  • p ( y ∣ θ k , x ) p(y|\theta_k,\pmb x) p(yθk,xxx),表示 x \pmb x xxx由第 k k k个分布生成且标记为 y y y的概率,当且仅当 y = k y=k y=k时,概率为1;
  • p ( θ k ∣ x ) p(\theta_k|\pmb x) p(θkxxx),表示 x \pmb x xxx由第 k k k个分布生成的后验概率,利用大数据量的未标记数据可提高该概率的准确率;

若类簇与真实类别一一对应,标记样本 x ∈ D l \pmb x\in D_l xxxDl,仅属于特定簇,则
p D l ( x , y = i ∣ Θ ) = α i p ( x ∣ θ i ) = ∑ k α k p ( x ∣ θ k ) p ( y = i ∣ θ k , x ) p_{D_l}(\pmb x, y=i|\Theta)=\alpha_ip(\pmb x|\theta_i)=\sum_k\alpha_kp(\pmb x|\theta_k)p(y=i|\theta_k,\pmb x) pDl(xxx,y=iΘ)=αip(xxxθi)=kαkp(xxxθk)p(y=iθk,xxx)
上式中仅当 i = k i=k i=k时, p ( y = k ∣ θ i , x ) p(y=k|\theta_i, \pmb x) p(y=kθi,xxx)为1,否则为0. 无标记样本 x ∈ D u \pmb x\in D_u xxxDu,可能属于任何类簇,则
p D u ( x ∣ Θ ) = ∑ k α k p ( x ∣ θ k ) p_{D_u}(\pmb x|\Theta)=\sum_k\alpha_kp(\pmb x|\theta_k) pDu(xxxΘ)=kαkp(xxxθk)
对数似然函数
L ( Θ ∣ D l ∪ D u ) = L ( Θ ∣ D l ) + L ( Θ ∣ D u ) = ∑ ( x , y ) ∈ D l ln ⁡ p ( x , y = k ∣ Θ ) + ∑ ( x , y ) ∈ D u ln ⁡ p ( x ∣ Θ ) = ∑ ( x , y ) ∈ D l ln ⁡ ∑ k α k p ( x ∣ θ k ) p ( y = i ∣ θ k , x ) + ∑ ( x , y ) ∈ D u ln ⁡ ∑ k α k p ( x ∣ θ k ) \begin{aligned} L(\Theta|D_l\cup D_u) &=L(\Theta|D_l) + L(\Theta|D_u)\\[1ex] &=\sum_{(\pmb x, y)\in D_l}\ln p(\pmb x, y=k|\Theta) + \sum_{(\pmb x, y)\in D_u}\ln p(\pmb x|\Theta)\\[1ex] &=\sum_{(\pmb x, y)\in D_l}\ln \sum_k\alpha_kp(\pmb x|\theta_k)p(y=i|\theta_k,\pmb x) + \sum_{(\pmb x, y)\in D_u}\ln \sum_k\alpha_kp(\pmb x|\theta_k)\\[1ex] \end{aligned} L(ΘDlDu)=L(ΘDl)+L(ΘDu)=(xxx,y)Dllnp(xxx,y=kΘ)+(xxx,y)Dulnp(xxxΘ)=(xxx,y)Dllnkαkp(xxxθk)p(y=iθk,xxx)+(xxx,y)Dulnkαkp(xxxθk)


Parameter Estimation

GMM的参数估计使用EM算法,即
Θ = max ⁡ Θ L ( Θ ) = arg ⁡ max ⁡ Θ Q ( Θ , Θ t ) = arg ⁡ max ⁡ θ ∑ j ∑ k P ( z k ∣ x j , Θ t ) ln ⁡ p ( x j ∣ z k , Θ ) p ( z k ∣ Θ ) \Theta = \max_{\Theta} L(\Theta) = \arg\max_{\Theta}Q(\Theta, \Theta_t) =\arg\max_{\theta}\sum_j\sum_kP(z_k|\pmb x_j,\Theta_t)\ln p(\pmb x_j|z_k,\Theta)p(z_k|\Theta) Θ=ΘmaxL(Θ)=argΘmaxQ(Θ,Θt)=argθmaxjkP(zkxxxj,Θt)lnp(xxxjzk,Θ)p(zkΘ)
其中隐变量期望,或者样本 x j \pmb x_j xxxj属于第 k k k个分布的概率,即E步
λ j k = p ( z k ∣ x j , Θ t ) = α k p ( x k ∣ θ k ) ∑ k p ( x k ∣ θ k ) \lambda_{jk}= p(z_k|\pmb x_j,\Theta_t) =\frac{\alpha_kp(\pmb x_k|\theta_k)}{\sum_kp(\pmb x_k|\theta_k)} λjk=p(zkxxxj,Θt)=kp(xxxkθk)αkp(xxxkθk)
N k N_k Nk表示第 k k k类有标记的样本数,M步
μ k = ∑ x j ∈ D u λ j k x j + ∑ ( x j , y j ) ∈ D l ∩ y i = k x j N k + ∑ x j ∈ D u λ j k σ k 2 = ∑ x j ∈ D u λ j k ( x j − μ k ) ( x j − μ k ) T + ∑ ( x j , y j ) ∈ D l ∩ y i = k ( x j − μ k ) ( x j − μ k ) T N k + ∑ x j ∈ D u λ j k α k = 1 N ( N k + ∑ x j ∈ D u λ j k ) \begin{aligned} \pmb\mu_k &=\frac{\sum_{\pmb x_j\in D_u}\lambda_{jk}\pmb x_j+\sum_{(\pmb x_j, y_j)\in D_l\cap y_i=k}\pmb x_j}{N_k + \sum_{\pmb x_j \in D_u}\lambda_{jk}}\\ \pmb\sigma_k^2 &=\frac{\sum_{\pmb x_j\in D_u}\lambda_{jk}(\pmb x_j-\pmb\mu_k)(\pmb x_j-\pmb\mu_k)^T+\sum_{(\pmb x_j, y_j)\in D_l\cap y_i=k}(\pmb x_j-\pmb\mu_k)(\pmb x_j-\pmb\mu_k)^T}{N_k + \sum_{\pmb x_j\in D_u}\lambda_{jk}}\\ \alpha_k &= \frac{1}{N}\left(N_k + \sum_{\pmb x_j \in D_u}\lambda_{jk}\right) \end{aligned} μμμkσσσk2αk=Nk+xxxjDuλjkxxxjDuλjkxxxj+(xxxj,yj)Dlyi=kxxxj=Nk+xxxjDuλjkxxxjDuλjk(xxxjμμμk)(xxxjμμμk)T+(xxxj,yj)Dlyi=k(xxxjμμμk)(xxxjμμμk)T=N1Nk+xxxjDuλjk

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值