【ML】_06_EM（隐变量）

最新推荐文章于 2024-02-29 00:32:42 发布

DamonDT

最新推荐文章于 2024-02-29 00:32:42 发布

阅读量401

点赞数

分类专栏： ML

本文链接：https://blog.csdn.net/qq_34330456/article/details/104653142

版权

ML 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

文章目录

【一】 Latent Variable Model（隐变量模型）

举个例子：比如说一个人的观测值为【公益活动，运动，执行力强】，但其对应的未观测值为【善良，坚持，博学】，也就是说这是一种因果关系，【善良，坚持，博学】=》【公益活动，运动，执行力强】，“因” 是隐变量，“果” 是观测值

Complete Case（ $X$ ， $Z$ 已知； $θ$ 未知）-- MLE

$\bm {\ell( \theta ; D )} = log P ( X , Z \,|\, \theta ) = logP ( Z \,|\, \theta _ { z } ) + logP ( X | Z , \theta _ { x } )$

Incomplete Case（ $X$ 已知； $Z$ ， $θ$ 未知）-- EM

$\bm {\ell( \theta ; D )} = log\sum _ { Z } P ( X , Z \,|\, \theta ) = log \sum _ { Z } P ( Z | \theta _ { Z } ) P ( X | Z , \theta _ { x } )$

【二】 Expectation Maximization（EM算法）

【无监督】一种 迭代算法，用于含 隐变量 的概率模型参数的极大似然估计（E 求期望 + M 求极大）

观测数据 & 未观测数据

$\bm Y = ( Y _ { 1 } , Y _ { 2 } \ldots Y _ { n } ) ^ { T } \;\;\;\;\;\;\; \bm Z = ( Z _ { 1 } , Z _ { 2 } \ldots Z _ { n } ) ^ { T }$

求解 观测数据 的似然函数（MLE）

$\bm {L ( \theta )} = \bm {P ( Y | \theta ) }= \sum _ { Z } P ( Z | \theta ) P ( Y | Z , \theta )$

求解模型参数 θ 的对数极大似然估计

$\bm {\hat { \theta }} = \operatorname { arg } \operatorname { max _ {\theta}} \, logP(Y | \theta)$

【三】手撕 EM 算法（必须掌握）

输入：观测变量数据 $\bm Y$ ，隐变量数据 $\bm Z$ ，联合分布 $\bm {P(Y,Z∣θ)}$ ，条件分布 $\bm {P(Z∣Y,θ)}$

输出：模型参数 $\bm θ$

推导过程：

$\bm \red {\theta ^ { n + 1 }} = \operatorname { arg } \operatorname { max _ {\theta} } \, \bm {L ( \theta )} - \bm {L ( \theta ^ { n } )}$
$\operatorname { arg } \operatorname { max _ {\theta} } \, logP(Y \,|\, \theta) - logP(Y \,|\, {\theta} ^ n )$
$\operatorname { arg } \operatorname { max _ {\theta} } \, log \sum _ { Z } P(Y\,|\, Z,\theta) P(Z \,|\, \theta)- log P(Y \,|\, {\theta} ^ n )$
$\operatorname { arg } \operatorname { max _ {\theta} } \, log \sum _ { Z } P(Y \,|\, Z, \theta) P(Z \,|\, \theta) \cdot \bm {\frac { P ( Z \,|\, Y , \theta ^ { n } ) } { P ( Z \,|\, Y , \theta ^ { n } ) }} - log P(Y \,|\, {\theta} ^ n )$
$\operatorname { arg } \operatorname { max _ {\theta} } \, log \sum _ { Z } \bm {P ( Z \,|\, Y , \theta ^ { n } )} \cdot {\frac { P(Y \,|\, Z, \theta) P(Z \,|\, \theta) } { \bm {P ( Z \,|\, Y , \theta ^ { n } ) }}} - log P(Y \,|\, {\theta} ^ n )$
$\bm \red {\geq }\operatorname { arg } \operatorname { max _ {\theta} } \, \red {\sum _ { Z } \bm {P ( Z \,|\, Y , \theta ^ { n } )} \cdot \bm {log} \, {\frac { P(Y \,|\, Z, \theta) P(Z \,|\, \theta) } { \bm {P ( Z \,|\, Y , \theta ^ { n } ) }}} - log P(Y \,|\, {\theta} ^ n )}$
$\operatorname { arg } \operatorname { max _ {\theta} } \bm \red { \triangle ( \theta \,|\, \theta ^ { n } ) } \tag{1}$
$\bm \red {\therefore} \;\; \bm {L ( \theta )} - \bm {L ( \theta ^ { n } ) } \geq \bm \red { \triangle ( \theta \,|\, \theta ^ { n } ) } \;\;\;\Rightarrow\;\; \operatorname {【最大化下限】} \bm {L ( \theta )} \geq \bm {L ( \theta ^ { n } ) } +\bm \red { \triangle ( \theta \,|\, \theta ^ { n } ) } \tag{2}$
$\bm \red {\therefore} \;\; \bm \red {\theta ^ { n + 1 }} = \operatorname { arg } \operatorname { max _ {\theta} } \; [ \bm {L ( \theta ^ { n } ) } +\bm \red { \triangle ( \theta \,|\, \theta ^ { n } ) } \, ]$
$\operatorname { arg } \operatorname { max _ {\theta} } \; [ \, \bm {L ( \theta ^ { n } ) } + \red {\sum _ { Z } \bm {P ( Z \,|\, Y , \theta ^ { n } )} \cdot \bm {log} \, {\frac { P(Y \,|\, Z, \theta) P(Z \,|\, \theta) } { \bm {P ( Z \,|\, Y , \theta ^ { n } ) }}} - log P(Y \,|\, {\theta} ^ n )} \, ]$
$\operatorname { arg } \operatorname { max _ {\theta} } \; [ \, \bm {L ( \theta ^ { n } ) } + \red {\sum _ { Z } \bm {P ( Z \,|\, Y , \theta ^ { n } )} \cdot \bm {log} \, {\frac { P(Y \,|\, Z, \theta) P(Z \,|\, \theta) } { \bm {P ( Z \,|\, Y , \theta ^ { n } )} P(Y \,|\, {\theta} ^ n ) }}} \, ]$
$\operatorname { arg } \operatorname { max _ {\theta} } \; [ \, \red {\sum _ { Z } \bm {P ( Z \,|\, Y , \theta ^ { n } )} \cdot \bm {log} \, { { P(Y \,|\, Z, \theta) P(Z \,|\, \theta) } }} \, ]$
$\operatorname { arg } \operatorname { max _ {\theta} } \; [ \, \red {\sum _ { Z } \bm {P ( Z \,|\, Y , \theta ^ { n } )} \cdot \bm {log} \, { { P(Y, Z \,|\, \theta) }}} \, ]$
$\operatorname { arg } \operatorname { max _ {\theta} } \; [ \, \bm \red {E _ { Z | Y , \, \theta ^ { n } } \bm [\,{ log } \, { { P(Y, Z \,|\, \theta) }} \,]} \, ] \tag{3}$

$\bm \red {E - Step}$ :（先求 $\bm Z$ ）

$\operatorname { arg } \operatorname { max _ {\theta} } \; [ \, \bm \red {E _ { Z | Y , \, \theta ^ { n } }} \bm {[\,{ log } \, { { P(Y, Z \,|\, \theta) }} \,] \, ]} \tag{E}$

$\bm \red {M - Step}$ :（再求 $\bm \theta$ ）

$\operatorname { arg } \operatorname { max _ {\theta} } \; [ \, \bm {E _ { Z | Y , \, \theta ^ { n } }} \bm \red { [\,{ log } \, { { P(Y, Z \,|\, \theta) }} \,] \, ]} \tag{M}$