A Note on Kaldi's PLDA Implementation

Kaldi’s PLDA implementation is based on [1], the so-called two-covariance PLDA by [2]. The authors derive a clean update formula for the EM training and give a detailed comment in the source code. Here we add some explanations to make formula derivation more easy to catch.

A pdf version of this note can be found here

1. Background

Recall that PLDA assume a two stage generative process:
1) generate the class center according to

yN(μ,Φb) y ∼ N ( μ , Φ b )

2) then, generate the observed data by:
xN(y,Φw) x ∼ N ( y , Φ w )

Here, μ μ is estimated by the global mean value:

μ=k=1Ki=1nkzki μ = ∑ k = 1 K ∑ i = 1 n k z k i

here zki z k i depicts the i i -th sample of the k -th class.

So let’s to the estimation of Φb Φ b and Φw Φ w .

Note that, as μ μ is fixed, we remove it from all samples. Hereafter, we assume all samples have pre-processed by removing mu m u from them.

The prior distribution of an arbitrary sample z z is:

p ( z ) N ( 0 , Φ w + Φ w )

Let’s suppose the mean of a particular class is m m , and suppose that that class had n examples.

m=1ni=1nziN(0,Φw+Φwn) m = 1 n ∑ i = 1 n z i ∼ N ( 0 , Φ w + Φ w n )

i.e. m m is Gaussian-distributed with zero mean and variance equal to the between-class variance plus 1 / n times the within-class variance. Now, m m is observed (average of all observed samples).

2. EM

We’re doing an E-M procedure where we treat m as the sum of two variables:

m=x+y m = x + y

where xN(0,Φb) x ∼ N ( 0 , Φ b ) , yN(0,Φw/n) y ∼ N ( 0 , Φ w / n ) .

The distribution of x x will contribute to the stats of

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值