方差双期望公式
为了不让自己忘记这个公式,在此纪录推导过程。XXX 和 YYY 是两个随机变量(向量),那么
E(X,Y)[g(X,Y)]=EYEX∣Y[g(X,Y)]=EYEX[g(X,Y)∣Y], E_{(X,Y)}[g(X,Y)]=E_YE_{X|Y}[g(X,Y)]=E_YE_{X}[g(X,Y)|Y],E(X,Y)[g(X,Y)]=EYEX∣Y[g(X,Y)]=EYEX[g(X,Y)∣Y],
其中 XXX 和 YYY 对称。那么
VarX(X)=EX(X−EXX)2=E(X,Y)(X−EXX)2=E(X,Y)(X−EX[X∣Y]+EX[X∣Y]−EXX)2=E(X,Y)(X−EX[X∣Y])2+2E(X,Y)(X−EX[X∣Y])(EX[X∣Y]−EXX)+E(X,Y)(EX[X∣Y]−EXX)2. \begin{aligned}Var_X(X) &= E_X(X-E_XX)^2 = E_{(X,Y)}(X-E_{X}X)^2\\ & = E_{(X,Y)}(X-E_X[X|Y]+E_X[X|Y]-E_{X}X)^2\\ & = E_{(X,Y)}(X-E_X[X|Y])^2 \\ &\quad+ 2E_{(X,Y)}(X-E_X[X|Y])(E_X[X|Y]-E_{X}X)\\ &\quad+E_{(X,Y)}(E_X[X|Y]-E_XX)^2. \end{aligned}VarX(X)=EX(X−EXX)2=E(X,Y)(X−EXX)2=E(X,Y)(X−EX[X∣Y]+EX[X∣Y]−EXX)2=E(X,Y)(X−EX[X∣Y])2+2E(X,Y)(X−EX[X∣Y])(EX[X∣Y]−EXX)+E(X,Y)(EX[X∣Y]−EXX)2.
其中第二项等于000,因为
E(X,Y)(X−EX[X∣Y])(EX[X∣Y]−EXX)=EYEX∣Y(X−EX[X∣Y])(EX[X∣Y]−EXX)=EY{(EX[X∣Y]−EXX)EX∣Y(X−EX[X∣Y])}=0.\begin{aligned}& E_{(X,Y)}(X-E_X[X|Y])(E_X[X|Y]-E_{X}X)\\&\quad = E_YE_{X|Y}(X-E_X[X|Y])(E_X[X|Y]-E_{X}X)\\ &\quad = E_Y\{(E_X[X|Y]-E_{X}X)E_{X|Y}(X-E_X[X|Y])\}=0. \end{aligned}E(X,Y)(X−EX[X∣Y])(EX[X∣Y]−EXX)=EYEX∣Y(X−EX[X∣Y])(EX[X∣Y]−EXX)=EY{(EX[X∣Y]−EXX)EX∣Y(X−EX[X∣Y])}=0.
又由于
E(X,Y)(X−EX[X∣Y])2=EYEX∣Y(X−EX[X∣Y])2=EY[Var(X∣Y)]E(X,Y)(EX[X∣Y]−EXX)2=EYEX∣Y(EX[X∣Y]−EXX)2=VarY[E(X∣Y)].\begin{aligned}&E_{(X,Y)}(X-E_X[X|Y])^2 = E_YE_{X|Y}(X-E_X[X|Y])^2=E_Y[Var(X|Y)] \\& E_{(X,Y)}(E_X[X|Y]-E_XX)^2= E_YE_{X|Y}(E_X[X|Y]-E_XX)^2=Var_Y[E(X|Y)].\end{aligned}E(X,Y)(X−EX[X∣Y])2=EYEX∣Y(X−EX[X∣Y])2=EY[Var(X∣Y)]E(X,Y)(EX[X∣Y]−EXX)2=EYEX∣Y(EX[X∣Y]−EXX)2=VarY[E(X∣Y)].
因此,VarX(X)=EY[Var(X∣Y)]+VarY[E(X∣Y)]Var_X (X) = E_Y[Var(X|Y)] + Var_Y[E(X|Y)]VarX(X)=EY[Var(X∣Y)]+VarY[E(X∣Y)] 即为方差的双期望公式。通过这个双期望公式,我们可以得出结论样本方差不会小于其条件方差的平均,借助这个思想可以帮助我们更好的理解 Rao-Blackwell 定理。
Rao-Blackwell 定理简述,假设 TTT 是一个充分统计量,ψ(X~)\psi(\widetilde{X})ψ(X) 是参数 θ\thetaθ 的无偏估计,那么 g^(T)=E(ψ(X~)∣T)\hat{g}(T) = E(\psi(\widetilde{X})|T)g^(T)=E(ψ(X)∣T) 是 θ\thetaθ 的一个方差减小的充分无偏估计。