Sufficient Statistic (充分统计量)

Sufficient statistic - Wikipedia

Sufficient statistic - arizona

定义

统计量是一些随机样本 X 1 , X 2 , ⋯   , X n X_1, X_2, \cdots, X_n X1,X2,,Xn的函数
T = r ( X 1 , X 2 , ⋯   , X n ) . T = r(X_1, X_2, \cdots, X_n). T=r(X1,X2,,Xn).
样本 X X X的分布 f θ ( X ) = f ( X ; θ ) f_{\theta}(X)=f(X;\theta) fθ(X)=f(X;θ)由位置参数 θ \theta θ决定, 通常我们通过极大似然估计
max ⁡ θ P ( X 1 , X 2 , ⋯   , X n ; θ ) = ∏ i = 1 n P ( X i ; θ ) = ∏ i = 1 n f θ ( X i ) . \max_{\theta} \quad P(X_1,X_2,\cdots, X_n ;\theta) = \prod_{i=1}^n P(X_i;\theta) = \prod_{i=1}^n f_{\theta}(X_i). θmaxP(X1,X2,,Xn;θ)=i=1nP(Xi;θ)=i=1nfθ(Xi).
而充分统计量是指这样的统计量:
P ( { X i } ∣ T = t ; θ ) = P ( { X i } ∣ T = t ) , P(\{X_i\}|T=t;\theta) = P(\{X_i\}|T=t), P({Xi}T=t;θ)=P({Xi}T=t),
即在给定 T ( X ) = t T(X)=t T(X)=t的情况下, { X i } \{X_i\} {Xi}的条件联合分布与未知参数 θ \theta θ无关.

Example: 考虑伯努利分布, 成功的概率为 p p p, 失败的概率为 1 − p 1-p 1p, 有 n n n个独立同分布的样本 X 1 , X 2 , ⋯   , X n X_1, X_2,\cdots, X_n X1,X2,,Xn, 则:
P ( { X i } ; p ) = p ∑ i X i ( 1 − p ) n − ∑ i X i , P(\{X_i\};p) = p^{\sum_i X_i}(1-p)^{n-\sum_i X_i}, P({Xi};p)=piXi(1p)niXi,
实际上(后面会讲到) T = ∑ i n X i T=\sum_i^n X_i T=inXi为其一充分统计量. 实际上,
P ( { X i } ∣ T = t ; p ) = P ( { X i } , T = t ; p ) P ( T = t ; p ) = I [ ∑ i n X i = t ] ⋅ p t ( 1 − p ) n − t C n t p t ( 1 − p ) n − t = I [ ∑ i n X i = t ] C n t . P(\{X_i\}|T=t;p) = \frac{P(\{X_i\}, T=t; p)}{P(T=t;p)} = \frac{\mathbb{I}[{\sum_{i}^nX_i=t]}\cdot p^t (1-p)^{n-t}}{C_n^t p^t (1-p)^{n-t}}=\frac{\mathbb{I}[\sum_i^n X_i = t]}{C_n^t}. P({Xi}T=t;p)=P(T=t;p)P({Xi},T=t;p)=Cntpt(1p)ntI[inXi=t]pt(1p)nt=CntI[inXi=t].
显然与位置参数 p p p无关.

充分统计量特别的意义, 比如上面提到的极大似然估计, 由于
P ( { X i } ; θ ) = P ( { X i } , T ; θ ) = P ( { X i } ∣ T ; θ )   P ( T ; θ ) = P ( { X i } ∣ T )   P ( T ; θ ) , P(\{X_i\};\theta) = P(\{X_i\}, T;\theta) = P(\{X_i\}|T;\theta) \:P(T;\theta) = P(\{X_i\}|T) \:P(T;\theta), P({Xi};θ)=P({Xi},T;θ)=P({Xi}T;θ)P(T;θ)=P({Xi}T)P(T;θ),
由于 P ( { X i } ∣ T ) P(\{X_i\}|T) P({Xi}T) θ \theta θ无关, 所以最大化上式等价于
max ⁡ θ P ( T ; θ ) = P ( r ( X 1 , X 2 , ⋯   , X n ) ; θ ) . \max_{\theta} \quad P(T;\theta) = P(r(X_1, X_2,\cdots, X_n); \theta). θmaxP(T;θ)=P(r(X1,X2,,Xn);θ).

特别地, 有时候标量 T T T并不充分, 需要 T = ( T 1 , T 2 , ⋯   , T k ) T=(T_1, T_2,\cdots, T_k) T=(T1,T2,,Tk) 整体作为充分统计量, 比如当正态分布地 μ , σ \mu, \sigma μ,σ均为未知参数的时候, T = ( 1 n ∑ i X i , 1 n − 1 ∑ i ( X i − X ˉ ) 2 ) T=(\frac{1}{n}\sum_i X_i, \frac{1}{n-1}\sum_i (X_i - \bar{X})^2) T=(n1iXi,n11i(XiXˉ)2). 性质和上面的别无二致, 所以下面也不特别说明了.

当置于贝叶斯框架下时, 可以发现:
P ( θ ∣ { X i } ) = P ( { X i } , θ ) P ( { X i } ) = P ( { X i } , T , θ ) P ( { X i } , T ) = P ( { X i } ∣ T , θ ) P ( T ∣ θ ) P ( { X i } , T ) = P ( { X i } ∣ T ) P ( T ∣ θ ) P ( { X i } , T ) = P ( θ ∣ T ) . P(\theta|\{X_i\}) = \frac{P(\{X_i\}, \theta)}{P(\{X_i\})} = \frac{P(\{X_i\}, T, \theta)}{P(\{X_i\}, T)} = \frac{P(\{X_i\}| T, \theta) P(T|\theta)}{P(\{X_i\}, T)} = \frac{P(\{X_i\}| T) P(T|\theta)}{P(\{X_i\}, T)} = P(\theta|T). P(θ{Xi})=P({Xi})P({Xi},θ)=P({Xi},T)P({Xi},T,θ)=P({Xi},T)P({Xi}T,θ)P(Tθ)=P({Xi},T)P({Xi}T)P(Tθ)=P(θT).
即给定 { X i } \{X_i\} {Xi}或者 T T T, θ \theta θ的条件(后验)分布是一致的.

特别地, 我们可以用互信息来定义充分统计量, T T T为充分统计量, 当且仅当
I ( θ ; X ) = I ( θ ; T ( X ) ) . I(\theta;X) = I(\theta;T(X)). I(θ;X)=I(θ;T(X)).
注: 一般情况下 I ( θ ; X ) ≥ I ( θ ; T ( X ) ) I(\theta;X) \ge I(\theta;T(X)) I(θ;X)I(θ;T(X)).

充分统计量的判定

用上面的标准来判断充分统计量是非常困难的一件事, 好在有Fisher-Neyman分离定理:

Factorization Theorem: { X i } \{X_i\} {Xi}的联合密度函数为 f θ ( X ) f_{\theta}(X) fθ(X), 则 T T T是关于 θ \theta θ的充分统计量当且仅当存在非负函数 g , h g, h g,h满足
f ( X 1 , X 2 , ⋯   , X n ; θ ) = h ( X 1 , X 2 , ⋯   , X n ) g ( T ; θ ) . f(X_1, X_2,\cdots, X_n; \theta) = h(X_1, X_2,\cdots, X_n) g(T; \theta). f(X1,X2,,Xn;θ)=h(X1,X2,,Xn)g(T;θ).
注: T T T可以是 T = ( T 1 , T 2 , ⋯   , T k ) T=(T_1, T_2,\cdots, T_k) T=(T1,T2,,Tk).

proof:

⇒ \Rightarrow
p ( X 1 , X 2 , ⋯   , X n ; θ ) = p ( { X i } ∣ T ; θ ) = p ( { X i } ∣ T ; θ ) p ( T ; θ ) = p ( { X i } ∣ T ) p ( T ; θ ) p(X_1,X_2,\cdots, X_n;\theta) = p(\{X_i\}|T;\theta) = p(\{X_i\}|T;\theta)p(T;\theta) = p(\{X_i\}|T)p(T;\theta) p(X1,X2,,Xn;θ)=p({Xi}T;θ)=p({Xi}T;θ)p(T;θ)=p({Xi}T)p(T;θ)
此时
g ( T ; θ ) = p ( T ; θ ) , h ( X 1 , X 2 , ⋯   , X n ) = p ( { X i } ∣ T ) . g(T;\theta) = p(T;\theta), \\ h(X_1, X_2,\cdots, X_n) = p(\{X_i\}|T). g(T;θ)=p(T;θ),h(X1,X2,,Xn)=p({Xi}T).

⇐ \Leftarrow

为了符号简便, 令 X = { X 1 , X 2 , ⋯   , X n } X = \{X_1, X_2,\cdots, X_n\} X={X1,X2,,Xn}.
p ( T = t ; θ ) = ∫ T ( X ) = t p ( X , T = t ; θ ) d X = ∫ T ( X ) = t f ( X ; θ ) d X = ∫ T ( X ) = t h ( X ) g ( T = t ; θ ) d X = ∫ T ( X ) = t h ( X ) d X ⋅ g ( T = t ; θ ) . \begin{array}{ll} p(T=t;\theta) &= \int_{T(X)=t} p(X,T=t;\theta) \mathrm{d}X \\ &= \int_{T(X)=t} f(X;\theta) \mathrm{d}X \\ &= \int_{T(X)=t} h(X) g(T=t;\theta) \mathrm{d}X \\ &= \int_{T(X)=t} h(X) \mathrm{d}X \cdot g(T=t;\theta) \\ \end{array}. p(T=t;θ)=T(X)=tp(X,T=t;θ)dX=T(X)=tf(X;θ)dX=T(X)=th(X)g(T=t;θ)dX=T(X)=th(X)dXg(T=t;θ).

p ( X ∣ T = t ; θ ) = p ( X , T = t ; θ ) p ( T = t ; θ ) = p ( X ; θ ) p ( T = t ; θ ) = h ( X ) g ( T = t ; θ ) ∫ T ( X ) = t h ( X ) d X ⋅ g ( T = t ; θ ) = h ( X ) ∫ T ( X ) = t h ( X ) . \begin{array}{ll} p(X | T=t;\theta) &= \frac{p(X,T=t;\theta)}{p(T=t;\theta)} \\ &= \frac{p(X;\theta)}{p(T=t;\theta)} \\ &= \frac{h(X)g(T=t;\theta)}{\int_{T(X)=t}h(X)\mathrm{d} X \cdot g(T=t;\theta)} \\ &= \frac{h(X)}{\int_{T(X)=t}h(X)}. \\ \end{array} p(XT=t;θ)=p(T=t;θ)p(X,T=t;θ)=p(T=t;θ)p(X;θ)=T(X)=th(X)dXg(T=t;θ)h(X)g(T=t;θ)=T(X)=th(X)h(X).
θ \theta θ无关.

注: 上述的证明存疑.

最小统计量

最小统计量S, 即

  1. S是充分统计量;
  2. 充分统计量 T T T, 存在 f f f, 使得 S = f ( T ) S=f(T) S=f(T).

注: 若 T T T是充分统计量, 则任意的可逆函数 f f f得到的 f ( T ) f(T) f(T)也是充分统计量.

例子

U [ 0 , θ ] U[0, \theta] U[0,θ]

均匀分布, 此时
p ( X 1 , X 2 , ⋯   , X n ; θ ) = 1 θ n I [ 0 ≤ min ⁡ { X i } ] ⋅ I [ max ⁡ { X i } ≤ θ ] , p(X_1, X_2,\cdots, X_n;\theta) = \frac{1}{\theta^n} \mathbb{I}[0\le \min \{X_i\}] \cdot \mathbb{I}[\max \{X_i\} \le \theta], p(X1,X2,,Xn;θ)=θn1I[0min{Xi}]I[max{Xi}θ],

T = max ⁡ { X i } ,   g ( T ; θ ) = I [ max ⁡ { X i } ⋅ 1 θ n ,   h ( X ) = I [ 0 ≤ min ⁡ { X i } ] . T = \max \{X_i\}, \: g(T;\theta) = \mathbb{I}[\max \{X_i\} \cdot \frac{1}{\theta^n}, \: h(X) = \mathbb{I}[0\le \min \{X_i\}]. T=max{Xi},g(T;θ)=I[max{Xi}θn1,h(X)=I[0min{Xi}].

U [ α , β ] U[\alpha, \beta] U[α,β]

p ( X 1 , X 2 , ⋯   , X n ; α , β ) = 1 ( β − α ) n I [ α ≤ min ⁡ { X i } ] ⋅ I [ max ⁡ { X i } ≤ θ ] , p(X_1, X_2,\cdots, X_n;\alpha,\beta) = \frac{1}{(\beta - \alpha)^n} \mathbb{I}[\alpha\le \min \{X_i\}] \cdot \mathbb{I}[\max \{X_i\} \le \theta], p(X1,X2,,Xn;α,β)=(βα)n1I[αmin{Xi}]I[max{Xi}θ],

T = ( min ⁡ { X i } , max ⁡ { X i } ) , g ( T ; α , β ) = 1 ( β − α ) n I [ α ≤ min ⁡ { X i } ] ⋅ I [ max ⁡ { X i } ≤ θ ] , h ( X ) = 1. T = (\min \{X_i\}, \max \{X_i\}), \\ g(T;\alpha, \beta) = \frac{1}{(\beta - \alpha)^n} \mathbb{I}[\alpha\le \min \{X_i\}] \cdot \mathbb{I}[\max \{X_i\} \le \theta], \\ h(X) = 1. T=(min{Xi},max{Xi}),g(T;α,β)=(βα)n1I[αmin{Xi}]I[max{Xi}θ],h(X)=1.

Poisson

P ( X ; λ ) = λ X e − λ X ! . P(X;\lambda) = \frac{\lambda^X e^{-\lambda}}{X!}. P(X;λ)=X!λXeλ.

p ( X 1 , X 2 , ⋯   , X n ; λ ) = e − n λ λ ∑ i X i ⋅ 1 ∏ i X i ! . p(X_1, X_2,\cdots, X_n;\lambda) = e^{-n\lambda} \lambda^{\sum_{i}X_i} \cdot \frac{1}{\prod_i X_i!}. p(X1,X2,,Xn;λ)=enλλiXiiXi!1.

T = ∑ i X i , g ( T ; θ ) = e − n λ ⋅ λ T , h ( X ) = 1 ∏ i X i ! . T = \sum_iX_i, \\ g(T;\theta) = e^{-n\lambda} \cdot \lambda^T, \\ h(X) = \frac{1}{\prod_{i} X_i!}. T=iXi,g(T;θ)=enλλT,h(X)=iXi!1.

Normal

P ( X ; μ , σ ) = 1 2 π σ 2 exp ⁡ ( − ( X − μ ) 2 2 σ 2 ) . P(X;\mu,\sigma) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp(-\frac{(X-\mu)^2}{2\sigma^2}). P(X;μ,σ)=2πσ2 1exp(2σ2(Xμ)2).

p ( X 1 , X 2 , ⋯   , X n ; μ , σ ) = ( 2 π σ 2 ) − n 2 exp ⁡ ( − 1 2 σ 2 ∑ i = 1 n ( X i − X ˉ ) 2 ) exp ⁡ ( − n 2 σ 2 ) ( μ − X ˉ ) 2 . p(X_1, X_2,\cdots, X_n;\mu, \sigma) = (2\pi\sigma^2)^{-\frac{n}{2}} \exp (-\frac{1}{2\sigma^2}\sum_{i=1}^n (X_i - \bar{X})^2) \exp(-\frac{n}{2\sigma^2})(\mu-\bar{X})^2. p(X1,X2,,Xn;μ,σ)=(2πσ2)2nexp(2σ21i=1n(XiXˉ)2)exp(2σ2n)(μXˉ)2.
σ \sigma σ已知:
T = 1 n ∑ X i = X ˉ , g ( T ; μ ) = ( 2 π σ 2 ) − n 2 exp ⁡ ( − n 2 σ 2 ) ( μ − T ) 2 , h ( X ) = exp ⁡ ( − 1 2 σ 2 ∑ i = 1 n ( X i − X ˉ ) 2 ) . T=\frac{1}{n}\sum X_i = \bar{X} , \\ g(T;\mu) = (2\pi\sigma^2)^{-\frac{n}{2}} \exp(-\frac{n}{2\sigma^2})(\mu-T)^2, \\ h(X) = \exp (-\frac{1}{2\sigma^2}\sum_{i=1}^n (X_i - \bar{X})^2). T=n1Xi=Xˉ,g(T;μ)=(2πσ2)2nexp(2σ2n)(μT)2,h(X)=exp(2σ21i=1n(XiXˉ)2).

σ \sigma σ未知:
T = ( X ˉ , s 2 ) , s 2 = ∑ i = 1 n ( X i − X ˉ ) 2 n − 1 , g ( T ; μ , σ ) = ( 2 π σ 2 ) − n 2 exp ⁡ ( − n − 1 2 σ 2 s 2 ) exp ⁡ ( − n 2 σ 2 ) ( μ − X ˉ ) 2 , h ( X ) = 1. T = (\bar{X}, s^2), s^2 = \frac{\sum_{i=1}^n(X_i-\bar{X})^2}{n-1}, \\ g(T;\mu,\sigma) = (2\pi\sigma^2)^{-\frac{n}{2}}\exp(-\frac{n-1}{2\sigma^2}s^2) \exp(-\frac{n}{2\sigma^2})(\mu-\bar{X})^2, \\ h(X) = 1. T=(Xˉ,s2),s2=n1i=1n(XiXˉ)2,g(T;μ,σ)=(2πσ2)2nexp(2σ2n1s2)exp(2σ2n)(μXˉ)2,h(X)=1.

指数分布

p ( X ) = 1 λ e − X λ , X ≥ 0. p(X) = \frac{1}{\lambda} e^{-\frac{X}{\lambda}}, \quad X \ge 0. p(X)=λ1eλX,X0.

p ( X 1 , X 2 , ⋯   , X n ; λ ) = 1 λ n e − ∑ i = 1 n X i λ . p(X_1, X_2,\cdots, X_n;\lambda) = \frac{1}{\lambda^n} e^{-\frac{\sum_{i=1}^n X_i}{\lambda}}. p(X1,X2,,Xn;λ)=λn1eλi=1nXi.

T = ∑ i = 1 n X i , g ( T ; λ ) = 1 λ n e − T λ , h ( X ) = 1. T = \sum_{i=1}^n X_i, \\ g(T;\lambda) = \frac{1}{\lambda^n} e^{-\frac{T}{\lambda}}, \\ h(X) = 1. T=i=1nXi,g(T;λ)=λn1eλT,h(X)=1.

Gamma

Γ ( α , β ) = 1 Γ ( α ) β α X α − 1 e − X β . \Gamma(\alpha, \beta) = \frac{1}{\Gamma(\alpha) \beta^{\alpha}}X^{\alpha-1} e^{-\frac{X}{\beta}}. Γ(α,β)=Γ(α)βα1Xα1eβX.

p ( X 1 , X 2 , ⋯   , X n ; α , β ) = 1 ( Γ ( α ) β α ) n ( ∏ i X i ) α − 1 e − ∑ i X i β . p(X_1, X_2,\cdots, X_n;\alpha, \beta) = \frac{1}{(\Gamma(\alpha) \beta^{\alpha})^n}(\prod_{i} X_i)^{\alpha-1} e^{-\frac{\sum_iX_i}{\beta}}. p(X1,X2,,Xn;α,β)=(Γ(α)βα)n1(iXi)α1eβiXi.

T = ( ∏ i X i , ∑ i X i ) , g ( T ; θ ) = 1 ( Γ ( α ) β α ) n ( ∏ i X i ) α − 1 e − ∑ i X i β , h ( X ) = 1. T = (\prod_i X_i, \sum_i X_i), \\ g(T;\theta) = \frac{1}{(\Gamma(\alpha) \beta^{\alpha})^n}(\prod_{i} X_i)^{\alpha-1} e^{-\frac{\sum_iX_i}{\beta}}, \\ h(X) = 1. T=(iXi,iXi),g(T;θ)=(Γ(α)βα)n1(iXi)α1eβiXi,h(X)=1.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值