Order Statistic

Order Statistic

The Order Statistic

所谓顺序统计量, 即一族独立的观测 X 1 , X 2 , … , X n X_1, X_2, \ldots, X_n X1,X2,,Xn的排序后的产物
X ( 1 ) ≤ X ( 2 ) ≤ ⋯ ≤ X ( n ) . X_{(1)} \le X_{(2)} \le \cdots \le X_{(n)}. X(1)X(2)X(n).
用大写的原因, 自然是我们可以将每一个元 X ( i ) X_{(i)} X(i)看成一个随机变量, 实际上它是 X i , i = 1 , … , n X_i, i=1,\ldots, n Xi,i=1,,n的一个函数, X ( i ) = X ( i ) ( X 1 , X 2 , ⋯   , X n ) X_{(i)} = X_{(i)}(X_1,X_2,\cdots, X_n) X(i)=X(i)(X1,X2,,Xn).

推导顺序统计量的性质, 需要用到一个非常有用的表示方法, 设 F ( x ) = P ( X ≤ x ) F(x)=P(X\le x) F(x)=P(Xx)为分布函数, 定义其逆为
F − 1 ( y ) = inf ⁡ { x : F ( x ) ≥ y } , F^{-1}(y) = \inf \{x: F(x) \ge y\}, F1(y)=inf{x:F(x)y},
有一个很好的性质是, 设 U U U [ 0 , 1 ] [0,1] [0,1]上的均匀分布, 则
F − 1 ( U ) = F = X , F^{-1}(U) = F=X, F1(U)=F=X,
实际上, 这是因为 P ( F − 1 ( U ) ≤ u ) ⇔ P ( U ≤ F ( u ) ) = F ( u ) P(F^{-1}(U) \le u) \Leftrightarrow P(U \le F(u))=F(u) P(F1(U)u)P(UF(u))=F(u).

故, 倘若我们有独立的随机变量 U 1 , U 2 , … , U n U_1, U_2, \ldots, U_n U1,U2,,Un以及独立同分布的 X 1 , X 2 , … , X n X_1, X_2,\ldots, X_n X1,X2,,Xn, 我们有
( X ( 1 ) , X ( 2 ) , ⋯   , X ( n ) ) = ( F − 1 ( U ( 1 ) ) , F − 1 ( U ( 2 ) ) , ⋯   , F − 1 ( U ( n ) ) ) . (X_{(1)}, X_{(2)}, \cdots, X_{(n)}) = (F^{-1}(U_{(1)}), F^{-1}(U_{(2)}), \cdots, F^{-1}(U_{(n)})). (X(1),X(2),,X(n))=(F1(U(1)),F1(U(2)),,F1(U(n))).

另外, 令 F n F_n Fn表示 X X X的一个经验分布, 显示为
F n ( x ) = 1 n ∑ i = 1 n I ( X i ≤ x ) . F_n(x) = \frac{1}{n}\sum_{i=1}^n \mathbb{I}(X_i \le x). Fn(x)=n1i=1nI(Xix).
并令
ξ p : = F − 1 ( p ) , ξ ^ p n : = F n − 1 ( p ) . \xi_p := F^{-1}(p), \quad \hat{\xi}_{pn} := F_n^{-1}(p). ξp:=F1(p),ξ^pn:=Fn1(p).

引理1 F − 1 F^{-1} F1的一些基本性质

引理1: 假设 F F F为一分布函数, 则 F − 1 ( t ) , 0 < t < 1 F^{-1}(t), 0 < t < 1 F1(t),0<t<1是非降左连续的且满足

  1. F − 1 F ( x ) ≤ x , − ∞ < x < ∞ F^{-1}F(x) \le x, -\infty < x < \infty F1F(x)x,<x<;
  2. F ( F − 1 ( t ) ) ≥ t , 0 < t < 1 F(F^{-1}(t)) \ge t, 0 < t < 1 F(F1(t))t,0<t<1;
  3. F ( x ) ≥ t F(x) \ge t F(x)t当前仅当 x ≥ F − 1 ( t ) x \ge F^{-1}(t) xF1(t).

注: F ( x ) F(x) F(x)是非降右连续.

顺序统计量的分布

定理1: F ( x ) F(x) F(x)存在密度函数 f ( x ) f(x) f(x).

  1. P ( X ( k ) ≤ x ) = ∑ i = k n C n i [ F ( x ) ] i [ 1 − F ( x ) ] n − i , − ∞ < x < ∞ . P(X_{(k)} \le x) = \sum_{i=k}^n \mathrm{C}_n^i [F(x)]^i [1-F(x)]^{n-i}, -\infty < x < \infty. P(X(k)x)=i=knCni[F(x)]i[1F(x)]ni,<x<.

  2. X k X_k Xk的密度函数为
    n C n − 1 k − 1 F k − 1 ( x ) [ 1 − F ( x ) ] n − k f ( x ) . n\mathrm{C}_{n-1}^{k-1} F^{k-1}(x) [1-F(x)]^{n-k} f(x). nCn1k1Fk1(x)[1F(x)]nkf(x).

  3. X ( k 1 ) , X ( k 2 ) X_{(k_1)}, X_{(k_2)} X(k1),X(k2)的联合密度函数( x 1 < x 2 , k 1 < k 2 x_1<x_2, k_1<k_2 x1<x2,k1<k2)为
    n ! ( k 1 − 1 ) ! ( k 2 − k 1 − 1 ) ! ( n − k 2 ) ! [ F ( x 1 ) ] k 1 − 1 [ F ( x 2 ) − F ( x 1 ) ] k 2 − k 1 − 1 [ 1 − F ( x 2 ) ] n − k 2 f ( x 1 ) f ( x 2 ) . \frac{n!}{(k_1-1)!(k_2-k_1-1)!(n-k_2)!}[F(x_1)]^{k_1-1} [F(x_2)-F(x_1)]^{k_2-k_1-1} \\ [1-F(x_2)]^{n-k_2} f(x_1)f(x_2). (k11)!(k2k11)!(nk2)!n![F(x1)]k11[F(x2)F(x1)]k2k11[1F(x2)]nk2f(x1)f(x2).

  4. 全体顺序统计量的密度函数为
    n ! f ( x 1 ) f ( x 2 ) ⋯ f ( z n ) , − ∞ < x 1 < x 2 < ⋯ < x n < ∞ . n!f(x_1)f(x_2)\cdots f(z_n), \quad -\infty < x_1<x_2<\cdots <x_n < \infty. n!f(x1)f(x2)f(zn),<x1<x2<<xn<.

proof: 1, 2的证明是简单的, 3需注意 X ( k 1 ) , X ( k 2 ) X_{(k_1)}, X_{(k_2)} X(k1),X(k2)的分布函数为
KaTeX parse error: Invalid delimiter: '{"type":"ordgroup","mode":"math","loc":{"lexer":{"input":"\n\\sum_{i=k_2}^n \\mathrm{C}_n^i [1-F(x_2)]^{n-i} \\Big{\\{} \\sum_{j=k_1}^i \\mathrm{C}_{k_2}^j [F(x_1)]^i [F(x_2)-F(x_1)]^{k_2-j} \\Big{\\}}.\n","settings":{"displayMode":true,"leqno":false,"fleqn":false,"throwOnError":true,"errorColor":"#cc0000","macros":{},"colorIsTextColor":false,"strict":"warn","maxSize":null,"maxExpand":1000,"allowedProtocols":["http","https","mailto","_relative"]},"tokenRegex":{},"catcodes":{"%":14}},"start":52,"end":56},"body":[{"type":"atom","mode":"math","family":"open","loc":{"lexer":{"input":"\n\\sum_{i=k_2}^n \\mathrm{C}_n^i [1-F(x_2)]^{n-i} \\Big{\\{} \\sum_{j=k_1}^i \\mathrm{C}_{k_2}^j [F(x_1)]^i [F(x_2)-F(x_1)]^{k_2-j} \\Big{\\}}.\n","settings":{"displayMode":true,"leqno":false,"fleqn":false,"throwOnError":true,"errorColor":"#cc0000","macros":{},"colorIsTextColor":false,"strict":"warn","maxSize":null,"maxExpand":1000,"allowedProtocols":["http","https","mailto","_relative"]},"tokenRegex":{},"catcodes":{"%":14}},"start":53,"end":55},"text":"\\{"}]}' after '\Big' at position 53: …_2)]^{n-i} \Big{̲\̲{̲}̲ \sum_{j=k_1}^i…
此公式进行求导实际上是和1, 2的证明是类似的. 4的证明是平凡的.

顺序统计量的条件分布

定理2: F ( x ) F(x) F(x)存在密度函数 f ( x ) f(x) f(x), 则 X ( j ) ∣ X ( i ) , i < j X_{(j)}|X_{(i)}, i< j X(j)X(i),i<j的分布等价于以 F ( x ) − F ( x i ) 1 − F ( x i ) , x i ≤ x < ∞ \frac{F(x)-F(x_i)}{1-F(x_i)}, x_i \le x < \infty 1F(xi)F(x)F(xi),xix<为分布函数的 n − i n-i ni个顺序统计量的第 j − i j-i ji个分布.

proof:
KaTeX parse error: Invalid delimiter: '{"type":"ordgroup","mode":"math","loc":{"lexer":{"input":"\n\\begin{array}{ll}\nf(x_j|X_{(i)}=x_i)\n&= f_{X_(i), X_{(j)}}(x_i, x_j) / f_{X_{(i)}}(x_i) \\\\\n&= \\frac{(n-i)!}{(j-i-1)!(n-j)!} \\Big{\\{} \\frac{F(x_j)-F(x_i)}{1-F(x_i)} \\Big{\\}}^{j-i-1} \\times \\Big{\\{} \\frac{1-F(x_j)}{1-F(x_i)} \\Big{\\}} \\frac{f(x_j)}{1-F(x_i)} \\\\\n&= (n-i)\\mathrm{C}_{n-i-1}^{j-i-1} [F_i(x_j)]^{j-i-1} [1-F_i(x_j)]^{n-j} [F_i(x_j)]'.\n\\end{array}\n","settings":{"displayMode":true,"leqno":false,"fleqn":false,"throwOnError":true,"errorColor":"#cc0000","macros":{"\\\\":"\\cr"},"colorIsTextColor":false,"strict":"warn","maxSize":null,"maxExpand":1000,"allowedProtocols":["http","https","mailto","_relative"]},"tokenRegex":{},"catcodes":{"%":14}},"start":129,"end":133},"body":[{"type":"atom","mode":"math","family":"open","loc":{"lexer":{"input":"\n\\begin{array}{ll}\nf(x_j|X_{(i)}=x_i)\n&= f_{X_(i), X_{(j)}}(x_i, x_j) / f_{X_{(i)}}(x_i) \\\\\n&= \\frac{(n-i)!}{(j-i-1)!(n-j)!} \\Big{\\{} \\frac{F(x_j)-F(x_i)}{1-F(x_i)} \\Big{\\}}^{j-i-1} \\times \\Big{\\{} \\frac{1-F(x_j)}{1-F(x_i)} \\Big{\\}} \\frac{f(x_j)}{1-F(x_i)} \\\\\n&= (n-i)\\mathrm{C}_{n-i-1}^{j-i-1} [F_i(x_j)]^{j-i-1} [1-F_i(x_j)]^{n-j} [F_i(x_j)]'.\n\\end{array}\n","settings":{"displayMode":true,"leqno":false,"fleqn":false,"throwOnError":true,"errorColor":"#cc0000","macros":{"\\\\":"\\cr"},"colorIsTextColor":false,"strict":"warn","maxSize":null,"maxExpand":1000,"allowedProtocols":["http","https","mailto","_relative"]},"tokenRegex":{},"catcodes":{"%":14}},"start":130,"end":132},"text":"\\{"}]}' after '\Big' at position 130: …1)!(n-j)!} \Big{̲\̲{̲}̲ \frac{F(x_j)-F…

对比定理1中的公式即可知.

定理3: F ( x ) F(x) F(x)存在密度函数 f ( x ) f(x) f(x), 则 X ( i ) ∣ X ( j ) , i < j X_{(i)}|X_{(j)}, i<j X(i)X(j),i<j的分布等价于以 F ( x ) F ( x j ) , − ∞ < x ≤ x j \frac{F(x)}{F(x_j)}, -\infty < x \le x_j F(xj)F(x),<xxj为分布的 j − 1 j-1 j1个顺序统计量的第 i i i个分布.

proof: 证明同上.

特殊分布的特殊性质

定理4: X 1 , X 2 , … , X n X_1, X_2, \ldots, X_n X1,X2,,Xn独立服从于标准指数分布, 令
Z i : = ( n − i + 1 ) ( X ( i ) − X ( i − 1 ) ) , X ( 0 ) ≡ 0 , Z_i := (n-i+1) (X_{(i)} - X_{(i-1)}), \quad X_{(0)} \equiv 0, Zi:=(ni+1)(X(i)X(i1)),X(0)0,
Z 1 , Z 2 , … , Z n Z_1, Z_2,\ldots,Z_n Z1,Z2,,Zn也独立服从于标准指数分布.

proof: 通过变量替换并利用Jacobian行列式从 x x x变换到 z z z, 需要注意俩个分布的区域的差别.

定理5: 对于 [ 0 , 1 ] [0, 1] [0,1]上的均匀分布, 则随机变量 V 1 = U ( i ) / U ( j ) V_1 = U_{(i)} / U_{(j)} V1=U(i)/U(j) V 2 = U ( j ) , 1 ≤ i < j ≤ n V_2=U_{(j)}, 1 \le i < j \le n V2=U(j),1i<jn, 独立, 前者服从 B e t a ( i , j − 1 ) Beta(i, j-1) Beta(i,j1), 后者服从 B e t a ( j , n − j + 1 ) Beta(j, n-j+1) Beta(j,nj+1).

proof: 同上利用变量替换.

定理6: 对于 [ 0 , 1 ] [0, 1] [0,1]上的均匀分布, 则随机变量
V 1 ∗ = U ( 1 ) U ( 2 ) , V 2 ∗ = ( U ( 2 ) U ( 3 ) ) 2 , ⋯   , V n − 1 ∗ = ( U ( n − 1 ) U ( n ) ) 2 , V n ∗ = U ( n ) n , V_1^* = \frac{U_{(1)}}{U_{(2)}}, V_2^*=\Big(\frac{U_{(2)}}{U_{(3)}}\Big)^2, \cdots, V_{n-1}^*=\Big(\frac{U_{(n-1)}}{U_{(n)}}\Big)^2, V_n^*=U_{(n)}^n, V1=U(2)U(1),V2=(U(3)U(2))2,,Vn1=(U(n)U(n1))2,Vn=U(n)n,
独立且均服从于 [ 0 , 1 ] [0, 1] [0,1]的均匀分布.

proof: 同样可以用变量替换来做, 不过文中是转换成指数分布然后利用前面的结论来证明的.

ξ ^ p n − ξ p \hat{\xi}_{pn}-\xi_p ξ^pnξp

定理7: 0 < p < 1. 0 < p < 1. 0<p<1. 假设 ξ p \xi_p ξp存在唯一解 x x x使得 F ( x − ) ≤ p ≤ F ( x ) F(x^{-}) \le p \le F(x) F(x)pF(x), 则
P ( ∣ ξ ^ p n − ξ p ∣ > ϵ ) ≤ 2 exp ⁡ ( − 2 n δ ϵ 2 ) , ∀ ϵ > 0 , n , P(|\hat{\xi}_{pn} - \xi_p| > \epsilon) \le 2 \exp (-2n\delta_{\epsilon}^2), \forall \epsilon > 0, n, P(ξ^pnξp>ϵ)2exp(2nδϵ2),ϵ>0,n,
其中 δ ϵ = min ⁡ { F ( ξ p + ϵ ) − p , p − F ( ξ p − ϵ ) } \delta_{\epsilon} = \min \{F(\xi_p+\epsilon)-p, p-F(\xi_p-\epsilon)\} δϵ=min{F(ξp+ϵ)p,pF(ξpϵ)}.

proof: 证明拆成并用到了Hoffeding不等式, 感觉挺有技巧性的.

F n F_n Fn

定理11:

  1. E ( F n ( x ) ) = F ( x ) \mathbb{E}(F_n(x)) = F(x) E(Fn(x))=F(x);
  2. V a r ( F n ( x ) ) = F ( x ) ( 1 − F ( x ) ) n → 0. \mathrm{Var}(F_n(x)) = \frac{F(x)(1-F(x))}{n}\rightarrow 0. Var(Fn(x))=nF(x)(1F(x))0.

proof: 只需注意到, n F n ( x ) nF_n(x) nFn(x)实际上服从的是 b i n o m i a l ( n , F ( x ) ) \mathrm{binomial}(n, F(x)) binomial(n,F(x))即可.

定理12:
P { sup ⁡ x ∣ F n ( x ) − F ( x ) ∣ → 0 } = 1. P\{\sup_x |F_n(x) - F(x)| \rightarrow 0\} = 1. P{xsupFn(x)F(x)0}=1.

proof: ϵ > 0 \epsilon >0 ϵ>0, 取 k > 1 / ϵ k > 1/\epsilon k>1/ϵ以及
− ∞ = x 0 < x 1 < ⋯ < x k − 1 < x k = ∞ -\infty =x_0 < x_1 < \cdots < x_{k-1} < x_k = \infty =x0<x1<<xk1<xk=
使得 F ( x j − ) ≤ j / k ≤ F ( x j ) , j = 1 … , k − 1 F(x_j^-) \le j/k\le F(x_j), j=1\ldots, k-1 F(xj)j/kF(xj),j=1,k1. 若 x j − 1 < x j x_{j-1}< x_j xj1<xj, 则 F ( x j − ) − F ( x j − 1 ) < ϵ F(x_j^-)-F(x_{j-1}) < \epsilon F(xj)F(xj1)<ϵ.

根据强大数定律, 有
F n ( x j ) → a . s . F ( x j ) , F n ( x j − ) → a . s . F ( x j − ) , j = 1 , … , k − 1. F_n(x_j) \mathop{\rightarrow} \limits^{a.s.} F(x_j), F_n(x_j^-) \mathop{\rightarrow} \limits^{a.s.} F(x_j^-), j=1,\ldots, k-1. Fn(xj)a.s.F(xj),Fn(xj)a.s.F(xj),j=1,,k1.

Δ n = max ⁡ ( ∣ F n ( x j ) − F ( x j ) ∣ , ∣ F n ( x j − ) − F ( x j − ) ∣ , j = 1 , … , k − 1 ) → a . s . 0. \Delta_n = \max(|F_n(x_j) - F(x_j)|, |F_n(x_j^-) - F(x_j^-)|, j=1,\ldots,k-1) \mathop{\rightarrow} \limits^{a.s.} 0. Δn=max(Fn(xj)F(xj),Fn(xj)F(xj),j=1,,k1)a.s.0.
对于 x j − 1 < x < x j − x_{j-1}< x < x_j^- xj1<x<xj (注 x = x j x=x_j x=xj的情况下面不等式成立是天然的):
F n ( x ) − F ( x ) ≤ F n ( x j − ) − F ( x j − 1 ) ≤ F n ( x j − ) − F ( x j − ) + ϵ ≤ Δ n + ϵ F n ( x ) − F ( x ) ≥ F n ( x j − 1 ) − F ( x j − ) ≥ F n ( x j − 1 ) − F ( x j − 1 ) − ϵ ≥ Δ n − ϵ . F_n(x) - F(x) \le F_n(x_j^-) - F(x_{j-1}) \le F_n(x_j^-)-F(x_j^-)+\epsilon\le \Delta_n + \epsilon \\ F_n(x) - F(x) \ge F_n(x_{j-1}) - F(x_j^-) \ge F_n(x_{j-1}) - F(x_{j-1}) -\epsilon \ge \Delta_n - \epsilon. Fn(x)F(x)Fn(xj)F(xj1)Fn(xj)F(xj)+ϵΔn+ϵFn(x)F(x)Fn(xj1)F(xj)Fn(xj1)F(xj1)ϵΔnϵ.


sup ⁡ x ∣ F n ( x ) − F ( x ) ∣ ≤ Δ n + ϵ → a . s . ϵ . \sup_x|F_n(x) - F(x)| \le \Delta_n + \epsilon \mathop{\rightarrow}\limits^{a.s.} \epsilon. xsupFn(x)F(x)Δn+ϵa.s.ϵ.
对于任意的 ϵ \epsilon ϵ均成立. 故不等式成立.

注: 这里的证明和文中的有点不同, 感觉这么写更加合理.

注: 文中还讲了不少其它特别是渐进性质, 能力有限只能看个大概, 便不记录了.


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值