目录
本文用 E [ ⋅ ] E[\cdot] E[⋅]、 D [ ⋅ ] D[\cdot] D[⋅]来表示期望和方差
离散随机变量分布
一、0-1分布或伯努利分布
考虑抛掷一枚硬币,设正面向上的概率为
p
p
p, 反面向上的概率为
1
−
p
1-p
1−p。伯努利随机变量
X
X
X在试验结果为正面向上时取值为
1
1
1,在试验结果为反面向上时取值为
0
0
0。
记为
X
X
X~
B
(
1
,
p
)
B(1,p)
B(1,p),
X
X
X的分布为
P
(
X
=
x
)
=
{
p
x
=
1
1
−
p
x
=
0
P(X=x)=\begin{cases}p & x=1\\1-p & x=0\end{cases}
P(X=x)={p1−px=1x=0
E
[
X
]
=
p
∗
1
+
(
1
−
p
)
∗
0
=
p
E[X]=p*1+(1-p)*0=p
E[X]=p∗1+(1−p)∗0=p
显然 E [ X 2 ] = 0 2 ∗ ( 1 − p ) + 1 2 ∗ p = p E[X^2]=0^2*(1-p)+1^2*p=p E[X2]=02∗(1−p)+12∗p=p
D [ X ] = E [ X 2 ] − E [ X ] 2 = p − p 2 = p ( 1 − p ) D[X]=E[X^2]-E[X]^2=p-p^2=p(1-p) D[X]=E[X2]−E[X]2=p−p2=p(1−p)
二、二项分布
将一枚硬币抛掷 n 次,每次抛掷, 正面出现的概率为
p
p
p,反面出现的概率为
1
−
p
1-p
1−p, 而且各次抛掷是相互独立的。令
X
X
X 为 n 次抛掷得到正面的次数。 我们称
X
X
X 为二项随机变量。 其参数为 n 和
p
p
p。
记为
X
X
X~
B
(
n
,
p
)
B(n,p)
B(n,p),
X
X
X的分布为
P
(
X
=
k
)
=
C
n
k
p
k
(
1
−
p
)
1
−
k
(
k
≤
n
,
k
∈
Z
,
C
n
k
=
n
!
k
!
(
n
−
k
)
!
)
P(X=k)=C_n^kp^k(1-p)^{1-k}\quad (k\le n,k\in Z,C_n^k=\dfrac{n!}{k!(n-k)!})
P(X=k)=Cnkpk(1−p)1−k(k≤n,k∈Z,Cnk=k!(n−k)!n!)
可认为 X = X 1 + X 2 + ⋯ + X n X=X_1+X_2+\cdots +X_n X=X1+X2+⋯+Xn, X i X_i Xi~ B ( 1 , p ) B(1,p) B(1,p)
E
[
X
]
=
E
[
X
1
+
X
2
+
⋯
+
X
n
]
=
∑
i
=
1
n
E
[
X
i
]
=
n
p
E[X]=E[X_1+X_2+\cdots +X_n]=\sum\limits^n_{i=1}E[X_i]=np
E[X]=E[X1+X2+⋯+Xn]=i=1∑nE[Xi]=np
D
[
X
]
=
D
[
∑
i
=
1
n
X
i
]
=
∑
i
=
1
n
D
[
X
i
]
=
n
p
(
1
−
p
)
D[X]=D[\sum\limits^n_{i=1}X_i]=\sum\limits^n_{i=1}D[X_i]=np(1-p)
D[X]=D[i=1∑nXi]=i=1∑nD[Xi]=np(1−p)
与二项分布相关的还有几何分布和超几何分布,超几何分布直接百度百科有。
几何分布
在伯努利试验中直到第 k k k次试验才成功的分布。
记为 X X X~ G ( p ) G(p) G(p), X X X的分布为 P ( X = k ) = ( 1 − p ) k − 1 p ( 1 ≤ k , k ∈ Z ) P(X=k)=(1-p)^{k-1}p\quad (1\le k,k\in Z) P(X=k)=(1−p)k−1p(1≤k,k∈Z)
其期望
E [ X ] = lim n → ∞ ∑ k = 1 n ( 1 − p ) k − 1 p k = p ⋅ lim n → ∞ ∑ k = 1 n ( 1 − p ) k − 1 k E[X]=\lim\limits_{n\to \infty}\sum\limits_{k=1}^n(1-p)^{k-1}pk=p\cdot \lim\limits_{n\to \infty}\sum\limits_{k=1}^n(1-p)^{k-1}k E[X]=n→∞limk=1∑n(1−p)k−1pk=p⋅n→∞limk=1∑n(1−p)k−1k
= p ⋅ lim n → ∞ 1 ( 1 − p ) − 1 [ ( 1 − p ) n ⋅ n + ∑ k = 1 n − 1 ( 1 − p ) k [ ( k − 1 ) − k ] − ( 1 − p ) 0 ⋅ 1 ] =p\cdot \lim\limits_{n\to \infty}\dfrac{1}{(1-p)-1}\left[(1-p)^n\cdot n +\sum\limits_{k=1}^{n-1}(1-p)^{k}[(k-1)-k]-(1-p)^0\cdot 1\right] =p⋅n→∞lim(1−p)−11[(1−p)n⋅n+k=1∑n−1(1−p)k[(k−1)−k]−(1−p)0⋅1]
= lim n → ∞ ( − 1 ) [ ( 1 − p ) n ⋅ n − ( 1 − p ) n − ( 1 − p ) 1 − p − 1 − 1 ] =\lim\limits_{n\to \infty}(-1)\left[(1-p)^n\cdot n -\dfrac{(1-p)^n-(1-p)}{1-p-1}-1\right] =n→∞lim(−1)[(1−p)n⋅n−1−p−1(1−p)n−(1−p)−1]
= lim n → ∞ [ 1 + ( 1 − p ) p − ( 1 − p ) n p − ( 1 − p ) n ⋅ n ] =\lim\limits_{n\to \infty}\left[1+\dfrac{(1-p)}{p}-\dfrac{(1-p)^n}{p}-(1-p)^n\cdot n \right] =n→∞lim[1+p(1−p)−p(1−p)n−(1−p)n⋅n]
= 1 p − lim n → ∞ ( 1 − p ) n [ 1 + p n p ] =\dfrac{1}{p}-\lim\limits_{n\to \infty}(1-p)^n\left[\dfrac{1+pn}{p}\right] =p1−n→∞lim(1−p)n[p1+pn]
又,指数比负幂收敛更快,所以
E [ X ] = 1 p − 0 = 1 p E[X]=\dfrac{1}{p}-0=\dfrac{1}{p} E[X]=p1−0=p1
类似如上所示,计算方差的无穷级数非常麻烦,这里给出《概率导论》提到的算法,涉及的条件期望概念可以在下面找到
假设第一次就成功,则有
E [ X ∣ X = 1 ] = 1 , E [ X 2 ∣ X = 1 ] = 1 E[X|X=1]=1,\quad E[X^2|X=1]=1 E[X∣X=1]=1,E[X2∣X=1]=1
若第一次没有成功,则视为浪费了一次机会
E [ X ∣ X > 1 ] = E [ 1 + X ] = 1 + E [ X ] , E [ X 2 ∣ X > 1 ] = E [ ( 1 + X ) 2 ] = 1 + 2 E [ X ] + E [ X 2 ] E[X|X>1]=E[1+X]=1+E[X],\quad E[X^2|X>1]=E[(1+X)^2]=1+2E[X]+E[X^2] E[X∣X>1]=E[1+X]=1+E[X],E[X2∣X>1]=E[(1+X)2]=1+2E[X]+E[X2]
所以 E [ X ] = P ( X = 1 ) E [ X ∣ X = 1 ] + P ( X > 1 ) E [ X ∣ X > 1 ] E[X]=P(X=1)E[X|X=1]+P(X>1)E[X|X>1] E[X]=P(X=1)E[X∣X=1]+P(X>1)E[X∣X>1]
= p ⋅ 1 + ( 1 − p ) ( 1 + E [ X ] ) =p\cdot 1+(1-p)(1+E[X]) =p⋅1+(1−p)(1+E[X])
解得 E [ X ] = 1 p E[X]=\dfrac{1}{p} E[X]=p1
且
E [ X 2 ] = P ( X = 1 ) E [ X 2 ∣ X = 1 ] + P ( X > 1 ) E [ X 2 ∣ X > 1 ] E[X^2]=P(X=1)E[X^2|X=1]+P(X>1)E[X^2|X>1] E[X2]=P(X=1)E[X2∣X=1]+P(X>1)E[X2∣X>1]
= p ⋅ 1 + ( 1 − p ) ( 1 + 2 E [ X ] + E [ X 2 ] ) ] =p\cdot 1+(1-p)(1+2E[X]+E[X^2])] =p⋅1+(1−p)(1+2E[X]+E[X2])]
即 E [ X 2 ] = 1 + 2 ( 1 − p ) E [ X ] p = 2 p 2 − 1 p E[X^2]=\dfrac{1+2(1-p)E[X]}{p}=\dfrac{2}{p^2}-\dfrac{1}{p} E[X2]=p1+2(1−p)E[X]=p22−p1
D [ X ] = E [ X 2 ] − E [ X ] 2 = 2 p 2 − 1 p − 1 p 2 = 1 − p p 2 D[X]=E[X^2]-E[X]^2=\dfrac{2}{p^2}-\dfrac{1}{p}-\dfrac{1}{p^2}=\dfrac{1-p}{p^2} D[X]=E[X2]−E[X]2=p22−p1−p21=p21−p
三、泊松分布
记为
X
X
X~
π
(
λ
)
\pi(\lambda)
π(λ) 或
X
X
X~
P
(
λ
)
P(\lambda)
P(λ),
X
X
X的分布为
P
(
X
=
k
)
=
λ
k
k
!
e
−
λ
(
k
∈
Z
)
P(X=k)=\dfrac{\lambda ^k}{k!}e^{-\lambda}\quad (k\in Z)
P(X=k)=k!λke−λ(k∈Z)
E
[
X
]
=
∑
k
=
0
∞
k
λ
k
k
!
e
−
λ
=
0
λ
0
0
!
e
−
λ
+
λ
e
−
λ
∑
k
−
1
=
0
∞
λ
k
−
1
(
k
−
1
)
!
=
λ
e
−
λ
e
λ
=
λ
E[X]=\sum\limits^{\infty}_{k=0}k\dfrac{\lambda ^k}{k!}e^{-\lambda}=0\dfrac{\lambda ^0}{0!}e^{-\lambda}+\lambda e^{-\lambda}\sum\limits^{\infty}_{k-1=0}\dfrac{\lambda ^{k-1}}{(k-1)!}=\lambda e^{-\lambda}e^{\lambda}=\lambda
E[X]=k=0∑∞kk!λke−λ=00!λ0e−λ+λe−λk−1=0∑∞(k−1)!λk−1=λe−λeλ=λ
注释
e x e^x ex的麦克劳林展开式为 e x = 1 + x + x 2 2 ! + … + x n n ! + … = ∑ n = 0 ∞ x n n ! e^x=1+x+\dfrac{x^2}{2!}+…+\dfrac{x^n}{n!}+…=\sum\limits^\infty_{n=0}\dfrac{x^n}{n!} ex=1+x+2!x2+…+n!xn+…=n=0∑∞n!xn
同时 E [ X ( X − 1 ) ] = ∑ k = 0 ∞ k ( k − 1 ) λ k k ! e − λ = λ 2 E[X(X-1)]=\sum\limits^{\infty}_{k=0}k(k-1)\dfrac{\lambda ^k}{k!}e^{-\lambda}=\lambda^2 E[X(X−1)]=k=0∑∞k(k−1)k!λke−λ=λ2
所以 E [ X 2 ] = E [ X ( X − 1 ) + X ] = E [ X ( X − 1 ) ] + E [ X ] = λ 2 + λ E[X^2]=E[X(X-1)+X]=E[X(X-1)]+E[X]=\lambda^2+\lambda E[X2]=E[X(X−1)+X]=E[X(X−1)]+E[X]=λ2+λ
D [ X ] = E [ X 2 ] − E [ X ] 2 = λ 2 + λ − λ 2 = λ D[X]=E[X^2]-E[X]^2=\lambda^2+\lambda-\lambda^2=\lambda D[X]=E[X2]−E[X]2=λ2+λ−λ2=λ
连续随机变量分布
四、均匀分布
考虑取值于区间
[
a
,
b
]
[a,b]
[a,b]上的随机变量.我们假定
X
X
X 取值于
[
a
,
b
]
[a,b]
[a,b] 的任意两个长度相同的子区间的概率是相同的。这种随机变量称为具有均匀分布的随机变量。
记为
X
X
X~
U
(
a
,
b
)
U(a,b)
U(a,b),
X
X
X的概率密度为
f
(
x
)
=
{
1
b
−
a
x
∈
[
a
,
b
]
0
o
t
h
e
r
s
(
a
,
b
∈
R
)
f(x)=\begin{cases}\dfrac{1}{b-a} &x\in[a,b]\\0 &others\end{cases}\quad (a,b\in R)
f(x)=⎩
⎨
⎧b−a10x∈[a,b]others(a,b∈R)
E
[
X
]
=
∫
−
∞
∞
x
f
(
x
)
d
x
=
∫
a
b
x
b
−
a
d
x
=
a
+
b
2
E[X]=\int\limits^{\infty}_{-\infty}xf(x)dx=\int\limits^b_a\dfrac{x}{b-a}dx=\dfrac{a+b}{2}
E[X]=−∞∫∞xf(x)dx=a∫bb−axdx=2a+b
D
[
X
]
=
E
[
X
2
]
−
E
[
X
]
2
=
∫
−
∞
∞
x
2
f
(
x
)
d
x
−
(
a
+
b
2
)
2
=
(
b
−
a
)
2
12
D[X]=E[X^2]-E[X]^2=\int\limits^{\infty}_{-\infty}x^2f(x)dx-(\dfrac{a+b}{2})^2=\dfrac{(b-a)^2}{12}
D[X]=E[X2]−E[X]2=−∞∫∞x2f(x)dx−(2a+b)2=12(b−a)2
五、 指数分布
记为
X
X
X~
E
(
θ
)
E(\theta)
E(θ),
X
X
X的概率密度为
f
(
x
)
=
{
1
θ
e
−
x
/
θ
x
>
0
0
x
≤
0
(
θ
>
0
)
f(x)=\begin{cases}\dfrac{1}{\theta } e^{-x/\theta }&x> 0\\0 &x\le0\end{cases}\quad (\theta>0)
f(x)=⎩
⎨
⎧θ1e−x/θ0x>0x≤0(θ>0)
(这是其中一种形式,还有形式有
f
(
x
)
=
{
λ
e
−
λ
x
x
>
0
0
x
≤
0
f(x)=\begin{cases}\lambda e^{-\lambda x }&x> 0\\0 &x\le0\end{cases}
f(x)={λe−λx0x>0x≤0的等等)
E
[
X
]
=
∫
−
∞
∞
x
f
(
x
)
d
x
E[X]=\int\limits^{\infty}_{-\infty}xf(x)dx
E[X]=−∞∫∞xf(x)dx
=
∫
0
∞
x
1
θ
e
−
x
/
θ
d
x
=\int\limits^{\infty}_{0}x\dfrac{1}{\theta } e^{-x/\theta }dx
=0∫∞xθ1e−x/θdx
=
[
1
θ
x
(
−
θ
)
e
−
x
/
θ
−
∫
1
θ
(
−
θ
)
e
−
x
/
θ
d
x
]
0
∞
=\Big[\dfrac{1}{\theta }x(-\theta)e^{-x/\theta }-\int \dfrac{1}{\theta } (-\theta)e^{-x/\theta }dx\Big]^{\infty}_{0}
=[θ1x(−θ)e−x/θ−∫θ1(−θ)e−x/θdx]0∞
=
[
−
x
e
−
x
/
θ
−
θ
e
−
x
/
θ
]
0
∞
=\Big[ -xe^{-x/\theta } -\theta e^{-x/\theta } \Big]^{\infty}_{0}
=[−xe−x/θ−θe−x/θ]0∞
=
θ
=\theta
=θ
同理 E [ X 2 ] = ∫ − ∞ ∞ x 2 f ( x ) d x = 2 θ 2 E[X^2]=\int\limits^{\infty}_{-\infty}x^2f(x)dx=2\theta^2 E[X2]=−∞∫∞x2f(x)dx=2θ2
D [ X ] = E [ X 2 ] − E [ X ] 2 = 2 θ 2 − θ 2 = θ 2 D[X]=E[X^2]-E[X]^2=2\theta^2-\theta^2=\theta^2 D[X]=E[X2]−E[X]2=2θ2−θ2=θ2
六、正态/高斯分布
记为
X
X
X~
N
(
μ
,
σ
2
)
N(\mu,\sigma^2)
N(μ,σ2),
X
X
X的概率密度为
f
(
x
)
=
1
2
π
σ
e
−
(
x
−
μ
)
2
2
σ
2
f(x)=\dfrac{1}{\sqrt{2\pi}\sigma}e^{-\dfrac{(x-\mu)^2}{2\sigma^2}}
f(x)=2πσ1e−2σ2(x−μ)2
设
Z
=
X
−
μ
σ
∴
Z
∼
N
(
0
,
1
)
Z=\dfrac{X-\mu}{\sigma}\therefore Z\sim N(0,1)
Z=σX−μ∴Z∼N(0,1)
容易知道
E
[
Z
]
=
0
E[Z]=0
E[Z]=0,
D
[
Z
]
=
1
D[Z]=1
D[Z]=1
由
X
=
μ
+
σ
Z
X=\mu+\sigma Z
X=μ+σZ
E
[
X
]
=
E
[
μ
+
σ
Z
]
=
μ
E[X]=E[\mu+\sigma Z]=\mu
E[X]=E[μ+σZ]=μ
D
[
X
]
=
D
[
μ
+
σ
Z
]
=
σ
2
D[X]=D[\mu+\sigma Z]=\sigma^2
D[X]=D[μ+σZ]=σ2
但是事实上用定义来做也能得出这个结果
常用结论
X ‾ − μ σ / n ∼ N ( 0 , 1 ) \dfrac{\overline X-\mu}{\sigma/\sqrt{n}} \sim N(0,1) σ/nX−μ∼N(0,1) 证明在下面样本均值模块
抽样统计量分布
七、抽样分布(1) 样本均值
我们假定
X
1
,
X
2
,
⋯
,
X
n
X_1,X_2,\cdots ,X_n
X1,X2,⋯,Xn为独立同分布的正态随机变量(但事实上
X
i
X_i
Xi属于何种分布不影响
X
‾
\overline X
X的期望和方差),其均值为
μ
\mu
μ, 方差为
σ
2
\sigma^2
σ2 。我们将这些变量得到的平均值称为样本均值
X
‾
\overline X
X。
有
X
‾
∼
N
(
μ
,
σ
2
/
n
)
\overline X \sim N(\mu,\sigma^2/n)
X∼N(μ,σ2/n)。我们通常定义
X
‾
=
1
n
∑
i
=
1
n
X
i
,
X
i
∼
N
(
μ
,
σ
2
)
\overline X=\dfrac{1}{n}\sum\limits_{i=1}^{n}X_i,\quad X_i\sim N(\mu,\sigma^2)
X=n1i=1∑nXi,Xi∼N(μ,σ2)
E
[
X
‾
]
=
E
[
1
n
∑
i
=
1
n
X
i
]
=
1
n
∑
i
=
1
n
E
[
X
i
]
=
n
μ
n
=
μ
E[\overline X]=E[\dfrac{1}{n}\sum\limits_{i=1}^{n}X_i]=\dfrac{1}{n}\sum\limits_{i=1}^{n}E[X_i]=\dfrac{n\mu}{n}=\mu
E[X]=E[n1i=1∑nXi]=n1i=1∑nE[Xi]=nnμ=μ
D
[
X
‾
]
=
D
[
1
n
∑
i
=
1
n
X
i
]
=
(
1
n
)
2
∑
i
=
1
n
D
[
X
i
]
=
n
σ
2
n
2
=
σ
2
/
n
D[\overline X]=D[\dfrac{1}{n}\sum\limits_{i=1}^{n}X_i]=(\dfrac{1}{n})^2\sum\limits_{i=1}^{n}D[X_i]=\dfrac{n\sigma^2}{n^2}=\sigma^2/n
D[X]=D[n1i=1∑nXi]=(n1)2i=1∑nD[Xi]=n2nσ2=σ2/n
注释
以下证明 X ‾ ∼ N ( μ , σ 2 / n ) \overline X \sim N(\mu,\sigma^2/n) X∼N(μ,σ2/n)
当 X ∼ N ( μ x , σ x 2 ) , Y ∼ N ( μ y , σ y 2 ) X\sim N(\mu_x,\sigma^2_x),Y\sim N(\mu_y,\sigma^2_y) X∼N(μx,σx2),Y∼N(μy,σy2)且 X , Y X,Y X,Y独立时,则 a X + b Y aX+bY aX+bY(a,b为不全为0的系数)也遵循正态分布, a X + b Y ∼ N ( a μ x + b μ y , a 2 σ x 2 + b 2 σ y 2 ) aX+bY \sim N(a\mu_x+b\mu_y,a^2\sigma_x^2+b^2\sigma^2_y) aX+bY∼N(aμx+bμy,a2σx2+b2σy2)
所以 X ‾ = 1 n ∑ i = 1 n X i ∼ N ( 1 n ∑ i = 1 n μ i , 1 n 2 ∑ i = 1 n σ i 2 ) = N ( μ , σ 2 / n ) \overline X=\dfrac{1}{n}\sum\limits_{i=1}^{n}X_i \sim N(\dfrac{1}{n}\sum\limits_{i=1}^{n}\mu_i,\dfrac{1}{n^2}\sum\limits_{i=1}^{n}\sigma^2_i)=N(\mu,\sigma^2/n) X=n1i=1∑nXi∼N(n1i=1∑nμi,n21i=1∑nσi2)=N(μ,σ2/n)
八、抽样分布(2) 样本方差
记为
S
2
S^2
S2,我们通常定义
S
2
=
1
n
−
1
∑
i
=
1
n
(
X
i
−
X
‾
)
2
,
X
i
∼
N
(
μ
,
σ
2
)
S^2=\dfrac{1}{n-1}\sum\limits_{i=1}^{n}(X_i-\overline X)^2,\quad X_i\sim N(\mu,\sigma^2)
S2=n−11i=1∑n(Xi−X)2,Xi∼N(μ,σ2)
E
[
S
2
]
=
E
[
1
n
−
1
(
(
∑
i
=
1
n
X
i
2
)
−
n
X
‾
2
)
]
E[S^2]=E[\dfrac{1}{n-1}((\sum\limits_{i=1}^{n}X_i^2)-n\overline X^2)]
E[S2]=E[n−11((i=1∑nXi2)−nX2)]
=
1
n
−
1
[
(
∑
i
=
1
n
E
[
X
i
2
]
)
−
n
E
[
X
‾
2
]
]
=\dfrac{1}{n-1}[(\sum\limits_{i=1}^{n}E[X_i^2])-nE[\overline X^2]]
=n−11[(i=1∑nE[Xi2])−nE[X2]]
=
1
n
−
1
[
(
∑
i
=
1
n
σ
2
+
μ
2
)
−
n
(
σ
2
/
n
+
μ
2
)
]
=\dfrac{1}{n-1}[(\sum\limits_{i=1}^{n}\sigma^2+\mu^2)-n(\sigma^2/n+\mu^2)]
=n−11[(i=1∑nσ2+μ2)−n(σ2/n+μ2)]
=
σ
2
=\sigma^2
=σ2
实际上在下面卡方分布提到对 S 2 S^2 S2有 ( n − 1 ) S 2 σ 2 = ∑ i = 1 n ( X i − X ‾ σ ) 2 ∼ χ 2 ( n − 1 ) \dfrac{(n-1)S^2}{\sigma^2}=\sum\limits_{i=1}^{n}(\dfrac{X_i-\overline X}{\sigma})^2 \sim \chi^2(n-1) σ2(n−1)S2=i=1∑n(σXi−X)2∼χ2(n−1)
所以 2 ( n − 1 ) = D [ ( n − 1 ) S 2 σ 2 ] = ( n − 1 ) 2 σ 4 D [ S 2 ] 2(n-1)=D[\dfrac{(n-1)S^2}{\sigma^2}]=\dfrac{(n-1)^2}{\sigma^4}D[S^2] 2(n−1)=D[σ2(n−1)S2]=σ4(n−1)2D[S2],即
D [ S 2 ] = 2 σ 4 n − 1 D[S^2]=\dfrac{2\sigma^4}{n-1} D[S2]=n−12σ4
反过来想,有 1 n E [ ∑ i = 1 n ( X i − X ‾ ) 2 ] \dfrac{1}{n}E[\sum\limits_{i=1}^{n}(X_i-\overline X)^2] n1E[i=1∑n(Xi−X)2]
= n − 1 n E [ 1 n − 1 ∑ i = 1 n ( X i − X ‾ ) 2 ] =\dfrac{n-1}{n}E[\dfrac{1}{n-1}\sum\limits_{i=1}^{n}(X_i-\overline X)^2] =nn−1E[n−11i=1∑n(Xi−X)2]
= n − 1 n E [ S 2 ] =\dfrac{n-1}{n}E[S^2] =nn−1E[S2] = n − 1 n σ 2 =\dfrac{n-1}{n}\sigma^2 =nn−1σ2
所以用均方误差来估计样本方差是一种有偏估计
九、抽样分布(3) 卡方分布
卡方分布用来处理与随机变量平方和有关的统计量,比如样本方差
记为
χ
2
∼
χ
2
(
n
)
\chi^2\sim \chi^2(n)
χ2∼χ2(n),
χ
2
\chi^2
χ2的定义为
χ
2
(
n
)
=
∑
i
=
1
n
X
i
2
,
X
i
∼
N
(
0
,
1
)
\chi^2(n)=\sum\limits^n_{i=1}X_i^2,\quad X_i\sim N(0,1)
χ2(n)=i=1∑nXi2,Xi∼N(0,1)
E [ X i 2 ] = E [ ( X i − 0 ) 2 ] = D [ X i ] = 1 E[X_i^2]=E[(X_i-0)^2]=D[X_i]=1 E[Xi2]=E[(Xi−0)2]=D[Xi]=1
E [ χ 2 ] = E [ ∑ i = 1 n X i 2 ] = n E[\chi^2]=E[\sum\limits^n_{i=1}X_i^2]=n E[χ2]=E[i=1∑nXi2]=n
D [ X i 2 ] = E [ X i 4 ] − E [ X i 2 ] 2 = 3 − 1 = 2 D[X_i^2]=E[X^4_i]-E[X_i^2]^2=3-1=2 D[Xi2]=E[Xi4]−E[Xi2]2=3−1=2
D [ χ 2 ] = D [ ∑ i = 1 n X i 2 ] = 2 n D[\chi^2]=D[\sum\limits^n_{i=1}X_i^2]=2n D[χ2]=D[i=1∑nXi2]=2n
注释
对于变量 X ∼ N ( 0 , 1 ) X\sim N(0,1) X∼N(0,1)有 E [ X 4 ] = 3 E[X^4]=3 E[X4]=3, 证明用 E [ g ( X ) ] = ∫ − ∞ ∞ g ( x ) f ( x ) d x E[g(X)]=\int\limits^{\infty}_{-\infty}g(x)f(x)dx E[g(X)]=−∞∫∞g(x)f(x)dx方法,也可以看最下面的解释
常用结论
若 X i ∼ N ( μ , σ 2 ) X_i\sim N(\mu,\sigma^2) Xi∼N(μ,σ2)则
1. ∑ i = 1 n ( X i − μ σ ) 2 ∼ χ 2 ( n ) \sum\limits^n_{i=1}(\dfrac{X_i-\mu}{\sigma})^2 \sim \chi^2(n) i=1∑n(σXi−μ)2∼χ2(n)
2. ( n − 1 ) S 2 σ 2 = ∑ i = 1 n ( X i − X ‾ σ ) 2 ∼ χ 2 ( n − 1 ) \dfrac{(n-1)S^2}{\sigma^2}=\sum\limits^n_{i=1}(\dfrac{X_i-\overline X}{\sigma})^2 \sim \chi^2(n-1) σ2(n−1)S2=i=1∑n(σXi−X)2∼χ2(n−1)(在1式情况下,当 μ \mu μ未知时,用 X ‾ \overline X X来代替 μ \mu μ)
3. X ‾ , S 2 \overline X,S^2 X,S2相互独立,且 X ‾ − μ S / n ∼ t ( n − 1 ) \dfrac{\overline X-\mu}{S/\sqrt{n}} \sim t(n-1) S/nX−μ∼t(n−1)(在把 X ‾ \overline X X标准化的时候,当 σ \sigma σ未知时,用 S S S来代替 σ \sigma σ, t t t指 t t t-分布)
推断统计
十、正态变量的幂的统计量
假设 X ∼ N ( μ , σ 2 ) X\sim N(\mu,\sigma^2) X∼N(μ,σ2)
仍然可以设 Z = X − μ σ Z=\dfrac{X-\mu}{\sigma} Z=σX−μ, Z ∼ N ( 0 , 1 ) Z\sim N(0,1) Z∼N(0,1)
E [ X 2 ] = D [ X ] + E [ X ] 2 = σ 2 + μ 2 E[X^2]=D[X]+E[X]^2=\sigma^2+\mu^2 E[X2]=D[X]+E[X]2=σ2+μ2.
由于 E [ Z 3 ] = ∫ − ∞ ∞ z 3 φ ( z ) d z E[Z^3]=\int\limits^\infty_{-\infty}z^3\varphi(z)dz E[Z3]=−∞∫∞z3φ(z)dz,且正态分布 φ ( z ) = 1 2 π e − z 2 / 2 \varphi(z)=\dfrac{1}{\sqrt{2\pi}}e^{-z^2/2} φ(z)=2π1e−z2/2是偶函数, z 3 z^3 z3是奇函数,且计算右半轴
∫ 0 ∞ z 3 φ ( z ) d z \int\limits^\infty_{0}z^3\varphi(z)dz 0∫∞z3φ(z)dz
= ∫ 0 ∞ z 3 1 2 π e − z 2 / 2 d z =\int\limits^\infty_{0}z^3\dfrac{1}{\sqrt{2\pi}}e^{-z^2/2}dz =0∫∞z32π1e−z2/2dz
= ∫ 0 ∞ z 3 2 3 / 2 2 3 / 2 2 π 2 e − z 2 / 2 d z 2 =\int\limits^\infty_{0} \dfrac{z^3}{2^{3/2}} \dfrac{2^{3/2}}{\sqrt{2\pi}} \sqrt{2}e^{-z^2/2}d\dfrac{z}{\sqrt{2}} =0∫∞23/2z32π23/22e−z2/2d2z
= 2 2 π ∫ 0 ∞ t 3 e − t 2 d t =\dfrac{2\sqrt{2}}{\sqrt{\pi}}\int\limits^\infty_{0}t^3e^{-t^2}dt =π220∫∞t3e−t2dt, ( t = z / 2 ) (t=z/\sqrt{2}) (t=z/2)
= 2 2 π 1 2 Γ ( 2 ) = 2 π < ∞ =\dfrac{2\sqrt{2}}{\sqrt{\pi}}\dfrac{1}{2}\Gamma(2)=\dfrac{\sqrt{2}}{\sqrt{\pi}}<\infty =π2221Γ(2)=π2<∞
所以此积分收敛,根据奇函数性质 E [ Z 3 ] = 0 E[Z^3]=0 E[Z3]=0
同理可得 E [ Z 4 ] = 4 π Γ ( 5 2 ) = 3 E[Z^4]=\dfrac{4}{\sqrt{\pi}}\Gamma(\dfrac{5}{2})=3 E[Z4]=π4Γ(25)=3等等
注释
事实上,有 ∫ 0 ∞ z n φ ( z ) d z = 2 n 2 π Γ ( n + 1 2 ) ( n ∈ Z + ) \int\limits^\infty_{0}z^n\varphi(z)dz=\dfrac{\sqrt{2}^n}{2\sqrt{\pi}}\Gamma(\dfrac{n+1}{2})(n\in Z^+) 0∫∞znφ(z)dz=2π2nΓ(2n+1)(n∈Z+)
所以
∫ − ∞ ∞ z n φ ( z ) d z = { 2 n π Γ ( n + 1 2 ) ( n = 2 k ) 0 ( n = 2 k + 1 ) ( k ∈ Z + ) \int\limits^\infty_{-\infty}z^n\varphi(z)dz=\left\{ \begin{aligned} &\dfrac{\sqrt{2}^n}{\sqrt{\pi}}\Gamma(\dfrac{n+1}{2}) & & (n=2k) \\ &0 & & (n=2k+1) \\ \end{aligned} \right.(k\in Z^+) −∞∫∞znφ(z)dz=⎩ ⎨ ⎧π2nΓ(2n+1)0(n=2k)(n=2k+1)(k∈Z+)
其中 Γ ( x ) \Gamma(x) Γ(x)是 Γ \Gamma Γ函数。即阶乘函数。 Γ ( x + 1 ) = x Γ ( x ) = x ! \Gamma(x+1)=x\Gamma(x)=x! Γ(x+1)=xΓ(x)=x!。 Γ \Gamma Γ函数特别地,满足 Γ ( 1 / 2 ) = π \Gamma(1/2)=\sqrt{\pi} Γ(1/2)=π
另一方面
E [ Z 3 ] = E [ ( X − μ σ ) 3 ] = E [ 1 σ 3 ( X 3 − 3 μ X 2 + 3 μ 2 X − μ 3 ) ] E[Z^3]=E[(\dfrac{X-\mu}{\sigma})^3]=E[\dfrac{1}{\sigma^3}(X^3-3\mu X^2+3\mu^2X-\mu^3)] E[Z3]=E[(σX−μ)3]=E[σ31(X3−3μX2+3μ2X−μ3)]
= 1 σ 3 ( E [ X 3 ] − 3 μ E [ X 2 ] + 3 μ 2 E [ X ] − μ 3 ) =\dfrac{1}{\sigma^3}(E[X^3]-3\mu E[X^2]+3\mu^2E[X]-\mu^3) =σ31(E[X3]−3μE[X2]+3μ2E[X]−μ3)
= 1 σ 3 ( E [ X 3 ] − 3 μ ( σ 2 + μ 2 ) + 3 μ 2 μ − μ 3 ) =\dfrac{1}{\sigma^3}(E[X^3]-3\mu(\sigma^2+\mu^2)+3\mu^2\mu-\mu^3) =σ31(E[X3]−3μ(σ2+μ2)+3μ2μ−μ3)
可得
E [ X 3 ] = 3 μ σ 2 + μ 3 E[X^3]=3\mu\sigma^2+\mu^3 E[X3]=3μσ2+μ3
以这种方法可以算出高次幂的期望,进而根据 D [ X n ] = E [ X 2 n ] − E [ X n ] 2 D[X^n]=E[X^{2n}]-E[X^n]^2 D[Xn]=E[X2n]−E[Xn]2可以算出高次幂的方差。如
E
[
X
4
]
=
3
σ
4
+
6
μ
2
σ
2
+
μ
4
E[X^4]=3\sigma^4+6\mu^2\sigma^2+\mu^4
E[X4]=3σ4+6μ2σ2+μ4
D
[
X
2
]
=
E
[
X
4
]
−
E
[
X
2
]
2
=
2
σ
4
+
4
μ
2
σ
2
D[X^2]=E[X^4]-E[X^2]^2=2\sigma^4+4\mu^2\sigma^2
D[X2]=E[X4]−E[X2]2=2σ4+4μ2σ2
十一、条件期望(考研不考)
记
X
,
Y
X,Y
X,Y为联合分布的两个连续随机变量,给定
Y
=
y
Y=y
Y=y条件下,有
E
[
X
∣
Y
=
y
]
=
∫
−
∞
∞
x
f
X
∣
Y
(
x
∣
y
)
d
x
E[X|Y=y]=\int\limits^\infty_{-\infty}xf_{X|Y}(x|y)dx
E[X∣Y=y]=−∞∫∞xfX∣Y(x∣y)dx
期望规则仍然有效
E
[
g
(
X
)
∣
Y
=
y
]
=
∫
−
∞
∞
g
(
x
)
f
X
∣
Y
(
x
∣
y
)
d
x
E[g(X)|Y=y]=\int\limits^\infty_{-\infty}g(x)f_{X|Y}(x|y)dx
E[g(X)∣Y=y]=−∞∫∞g(x)fX∣Y(x∣y)dx
全期望定理
设 A 1 , A 2 A_1,A_2 A1,A2为互不相容的n个事件,且 P ( A i ) > 0 P(A_i )>0 P(Ai)>0、这些事件形成样本空间的一个分割,则
E [ X ] = ∑ i = 1 n P ( A i ) E [ X ∣ A i ] E[X]=\sum\limits_{i=1}^n P(A_i)E[X|A_i] E[X]=i=1∑nP(Ai)E[X∣Ai]
E [ X ] = ∫ − ∞ ∞ E [ X ∣ Y = y ] f Y ( y ) d y E[X]=\int\limits^\infty_{-\infty}E[X|Y=y]f_{Y}(y)dy E[X]=−∞∫∞E[X∣Y=y]fY(y)dy
证明如下
∫ − ∞ ∞ E [ X ∣ Y = y ] f Y ( y ) d y \int\limits^\infty_{-\infty}E[X|Y=y]f_{Y}(y)dy −∞∫∞E[X∣Y=y]fY(y)dy
= ∫ − ∞ ∞ [ ∫ − ∞ ∞ x f X ∣ Y ( x ∣ y ) d x ] f Y ( y ) d y =\int\limits^\infty_{-\infty}\left[\int\limits^\infty_{-\infty}xf_{X|Y}(x|y)dx\right]f_{Y}(y)dy =−∞∫∞[−∞∫∞xfX∣Y(x∣y)dx]fY(y)dy
= ∫ − ∞ ∞ ∫ − ∞ ∞ x [ f X ∣ Y ( x ∣ y ) f Y ( y ) ] d x d y =\int\limits^\infty_{-\infty}\int\limits^\infty_{-\infty}x\left[f_{X|Y}(x|y)f_{Y}(y)\right]dxdy =−∞∫∞−∞∫∞x[fX∣Y(x∣y)fY(y)]dxdy
= ∫ − ∞ ∞ ∫ − ∞ ∞ x f X , Y ( x , y ) d x d y =\int\limits^\infty_{-\infty}\int\limits^\infty_{-\infty}xf_{X,Y}(x,y)dxdy =−∞∫∞−∞∫∞xfX,Y(x,y)dxdy
= ∫ − ∞ ∞ x [ ∫ − ∞ ∞ f X , Y ( x , y ) d y ] d x =\int\limits^\infty_{-\infty}x\left[\int\limits^\infty_{-\infty}f_{X,Y}(x,y)dy\right]dx =−∞∫∞x[−∞∫∞fX,Y(x,y)dy]dx
= ∫ − ∞ ∞ x f X ( x ) d x = E [ X ] =\int\limits^\infty_{-\infty}xf_{X}(x)dx=E[X] =−∞∫∞xfX(x)dx=E[X]
反过来想,有
E [ E [ X ∣ Y ] ] = ∫ − ∞ ∞ E [ X ∣ Y = y ] f Y ( y ) d y E[E[X|Y]]=\int\limits^\infty_{-\infty}E[X|Y=y]f_{Y}(y)dy E[E[X∣Y]]=−∞∫∞E[X∣Y=y]fY(y)dy
( E [ X ∣ Y ] E[X|Y] E[X∣Y]是随机变量 Y Y Y的函数)
可以导出一个非常重要的结论–重期望法则
重期望法则
E [ E [ X ∣ Y ] ] = E [ X ] E[E[X|Y]]=E[X] E[E[X∣Y]]=E[X]
十二、条件方差(考研不考)
接上节,
如果我们把 Y Y Y视为一种观测值或一种抽样,我们可以将条件期望视为给定 Y Y Y条件下对 X X X的一种估计,记作 X ^ = E [ X ∣ Y ] \hat{X} = E[X|Y] X^=E[X∣Y]
我们就可以定义估计误差 X ~ = X ^ − X \tilde{X}=\hat{X}-X X~=X^−X
估计误差也是随机变量,且满足
E [ X ~ ∣ Y ] = E [ ( X ^ − X ) ∣ Y ] = E [ X ^ ∣ Y ] − E [ X ∣ Y ] = X ^ − X ^ = 0 E[\tilde{X}|Y]=E[(\hat{X}-X)|Y]=E[\hat{X}|Y]-E[X|Y]=\hat{X}-\hat{X}=0 E[X~∣Y]=E[(X^−X)∣Y]=E[X^∣Y]−E[X∣Y]=X^−X^=0
应用重期望法则还可得到
E [ X ~ ] = E [ E [ X ~ ∣ Y ] ] = 0 E[\tilde{X}]=E[E[\tilde{X}|Y]]=0 E[X~]=E[E[X~∣Y]]=0
也就是说 X ^ \hat{X} X^是一种无偏估计
还可证明, X ~ , X ^ \tilde{X},\hat{X} X~,X^不相关,证明如下
首先 E [ X ^ X ~ ] = E [ E [ X ^ X ~ ∣ Y ] ] = E [ X ^ E [ X ~ ∣ Y ] ] = 0 E[\hat{X}\tilde{X}]=E[E[\hat{X}\tilde{X}|Y]]=E[\hat{X}E[\tilde{X}|Y]]=0 E[X^X~]=E[E[X^X~∣Y]]=E[X^E[X~∣Y]]=0(对于确定的 Y Y Y, X ^ \hat{X} X^也是确定的)
从而 c o v ( X ^ , X ~ ) = E [ X ^ X ~ ] − E [ X ^ ] E [ X ~ ] = 0 − E [ X ] ⋅ 0 = 0 cov(\hat{X},\tilde{X})=E[\hat{X}\tilde{X}]-E[\hat{X}]E[\tilde{X}]=0-E[X]\cdot 0=0 cov(X^,X~)=E[X^X~]−E[X^]E[X~]=0−E[X]⋅0=0
由 c o v ( X ^ , X ~ ) = 0 cov(\hat{X},\tilde{X})=0 cov(X^,X~)=0导出
D [ X ] = D [ X ^ − X ~ ] = D [ X ^ ] + D [ X ~ ] D[X]=D[\hat{X}-\tilde{X}]=D[\hat{X}]+D[\tilde{X}] D[X]=D[X^−X~]=D[X^]+D[X~]
对于条件方差可以看到
D
[
X
∣
Y
]
=
E
[
(
X
−
E
[
X
∣
Y
]
)
2
∣
Y
]
=
E
[
X
~
2
∣
Y
]
D[X|Y]=E[(X-E[X|Y])^2|Y]=E[\tilde{X}^2|Y]
D[X∣Y]=E[(X−E[X∣Y])2∣Y]=E[X~2∣Y]
即给定
Y
=
y
Y=y
Y=y,
X
X
X的条件方差为
D
[
X
∣
Y
=
y
]
=
E
[
X
~
2
∣
Y
=
y
]
D[X|Y=y]=E[\tilde{X}^2|Y=y]
D[X∣Y=y]=E[X~2∣Y=y]
利用结论 E [ X ~ ] = 0 E[\tilde{X}]=0 E[X~]=0和重期望法则有
D [ X ~ ] = E [ X ~ 2 ] − 0 = E [ E [ X ~ 2 ∣ Y ] ] = E [ D [ X ∣ Y ] ] D[\tilde{X}]=E[\tilde{X}^2]-0=E[E[\tilde{X}^2|Y]]=E[D[X|Y]] D[X~]=E[X~2]−0=E[E[X~2∣Y]]=E[D[X∣Y]]
所以 D [ X ] = D [ X ^ ] + D [ X ~ ] D[X]=D[\hat{X}]+D[\tilde{X}] D[X]=D[X^]+D[X~]可以写成如下形式
全方差法则
D [ X ] = E [ D [ X ∣ Y ] ] + D [ E [ X ∣ Y ] ] D[X]=E[D[X|Y]]+D[E[X|Y]] D[X]=E[D[X∣Y]]+D[E[X∣Y]]