【概率论】5-6:正态分布(The Normal Distributions Part III)


Abstract: 本文介绍正态分布第三部分,标准正态分布,正态分布的线性组合,对数正态分布以及对数正态分布
Keywords: The Normal Distributions,The Standard Normal Distribution



标准正态分布 The Standard Normal Distribution

Definition Standard Normal Distribution.The normal distribution with mean 0 and variance 1 is called the standard normal distribution.The p.d.f. of the standard nromal distribution is usually denoted by the symbol ϕ \phi ϕ ,and the c.d.f. is denoted by the symbol Φ \Phi Φ .Thus,
ϕ ( x ) = f ( x ∣ 0 , 1 ) = 1 ( 2 π ) 1 / 2 e − 1 2 x 2  for  − ∞ < x < ∞ \phi(x)=f(x|0,1)=\frac{1}{(2\pi)^{1/2}}e^{-\frac{1}{2}x^2} \text{ for }-\infty<x<\infty ϕ(x)=f(x0,1)=(2π)1/21e21x2 for <x<
Ψ ( x ) = ∫ − ∞ x Ψ ( μ ) d μ  for  − ∞ < x < ∞ \Psi(x)=\int^{x}_{-\infty}\Psi(\mu)d\mu \text{ for }-\infty<x<\infty Ψ(x)=xΨ(μ)dμ for <x<

第二个公式中 μ \mu μ 是个哑变量,根据微积分基本定理可以知道上面写的 c.d.f.的导数就是p.d.f.

Theorem Consequences of Symmetry.For all x and all 0 < p < 1 0 < p < 1 0<p<1
Ψ ( − x ) = 1 − Ψ ( x )  and  Ψ − 1 ( p ) = − Ψ − 1 ( 1 − p ) \begin{aligned} \Psi(-x)=1-\Psi(x) \text{ and } \Psi^{-1}(p)=-\Psi^{-1}(1-p) \end{aligned} Ψ(x)=1Ψ(x) and Ψ1(p)=Ψ1(1p)

这个证明相对简单,其实主要考察的是上一篇关于正态分布的形状问题,正态分布p.d.f.的根本性质是对称性,关于均值对称,这个性质就可以衍生出上面定理的结论,比如 P r ( X ≤ − x ) = P r ( X ≥ x ) Pr(X\leq -x)=Pr(X\geq x) Pr(Xx)=Pr(Xx) 就是对称性质的体现,然后是c.d.f.的反函数重新改写前面这个对称性质,等是左边为 Ψ − 1 ( Ψ ( − x ) ) = Ψ − 1 ( p ) \Psi^{-1}(\Psi(-x))=\Psi^{-1}(p) Ψ1(Ψ(x))=Ψ1(p) 以及 等式右边 Ψ − 1 ( 1 − Ψ ( x ) ) = − Ψ − 1 ( 1 − p ) \Psi^{-1}(1-\Psi(x))=-\Psi^{-1}(1-p) Ψ1(1Ψ(x))=Ψ1(1p)

Theorem Converting Normal Distributions to Standard.Let X X X have the normal distribution with mean μ \mu μ and variance σ 2 \sigma^2 σ2 .Let F F F be the c.d.f. of X X X .Then Z = ( X − μ ) / σ Z=(X-\mu)/\sigma Z=(Xμ)/σ has the standard normal distribution, and ,for all x and all 0 < p < 1 0 < p < 1 0<p<1
F ( x ) = Φ ( x − μ σ ) F − 1 ( p ) = μ + σ Φ − 1 ( p ) F(x)=\Phi(\frac{x-\mu}{\sigma})\\ F^{-1}(p)=\mu+\sigma\Phi^{-1}(p) F(x)=Φ(σxμ)F1(p)=μ+σΦ1(p)

P r ( X ≤ x ) = P r ( Z ≤ x − μ σ ) Pr(X\leq x)=Pr(Z\leq \frac{x-\mu}{\sigma}) Pr(Xx)=Pr(Zσxμ)
这就能得到结论了,令 p = F ( x ) p=F(x) p=F(x) 能得到 F − 1 ( p ) = μ + σ Φ − 1 ( p ) F^{-1}(p)=\mu+\sigma\Phi^{-1}(p) F1(p)=μ+σΦ1(p) 的结论。

我们来举个计算的例子,我们来计算一个正态分布中的概率,假设X有一个正态分布,均值是5方差是2,我们来计算 P r ( 1 < X < 8 ) Pr(1<X<8) Pr(1<X<8)
如果我们令 Z = ( X − 5 ) / 2 Z=(X-5)/2 Z=(X5)/2 那么Z会有一个标准的正态分布并且:
P r ( 1 < X < 8 ) = P r ( 1 − 5 2 < X − 5 2 < 8 − 5 2 ) = P r ( − 2 < Z < 1.5 ) futhermore: P r ( − 1 < Z < 1.5 ) = P r ( Z < 1.5 ) − P r ( Z ≤ − 2 ) = Φ ( 1.5 ) − Φ ( − 2 ) = Φ ( 1.5 ) − [ 1 − Φ ( 2 ) ] Pr(1<X<8)=Pr(\frac{1-5}{2}<\frac{X-5}{2}<\frac{8-5}{2})=Pr(-2<Z<1.5)\\ \text{futhermore:}\\ \begin{aligned} Pr(-1<Z<1.5)&=Pr(Z<1.5)-Pr(Z\leq -2)\\ &=\Phi(1.5)-\Phi(-2)\\ &=\Phi(1.5)-[1-\Phi(2)] \end{aligned} Pr(1<X<8)=Pr(215<2X5<285)=Pr(2<Z<1.5)futhermore:Pr(1<Z<1.5)=Pr(Z<1.5)Pr(Z2)=Φ(1.5)Φ(2)=Φ(1.5)[1Φ(2)]
从书后标准正态分布的表格中可以查到c.d.f.为 Φ ( 1.5 ) = 0.9332 \Phi(1.5)=0.9332 Φ(1.5)=0.9332 并且 Φ ( 2 ) = 0.9773 \Phi(2)=0.9773 Φ(2)=0.9773 所以
P r ( 1 < X < 8 ) = 0.9105 Pr(1<X<8)=0.9105 Pr(1<X<8)=0.9105


正态分布比较 Comparisons of Normal Distributions

这里有一个非常重要的性质,以均值位置为参考物,一个标准偏移是一个标准差,这个偏移之内的概率是相等的,并且任意个相同的标准偏移都相等,对于两个正态分布有 f 1 ( X ∣ μ X , σ X 2 ) , f 2 ( Y ∣ μ Y , σ Y 2 ) f_1(X|\mu_X,\sigma_X^2),f_2(Y|\mu_Y,\sigma_Y^2) f1(XμX,σX2),f2(YμY,σY2) 有正数 k k k
P r ( μ X − k σ X < X < μ X + k σ X ) = P r ( μ Y − k σ Y < Y < μ Y + k σ Y ) Pr(\mu_X-k\sigma_X<X<\mu_X+k\sigma_X)=Pr(\mu_Y-k\sigma_Y<Y<\mu_Y+k\sigma_Y) Pr(μXkσX<X<μX+kσX)=Pr(μYkσY<Y<μY+kσY)

p k = P r ( ∣ X − μ ∣ ≤ k σ ) = P r ( ∣ Z ∣ ≤ k ) p_k=Pr(|X-\mu|\leq k\sigma)=Pr(|Z|\leq k) pk=Pr(Xμkσ)=Pr(Zk)


表格中k表示几个标准偏移,也就是k的具体值,可以看出,当 k = 3 k=3 k=3 的时候,正态分布内 P r ( μ − 3 σ , μ + 3 σ ) > 0.99 Pr(\mu-3\sigma,\mu+3\sigma)>0.99 Pr(μ3σ,μ+3σ)>0.99 了而且我记得我大学书上有个 3 − σ 3-\sigma 3σ 原则,好像是当检测结果满足这个要求的时候就算合格了,这个具体的我们在数理统计部分再说,我们要知道的是正态分布的形状,和偏移性质。

正太分布随机变量的线性组合 Linear Combinatios of Normally Distributed Variables


Theorem If the random variables X 1 , … , X k X_1,\dots,X_k X1,,Xk are independent and if X i X_i Xi has the normal distribution with mean μ i \mu_i μi and variance σ i 2 ( i = 1 , … , k ) \sigma^2_i(i=1,\dots,k) σi2(i=1,,k) ,then the sum X 1 + ⋯ + X k X_1+\dots+X_k X1++Xk has the normal distribution with mean μ 1 + ⋯ + μ k \mu_1+\dots+\mu_k μ1++μk and variance σ 1 2 + ⋯ + σ k 2 \sigma^2_1+\dots+\sigma^2_k σ12++σk2


Ψ i ( t ) \Psi_i(t) Ψi(t) 表示 第 X i X_i Xi 的m.g.f. ( i = 1 , 2 , …   ) (i=1,2,\dots) (i=1,2,) 然后用 Ψ ( t ) \Psi(t) Ψ(t) 表示 X 1 + ⋯ + X k X_1+\dots+X_k X1++Xk 的m.g.f.
Ψ ( t ) = Π i = 1 k Ψ i ( t ) = Π i = 1 k e x p [ μ i t + 1 2 σ i 2 t 2 ] = e x p [ ( ∑ i = 1 k μ i ) t + 1 2 ( ∑ i = 1 k σ i 2 ) t 2 ] \Psi(t)=\Pi^{k}_{i=1}\Psi_i(t)=\Pi^k_{i=1}exp[\mu_it+\frac{1}{2}\sigma^2_it^2]\\ =exp{[(\sum^k_{i=1}\mu_i)t+\frac{1}{2}(\sum^{k}_{i=1}\sigma^2_i)t^2]} Ψ(t)=Πi=1kΨi(t)=Πi=1kexp[μit+21σi2t2]=exp[(i=1kμi)t+21(i=1kσi2)t2]

Corollary If the random variables X 1 , … , X k X_1,\dots,X_k X1,,Xk are independent,if X i X_i Xi has the normal distribution with mean μ i \mu_i μi and variance σ i 2 ( i = 1 , … , k ) \sigma^2_i (i=1,\dots,k) σi2(i=1,,k) ,and if a 1 , … , a k a_1,\dots,a_k a1,,ak and b b b are constants for which at least one of the values a 1 , … , a k a_1,\dots,a_k a1,,ak is different from 0,then the variable a 1 X 1 + ⋯ + a k X k + b a_1X_1+\dots+a_kX_k+b a1X1++akXk+b has the normal distribution with mean a 1 μ 1 + ⋯ + a k μ k + b a_1\mu_1+\dots+a_k\mu_k+b a1μ1++akμk+b and variance a 1 2 σ 1 2 + ⋯ + a k 2 σ k 2 a_1^2\sigma_1^2+\dots+a^2_k\sigma_k^2 a12σ12++ak2σk2


Definition Sample Mean.Let X 1 , … , X n X_1,\dots,X_n X1,,Xn be random variables,The average of these n n n random variables 1 n ∑ i = 1 n X i \frac{1}{n}\sum^{n}_{i=1}X_i n1i=1nXi ,is called their sample mean and is commonly denoted X ˉ n \bar{X}_n Xˉn

有n个随机变量,他们的均值,被称为样本均值,记做 X ˉ n \bar{X}_n Xˉn 注意,这里并没有说 X i X_i Xi 的分布和独立性关系。也就是说是任意的的分布都可以。

Corollary Suppose that the random variables X 1 , … , X n X_1,\dots,X_n X1,,Xn form a random sample from the normal distribution with mean μ \mu μ and variance σ 2 \sigma^2 σ2 ,and let X ˉ n \bar{X}_n Xˉn denote their sample mean .Then X ˉ n \bar{X}_n Xˉn has the normal distribution with mean μ \mu μ and variance σ 2 / n \sigma^2/n σ2/n

X ˉ n = ∑ i = 1 n ( 1 / n ) X i \bar{X}_n =\sum^{n}_{i=1}(1/n)X_i Xˉn=i=1n(1/n)Xi
这就是一个线性组合,而且根据条件, X i X_i Xi 的均值方差一致,那么均值最后不变 ∑ i = 1 n 1 n μ i = μ i \sum^n_{i=1}\frac{1}{n}\mu_i=\mu_i i=1nn1μi=μi 而对应的方差应该是 ∑ i = 0 n 1 n 2 σ 2 = σ 2 / n \sum^{n}_{i=0}\frac{1}{n^2}\sigma^2=\sigma^2/n i=0nn21σ2=σ2/n

正态分布的对数 The Lognormal Distributions


Definition Lognormal Distribution.If l o g ( X ) log(X) log(X) has the normal distribution with mean μ \mu μ and variance σ 2 \sigma^2 σ2 ,we say that X X X has the lognormal distribution with parameters μ \mu μ and σ 2 \sigma^2 σ2








