Gaussian Distribution
-
基础概念:
-
似然性:用于在已知某些观测所得到的结果时,对有关事物之性质的参数进行估值。
-
最大似然估计:给定一个概率分布 D D D,一直其概率密度函数为 f D f_D fD,以及一个分布参数 θ \theta θ,我们可以从这个分布中抽出一个具有 n n n个值的采样 X 1 , X 2 , ⋯ , X n X_1,X_2,\cdots,X_n X1,X2,⋯,Xn,利用 f D f_D fD计算出其似然函数:
L ( θ ∣ x 1 , … , x n ) = f θ ( x 1 , … , x n ) \mathrm{L}\left(\theta \mid x_1, \ldots, x_n\right)=f_\theta\left(x_1, \ldots, x_n\right) L(θ∣x1,…,xn)=fθ(x1,…,xn)
其中 f θ f_\theta fθ为这些采样值的联合分布,最大似然估计就是找到一个 θ ^ \hat \theta θ^使得 L L L有最大值。
-
-
证明:
设误差密度函数(误差的概率)为 f f f,设有 n n n个独立的测量值 x 1 , x 2 , ⋯ , x n x_1,x_2,\cdots,x_n x1,x2,⋯,xn,设真值为 x x x;
这 n n n次测量的误差为 x 1 − x , x 2 − x , ⋯ , x n − x x_1-x,x_2-x,\cdots,x_n-x x1−x,x2−x,⋯,xn−x,概率为 f ( x 1 − x ) , f ( x 2 − x ) , ⋯ f(x_1-x),f(x_2-x),\cdots f(x1−x),f(x2−x),⋯;
似然函数为:
L ( x ) = f ( x 1 − x ) ⋅ f ( x 2 − x ) ⋯ f ( x n − x ) L(x) =f(x_1-x)\cdot f(x_2-x) \cdots f(x_n-x) L(x)=f(x1−x)⋅f(x2−x)⋯f(xn−x)
应用最大似然估计,对似然函数取对数:
L ( x ) = ∑ i = 1 n ln f ( x i − x ) L(x) = \sum_{i=1}^{n} \ln f(x_i-x) L(x)=i=1∑nlnf(xi−x)
对 L ( x ) L(x) L(x)求偏导:(复合函数求导,链式法则),且为了找到似然函数最大值,导函数需为0,则
d L ( x ) d x = − ∑ i = 1 n f ′ ( x i − x ) f ( x i − x ) = 0 \frac{\mathrm d L(x)}{\mathrm d x} = -\sum_{i=1}^n \frac{f'(x_i-x)}{f(x_i-x)}=0 dxdL(x)=−i=1∑nf(xi−x)f′(xi−x)=0
记 g ( x ) = f ′ ( x ) / f ( x ) g(x) = f'(x)/f(x) g(x)=f′(x)/f(x)则导函数为:
∑ i = 1 n g ( x i − x ) = 0 \sum_{i=1}^n g(x_i-x) = 0 i=1∑ng(xi−x)=0
我们认为真值 x x x估计为 x ˉ \bar x xˉ(假设这玩意已知),并且令 L ( x ) L(x) L(x)的导函数对 x 1 , x 2 , ⋯ , x n x_1,x_2,\cdots,x_n x1,x2,⋯,xn分别求导:(这里 x ˉ \bar x xˉ看作也是关于 n n n函数)
g ′ ( x 1 − x ˉ ) ( 1 − 1 n ) + g ′ ( x 2 − x ˉ ) ( − 1 n ) + ⋯ + g ′ ( x n − x ˉ ) ( − 1 n ) = 0 g ′ ( x 1 − x ˉ ) ( − 1 n ) + g ′ ( x 2 − x ˉ ) ( 1 − 1 n ) + ⋯ + g ′ ( x n − x ˉ ) ( − 1 n ) = 0 g ′ ( x 1 − x ˉ ) ( − 1 n ) + g ′ ( x 2 − x ˉ ) ( − 1 n ) + ⋯ + g ′ ( x n − x ˉ ) ( 1 − 1 n ) = 0. \begin{aligned} &g^{\prime}\left(x_1-\bar{x}\right)\left(1-\frac{1}{n}\right)+g^{\prime}\left(x_2-\bar{x}\right)\left(-\frac{1}{n}\right)+\cdots+g^{\prime}\left(x_n-\bar{x}\right)\left(-\frac{1}{n}\right)=0 \\ &g^{\prime}\left(x_1-\bar{x}\right)\left(-\frac{1}{n}\right)+g^{\prime}\left(x_2-\bar{x}\right)\left(1-\frac{1}{n}\right)+\cdots+g^{\prime}\left(x_n-\bar{x}\right)\left(-\frac{1}{n}\right)=0 \\ &g^{\prime}\left(x_1-\bar{x}\right)\left(-\frac{1}{n}\right)+g^{\prime}\left(x_2-\bar{x}\right)\left(-\frac{1}{n}\right)+\cdots+g^{\prime}\left(x_n-\bar{x}\right)\left(1-\frac{1}{n}\right)=0 . \end{aligned} g′(x1−xˉ)(1−n1)+g′(x2−xˉ)(−n1)+⋯+g′(xn−xˉ)(−n1)=0g′(x1−xˉ)(−n1)+g′(x2−xˉ)(1−n1)+⋯+g′(xn−xˉ)(−n1)=0g′(x1−xˉ)(−n1)+g′(x2−xˉ)(−n1)+⋯+g′(xn−xˉ)(1−n1)=0.
矩阵:
( 1 − 1 n − 1 n ⋯ 1 n − 1 n 1 − 1 n ⋯ − 1 n ⋮ ⋮ ⋮ − 1 n − 1 n ⋯ 1 − 1 n ) \left(\begin{array}{cccc} 1-\frac{1}{n} & -\frac{1}{n} & \cdots & \frac{1}{n} \\ -\frac{1}{n} & 1-\frac{1}{n} & \cdots & -\frac{1}{n} \\ \vdots &\vdots & &\vdots\\ -\frac{1}{n} & -\frac{1}{n} & \cdots & 1-\frac{1}{n} \end{array}\right) ⎝ ⎛1−n1−n1⋮−n1−n11−n1⋮−n1⋯⋯⋯n1−n1⋮1−n1⎠ ⎞
的秩为1,方程组的解为:
X = C ( 1 , 1 , ⋯ , 1 ) T \mathbf X = C(1,1,\cdots,1)^T X=C(1,1,⋯,1)T
所以有:
g ′ ( x 1 − x ˉ ) = g ′ ( x 2 − x ˉ ) = ⋯ = g ′ ( x 1 − x ˉ ) = c g^{\prime}\left(x_1-\bar{x}\right)=g^{\prime}\left(x_2-\bar{x}\right)=\cdots=g^{\prime}\left(x_1-\bar{x}\right)=c g′(x1−xˉ)=g′(x2−xˉ)=⋯=g′(x1−xˉ)=c
g ′ ( x ) g'(x) g′(x)为常数,则认为 g ( x ) g(x) g(x)是一次函数,给予待定系数,令 g ( x ) = c x + b g(x)=cx+b g(x)=cx+b,则
∑ i = 1 n g ( x i − x ˉ ) = ∑ i = 1 n c ( x i − x ˉ ) + n b = 0 \sum_{i=1}^n g\left(x_i-\bar{x}\right)=\sum_{i=1}^n c\left(x_i-\bar{x}\right)+n b=0 i=1∑ng(xi−xˉ)=i=1∑nc(xi−xˉ)+nb=0
由于 ∑ ( x i − x ˉ ) \sum(x_i-\bar x) ∑(xi−xˉ)为0,所以 b = 0 b=0 b=0所以有:
g ( x ) = c x g(x) = cx g(x)=cx
换元,还原出 f f f,则
f ′ ( x ) / f ( x ) = c x f'(x) / f(x) = cx f′(x)/f(x)=cx
为一阶微分方程,解得
f ( x ) = K exp ( 1 2 c x 2 ) f(x) = K \exp{(\frac1 2 c x^2}) f(x)=Kexp(21cx2)
我们知道密度函数 f ( x ) f(x) f(x)的积分是1
∫ − ∞ ∞ f ( x ) d x = 1 \int_{-\infin}^{\infin} f(x) \mathrm d x = 1 ∫−∞∞f(x)dx=1
围绕这个积分,我们可以解出 K K K和 c c c为了使积分收敛,记 c = 1 / σ 2 c = 1 / \sigma^2 c=1/σ2
为了使积分等于1,利用
∫ e − t 2 d t = π \int e ^ {-t^2} \mathrm d t = \sqrt{\pi} ∫e−t2dt=π
得到 K = 1 / 2 π σ K = 1/\sqrt{2\pi} \sigma K=1/2πσ所以
f ( x ) = 1 2 π σ exp ( − x 2 2 σ 2 ) ∼ N ( 0 , σ 2 ) f(x) = \frac{1}{\sqrt{2\pi}\sigma} \exp (-\frac{x^2}{2\sigma^2}) \sim N(0,\sigma^2) f(x)=2πσ1exp(−2σ2x2)∼N(0,σ2)