小白对最小二乘估计、最大似然估计和最大后验估计的理解

最新推荐文章于 2024-02-17 01:27:09 发布

Chauncy__xu

最新推荐文章于 2024-02-17 01:27:09 发布

阅读量461

点赞数

分类专栏：基础知识文章标签：概率论机器学习人工智能

本文链接：https://blog.csdn.net/qq_44384577/article/details/105759145

版权

基础知识专栏收录该内容

1 篇文章 0 订阅

订阅专栏

这三者的联系非常紧密

最小二乘估计（LLS）

首先假设直线为 $y = a x + b$ 我们的目标函数就可以设为 $\chi^2=\Sigma_i^n(y_i-y(x_i))^2$ 最小二乘估计比较简单易懂，只需把数据代入这个公式，然后求导等于零，就可以算出a和b的值。
**改进的最小二乘估计方法（MLS）：**引入数据的权重来改进估计，提高估计的可信度，具体的方法时在上式中加入权重系数 $w_i=1/\sigma_i^2$ 则原来的公式变为： $\chi^2=\Sigma_i^nw_i(y_i-y(x_i))^2$

最大似然估计（MLE）

最大似然的原理是： $L (p a r a m e t e r ∣ d a t a) = p (d a t a ∣ p a r a m e t e r)$
也就是概率大小是根据参数来确定的，反过来，参数的大小也可以由概率来推断，为了表示区别，对参数的估计被称为似然估计。
Maximum Likelihood Estimation seeks the solution that “best” explains the observed data set. $\theta^{ML}=argmax_\theta P(X|\theta)$ $=argmax_\theta logP(X|\theta)$
举一个例子：
Example: Coin flipping

Suppose we have been given data from a series of m coin flips, and we are not sure if the coin is fair or not.
We might assume that the data were generated by a sequence of independent draws from a Bernoulli distribution, parameterized by $\theta$ , which is the probability of flipping Heads.
But what’s the value of $\theta$ ？That is, which Bernoulli distribution generated these data?
We could estimate $\theta$ as the proportion of the flips that are Heads. We will see shortly that this is a principled Bayesian approach. Let $y_i=1$ if flip $i$ was Heads, and $y_i=0$ otherwise. Let $m_H=\Sigma_{i=1}^my_i$ be the number of heads in $m$ tosses. Then the likelihood model is $p(y|\theta)=\theta ^{m_H}(1-\theta)^{m-m_H}$

这里补充一点大数定律和中心极限定律的知识：
大数定律：在随机事件的大量重复出现中，往往呈现几乎必然的规律。在试验不变的条件下，重复试验多次，随机事件的概率近似于它出现的频率。这是概率论的重要基石。

中心极限定律：在一定条件下，大量独立随机变量的平均数是以正态分布为极限的。
最大似然估计与最小二乘估计的联系：如果模型假设为 $y=\alpha+\beta x+\epsilon$ 其中 $\epsilon服从N(0, \sigma^2)$ ，则 $y_i服从N(\alpha+\beta x_i, \sigma^2)$ ，则可以得到最大似然估计：
$L=(2\pi\sigma^2)^{-n/2}exp[\frac{-1}{2\sigma^2}\Sigma_{i=1}^n(y_i-\alpha-\beta _ix)^2]$
求解可得：
$\alpha =\bar y - \beta\bar x$
$\beta=\frac{\Sigma y_i(x_i-\bar x)}{\Sigma (x_i-\bar x)^2}$
推到这里你会惊奇的发现，这里的结果和不加权的最小二乘估计的结果是一样的，神奇不。