最小二乘法以及正交最小二乘(OLS)的推导与简单比较

最新推荐文章于 2024-06-04 08:00:00 发布

爱情小飞机

最新推荐文章于 2024-06-04 08:00:00 发布

阅读量3.3k

点赞数 1

文章标签：最小二乘法算法

本文链接：https://blog.csdn.net/qq_41332314/article/details/127477680

版权

在阅读论文[ ]过程中，发现在直线拟合阶段使用了正交最小二乘方法。目前先简单记录最小二乘法和正交最小二乘这两种数学模型的推导和比较。

一、最小二乘法

最小二乘法是最常见的数学优化方法之一。它通过最小化误差(定义为真实数据与预测数据之间的差)的平方和来寻找数据的最佳函数匹配。

若用 $\hat{\mathrm{y}}=x w$ 来描述对y的预测。则在已知的一组x与y的样本数据中，预测的误差平方和为：

$\begin{aligned} \mathbf{E}(w) &=\sum_{i=1}^m\left(\hat{\mathrm{y}}_i-\mathrm{y}_i\right)^2 \\ &=\sum_{i=1}^m\left(x_i w-\mathrm{y}_i\right)^2 \\ &=(X w-Y)^T(X w-Y) \end{aligned}$

即求使得E最小的w。整理成数学问题，就是：当有数据集合X、Y，求w使得X、Y之间的 $\mathbf{E}(w)=(X w-Y)^T(X w-Y)$ 最小。也即为 $w=\left(X^T X\right)^{-1} X^T Y$ 。

推导思路：

推导思路比较简单，就是有数据 $x_1, x_2, \ldots, x_n$ ，求E(x)分别对 $x_1, x_2, \ldots, x_n$ 的偏导为0。

$\left\{\begin{array}{c} \frac{\partial \mathbf{E}(x)}{\partial x_1}=0 \\ \frac{\partial \mathbf{E}(x)}{\partial x_2}=0 \\ \cdots \\ \frac{\partial \mathbf{E}(x)}{\partial x_n}=0 \end{array}\right.$

在matlab的函数为polyfit(x,y,n); 其中，n为最小二乘的阶数。

二、正交最小二乘法

OLS是在LS的基础上一种贪婪选择子集(总是选择最好的路径)的方法。在计算最小二乘时，会将所有点都考虑到，但是，若某个离群值偏离的过大，那么他还是否对最终的拟合结果有正向的帮助呢？OLS就是基于这一思想，通过用部分数据来拟合出符合要求的线。那么，怎么选择这一部分数据呢？

相对于最小二乘法，现有A与y，我们希望求一x使Ax与y的最小二乘误差最小，即Ax最佳地逼近y。OLS的思路是，将A的列逐个添加到As(A的子集)，然后用最小二乘法求得As与y的最小二乘误差，满足条件就不再添加。具体如何选择列呢？

这里介绍一下误差下降量的概念。

最小二乘的解是 $x=\left(A^T A\right)^{-1} A^T \mathrm{y}$ ， A 为单位正交列矩阵时， $x=\left(A^T A\right)^{-1} A^T \mathrm{y}=A^T \mathrm{y}$

假设现在已选出的列组成 As。有 $x=A_s^T \mathrm{y}$

未添加新列时，预测值 $A_s x=A_s A_s^T \mathrm{y}$ 的最小二乘误差为：

$\begin{aligned} \mathbf{E} &=\left(A_s A_s^T * \mathrm{y}-\mathrm{y}\right)^T\left(A_s A_s^T * \mathrm{y}-\mathrm{y}\right) \\ &=\left(\mathrm{y}^T A_s A_s^T-\mathrm{y}^T\right)\left(A_s A_s^T * \mathrm{y}-\mathrm{y}\right) \\ &=\mathrm{y}^T A_s A_s^T A_s A_s^T * \mathrm{y}-\mathrm{y}^T A_s A_s^T * \mathrm{y}-\mathrm{y}^T A_s A_s^T \mathrm{y}+\mathrm{y}^T \mathrm{y} \\ &=\mathrm{y}^T A_s A_s^T * \mathrm{y}-\mathrm{y}^T A_s A_s^T * \mathrm{y}-\mathrm{y}^T A_s A_s^T \mathrm{y}+\mathrm{y}^T \mathrm{y} \\ &=\mathrm{y}^T \mathrm{y}-\mathrm{y}^T A_s\left(\mathrm{y}^T A_s\right)^T \end{aligned}$

在添加一列a后，最小二乘误差为

$\begin{aligned} \mathbf{E}\left(\left[A_s, a\right]\right) &=\mathrm{y}^T \mathrm{y}-\mathrm{y}^T\left[A_s, a\right]\left(\mathrm{y}^T\left[A_s, a\right]\right)^T \\ &=\mathrm{y}^T \mathrm{y}-\left[\mathrm{y}^T A_s, \mathrm{y}^T a\right]\left[\mathrm{y}^T A_s, \mathrm{y}^T a\right]^T \\ &=\mathrm{y}^T \mathrm{y}-\mathrm{y}^T A_s\left(\mathrm{y}^T A_s\right)^T-\mathrm{y}^T a\left(\mathrm{y}^T a\right)^T \\ &=\mathrm{y}^T \mathrm{y}-\mathrm{y}^T A_s\left(\mathrm{y}^T A_s\right)^T-\left(\mathrm{y}^T a\right)^2 \end{aligned}$