回归模型学习——理解&自言自语

Orange_Ao

已于 2023-07-31 14:11:42 修改

阅读量47

点赞数

文章标签：学习

于 2023-07-31 14:03:42 首次发布

本文链接：https://blog.csdn.net/Murphy_Ao/article/details/132019721

版权

文章探讨了回归模型在混乱数据中寻找稳定模式的概念，解释了二次型矩阵的秩与自由度的关系，以及SST（总平方和）如何分解为SSR（回归平方和）和SSE（剩余平方和）。还讨论了R-Square作为模型拟合优度的指标，以及在OLS（最小二乘法）条件下SST=SSR+SSE的成立条件。

摘要由CSDN通过智能技术生成

PRELUDE

Regression, a model of learning and finding out some relatively fixed and regular patterns among the chaos. The world of mess and random is somewhat, actually not so eluive as it may seem. There are indeed some underlying but arcane laws or principles or, maybe certain unprovable axioms. If meticulous and scrupulous enough, one would ultimately find the predictability out of the unpredictability.

UNDERSTANDING OF CONCEPTS

Degree of Freedom

二次型矩阵的秩。即为可以自由取值的变量个数。因为n个变量中，又r个线性无关，也就是各自的变动不会影响彼此的值。是最free的。

比如： $f=(x_1+x_2)^2+(x_3+x_4)^2$ 看似有4个变量，但是其实r=2，也就是可以化简成 $f=z_1^2+z_2^2$ , 就只有两个可以自由取值的。其实就是对角阵 $\begin{pmatrix} 1 &1 &0 &0 \\ 1&1 &0 &0 \\ 0& 0 &1 &1\\ 0& 0&1 &1 \end{pmatrix}$ 化成 $\begin{pmatrix} 1 & 0 &0&0\\ 0 & 1 &0 &0 \\ 0&0&0&0\\0&0&0&0 \end{pmatrix}$

SST的分解

Total sum of squares, 也就是n倍方差，是样本实际值偏离中心的程度 $(y_i-\bar{y})^2$ 。Total，意味着这个是可分解的量。从残差和误差两个角度去分解。也就是估计值偏离样本实际值的程度 $(\hat{y_i}-\bar{y})^2$ +估计值偏离中样本心的程度 $(y_i-\hat{y_i})^2$ 。