What Will Happen When Adding a New Variable to a Multiple Linear Regression Model

最新推荐文章于 2024-01-22 21:17:07 发布

The Well-Built City

最新推荐文章于 2024-01-22 21:17:07 发布

阅读量162

点赞数

分类专栏： Statistics Misc

本文链接：https://blog.csdn.net/Bill_Wang_01/article/details/114349625

版权

Statistics Misc 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

What Will Happen When Adding a New Variable to a Multiple Linear Regression Model

Conclusions:
- This new (different from existing ones) variable will ALWAYS reduce Residual Sum of Squares ( $R S S$ )
- However, a large $p$ for the partial t-test for this new variable leads us to conclude that this variable is not a useful or valid one and thus should not be included in the model.
Explanations:
- Formula: $RSS=\sum_{i}(y_i-\hat {y_i})^2$ , $y_i$ as observed values of the response variable at $x_i$ , and $\hat {y_i}$ is the estimated mean value of the unobservable random variable $Y$ at $x_i$ estimated/calculated by the regression model fitted. Notice that we assume no error in the observed values of the explanatory variable(s) as the predictor(s) i.e. we treat the predictors’ values as fixed all the time.
- From the formula, we can see that $R S S$ measures how many variability in the observed values of the response variable (i.e. the dataset used) is NOT explained by the regression model fitted.
- Adding a new variable and calculating the model parameters by minimizing $R S S$ will always make the regression model explain a larger proportion of the variations in the observed $y$ values (see https://stats.stackexchange.com/questions/179244/is-rss-decreasing-or-non-increasing).
- This means,regardless of whether a newly added variable makes sense in the model, $R S S$ will always decrease, which cause $R^2$ to increase. This leads to the caveat that using $R^2$ to test whether to add a new variable in the model is not appropriate .
- The proper way to test whether the new variable is really statistically significant is through the $p$ value produced by the partial t-test on this new variable (or any equivalences like the partial $F$ test on the new variable). The null hypothesis is always the original model and the alternative hypothesis the original model plus the new variable (the new model). If $p$ is large, then we fail to reject the null hypothesis and conclude that the model with the new variable is not statistically significant against the original model, so we still use the “old model”.

The Well-Built City

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
What Will Happen When Adding a New Variable to a Multiple Linear Regression Model

What Will Happen When Adding a New Variable to a Multiple Linear Regression ModelConclusions:This new (different from existing ones) variable will ALWAYS reduce Residual Sum of Squares (RSSRSSRSS)However, a large ppp for the partial t-test for this ne
复制链接

扫一扫

专栏目录