Stepwise Selection

最新推荐文章于 2024-07-09 21:50:49 发布

冲鸭695

最新推荐文章于 2024-07-09 21:50:49 发布

阅读量324

点赞数

文章标签：回归 r语言

原文链接：https://www.statology.org/backward-selection/

版权

Stepwise Selection

文章目录

**Stepwise Selection**

In statistics, stepwise selection is a procedure we can use to build a regression model from a set of predictor variables by entering and removing predictors in a stepwise manner into the model until there is no statistically valid reason to enter or remove any more.

Backward Selection

One of the most commonly used stepwise selection methods is known as backward selection, which works as follows:

Step 1: Fit a regression model using all p predictor variables. Calculate the AIC***** value for the model.

Step 2: Remove the predictor variable that leads to the largest reduction in AIC and also leads to a statistically significant reduction in AIC compared to the model with all p predictor variables.

Step 3: Remove the predictor variable that leads to the largest reduction in AIC and also leads to a statistically significant reduction in AIC compared to the model with p-1 predictor variables.

Repeat the process until removing any predictor variable no longer longer leads to a statistically significant reduction in AIC.

*****There are several metrics you could use to calculate the quality of fit of a regression model including cross-validation prediction error, Cp, BIC, AIC, or adjusted R2. In the example below we choose to use AIC.

The following example shows how to perform backward selection in R.

Example: Backward Selection in R

For this example we’ll use the built-in mtcars dataset in R:

# view first six rows of mtcars
head(mtcars)

We will fit a multiple linear regression model using mpg (miles per gallon) as our response variable and all of the other 10 variables in the dataset as potential predictors variables.

The following code shows how to perform backward stepwise selection :

# define intercept-only model
intercept_only <- lm(mpg ~ 1, data=mtcars)

# define model with all predictors
all <- lm(mpg ~ ., data = mtcars)

# perform backward stepwise regression
backward <- step(all, direction = 'backward', scope = formula(all), trace = 0)

#view results of backward stepwise regression
backward$anova

# view final model
backward$coefficients

Here is how to interpret the results:

First, we fit a model using all 10 predictor variables and calculate the AIC of the model.
Next, we removed the variable (cyl) that lead to the greatest reduction in AIC and also had a statistically significant reduction in AIC compared to the 10-predictor variable model.
Next, we removed the variable (vs) that lead to the greatest reduction in AIC and also had a statistically significant reduction in AIC compared to the 9-predictor variable model.
Next, we removed the variable (carb) that lead to the greatest reduction in AIC and also had a statistically significant reduction in AIC compared to the 8-predictor variable model.

We repeated this process until removing any variable no longer led to a statistically significant reduction in AIC.

The final model turns out to be:
$m p g = 9.62 - 3.92 * wt + 1.23 * q sec + 2.94 * am$

A Note on Using AIC

In the previous example, we chose to use AIC as the metric for evaluating the fit of various regression models.

AIC stands for Akaike information criterion and is calculated as:
$2\ln(L)$
where:

K: The number of model parameters.
ln(L): The log-likelihood of the model. This tells us how likely the model is, given the data.

However, there are other metrics you might choose to use to evaluate the fit of regression models including cross-validation prediction error, Cp, BIC, AIC, or adjusted $R^2$ .

Fortunately, most statistical software allows you to specify which metric you would like to use when performing backward selection.

Forward Selection

The goal of stepwise selection is to build a regression model that includes all of the predictor variables that are statistically significantly related to the response variable.

One of the most commonly used stepwise selection methods is known as forward selection, which works as follows:

Step 1: Fit an intercept-only regression model with no predictor variables. Calculate the AIC***** value for the model.

Step 2: Fit every possible one-predictor regression model. Identify the model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the intercept-only model.

Step 3: Fit every possible two-predictor regression model. Identify the model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the one-predictor model.

Repeat the process until fitting a regression model with more predictor variables no longer leads to a statistically significant reduction in AIC.

*****There are several metrics you could use to calculate the quality of fit of a regression model including cross-validation prediction error, Cp, BIC, AIC, or adjusted $R^2$ .

In the example below we choose to use AIC.

The following example shows how to perform forward selection in R.

Example: Forward Selection in R

For this example we’ll use the built-in mtcars dataset in R:

We will fit a multiple linear regression model using mpg (miles per gallon) as our response variable and all of the other 10 variables in the dataset as potential predictors variables.

The following code shows how to perform forward stepwise selection:

# view first six rows of mtcars
head(mtcars)

# define intercept-only model
intercept_only <- lm(mpg ~ 1, data = mtcars)

# define model with all predictors
all <- lm(mpg ~ ., data = mtcars)

# perform forward stepwise regression
forward <- step(intercept_only, direction = 'forward', scope = formula(all), trace = 0)

# view results of backward stepwise regression
forward$anova

# view final model
forward$coefficients

First, we fit the intercept-only model. This model had an AIC of 115.94345.
Next, we fit every possible one-predictor model. The model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the intercept-only model used the predictor wt. This model had an AIC of 73.21736.
Next, we fit every possible two-predictor model. The model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the single-predictor model added the predictor cyl. This model had an AIC of 63.19800.
Next, we fit every possible three-predictor model. The model that produced the lowest AIC and also had a statistically significant reduction in AIC compared to the two-predictor model added the predictor hp. This model had an AIC of 62.66456.
Next, we fit every possible four-predictor model. It turned out that none of these models produced a significant reduction in AIC, thus we stopped the procedure.

Thus, the final model turns out to be:

mpg = 38.75 - 3.17*wt - 0.94*cyl - 0.02*hyp

It turns out that attempting to add more predictor variables to the model does not lead to a statistically significant reduction in AIC.

Thus, we conclude that the best model is the one with three predictor variables: wt, cyl, and hp.

冲鸭695

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Stepwise Selection

Stepwise Selection。
复制链接

扫一扫

Stepwise Selection

Stepwise Selection

文章目录

Backward Selection

Example: Backward Selection in R

A Note on Using AIC

Forward Selection

Example: Forward Selection in R

“相关推荐”对你有帮助么？