01 Consider the mtcars data set. Fit a model with mpg as the outcome that includes number of cylinders as a factor variable and weight as confounder. Give the adjusted estimate for the expected change in mpg comparing 8 cylinders to 4.
使用mtcars数据集。以mpg为结果,用气缸和重量作为参数做线性回归,看看8气缸相对于4气缸的改进估计值是多少?
> mtcars$cyl <- factor(mtcars$cyl)
> fit <- lm(mpg ~ cyl + wt, mtcars)
> summary(fit)
Call:
lm(formula = mpg ~ cyl + wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.5890 -1.2357 -0.5159 1.3845 5.7915
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.9908 1.8878 18.006 < 2e-16 ***
cyl6 -4.2556 1.3861 -3.070 0.004718 **
cyl8 -6.0709 1.6523 -3.674 0.000999 ***
wt -3.2056 0.7539 -4.252 0.000213 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.557 on 28 degrees of freedom
Multiple R-squared: 0.8374, Adjusted R-squared: 0.82
F-statistic: 48.08 on 3 and 28 DF, p-value: 3.594e-11
在summary的结果中可以看到,Intercept就是cyl4,所以cyl8对应cyl4的Estimate的值就是 -6.0709
02 Consider the mtcars data set. Fit a model with mpg as the outcome that includes number of cylinders as a factor variable and weight as a possible confounding variable. Compare the effect of 8 versus 4 cylinders on mpg for the adjusted and unadjusted by weight models. Here, adjusted means including the weight variable as a term in the regression model and unadjusted means the model without weight included. What can be said about the effect comparing 8 and 4 cylinders after looking at models with and without weight included?.
在上题的数据中,在有或没有重量作为干扰因子的情况下,以下哪个说法是对的?
A. Holding weight constant, cylinder appears to have less of an impact on mpg than if weight is disregarded.
保持重量不变,对气缸的影响比完全忽略重量时更小
B. Within a given weight, 8 cylinder vehicles have an expected 12 mpg drop in fuel efficiency.
给定重量的情况下,8气缸的车,燃料功率下降12个mpg
C. Including or excluding weight does not appear to change anything regarding the estimated impact of number of cylinders on mpg.
重量对结果没有影响
D. Holding weight constant, cylinder appears to have more of an impact on mpg than if weight is disregarded.
保持重量不变,对气缸的影响比完全忽略重量时更大
> fit <- lm(mpg ~ cyl + wt, mtcars)
> summary(fit)
Call:
lm(formula = mpg ~ cyl + wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.5890 -1.2357 -0.5159 1.3845 5.7915
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.9908 1.8878 18.006 < 2e-16 ***
cyl6 -4.2556 1.3861 -3.070 0.004718 **
cyl8 -6.0709 1.6523 -3.674 0.000999 ***
wt -3.2056 0.7539 -4.252 0.000213 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.557 on 28 degrees of freedom
Multiple R-squared: 0.8374, Adjusted R-squared: 0.82
F-statistic: 48.08 on 3 and 28 DF, p-value: 3.594e-11
> fit2 <- lm(mpg ~ cyl, mtcars)
> summary(fit2)
Call:
lm(formula = mpg ~ cyl, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-5.2636 -1.8357 0.0286 1.3893 7.2364
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26.6636 0.9718 27.437 < 2e-16 ***
cyl6 -6.9208 1.5583 -4.441 0.000119 ***
cyl8 -11.5636 1.2986 -8.905 8.57e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.223 on 29 degrees of freedom
Multiple R-squared: 0.7325, Adjusted R-squared: 0.714
F-statistic: 39.7 on 2 and 29 DF, p-value: 4.979e-09
答案:A
03 Consider the mtcars data set. Fit a model with mpg as the outcome that considers number of cylinders as a factor variable and weight as confounder. Now fit a second model with mpg as the outcome model that considers the interaction between number of cylinders (as a factor variable) and weight. Give the P-value for the likelihood ratio test comparing the two models and suggest a model using 0.05 as a type I error rate significance benchmark.
现在再做第二个模型,让mpg作为结果,考虑气缸数和重量之间的交互。给定假定值,比较两个模型的似然比,并提出一个模型使用0.05作为第一类误差基准率。
A. The P-value is small (less than 0.05). So, according to our criterion, we reject, which suggests that the interaction term is necessary
P值小于0.05,则交互项是必须的
B. The P-value is larger than 0.05. So, according to our criterion, we would fail to reject, which suggests that the interaction terms may not be necessary.
P值大于0.05,则交互项不是必须的
C. The P-value is larger than 0.05. So, according to our criterion, we would fail to reject, which suggests that the interaction terms is necessary.
P值大于0.05,则交互项是必须的
D. The P-value is small (less than 0.05). Thus it is surely true that there is no interaction term in the true model.
P值小于0.05,则在真实模型中不存在交互项
E. The P-value is small (less than 0.05). So, according to our criterion, we reject, which suggests that the interaction term is not necessary.
P值小于0.05,则交互项不是必须的
F. The P-value is small (less than 0.05). Thus it is surely true that there is an interaction term in the true model.
P值小于0.05,则在真实模型中肯定存在交互项
> fit1 <- lm(mpg ~ cyl + wt, mtcars)
> summary(fit1)$adj.r.squared
[1] 0.8200146
> fit2 <- lm(mpg ~ cyl + wt + cyl:wt, mtcars)
> summary(fit2)$adj.r.squared
[1] 0.8349382
> lrtest(fit1, fit2)
Likelihood ratio test
Model 1: mpg ~ cyl + wt
Model 2: mpg ~ cyl + wt + cyl:wt
#Df LogLik Df Chisq Pr(>Chisq)
1 5 -73.311
2 7 -70.741 2 5.1412 0.07649 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
答案:C
持续更新中