【Andrew Gelman Data Analysis Using Regression and Multilevel/Hierarchical Models】4.9 exercises 解答

本文链接：https://blog.csdn.net/tianty1121/article/details/116032673

本文提供了对Andrew Gelman著书《Data Analysis Using Regression and Multilevel/Hierarchical Models》中部分练习题的解答，涉及线性回归、对数变换、残差分析、数据清理、交互项及特殊变换在预测模型中的应用。通过实例探讨了统计模型在不同情境下的适用性和解释性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

（部分不全）

第一题

Logarithmic transformation and regression: consider the following regression:
log(weight) = −3.5+2.0 log(height) + error
with errors that have standard deviation 0.25. Weights are in pounds and heights are in inches.
(a) Fill in the blanks: approximately 68% of the persons will have weights within a factor of ___ and of ___ their predicted values from the regression.
(b) Draw the regression line and scatterplot of log(weight) versus log(height) that make sense and are consistent with the fitted model. Be sure to label the axes of your graph.

$(a)$

$- 0.25, + 0.25$

$(b)$

#随机生成身高变量数组
height <- rnorm(100,160,1.6)
#用随机生成的数组生成weight变量
weight <- rnorm(-3.5 + 2.0*log(height),0.25)
weight <- exp(weight)
#绘制模拟生成的变量的散点图
plot(log(height),log(weight))
#重新拟合模型
fit.1 <- lm(log(weight) ~ log(height))
#绘制回归曲线
curve(cbind(1,x) %*% coef(fit.1), add=TRUE)

log(weight)存在负数，可见这是一个很差的模型
在这里插入图片描述

第二题

The folder earnings has data from the Work, Family, and Well-Being Survey (Ross, 1990). Pull out the data on earnings, sex, height, and weight.
(a) In R, check the dataset and clean any unusually coded data.
(b) Fit a linear regression model predicting earnings from height. What transformation should you perform in order to interpret the intercept from this model as average earnings for people with average height?
© Fit some regression models with the goal of predicting earnings from some combination of sex, height, and weight. Be sure to try various transformations and interactions that might make sense. Choose your preferred model and justify.

简单回归，这次不造数据了，分析思路如下：

$(a)$

删除含NA的行data<-na.omit(data)
删除含缺失值的行x <- x[complete.cases(x),]

$(b)$

#将earning和height分别减去各自的均值
m.earning <- earning-mean(earning)
m.height <- height-mean(height
#拟合模型
fit.1 <- lm(m.earning ~ m.height)

当height等于均值时，z.height=0，此时截距就是height取均值时earning的平均数。

$(c)$

首先是一个简单模型，正常情况下R-s

【Andrew Gelman Data Analysis Using Regression and Multilevel/Hierarchical Models】4.9 exercises 解答

第一题

( a ) (a) (a)

( b ) (b) (b)

第二题

( a ) (a) (a)

( b ) (b) (b)

( c ) (c) (c)

$(a)$

$(b)$

$(a)$

$(b)$

$(c)$