STAT 371 S24 #3SQL

Java Python STAT 371 S24 Assignment #3

(Submission deadline: 11:59 pm Fri., July 19th)

In this assignment, we will continue with developing a suitable regression model for your CEO dataset from Assignment #2, continuing with your fitted model from 2e) of the assignment (i.e. the model fit without the BACKGRD variate).

1)  Plot the residuals vs the fitted values, as well as a QQ plot. Comment on the adequacy of the fitted model, in terms of the model assumptions.

2)  One approach to stabilize the variance of the residuals and/or more adequately describe the relationship between a response variate and the explanatory variates is with an appropriate transformation of the response variate.

a)   Create a histogram of CEO compensation. What characteristic of this variate might lead you to suspect that a log transformation maybe suitable?

b)  Refit the data using the (natural) log transformation of compensation.

c)   Compare the overall fit of the model and significance of the individual parameters with that of the original (untransformed) model.

d)  Replot the two residual plots in 1). Has the transformation helped to address the issues with the adequacy of the (untransformed) model?

3)  We can also investigate the suitability of transformations of one or more of the explanatory variates by looking at scatterplots of the variates vs the response (log(COMP), in this case).

a)   Create a scatterplot of SALES vs log(COMP). Does a linear model seem appropriate for these two variates?

b)  Create a scatterplot of log(SALES) vs log(COMP). Comment.

c)  Refit the model once again, this time taking the log transformation of compensation as well as of the variates SALES, VAL, PCNTOWN and PROF. We will use this model going forward. Comment on the effect these transformations have on the overall fit of the model, and on the p-values of the associated variates.

4)  Plot the residuals vs the  fitted values  and  the  QQ  STAT 371 S24 Assignment #3SQL plot  for  the model  in  3).  Comment  on  the  effect  of the transformations on the model assumptions.

5)  Replot the plots in 4) using the studentized residuals. Do you notice any major changes in these plots? Are there any outliers present?

6)  Plot the hat values vs index (observation number). Are there any high leverage points?

7)  Investigate the observation with the highest leverage for a possible cause.

8)  Plot the Cook’s Distance values. Are there any influential cases?

9)  Now that we have obtained a more adequate model through transformation of the response and some of the explanatory variables, we can further improve the model by using model selection methods.to select which subset of variables to include.

a)  Use backward selection to arrive at a reasonable model (use α = .15). Show your work.

b)  Use the leaps function in R to select a model, based on Mallow’s Cp  and adjusted R2. (You may first need to install and load the leaps package). Select the model that yields the largest adjusted R-squared and meets the Mallow’s Cp criterion (Cp < k+1). Comment on the overall fit and the significance of the model parameters.

c)   Confirm the Mallow’s Cp value for this model by calculating the value from information in the summary output of this model and of the full model.

d)  Did the model selection procedures in a) and b) arrive at the same model?

e)   Perform an additional sum of squares test on the full model (model in 3c) and reduced model (model in 9b) using the anova function. Be sure to state the conclusion in the context of the study.

f)   Plot the studentized residuals vs the fitted values and the QQ plot of the studentized residuals to confirm that your preferred model is adequate in terms of the model assumptions.

g)  Finally, recalculate the 95% prediction interval for the CEO in 2e) of Assignment #2, based on your preferred model. Be sure to back transform. to the original units         

  • 3
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值