MATH377: Financial and Actuarial Modelling in R Tutorial 5Statistics

Java Python MATH377: Financial and Actuarial Modelling in R

Tutorial 5

Exercise 1. Let X = (X1, X2) be a bivariate normal distributed random vector with mean vector µ = (1.5, 1) and covariance matrix

a) Evaluate the density function of X at x = (1, 1) and x = (0, 2).

b) Compute P(−2 ≤ X1 ≤ 4, X2 > 1).

c) Plot the 3D surface of this bivariate normal density and its contours. Hint: You can modify the code in the lecture notes to plot the loglikelihood of a normal distribution. However, to use outer() you may need to pass a function similar to this one: f <- function(x, y) dmvnorm(cbind(x, y), mu , sigma). Finally, use the functions contour() and persp() to create the plots.

d) Generate 5000 observation from X and create a scatter plot for the generated sample.

e) Compute the empirical mean vector, covariance matrix, and correlation matrix for the generated sample in d).

f) Using your simulated sample in d), approximate the 95% quantile of X1 · X2.

Exercise 2. Consider the cars data set in R.

a) Compute the correlation between speed and dist, and create a scatter plot to compare speed vs dist. Do you see any relationship?

b) Fit a linear regression model to explain distance in terms of speed.

c) Add the regression line to your plot in a). Hint: this can be done using the abline() function applied to your regression model in b).

d) Predict dist for values of speed of 28 and MATH377: Financial and Actuarial Modelling in R Tutorial 5Statistics 30.

e) Does the model seem to satisfy the assumptions of mean zero, constant variance, and normality for the residuals?

Exercise 3. Consider the Boston data set available in the MASS package.

a) Create a scatter plot of lstat vs medv. Do you see any relationship?

b) Fit a linear regression model to explain medv in terms of lstat.

c) Add the regression line to your plot in a).

d) In a linear model, we can specify that the relationship between the independent variable and dependent variable is given in the form. of an nth-degree polynomial. One way to specify this in R is by using I(). Fit a linear model with medv ~ lstat + I(lstatˆ2), then predict medv for values of lstat of 0 and 40, and add these values as a line in your plot in a).

e) Use an information criteria to conclude which model among the ones in b) and d) describes the data better.

f) An alternative way to produce the same model as in d) is by using medv ~ poly(lstat, 2, raw = TRUE). In the previous line of code, 2 can be changed to other integer values to specify polynomials of different degrees. Fit a linear model with a 5th-degree polynomial for lstat, then predict medv for values of lstat of 0 and 40, and add a line in your plot in a) using these values.

g) Use an information criteria to conclude which of the three models best describes the data.

h) Fit a linear model with a 8th-degree polynomial. Conclude based on the information criteria if this model is a better choice (recall the concept of overfitting)         

  • 15
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值