In the model development, the “leave-one-out” prediction is a way of cross-validation, calculated as below:
1. First of all, after a model is developed, each observation used in the model development is removed in turn and then the model is refitted with the remaining observations
2. The out-of-sample prediction for the refitted model is calculated with the removed observation one by one to assemble the LOO, e.g. leave-one-out predicted values for the whole model development sample.
The loo_predict() function below is a general routine to calculate the LOO prediction for any GLM object, which can be further employed to investigate the model stability and predictability.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
> pkgs <-
c
(
'doParallel'
,
'foreach'
)
>
lapply
(pkgs, require, character.only = T)
[[1]]
[1]
TRUE
[[2]]
[1]
TRUE
>
registerDoParallel
(cores = 8)
>
>
data
(AutoCollision, package =
"insuranceData"
)
>
# A GAMMA GLM #
> model1 <-
glm
(Severity ~ Age + Vehicle_Use, data = AutoCollision, family =
Gamma
(link =
"log"
))
>
# A POISSON GLM #
> model2 <-
glm
(Claim_Count ~ Age + Vehicle_Use, data = AutoCollision, family =
poisson
(link =
"log"
))
>
> loo_predict <-
function
(obj) {
+ yhat <-
foreach
(i = 1:
nrow
(obj$data), .combine = rbind) %dopar% {
+
predict
(
update
(obj, data = obj$data[-i, ]), obj$data[i,], type =
"response"
)
+ }
+
return
(
data.frame
(result = yhat[, 1], row.names =
NULL
))
+ }
>
# TEST CASE 1
> test1 <-
loo_predict
(model1)
> test1$result
[1] 303.7393 328.7292 422.6610 375.5023 240.9785 227.6365 288.4404 446.5589
[9] 213.9368 244.7808 278.7786 443.2256 213.9262 243.2495 266.9166 409.2565
[17] 175.0334 172.0683 197.2911 326.5685 187.2529 215.9931 249.9765 349.3873
[25] 190.1174 218.6321 243.7073 359.9631 192.3655 215.5986 233.1570 348.2781
>
# TEST CASE 2
> test2 <-
loo_predict
(model2)
> test2$result
[1] 11.15897 37.67273 28.76127 11.54825 50.26364 152.35489 122.23782
[8] 44.57048 129.58158 465.84173 260.48114 107.23832 167.40672 510.41127
[15] 316.50765 121.75804 172.56928 546.25390 341.03826 134.04303 359.30141
[22] 977.29107 641.69934 251.32547 248.79229 684.86851 574.13994 238.42209
[29] 148.77733 504.12221 422.75047 167.61203
|