\(Models\ comparison\ trough\ Leave-one-out,\ cross-validation\ (LOO,\ Vehtari\ and\ Gelman,\ 2015)\ using\ the\ dataset\\ with\ the\ visit\ length\ at\ the\ feeder\ when\ the\ next\ visit\ was\ less\ than\ or\ equal\ to\ 60sc,\ and\ greater\ than\ or\ equal\ to\ 600sc,\\ from\ 7\ trials,\ in\ pig\ production\)
Pareto Smoothed Importance Sampling-Leave one out cross-validation (PSI-LOO)
\(The\ matrix\ to\ each\ model\ contain\ the\ estimates\ and\ their\ standard\ errors\ of:\\ \hat{elpd}_{loo}:\ expected\ log\ predictive\ pointwise\ density\\ \ \hat{p}_{loo}:\ effective\ number\ of\ parameters\\ looic:\ information\ ciretion\ converted\ to\ deviance\ scale=\ -2*\hat{elpd}_{loo}\)
The table show a summary of Pareto \(k\) diagnostics, whis is used to asses the reliability of the estimates. If we have:
\(**\ \mathbf{k\ge 1},\ neither\ variance\ nor\ mean\ the\ raw\ importance\ ratios\ exist.\ we\ are\ not\ able\ to\ compute\ an\ estimate\\ for\ Monte\ Carlo\ standar\ error\ (SE)\ of\ the\ \hat{elpd}_{loo} \\**\ \mathbf{0.5\le\ k\ < 1}\, the\ variance\ the\ raw\ importance\ ratios\ is\ infinite,\ but\ the\ mean\ exist. If\ \mathbf{k}\ is\ between\ \mathbf{0.5}\ and\ \mathbf{0.7}\ then\ observed\\ Monte\ Carlo\ error\ estimates\ with\ PSIS\\ **\ \mathbf{k\ \le 0.5},\ the\ distribution\ of\ raw\ importance\ ratios\ has\ finite\ variance\ and\ Monte\ Carlo\ standar\ error\ is\ small\)
1. Subset less than or equal to 60sc next visit ath the feeder (immediate replacement)
1.1. M1 mixed model
print(m160.loo,plot_k = T)
##
## Computed from 30000 by 58255 log-likelihood matrix
##
## Estimate SE
## elpd_loo -192339.2 236.7
## p_loo 181.7 2.0
## looic 384678.5 473.4
## ------
## Monte Carlo SE of elpd_loo is 0.1.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

1.2. M2 mixed model
print(m260.loo,plot_k = T)
##
## Computed from 30000 by 58255 log-likelihood matrix
##
## Estimate SE
## elpd_loo -191608.4 240.3
## p_loo 301.2 3.2
## looic 383216.8 480.6
## ------
## Monte Carlo SE of elpd_loo is 0.1.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

1.3. M3 mixed model
print(m360.loo,plot_k = T)
##
## Computed from 30000 by 58255 log-likelihood matrix
##
## Estimate SE
## elpd_loo -191608.7 240.3
## p_loo 301.8 3.2
## looic 383217.5 480.6
## ------
## Monte Carlo SE of elpd_loo is 0.1.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

Diagnostics for PSIS
a<-rbind(mcse_loo(m160.loo),
mcse_loo(m260.loo),
mcse_loo(m360.loo))
rownames(a)<-c("M1", "M2", "M3")
colnames(a)<-"MCSE"
kable(a,caption = "Monte Carlo Standar Error (MCSE) by each model", align = "l")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)
Monte Carlo Standar Error (MCSE) by each model
|
|
MCSE
|
|
M1
|
0.0779100
|
|
M2
|
0.0915936
|
|
M3
|
0.0914269
|
b<-round(rbind(summary(pareto_k_values(m160.loo)),
summary(pareto_k_values(m260.loo)),
summary(pareto_k_values(m360.loo))),4)
rownames(b)<-c("M1", "M2", "M3")
kable(b,caption = "Distribution of Pareto k estimates by each model")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)
Distribution of Pareto k estimates by each model
|
|
Min.
|
1st Qu.
|
Median
|
Mean
|
3rd Qu.
|
Max.
|
|
M1
|
-0.2336
|
-0.0740
|
-0.0424
|
-0.0418
|
-0.0106
|
0.2020
|
|
M2
|
-0.2773
|
-0.0616
|
-0.0287
|
-0.0284
|
0.0043
|
0.2475
|
|
M3
|
-0.2491
|
-0.0631
|
-0.0303
|
-0.0298
|
0.0033
|
0.2404
|
a<-round(rbind(summary(psis_n_eff_values(m160.loo)),
summary(psis_n_eff_values(m260.loo)),
summary(psis_n_eff_values(m360.loo))),3)
rownames(a)<-c("M1", "M2", "M3")
kable(a,caption = "Distribution of estimates PSIS effective sample size by each model")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)
Distribution of estimates PSIS effective sample size by each model
|
|
Min.
|
1st Qu.
|
Median
|
Mean
|
3rd Qu.
|
Max.
|
|
M1
|
12682.88
|
29900.85
|
30477.60
|
30230.40
|
30991.88
|
33600.03
|
|
M2
|
12002.38
|
34674.26
|
36590.02
|
36069.05
|
38330.12
|
50839.56
|
|
M3
|
11702.08
|
35264.35
|
37078.56
|
36543.50
|
38847.88
|
51909.62
|
1.4. Model comparison
\(Compare\ fitted\ models\ based\ on\ expected\ log\ pointwise\ predictive\ density\ (\hat{elpd}_{loo})\ for\ new\ data,\ the\ matrix\ contain:\\ elpd_{diff}: is\ the\ diference\ in\ elpd\ for\ two\ models,\ if\ more\ than\ two\ models\ are\ compared,\ the\ difference\ is\ computed\ relative\ to\ the\ model\\ with\ highest\ elpd\\ se_{diff}:\ standard\ error\ of\ difference\)
a<-loo_compare(m160.loo, m260.loo, m360.loo)[,1:8]
kable(a,caption = "Model comparison LOO-CV subset Immediate replacement")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)%>%
column_spec(6, background = "#AAEEAA")
Model comparison LOO-CV subset Immediate replacement
|
|
elpd_diff
|
se_diff
|
elpd_loo
|
se_elpd_loo
|
p_loo
|
se_p_loo
|
looic
|
se_looic
|
|
model2
|
0.0000000
|
0.000000
|
-191608.4
|
240.3140
|
301.2128
|
3.238398
|
383216.8
|
480.6281
|
|
model3
|
-0.3478399
|
0.522286
|
-191608.7
|
240.2925
|
301.7511
|
3.243406
|
383217.5
|
480.5850
|
|
model1
|
-730.8295297
|
40.992374
|
-192339.2
|
236.7156
|
181.7391
|
1.994128
|
384678.5
|
473.4312
|
2. Subset greater than or equal to 600sc next visit ath the feeder (distant replacement)
2.1. M1 mixed model
print(m1600.loo,plot_k = T)
##
## Computed from 30000 by 6258 log-likelihood matrix
##
## Estimate SE
## elpd_loo -21070.4 210.9
## p_loo 178.8 19.8
## looic 42140.9 421.9
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 6256 100.0% 5993
## (0.5, 0.7] (ok) 0 0.0% <NA>
## (0.7, 1] (bad) 0 0.0% <NA>
## (1, Inf) (very bad) 2 0.0% 11
## See help('pareto-k-diagnostic') for details.

2.2. M2 mixed model
print(m2600.loo,plot_k = T)
##
## Computed from 30000 by 6258 log-likelihood matrix
##
## Estimate SE
## elpd_loo -21071.2 210.7
## p_loo 185.5 19.6
## looic 42142.3 421.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 6256 100.0% 4607
## (0.5, 0.7] (ok) 0 0.0% <NA>
## (0.7, 1] (bad) 0 0.0% <NA>
## (1, Inf) (very bad) 2 0.0% 8
## See help('pareto-k-diagnostic') for details.

2.3. M3 mixed model
print(m3600.loo,plot_k = T)
##
## Computed from 30000 by 6258 log-likelihood matrix
##
## Estimate SE
## elpd_loo -21071.7 210.5
## p_loo 184.9 19.5
## looic 42143.4 421.1
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 6256 100.0% 5404
## (0.5, 0.7] (ok) 0 0.0% <NA>
## (0.7, 1] (bad) 0 0.0% <NA>
## (1, Inf) (very bad) 2 0.0% 9
## See help('pareto-k-diagnostic') for details.

Diagnostics for PSIS
a<-rbind(mcse_loo(m1600.loo),
mcse_loo(m2600.loo),
mcse_loo(m3600.loo))
rownames(a)<-c("M1", "M2", "M3")
colnames(a)<-"MCSE"
kable(a,caption = "Monte Carlo Standar Error (MCSE) by each model", align = "l")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)
Monte Carlo Standar Error (MCSE) by each model
|
|
MCSE
|
|
M1
|
NA
|
|
M2
|
NA
|
|
M3
|
NA
|
b<-round(rbind(summary(pareto_k_values(m1600.loo)),
summary(pareto_k_values(m2600.loo)),
summary(pareto_k_values(m3600.loo))),4)
rownames(b)<-c("M1", "M2", "M3")
kable(b,caption = "Distribution of Pareto k estimates by each model")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)
Distribution of Pareto k estimates by each model
|
|
Min.
|
1st Qu.
|
Median
|
Mean
|
3rd Qu.
|
Max.
|
|
M1
|
-0.1812
|
-0.0146
|
0.0204
|
0.0241
|
0.0587
|
1.2717
|
|
M2
|
-0.1768
|
-0.0060
|
0.0295
|
0.0316
|
0.0650
|
1.1792
|
|
M3
|
-0.1668
|
-0.0112
|
0.0247
|
0.0267
|
0.0617
|
1.1056
|
a<-round(rbind(summary(psis_n_eff_values(m1600.loo)),
summary(psis_n_eff_values(m2600.loo)),
summary(psis_n_eff_values(m3600.loo))),3)
rownames(a)<-c("M1", "M2", "M3")
kable(a,caption = "Distribution of estimates PSIS effective sample size by each model")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)
Distribution of estimates PSIS effective sample size by each model
|
|
Min.
|
1st Qu.
|
Median
|
Mean
|
3rd Qu.
|
Max.
|
|
M1
|
11.489
|
30141.86
|
34704.61
|
32672.80
|
36968.79
|
42768.85
|
|
M2
|
7.632
|
26406.44
|
36104.39
|
33146.32
|
40470.06
|
48204.56
|
|
M3
|
8.527
|
23860.77
|
31286.63
|
29637.65
|
35710.26
|
43795.46
|
1.4. Model comparison
\(Compare\ fitted\ models\ based\ on\ expected\ log\ pointwise\ predictive\ density\ (\hat{elpd}_{loo})\ for\ new\ data,\ the\ matrix\ contain:\\ elpd_{diff}: is\ the\ diference\ in\ elpd\ for\ two\ models,\ if\ more\ than\ two\ models\ are\ compared,\ the\ difference\ is\ computed\ relative\ to\ the\ model\\ with\ highest\ elpd\\ se_{diff}:\ standard\ error\ of\ difference\)
a<-loo_compare(m1600.loo, m2600.loo, m3600.loo)[,1:8]
kable(a,caption = "Model comparison LOO-CV subset distant replacement")%>%
kable_styling(bootstrap_options = c("striped"),full_width = F, position = "left",font_size = 14)%>%
column_spec(1, bold = T)%>%
column_spec(6, background = "#AAA0CB")
Model comparison LOO-CV subset distant replacement
|
|
elpd_diff
|
se_diff
|
elpd_loo
|
se_elpd_loo
|
p_loo
|
se_p_loo
|
looic
|
se_looic
|
|
model1
|
0.0000000
|
0.0000000
|
-21070.43
|
210.9289
|
178.8201
|
19.78814
|
42140.85
|
421.8579
|
|
model2
|
-0.7394514
|
0.9919643
|
-21071.17
|
210.6864
|
185.5074
|
19.61010
|
42142.33
|
421.3728
|
|
model3
|
-1.2583838
|
0.8014598
|
-21071.68
|
210.5318
|
184.9381
|
19.48035
|
42143.37
|
421.0635
|