This is a demo of two new functions I added to bayesplot this week.
Let’s fit a bad model. Bad because we have too few MCMC samples (iter) in the posterior distribution.
library(rstanarm)
#> Loading required package: Rcpp
#> Warning: package 'Rcpp' was built under R version 3.4.1
#> rstanarm (Version 2.15.3, packaged: 2017-04-29 06:18:44 UTC)
#> - Do not expect the default priors to remain the same in future rstanarm versions.
#> Thus, R scripts should specify priors explicitly, even if they are just the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#> options(mc.cores = parallel::detectCores())
m <- stan_lm(mpg ~ ., data = mtcars,
prior = R2(.8), iter = 100)
#> trying deprecated constructor; please alert package maintainer
#>
#> SAMPLING FOR MODEL 'lm' NOW (CHAIN 1).
#>
#> Gradient evaluation took 0 seconds
#> 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Adjust your expectations accordingly!
#>
#>
#> WARNING: There aren't enough warmup iterations to fit the
#> three stages of adaptation as currently configured.
#> Reducing each adaptation stage to 15%/75%/10% of
#> the given number of warmup iterations:
#> init_buffer = 7
#> adapt_window = 38
#> term_buffer = 5
#>
#> Iteration: 1 / 100 [ 1%] (Warmup)
#> Iteration: 10 / 100 [ 10%] (Warmup)
#> Iteration: 20 / 100 [ 20%] (Warmup)
#> Iteration: 30 / 100 [ 30%] (Warmup)
#> Iteration: 40 / 100 [ 40%] (Warmup)
#> Iteration: 50 / 100 [ 50%] (Warmup)
#> Iteration: 51 / 100 [ 51%] (Sampling)
#> Iteration: 60 / 100 [ 60%] (Sampling)
#> Iteration: 70 / 100 [ 70%] (Sampling)
#> Iteration: 80 / 100 [ 80%] (Sampling)
#> Iteration: 90 / 100 [ 90%] (Sampling)
#> Iteration: 100 / 100 [100%] (Sampling)
#>
#> Elapsed Time: 0.258 seconds (Warm-up)
#> 0.267 seconds (Sampling)
#> 0.525 seconds (Total)
#>
#>
#> SAMPLING FOR MODEL 'lm' NOW (CHAIN 2).
#>
#> Gradient evaluation took 0 seconds
#> 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Adjust your expectations accordingly!
#>
#>
#> WARNING: There aren't enough warmup iterations to fit the
#> three stages of adaptation as currently configured.
#> Reducing each adaptation stage to 15%/75%/10% of
#> the given number of warmup iterations:
#> init_buffer = 7
#> adapt_window = 38
#> term_buffer = 5
#>
#> Iteration: 1 / 100 [ 1%] (Warmup)
#> Iteration: 10 / 100 [ 10%] (Warmup)
#> Iteration: 20 / 100 [ 20%] (Warmup)
#> Iteration: 30 / 100 [ 30%] (Warmup)
#> Iteration: 40 / 100 [ 40%] (Warmup)
#> Iteration: 50 / 100 [ 50%] (Warmup)
#> Iteration: 51 / 100 [ 51%] (Sampling)
#> Iteration: 60 / 100 [ 60%] (Sampling)
#> Iteration: 70 / 100 [ 70%] (Sampling)
#> Iteration: 80 / 100 [ 80%] (Sampling)
#> Iteration: 90 / 100 [ 90%] (Sampling)
#> Iteration: 100 / 100 [100%] (Sampling)
#>
#> Elapsed Time: 0.197 seconds (Warm-up)
#> 0.278 seconds (Sampling)
#> 0.475 seconds (Total)
#>
#>
#> SAMPLING FOR MODEL 'lm' NOW (CHAIN 3).
#>
#> Gradient evaluation took 0 seconds
#> 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Adjust your expectations accordingly!
#>
#>
#> WARNING: There aren't enough warmup iterations to fit the
#> three stages of adaptation as currently configured.
#> Reducing each adaptation stage to 15%/75%/10% of
#> the given number of warmup iterations:
#> init_buffer = 7
#> adapt_window = 38
#> term_buffer = 5
#>
#> Iteration: 1 / 100 [ 1%] (Warmup)
#> Iteration: 10 / 100 [ 10%] (Warmup)
#> Iteration: 20 / 100 [ 20%] (Warmup)
#> Iteration: 30 / 100 [ 30%] (Warmup)
#> Iteration: 40 / 100 [ 40%] (Warmup)
#> Iteration: 50 / 100 [ 50%] (Warmup)
#> Iteration: 51 / 100 [ 51%] (Sampling)
#> Iteration: 60 / 100 [ 60%] (Sampling)
#> Iteration: 70 / 100 [ 70%] (Sampling)
#> Iteration: 80 / 100 [ 80%] (Sampling)
#> Iteration: 90 / 100 [ 90%] (Sampling)
#> Iteration: 100 / 100 [100%] (Sampling)
#>
#> Elapsed Time: 0.366 seconds (Warm-up)
#> 0.32 seconds (Sampling)
#> 0.686 seconds (Total)
#>
#>
#> SAMPLING FOR MODEL 'lm' NOW (CHAIN 4).
#>
#> Gradient evaluation took 0 seconds
#> 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Adjust your expectations accordingly!
#>
#>
#> WARNING: There aren't enough warmup iterations to fit the
#> three stages of adaptation as currently configured.
#> Reducing each adaptation stage to 15%/75%/10% of
#> the given number of warmup iterations:
#> init_buffer = 7
#> adapt_window = 38
#> term_buffer = 5
#>
#> Iteration: 1 / 100 [ 1%] (Warmup)
#> Iteration: 10 / 100 [ 10%] (Warmup)
#> Iteration: 20 / 100 [ 20%] (Warmup)
#> Iteration: 30 / 100 [ 30%] (Warmup)
#> Iteration: 40 / 100 [ 40%] (Warmup)
#> Iteration: 50 / 100 [ 50%] (Warmup)
#> Iteration: 51 / 100 [ 51%] (Sampling)
#> Iteration: 60 / 100 [ 60%] (Sampling)
#> Iteration: 70 / 100 [ 70%] (Sampling)
#> Iteration: 80 / 100 [ 80%] (Sampling)
#> Iteration: 90 / 100 [ 90%] (Sampling)
#> Iteration: 100 / 100 [100%] (Sampling)
#>
#> Elapsed Time: 0.535 seconds (Warm-up)
#> 0.328 seconds (Sampling)
#> 0.863 seconds (Total)
#> Warning: Markov chains did not converge! Do not analyze results!
bayesplot provides some helper functions to quickly scan model results using the R hat diagonistic and the ratio of effective samples diagnostic.
library(bayesplot)
#> This is bayesplot version 1.3.0.9000
#> - Plotting theme set to bayesplot::theme_default()
#> - Online documentation at mc-stan.org/bayesplot
# Rhat
mcmc_rhat(rhat(m))
# Ratio of effective samples
neff_ratio <- m$stan_summary[, "n_eff"] / 200
mcmc_neff(neff_ratio)
These functions do a bit of work to transform the data before plotting. I have been working to make this cleaned up plotting-ready data available to users by adding *_data() functions to the package. These return the data that would be plotted by a function.
mcmc_rhat_data(rhat(m))
#> # A tibble: 14 x 5
#> diagnostic parameter value rating description
#> <chr> <fctr> <dbl> <fctr> <chr>
#> 1 rhat disp 0.9861645 low hat(R) <= 1.05
#> 2 rhat wt 0.9870194 low hat(R) <= 1.05
#> 3 rhat vs 1.0000586 low hat(R) <= 1.05
#> 4 rhat am 1.0004939 low hat(R) <= 1.05
#> 5 rhat log-fit_ratio 1.0028728 low hat(R) <= 1.05
#> 6 rhat R2 1.0034469 low hat(R) <= 1.05
#> 7 rhat carb 1.0089427 low hat(R) <= 1.05
#> 8 rhat hp 1.0092232 low hat(R) <= 1.05
#> 9 rhat sigma 1.0109818 low hat(R) <= 1.05
#> 10 rhat gear 1.0936643 ok hat(R) <= 1.1
#> 11 rhat drat 1.1321676 high hat(R) > 1.1
#> 12 rhat cyl 1.2516668 high hat(R) > 1.1
#> 13 rhat qsec 1.3432087 high hat(R) > 1.1
#> 14 rhat (Intercept) 1.7675339 high hat(R) > 1.1
mcmc_neff_data(neff_ratio)
#> # A tibble: 16 x 5
#> diagnostic parameter value rating description
#> <chr> <fctr> <dbl> <fctr> <chr>
#> 1 neff (Intercept) 0.05536171 low N[eff]/N <= 0.1
#> 2 neff qsec 0.08904807 low N[eff]/N <= 0.1
#> 3 neff cyl 0.09094522 low N[eff]/N <= 0.1
#> 4 neff log-posterior 0.16647326 ok N[eff]/N <= 0.5
#> 5 neff drat 0.17204797 ok N[eff]/N <= 0.5
#> 6 neff gear 0.25459439 ok N[eff]/N <= 0.5
#> 7 neff R2 0.40085357 ok N[eff]/N <= 0.5
#> 8 neff sigma 0.42027229 ok N[eff]/N <= 0.5
#> 9 neff mean_PPD 0.95593729 high N[eff]/N > 0.5
#> 10 neff disp 1.00000000 high N[eff]/N > 0.5
#> 11 neff hp 1.00000000 high N[eff]/N > 0.5
#> 12 neff wt 1.00000000 high N[eff]/N > 0.5
#> 13 neff vs 1.00000000 high N[eff]/N > 0.5
#> 14 neff am 1.00000000 high N[eff]/N > 0.5
#> 15 neff carb 1.00000000 high N[eff]/N > 0.5
#> 16 neff log-fit_ratio 1.00000000 high N[eff]/N > 0.5
With these dataframes, we could manually recreate these kinds of plots directly with ggplot2.
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.4.1
theme_set(theme_grey())
df <- mcmc_rhat_data(rhat(m))
ggplot(mcmc_rhat_data(rhat(m))) +
aes(x = value, y = parameter, color = rating) +
geom_segment(aes(yend = parameter), xend = 1) +
geom_point() +
scale_color_discrete(
breaks = unique(df$rating),
labels = parse(text = as.character(unique(df$description)))) +
theme(legend.position = "bottom") +
labs(color = NULL, x = expression(hat(R)))