Pengantar
vignette menjelaskan cara menggunakan paket tidybayes untuk mengekstraksi tidy data frame dari model brms, dan penarikan dari distribusi posterior variabel model, kecocokan, dan prediksi dari brms :: brm. Untuk pengenalan yang lebih umum tentang tidybayes, lihat vignette(“tidybayes”).
Untuk menjalankan vignette membutuhkan package sebagai berikut :
library(magrittr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(purrr)
##
## Attaching package: 'purrr'
## The following object is masked from 'package:magrittr':
##
## set_names
library(forcats)
library(tidyr)
##
## Attaching package: 'tidyr'
## The following object is masked from 'package:magrittr':
##
## extract
library(modelr)
library(ggdist)
library(tidybayes)
library(ggplot2)
library(cowplot)
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
##
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
##
## extract
## The following object is masked from 'package:magrittr':
##
## extract
library(brms)
## Loading required package: Rcpp
## Loading 'brms' package (version 2.14.4). Useful instructions
## can be found by typing help('brms'). A more detailed introduction
## to the package is available through vignette('brms_overview').
##
## Attaching package: 'brms'
## The following object is masked from 'package:rstan':
##
## loo
## The following objects are masked from 'package:tidybayes':
##
## dstudent_t, pstudent_t, qstudent_t, rstudent_t
## The following objects are masked from 'package:ggdist':
##
## dstudent_t, pstudent_t, qstudent_t, rstudent_t
## The following object is masked from 'package:stats':
##
## ar
library(ggrepel)
library(RColorBrewer)
library(gganimate)
## No renderer backend detected. gganimate will default to writing frames to separate files
## Consider installing:
## - the `gifski` package for gif output
## - the `av` package for video output
## and restarting the R session
theme_set(theme_tidybayes() + panel_border())
Opsi ini membantu Stan bekerja lebih cepat:
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
mengekstraksi parameter regresi distribusi
brms :: brm () juga memungkinkan kita menyiapkan submodel untuk parameter distribusi respons selain lokasi (mis., mean). Misalnya, kita dapat mengizinkan parameter varian, seperti deviasi standar, untuk juga menjadi beberapa fungsi prediktor.
Pendekatan ini dapat membantu dalam kasus varians tidak konstan (juga disebut heteroskedastisitas oleh orang-orang yang menyukai kebingungan melalui bahasa Latin). Misalnya, bayangkan dua kelompok, masing-masing dengan respons dan varian rata-rata yang berbeda:
Kita mengambil dataset :
set.seed(1234)
AB = tibble(
group = rep(c("a", "b"), each = 20),
response = rnorm(40, mean = rep(c(1, 5), each = 20), sd = rep(c(1, 3), each = 20))
)
Untuk melihat data set :
head(AB,10)
## # A tibble: 10 x 2
## group response
## <chr> <dbl>
## 1 a -0.207
## 2 a 1.28
## 3 a 2.08
## 4 a -1.35
## 5 a 1.43
## 6 a 1.51
## 7 a 0.425
## 8 a 0.453
## 9 a 0.436
## 10 a 0.110
Bisa kita buat dalam bentuk grafik :
AB %>%
ggplot(aes(x = response, y = group)) +
geom_point()
Berikut adalah model yang memungkinkan mean dan deviasi standar respons bergantung pada kelompok:
m_ab = brm(
bf(
response ~ group,
sigma ~ group
),
data = AB,
)
## Compiling Stan program...
## Trying to compile a simple C file
## Running /usr/lib/R/bin/R CMD SHLIB foo.c
## gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include/" -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/" -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/unsupported" -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/BH/include" -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/StanHeaders/include/src/" -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/StanHeaders/include/" -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppParallel/include/" -I"/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/rstan/include" -DEIGEN_NO_DEBUG -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -DSTAN_THREADS -DBOOST_NO_AUTO_PTR -include '/home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp' -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1 -fpic -g -O2 -fdebug-prefix-map=/build/r-base-V28x5H/r-base-3.6.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c foo.c -o foo.o
## In file included from /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/Eigen/Core:88:0,
## from /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/Eigen/Dense:1,
## from /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13,
## from <command-line>:0:
## /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name ‘namespace’
## namespace Eigen {
## ^~~~~~~~~
## /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:17: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘{’ token
## namespace Eigen {
## ^
## In file included from /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/Eigen/Dense:1:0,
## from /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13,
## from <command-line>:0:
## /home/suhartono/R/x86_64-pc-linux-gnu-library/3.6/RcppEigen/include/Eigen/Core:96:10: fatal error: complex: No such file or directory
## #include <complex>
## ^~~~~~~~~
## compilation terminated.
## /usr/lib/R/etc/Makeconf:168: recipe for target 'foo.o' failed
## make: *** [foo.o] Error 1
## Start sampling
Kita dapat memplot distribusi posterior dari respons rata-rata di samping interval prediksi posterior dan data:
grid = AB %>%
data_grid(group)
fits = grid %>%
add_fitted_draws(m_ab)
preds = grid %>%
add_predicted_draws(m_ab)
AB %>%
ggplot(aes(x = response, y = group)) +
stat_halfeye(aes(x = .value), scale = 0.6, position = position_nudge(y = 0.175), data = fits) +
stat_interval(aes(x = .prediction), data = preds) +
geom_point(data = AB) +
scale_color_brewer()
Ini menunjukkan posterior dari rata-rata setiap kelompok (interval hitam dan plot kepadatan) dan interval prediksi posterior (biru).
Interval prediktif di grup b lebih besar daripada di grup a karena model cocok dengan standar deviasi yang berbeda untuk setiap grup. Kita dapat melihat bagaimana parameter distribusi yang sesuai, sigma, berubah dengan mengekstraknya menggunakan argumen dpar ke add_fitted_draws ():
grid %>%
add_fitted_draws(m_ab, dpar = TRUE) %>%
ggplot(aes(x = sigma, y = group)) +
stat_halfeye() +
geom_vline(xintercept = 0, linetype = "dashed")
Dengan menyetel dpar = TRUE, semua parameter distribusi ditambahkan sebagai kolom tambahan dalam hasil add_fitted_draws (); jika Anda hanya menginginkan parameter tertentu, Anda dapat menentukannya (atau daftar hanya parameter yang Anda inginkan). Dalam model di atas, dpar = TRUE setara dengan dpar = list (“mu”, “sigma”).
Daftar pustaka :