Modelo Poisson mixto con brms

JuveYell

1-Introducción

El paquete brms se ajusta a los modelos multinivel multivariados no-lineales generalizados Bayesianos usando ‘Stan’ para la inferencia bayesiana completa. Se admite una amplia gama de distribuciones y funciones de enlace. Todos los parámetros de la distribución de la respuesta se pueden predecir para realizar una regresión distributiva. Las especificaciones previa son flexibles y alientan explícitamente a los usuarios a aplicar distribuciones anteriores que realmente reflejen sus creencias.

Para obterner más información, consulte la página web.

2-Problema

Vamos a simular ni=50 observaciones para G=10 grupos, es decir, en total 500 observaciones que sigan el siguiente modelo.

\[\begin{align*} y_{ij} &\sim Poisson(\lambda_{ij}) \\ \lambda_{ij} &= 4 - 6 x_{ij} + b_{0i} \\ b_{0} &\sim N(0, \sigma^2_{b0}=625) \\ x_{ij} &\sim U(-5, 6) \end{align*}\]

ni <- 50
G <- 10
nobs <- ni * G
grupo <- factor(rep(x=1:G, each=ni))
obs <- rep(x=1:ni, times=G)
set.seed(1234567)
x <- runif(n=nobs, min=0, max=1)
b0 <- rnorm(n=G, mean=0, sd=sqrt(17)) # Intercepto aleatorio
b0 <- rep(x=b0, each=ni)  # El mismo intercepto aleatorio pero repetido
re_slope<-rnorm(n=G, sd=sqrt(4))
re_slope <- rep(x=re_slope, each=ni)  # La misma pendiente aleatoria pero repetida
media <- exp( 4 + (-6+re_slope) *  x + b0)
set.seed(123456)
y <- rpois(n=obs, lambda = media)
datos <- data.frame(grupo, obs, b0, x, media, y)

A continuación se dibujan los datos simulados.

library(ggplot2)
ggplot(datos, aes(x, y, color=grupo) ) + 
  geom_point() + 
  labs(colour="Grupo/Cluster")

3-Ajustando el modelo

## Warning: package 'brms' was built under R version 4.0.5

## Loading required package: Rcpp

## Loading 'brms' package (version 2.15.0). Useful instructions
## can be found by typing help('brms'). A more detailed introduction
## to the package is available through vignette('brms_overview').

## 
## Attaching package: 'brms'

## The following object is masked from 'package:stats':
## 
##     ar

## Compiling Stan program...

## Start sampling

## 
## SAMPLING FOR MODEL '1835894ba820570efdbdafcbd1cd6547' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 66.256 seconds (Warm-up)
## Chain 1:                80.061 seconds (Sampling)
## Chain 1:                146.317 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '1835894ba820570efdbdafcbd1cd6547' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0.001 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 10 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 64.348 seconds (Warm-up)
## Chain 2:                88.297 seconds (Sampling)
## Chain 2:                152.645 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '1835894ba820570efdbdafcbd1cd6547' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 67.382 seconds (Warm-up)
## Chain 3:                80.667 seconds (Sampling)
## Chain 3:                148.049 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '1835894ba820570efdbdafcbd1cd6547' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 67.639 seconds (Warm-up)
## Chain 4:                88.161 seconds (Sampling)
## Chain 4:                155.8 seconds (Total)
## Chain 4:

## Warning: There were 1 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.

## Warning: There were 3902 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
## http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded

## Warning: Examine the pairs() plot to diagnose sampling problems

## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#bulk-ess

## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#tail-ess

4-Resultados

Una tabla de resumen se puede obtener así:

## Warning: There were 1 divergent transitions after warmup. Increasing adapt_delta
## above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-
## transitions-after-warmup

##  Family: poisson 
##   Links: mu = log 
## Formula: y ~ x + (1 | grupo) + (0 + x | grupo) 
##    Data: datos (Number of observations: 500) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~grupo (Number of levels: 10) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     3.73      0.98     2.39     6.21 1.02      135      247
## sd(x)             2.97      0.94     1.77     5.34 1.01      256      392
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     4.23      1.05     2.30     6.50 1.05       97      362
## x            -5.53      1.03    -7.74    -3.69 1.02      120      288
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

5-Resultados gráficos

Para obtener las distribuciones aposterioris de los parámetros se usa el siguiente código: