Los datos son de “Skin Cancer Prevention Study”, un estudio aleatorizado de ensayos clínicos controlados para prevenir cáncer de piel no melanoma(células basales o células escamosas) en la piel de sujetos de alto riesgo. A los sujetos fueron asignados al azar un tratamiento placebo o 50 mg de betacaroteno por día durante 5 años. Los sujetos fueron examinados una vez al año y se les realizaba una biopsia si se sospechaba cáncer para determinar el número de nuevas células cencerosas en la piel que ocurrieron desde el último examen.
#Cargar paquetes
library(brms)
library(dplyr)
library(purrr)
library(tidyr)
library(tidybayes)
library(ggplot2)
library(cowplot)
library(rstan)
library(brms)
library(gganimate)
library(gifski)
theme_set(theme_tidybayes() + panel_border())
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
Modelar el número de nuevas células cancerosas en un paciente, además de observar la incedencia de diferentes variables tales como el género, la edad, el tratamiento, tipo de piel, y la exposición. Debido a que hay variables de agrupación, modelar dicha variable mediante modelos lineales jerárquicos puede generar buenos resultados.
Los datos completos tienen 1683 pacientes y un total de 7081 mediciones, en este caso se seleccionará aleatoriamente un subconjunto de los datos, los cuales tendrán en cuenta las mediciones de 100 pacientes.
skin_data <- read.csv2("skin.data.csv") #Lectura base de datos
set.seed(12345678)
pacientes <- sample(unique(skin_data$id),150) #Seleccionar solo 150 pacientes aleatoriamente
skin_data <- skin_data %>% filter(id %in% pacientes) %>%
mutate(tratamiento = ifelse(tratamiento == 0,"Placebo","Beta-carotene")) %>%
mutate(genero = as.factor(genero)) %>%
mutate(genero = as.factor(piel))
head(skin_data)
## id edad piel genero exposicion y tratamiento year
## 1 100067 53 1 1 3 0 Placebo 1
## 2 100067 53 1 1 3 0 Placebo 2
## 3 100067 53 1 1 3 0 Placebo 3
## 4 100067 53 1 1 3 0 Placebo 4
## 5 100067 53 1 1 3 0 Placebo 5
## 6 100214 64 1 1 2 0 Placebo 1
Es importante observar el comportamiento de la variable respuesta para plantear un modelo adecuado, en este caso la variable es de conteos por lo cual deben utilizarse distribuciones que puedan modelar dicho comportamiento.
skin_data %>%
ggplot(aes(y)) +
geom_bar(aes(y = (..count..)/sum(..count..)),fill = "salmon") +
ylab("Frecuencia")+
theme_bw()
Se observa que el número de nuevas células cancerosas en la piel que aparece con más frecuencia es cero células, siendo más del 80% de los datos, además tiene un comportamiento decreciente.
ggplot(skin_data, aes(x=edad, y=y)) +
geom_point() +
theme_bw()
Para la variable edad no se presentan diferencias muy significativas en cuanto al número de células cancerosas observadas, parece haber un comportamiento parecido para cada edad, excepto para los mayores números de nuevas células cancerosas que como se vio anteriormente no representan una frecuenta alta y es aproximadamente 0, las obtienen pacientes de 57 y 60 años.
tabla <- with(skin_data, table(piel, y))
model::plot_prof(tabla, Row=T, cex=1, xlab="Tipo de piel")
Para esta variable es conveniente realizar comparaciones mediante un gráfico de perfiles, en este caso se observa un comportamiento parecido para ambos tipos de piel, por lo cual puede que no sea muy significativo en cuánto al número de nuevas células cancerosas en los pacientes. ### Variable genero
tabla <- with(skin_data, table(genero, y))
model::plot_prof(tabla, Row=T, cex=1, xlab='Genero')
Tal y como se procedió a analizar el tipo de piel, se realiza un gráfico de perfiles para analizar la variable género, ambos tienen un comportamiento muy similar.
ggplot(skin_data, aes(x=exposicion, y=y)) +
geom_point() +
theme_bw()
La exposición,es decir, el recuento del número de nuevas células cancerosas en la piel está relacionado en cierta medida con las células nuevas que aparecen en los pacientes, quienes han presentado mayor número de celulas cancerosas presentan de 5 a 12 células cancerosas nuevas cada año.
tabla <- with(skin_data, table(tratamiento, y))
model::plot_prof(tabla, Row=T, cex=1, xlab='Tratamiento')
En este caso el número de células cancerosas nuevas se comporta de igual manera para quienes reciben placebo o beta-caroteno.
Como se tiene un total de 150 pacientes únicamente se graficarán el número de células cancerosas observadas en cada año discriminadas por tratamiento para los pacientes con id = 418191,107355,103755,404961,413502,106815
skin_data %>% filter(id %in% c(418191,107355,103755,404961,413502,106815)) %>%
ggplot(aes(x = year, y = y, color = tratamiento)) +
geom_point() +
theme_bw() +
facet_wrap(~ id)
Se puede observar que el comportamiento en cada paciente es diferente, por lo cual el uso de modelos jerárquicos tomando como variable de agrupación el id del paciente puede dar mejores resultados que un modelo de regresión lineal.
Se considerarán modelos lineales generalizados y modelos lineales jerárquicos generalizados en ambos casos utilizando distribuciones que puedan modelar conteos, como en la variable respuesta se observó que el número de nuevas células cancerosas que tuvo más frecuencia fue el cero se considerarán las distribuciones Poisson cero-inflado (ZIP) y Binomital negativa cero-inflado(ZINB)
Del análisis descriptivo las covariables que podrían ser significativas en los modelos es la exposición, por la naturaleza del problema también se incluirá la covariable tratamiento, además para los modelos jerárquicos se tendrá en cuenta como variable de agrupación el id del paciente.
\[y_{ij} \sim ZIP(\mu_{ij},\sigma)\] \[log(\mu_{ij}) = \beta_0 + \beta_1Exposición_{ij}+\beta_2 Tratamiento_{placeboi}\]
\[y_{ij} \sim ZIP(\mu_{ij},\sigma)\] \[log(\mu_{ij}) = \beta_0 + \beta_1Exposición_{ij}+\beta_2 Tratamiento_{placeboi} + b_{0i}\] \[b_{0} \sim N(0,\sigma_{b0}^2)\]
\[y_{ij} \sim ZIP(\mu_{ij},\sigma)\] \[log(\mu_{ij}) = \beta_0 + \beta_1Exposición_{ij}+\beta_2 Tratamiento_{placeboi} + b_{0i}+b_{1i}Exposición_i\] \[\begin{pmatrix} b_0 \\ b_1 \end{pmatrix} \sim N \left( \begin{pmatrix} 0 \\ 0 \end{pmatrix},\begin{pmatrix} \sigma^2_{b0} & \sigma_{b01} \\ \sigma_{b01} & \sigma^2_{b1} \end{pmatrix} \right)\]
\[y_{ij} \sim ZINB(\mu_{ij},\sigma,\nu)\] \[log(\mu_{ij}) = \beta_0 + \beta_1Exposición_{ij}+\beta_2 Tratamiento_{placeboi}\]
\[y_{ij} \sim ZINB(\mu_{ij},\sigma,\nu)\] \[log(\mu_{ij}) = \beta_0 + \beta_1Exposición_i+\beta_2 Tratamiento_{placeboi} + b_{0i}\] \[b_{0} \sim N(0,\sigma_{b0}^2)\]
\[y_{ij} \sim ZINB(\mu_{ij},\sigma,\nu)\] \[log(\mu_{ij}) = \beta_0 + \beta_1Exposición_{ij}+\beta_2 Tratamiento_{placeboi} + b_{0i}+b_{1i}Exposición_i\] \[\begin{pmatrix} b_0 \\ b_1 \end{pmatrix} \sim N \left( \begin{pmatrix} 0 \\ 0 \end{pmatrix},\begin{pmatrix} \sigma^2_{b0} & \sigma_{b01} \\ \sigma_{b01} & \sigma^2_{b1} \end{pmatrix} \right)\]
Para la construcción de los modelos se utilizará la función brm del paquete brms
#Modelo 4.0.1
mod1 <- brm(y ~ exposicion + tratamiento, data = skin_data, family = zero_inflated_poisson(),cores = 4, seed = 12345678)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -D_REENTRANT -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
summary(mod1)
## Family: zero_inflated_poisson
## Links: mu = log; zi = identity
## Formula: y ~ exposicion + tratamiento
## Data: skin_data (Number of observations: 649)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -0.98 0.22 -1.39 -0.55 1.00 1306
## exposicion 0.12 0.01 0.09 0.15 1.00 1522
## tratamientoPlacebo -0.66 0.18 -1.01 -0.32 1.00 2296
## Tail_ESS
## Intercept 1908
## exposicion 2075
## tratamientoPlacebo 2071
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## zi 0.55 0.07 0.41 0.68 1.00 1411 2313
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Modelo lineal ZIP
\[y_{ij} \sim ZIP(\mu_{ij},\sigma)\] \[\mu_{ij} = exp(-0.56 + 0.11Exposición_i-0.71Tratamiento_{placeboi})\] \[\sigma = 0.60\]
Modelo lineal ZIP con intercepto aleatorio
#Modelo 4.0.2
mod2 <- brm(y ~ exposicion + tratamiento + (1|id), data = skin_data, family = zero_inflated_poisson(),cores = 4, seed = 12345678)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -D_REENTRANT -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
summary(mod2)
## Family: zero_inflated_poisson
## Links: mu = log; zi = identity
## Formula: y ~ exposicion + tratamiento + (1 | id)
## Data: skin_data (Number of observations: 649)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~id (Number of levels: 150)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.14 0.18 0.81 1.53 1.00 885 1784
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -2.28 0.31 -2.91 -1.68 1.00 1629
## exposicion 0.19 0.03 0.13 0.25 1.00 1355
## tratamientoPlacebo -0.59 0.31 -1.21 0.02 1.00 2406
## Tail_ESS
## Intercept 2016
## exposicion 2017
## tratamientoPlacebo 2933
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## zi 0.32 0.07 0.19 0.46 1.00 3166 2964
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
\[y_{ij} \sim ZIP(\mu_{ij},\sigma)\] \[\mu_{ij} = exp(-2.28 + 0.19Exposición_i-0.59 Tratamiento_{placeboi} + b_{0i})\] \[b_{0} \sim N(0,\sigma_{b0}^2=1.14^2 )\] \[\sigma = 0.32\]
Modelo lineal ZIP con intercepto y pendiente aleatoria
#Modelo 4.0.3
mod3 <- brm(y ~ exposicion + tratamiento + (1+exposicion|id), data = skin_data, family = zero_inflated_poisson(),cores = 4, seed = 12345678)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -D_REENTRANT -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
## [[1]]
## Stan model 'c810fb673b85bb84cc8e5c1c6c62cb5c' does not contain samples.
summary(mod3)
## Warning: There were 156 divergent transitions after warmup. Increasing
## adapt_delta above 0.8 may help. See http://mc-stan.org/misc/
## warnings.html#divergent-transitions-after-warmup
## Family: zero_inflated_poisson
## Links: mu = log; zi = identity
## Formula: y ~ exposicion + tratamiento + (1 + exposicion | id)
## Data: skin_data (Number of observations: 649)
## Samples: 3 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 3000
##
## Group-Level Effects:
## ~id (Number of levels: 150)
## Estimate Est.Error l-95% CI u-95% CI Rhat
## sd(Intercept) 0.98 0.28 0.45 1.59 1.01
## sd(exposicion) 0.11 0.06 0.01 0.24 1.01
## cor(Intercept,exposicion) -0.01 0.55 -0.90 0.93 1.00
## Bulk_ESS Tail_ESS
## sd(Intercept) 284 113
## sd(exposicion) 143 340
## cor(Intercept,exposicion) 202 248
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -2.26 0.30 -2.86 -1.70 1.00 745
## exposicion 0.22 0.05 0.14 0.34 1.00 338
## tratamientoPlacebo -0.62 0.30 -1.24 -0.04 1.00 1219
## Tail_ESS
## Intercept 1557
## exposicion 183
## tratamientoPlacebo 1336
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## zi 0.32 0.07 0.19 0.46 1.00 1240 1609
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
\[y_{ij} \sim ZIP(\mu_{ij},\sigma)\] \[\mu_{ij} = exp(-2.26 + 0.22Exposición_{ij}-0.62Tratamiento_{placeboi} + b_{0i}+b_{1i}Exposición_{ij})\]
\[\begin{pmatrix} b_0 \\ b_1 \end{pmatrix} \sim N \left( \begin{pmatrix} 0 \\ 0 \end{pmatrix},\begin{pmatrix} \sigma^2_{b0}=0.98^2 & \sigma_{b01}=0 \\ \sigma_{b01}=0 & \sigma^2_{b1}=0.11^2 \end{pmatrix} \right)\]
Modelo lineal ZINB
#Modelo 4.1.1
mod4 <- brm(y ~ exposicion + tratamiento, data = skin_data, family = zero_inflated_negbinomial(),cores = 4, seed = 12345678)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -D_REENTRANT -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
## [[1]]
## Stan model 'f0ccf07018dc47a02f63a4092e5db6fa' does not contain samples.
##
## [[2]]
## Stan model 'f0ccf07018dc47a02f63a4092e5db6fa' does not contain samples.
##
## [[3]]
## Stan model 'f0ccf07018dc47a02f63a4092e5db6fa' does not contain samples.
summary(mod4)
## Family: zero_inflated_negbinomial
## Links: mu = log; shape = identity; zi = identity
## Formula: y ~ exposicion + tratamiento
## Data: skin_data (Number of observations: 649)
## Samples: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 1000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -1.87 0.24 -2.30 -1.38 1.00 315
## exposicion 0.17 0.02 0.13 0.22 1.00 662
## tratamientoPlacebo -0.58 0.23 -1.05 -0.16 1.00 652
## Tail_ESS
## Intercept 462
## exposicion 547
## tratamientoPlacebo 510
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 0.63 0.27 0.32 1.40 1.00 370 274
## zi 0.16 0.11 0.01 0.41 1.00 288 322
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
\[y_{ij} \sim ZINB(\mu_{ij},\sigma,\nu)\] \[\mu_{ij} = exp(-1.87 + 0.17Exposición_{ij}-0.58 Tratamiento_{placeboi})\] \[\sigma = 0.63\] \[\nu = 0.16\]
Modelo lineal ZINB con intercepto aleatorio
#Modelo 4.1.2
mod5 <- brm(y ~ exposicion + tratamiento + (1|id), data = skin_data, family = zero_inflated_negbinomial(),cores = 4, seed = 12345678)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -D_REENTRANT -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
## [[1]]
## Stan model '062fa29afb279a55cd028f79d403ed29' does not contain samples.
summary(mod5)
## Family: zero_inflated_negbinomial
## Links: mu = log; shape = identity; zi = identity
## Formula: y ~ exposicion + tratamiento + (1 | id)
## Data: skin_data (Number of observations: 649)
## Samples: 3 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 3000
##
## Group-Level Effects:
## ~id (Number of levels: 150)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.12 0.18 0.78 1.48 1.00 877 1374
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -2.28 0.32 -2.94 -1.68 1.00 1098
## exposicion 0.19 0.03 0.13 0.25 1.00 944
## tratamientoPlacebo -0.58 0.31 -1.17 0.03 1.01 1306
## Tail_ESS
## Intercept 1575
## exposicion 1280
## tratamientoPlacebo 1956
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 42.08 52.47 1.65 194.30 1.00 954 598
## zi 0.31 0.08 0.13 0.46 1.00 877 519
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
\[y_{ij} \sim ZINB(\mu_{ij},\sigma,\nu)\] \[\mu_{ij} = exp(-2.28 + 0.19Exposición_i-0.58 Tratamiento_{placeboi} + b_{0i})\] \[b_{0} \sim N(0,\sigma_{b0}^2 = 1.12^2)\] \[\sigma = 42.08^2\] \[\nu = 0.31^2\]
Modelo lineal ZINB con intercepto y pendiente aleatoria
#Modelo 4.1.2
mod6 <- brm(y ~ exposicion + tratamiento + (1+exposicion|id), data = skin_data, family = zero_inflated_negbinomial(),cores = 4, seed = 1234567)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -D_REENTRANT -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
## [[1]]
## Stan model 'dde101458a80bc54782575cf7d6ad328' does not contain samples.
##
## [[2]]
## Stan model 'dde101458a80bc54782575cf7d6ad328' does not contain samples.
##
## [[3]]
## Stan model 'dde101458a80bc54782575cf7d6ad328' does not contain samples.
summary(mod6)
## Family: zero_inflated_negbinomial
## Links: mu = log; shape = identity; zi = identity
## Formula: y ~ exposicion + tratamiento + (1 + exposicion | id)
## Data: skin_data (Number of observations: 649)
## Samples: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 1000
##
## Group-Level Effects:
## ~id (Number of levels: 150)
## Estimate Est.Error l-95% CI u-95% CI Rhat
## sd(Intercept) 0.95 0.27 0.40 1.50 1.00
## sd(exposicion) 0.11 0.06 0.01 0.25 1.02
## cor(Intercept,exposicion) -0.01 0.55 -0.87 0.96 1.00
## Bulk_ESS Tail_ESS
## sd(Intercept) 191 313
## sd(exposicion) 58 84
## cor(Intercept,exposicion) 60 223
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept -2.28 0.32 -2.98 -1.72 1.00 566
## exposicion 0.22 0.05 0.15 0.33 1.00 253
## tratamientoPlacebo -0.60 0.31 -1.24 0.01 1.00 764
## Tail_ESS
## Intercept 911
## exposicion 438
## tratamientoPlacebo 875
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## shape 48.11 65.31 2.86 230.63 1.00 763 453
## zi 0.31 0.07 0.17 0.44 1.00 597 454
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
\[y_{ij} \sim ZINB(\mu_{ij},\sigma,\nu)\] \[\mu_{ij} = exp(-2.28+ 0.22Exposición_{ij}-0.60Tratamiento_{placeboi} + b_{0i}+b_{1i}Exposición_{ij})\] \[\begin{pmatrix} b_0 \\ b_1 \end{pmatrix} \sim N \left( \begin{pmatrix} 0 \\ 0 \end{pmatrix},\begin{pmatrix} \sigma^2_{b0}=0.95^2 & \sigma_{b01}=0 \\ \sigma_{b01}=0 & \sigma^2_{b1}=0.11^2 \end{pmatrix} \right)\]
Para la comparación de modelos se utilizarán los criterios que puedan aplicarse a objetos de tipo brmsfit en este caso dos criterios waic que es la versión generalizada del AIC y loo leave-one-out cross-validation.
modelos <- c(mod1,mod2,mod3,mod4,mod5,mod6)
#Añadir criterios a los modelos
waic_c <- c(WAIC(mod1)$estimates[3],
WAIC(mod2)$estimates[3],
WAIC(mod3)$estimates[3],
WAIC(mod4)$estimates[3],
WAIC(mod5)$estimates[3],
WAIC(mod6)$estimates[3])
loo_c <- c(loo(mod1)$estimates[3],
loo(mod2)$estimates[3],
loo(mod3)$estimates[3],
loo(mod4)$estimates[3],
loo(mod5)$estimates[3],
loo(mod6)$estimates[3])
## Warning: Found 6 observations with a pareto_k > 0.7 in model 'mod2'. It is
## recommended to set 'reloo = TRUE' in order to calculate the ELPD without
## the assumption that these observations are negligible. This will refit
## the model 6 times to compute the ELPDs for the problematic observations
## directly.
## Warning: Found 4 observations with a pareto_k > 0.7 in model 'mod3'. It is
## recommended to set 'reloo = TRUE' in order to calculate the ELPD without
## the assumption that these observations are negligible. This will refit
## the model 4 times to compute the ELPDs for the problematic observations
## directly.
## Warning: Found 1 observations with a pareto_k > 0.7 in model 'mod4'. It is
## recommended to set 'reloo = TRUE' in order to calculate the ELPD without
## the assumption that these observations are negligible. This will refit
## the model 1 times to compute the ELPDs for the problematic observations
## directly.
## Warning: Found 3 observations with a pareto_k > 0.7 in model 'mod5'. It is
## recommended to set 'reloo = TRUE' in order to calculate the ELPD without
## the assumption that these observations are negligible. This will refit
## the model 3 times to compute the ELPDs for the problematic observations
## directly.
## Warning: Found 8 observations with a pareto_k > 0.7 in model 'mod6'. It is
## recommended to set 'reloo = TRUE' in order to calculate the ELPD without
## the assumption that these observations are negligible. This will refit
## the model 8 times to compute the ELPDs for the problematic observations
## directly.
criterios <- data.frame(waic = waic_c,
loo = loo_c)
rownames(criterios) <- c("mod1","mod2","mod3","mod4","mod5","mod6")
criterios
## waic loo
## mod1 724.0727 724.2066
## mod2 636.3050 645.8142
## mod3 637.6824 643.5194
## mod4 692.1062 692.3049
## mod5 641.7868 649.8512
## mod6 642.4121 648.0960
Como puede observarse los modelos que tuvieron mejor rendimiento fueron los que consideraron efectos aleatorios en este caso la pendiente aleatoria parece no aportar mucho a los modelos, por lo tanto los modelos solo con intercepto aleatorio tuvieron mejor rendimiento, los modelos en los que se tuvo en cuenta la distribución ZIP fueron mejores que los de la ZINB, el mejor modelo entre los seis planteados es el modelo 2, modelo lineal ZIP con intercepto aleatorio.
El modelo que tuvo mejor desempeño fue le modelo ZIP con covariables exposición y tratamiento, además con intercepto aleatorio.