Examen N°03

Los datos presentados a continuación contienen información de los casos de personas que padecen cáncer en los 14 distritos de la provincia “Mundo Feliz” (5p).:

#Cargar los datos
datos <- read.table("CASOSCANCER.txt", header = T, sep = "\t")
attach(datos)
head(datos,5) #Observar los 5 datos de la base de datos
##   Casos  Edad Area
## 1    11 30-40    0
## 2    15 40-50    1
## 3    21 40-50    1
## 4    25 30-40    0
## 5    24 40-50    1

Para no tener mayor inconveniente con las variables categoricas o “CHR”, vamos hacer las transformaciones siguientes para edad.

datos$Edad <- ifelse(datos$Edad=="20-30",1,
                      ifelse(datos$Edad=="30-40",2,
                             ifelse(datos$Edad=="40-50",3,
                                    ifelse(datos$Edad=="50-60",4,5))))
head(datos,5) #Observar los 5 datos de la base de datos
##   Casos Edad Area
## 1    11    2    0
## 2    15    3    1
## 3    21    3    1
## 4    25    2    0
## 5    24    3    1

De acuerdo a los datos, Se pide:

Regresión poisson clásico

  1. Estimar el modelo de regresión poisson clásico e interpretar los parámetros.
# Ajusta el modelo de regresión de Poisson
modelo_poisson <- glm(Casos ~   Edad +  Area, data = datos, family = "poisson")

# Obtén un resumen de los resultados del modelo
summary(modelo_poisson)
## 
## Call:
## glm(formula = Casos ~ Edad + Area, family = "poisson", data = datos)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  2.29234    0.18497  12.393  < 2e-16 ***
## Edad         0.14860    0.05113   2.906  0.00366 ** 
## Area         0.10768    0.14238   0.756  0.44947    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 29.477  on 13  degrees of freedom
## Residual deviance: 20.512  on 11  degrees of freedom
## AIC: 90.098
## 
## Number of Fisher Scoring iterations: 4

Nota. Los coeficiente del modelo de regresión poisson clasico, nos expresan los siguiente: * B(Edad)=0.1486 : Nos da ha entender que, si la edad aumenta, y el área donde a recidido los ultimos años se mantiene constantes, Cantidad de personas con cáncer por distrito aumentaría. * B(Área)=0.10768 : Nos da ha entender que, el área donde ha residido el último año sea urbano y la edad se mantienen constantes, los casos de covid por día aumentaría.

  1. Evaluar los supuestos del modelo.
#Evaluación de supuestos
# Verifica la bondad de ajuste del modelo
#Bondad de ajuste del modelo
plot(modelo_poisson, which = 1)

Nota. Se observa que los datos no se ajustan a la recta, lo que nos da ha entender que el modelo es no es bueno.

library(AER)
## Loading required package: car
## Loading required package: carData
## Loading required package: lmtest
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: survival
#Examina la sobredispersión o subdispersión:
dispersion_test <- dispersiontest(modelo_poisson)
dispersion_test
## 
##  Overdispersion test
## 
## data:  modelo_poisson
## z = 0.87546, p-value = 0.1907
## alternative hypothesis: true dispersion is greater than 1
## sample estimates:
## dispersion 
##   1.570365

Nota. De acuerdo a la dispersión de los datos, se observa que, es no significativo, p-value mayor al 5%, dando a entender que la verdadera dispersión es menor que 1.

Regresión poisson bayesiano

Considerando los datos del anterior ejercicio:

  1. Estimar el modelo de regresión bayesiano.
#BAYESIANO
# install.packages("brms")
# Carga el paquete brms
library(brms)
## Loading required package: Rcpp
## Loading 'brms' package (version 2.20.3). Useful instructions
## can be found by typing help('brms'). A more detailed introduction
## to the package is available through vignette('brms_overview').
## 
## Attaching package: 'brms'
## The following object is masked from 'package:survival':
## 
##     kidney
## The following object is masked from 'package:stats':
## 
##     ar
# Utiliza la función brm() para ajustar el modelo
model <- brm(Casos ~ Edad + Area, family = poisson(), data = datos)
## Compiling Stan program...
## Start sampling
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.00025 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.5 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 7.36 seconds (Warm-up)
## Chain 1:                1.261 seconds (Sampling)
## Chain 1:                8.621 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 7.2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.72 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 1.241 seconds (Warm-up)
## Chain 2:                0.984 seconds (Sampling)
## Chain 2:                2.225 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 7.4e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.74 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.983 seconds (Warm-up)
## Chain 3:                1.007 seconds (Sampling)
## Chain 3:                1.99 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0.000101 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.01 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.967 seconds (Warm-up)
## Chain 4:                1.03 seconds (Sampling)
## Chain 4:                1.997 seconds (Total)
## Chain 4:
# Resumen del modelo
summary(model)
##  Family: poisson 
##   Links: mu = log 
## Formula: Casos ~ Edad + Area 
##    Data: datos (Number of observations: 14) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     2.28      0.18     1.91     2.63 1.00     3242     2682
## Edad          0.15      0.05     0.05     0.25 1.00     3499     2821
## Area          0.11      0.15    -0.17     0.41 1.00     3377     2274
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Nota. Se observa que los coeficiente del modelo de regresión poisson bayesiano.

Entonces: el modelo de regresión bayesiano seria: ln(y)=alfa + beta1(x1) + beta2(x2)

ln(y) = 2.28 + 0.15x1 + 0.11x2