Los datos presentados a continuación contienen información de los casos de personas que padecen cáncer en los 14 distritos de la provincia “Mundo Feliz” (5p).:
#Cargar los datos
datos <- read.table("CASOSCANCER.txt", header = T, sep = "\t")
attach(datos)
head(datos,5) #Observar los 5 datos de la base de datos
## Casos Edad Area
## 1 11 30-40 0
## 2 15 40-50 1
## 3 21 40-50 1
## 4 25 30-40 0
## 5 24 40-50 1
Para no tener mayor inconveniente con las variables categoricas o “CHR”, vamos hacer las transformaciones siguientes para edad.
datos$Edad <- ifelse(datos$Edad=="20-30",1,
ifelse(datos$Edad=="30-40",2,
ifelse(datos$Edad=="40-50",3,
ifelse(datos$Edad=="50-60",4,5))))
head(datos,5) #Observar los 5 datos de la base de datos
## Casos Edad Area
## 1 11 2 0
## 2 15 3 1
## 3 21 3 1
## 4 25 2 0
## 5 24 3 1
De acuerdo a los datos, Se pide:
# Ajusta el modelo de regresión de Poisson
modelo_poisson <- glm(Casos ~ Edad + Area, data = datos, family = "poisson")
# Obtén un resumen de los resultados del modelo
summary(modelo_poisson)
##
## Call:
## glm(formula = Casos ~ Edad + Area, family = "poisson", data = datos)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.29234 0.18497 12.393 < 2e-16 ***
## Edad 0.14860 0.05113 2.906 0.00366 **
## Area 0.10768 0.14238 0.756 0.44947
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 29.477 on 13 degrees of freedom
## Residual deviance: 20.512 on 11 degrees of freedom
## AIC: 90.098
##
## Number of Fisher Scoring iterations: 4
Nota. Los coeficiente del modelo de regresión poisson clasico, nos expresan los siguiente: * B(Edad)=0.1486 : Nos da ha entender que, si la edad aumenta, y el área donde a recidido los ultimos años se mantiene constantes, Cantidad de personas con cáncer por distrito aumentaría. * B(Área)=0.10768 : Nos da ha entender que, el área donde ha residido el último año sea urbano y la edad se mantienen constantes, los casos de covid por día aumentaría.
#Evaluación de supuestos
# Verifica la bondad de ajuste del modelo
#Bondad de ajuste del modelo
plot(modelo_poisson, which = 1)
Nota. Se observa que los datos no se ajustan a la recta, lo que nos da ha entender que el modelo es no es bueno.
library(AER)
## Loading required package: car
## Loading required package: carData
## Loading required package: lmtest
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: survival
#Examina la sobredispersión o subdispersión:
dispersion_test <- dispersiontest(modelo_poisson)
dispersion_test
##
## Overdispersion test
##
## data: modelo_poisson
## z = 0.87546, p-value = 0.1907
## alternative hypothesis: true dispersion is greater than 1
## sample estimates:
## dispersion
## 1.570365
Nota. De acuerdo a la dispersión de los datos, se observa que, es no significativo, p-value mayor al 5%, dando a entender que la verdadera dispersión es menor que 1.
Considerando los datos del anterior ejercicio:
#BAYESIANO
# install.packages("brms")
# Carga el paquete brms
library(brms)
## Loading required package: Rcpp
## Loading 'brms' package (version 2.20.3). Useful instructions
## can be found by typing help('brms'). A more detailed introduction
## to the package is available through vignette('brms_overview').
##
## Attaching package: 'brms'
## The following object is masked from 'package:survival':
##
## kidney
## The following object is masked from 'package:stats':
##
## ar
# Utiliza la función brm() para ajustar el modelo
model <- brm(Casos ~ Edad + Area, family = poisson(), data = datos)
## Compiling Stan program...
## Start sampling
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.00025 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.5 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 7.36 seconds (Warm-up)
## Chain 1: 1.261 seconds (Sampling)
## Chain 1: 8.621 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 7.2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.72 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 1.241 seconds (Warm-up)
## Chain 2: 0.984 seconds (Sampling)
## Chain 2: 2.225 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 7.4e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.74 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.983 seconds (Warm-up)
## Chain 3: 1.007 seconds (Sampling)
## Chain 3: 1.99 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 0.000101 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.01 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.967 seconds (Warm-up)
## Chain 4: 1.03 seconds (Sampling)
## Chain 4: 1.997 seconds (Total)
## Chain 4:
# Resumen del modelo
summary(model)
## Family: poisson
## Links: mu = log
## Formula: Casos ~ Edad + Area
## Data: datos (Number of observations: 14)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 2.28 0.18 1.91 2.63 1.00 3242 2682
## Edad 0.15 0.05 0.05 0.25 1.00 3499 2821
## Area 0.11 0.15 -0.17 0.41 1.00 3377 2274
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Nota. Se observa que los coeficiente del modelo de regresión poisson bayesiano.
Entonces: el modelo de regresión bayesiano seria: ln(y)=alfa + beta1(x1) + beta2(x2)
ln(y) = 2.28 + 0.15x1 + 0.11x2