Ejercicio 1 logística

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

options(scipen = 999, digits = 3, encoding = 'UTF-8')

library(readr)
datos <- read_csv("C:/Users/Administrador/Downloads/datos.csv")

## Rows: 1363 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (10): age, sbp, dbp, chol, sex, newchd, male, cig, packs, muerte
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

View(datos)

library(data.table)
library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ purrr     1.0.2
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::between()     masks data.table::between()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ dplyr::first()       masks data.table::first()
## ✖ lubridate::hour()    masks data.table::hour()
## ✖ lubridate::isoweek() masks data.table::isoweek()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ dplyr::last()        masks data.table::last()
## ✖ lubridate::mday()    masks data.table::mday()
## ✖ lubridate::minute()  masks data.table::minute()
## ✖ lubridate::month()   masks data.table::month()
## ✖ lubridate::quarter() masks data.table::quarter()
## ✖ lubridate::second()  masks data.table::second()
## ✖ purrr::transpose()   masks data.table::transpose()
## ✖ lubridate::wday()    masks data.table::wday()
## ✖ lubridate::week()    masks data.table::week()
## ✖ lubridate::yday()    masks data.table::yday()
## ✖ lubridate::year()    masks data.table::year()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(broom)
library(vcd)

## Warning: package 'vcd' was built under R version 4.4.3

## Cargando paquete requerido: grid

library(sjPlot)
library(epiR)

## Cargando paquete requerido: survival
## Package epiR 2.0.77 is loaded
## Type help(epi.about) for summary information
## Type browseVignettes(package = 'epiR') to learn how to use epiR for applied epidemiological analyses

datos$sex <- as.factor(datos$sex)
datos$newchd <- as.factor(datos$newchd)

modelo1 <- glm(newchd ~ age, family = "binomial", data = datos)
t1<-tab_model(modelo1, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = F, title = "Newchd - EDAD")
knitr::asis_output(t1$knitr)

Newchd - EDAD
	newchd
Predictors	Odds Ratios	std. Error	CI	p
age	1.07	0.02	1.04 – 1.10	<0.001
Observations	1363
Deviance	1330.165
log-Likelihood	-665.082

summary(modelo1)

## 
## Call:
## glm(formula = newchd ~ age, family = "binomial", data = datos)
## 
## Coefficients:
##             Estimate Std. Error z value      Pr(>|z|)    
## (Intercept)  -4.8843     0.7737   -6.31 0.00000000027 ***
## age           0.0658     0.0145    4.55 0.00000537421 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1351.2  on 1362  degrees of freedom
## Residual deviance: 1330.2  on 1361  degrees of freedom
## AIC: 1334
## 
## Number of Fisher Scoring iterations: 4

modelo2 <- glm(newchd ~ age+sex, family = "binomial", data = datos)
t2<-tab_model(modelo2, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = F, title = "Newchd - EDAD y sexo")
knitr::asis_output(t2$knitr)

Newchd - EDAD y sexo
	newchd
Predictors	Odds Ratios	std. Error	CI	p
age	1.07	0.02	1.04 – 1.10	<0.001
sex1	2.05	0.29	1.56 – 2.70	<0.001
Observations	1363
Deviance	1303.531
log-Likelihood	-651.766

summary(modelo2)

## 
## Call:
## glm(formula = newchd ~ age + sex, family = "binomial", data = datos)
## 
## Coefficients:
##             Estimate Std. Error z value       Pr(>|z|)    
## (Intercept)  -5.3082     0.7863   -6.75 0.000000000015 ***
## age           0.0667     0.0146    4.58 0.000004751511 ***
## sex1          0.7161     0.1405    5.10 0.000000346121 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1351.2  on 1362  degrees of freedom
## Residual deviance: 1303.5  on 1360  degrees of freedom
## AIC: 1310
## 
## Number of Fisher Scoring iterations: 4

anova(modelo1, test = "LRT")

## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: newchd
## 
## Terms added sequentially (first to last)
## 
## 
##      Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
## NULL                  1362       1351              
## age   1     21.1      1361       1330 0.0000044 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(modelo2, test = "LRT")

## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: newchd
## 
## Terms added sequentially (first to last)
## 
## 
##      Df Deviance Resid. Df Resid. Dev   Pr(>Chi)    
## NULL                  1362       1351               
## age   1     21.1      1361       1330 0.00000440 ***
## sex   1     26.6      1360       1304 0.00000025 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Si la variable sex agrega valor predictivo al modelo. Como lo explica? El modelo 1 con edad disminuye la deviance en 21.1 puntos (1351 a 1330) mientras que el agregado de sexo (modelo2) disminuye 26.6 puntos adicionales a 1304.
Como se interpreta el efecto de la edad en el modelo múltiple? Por cada año que aumenta la edad, la enfermedad cardiovascular aumenta un 7%. Este valor se encuentra entre 4 y 10% con un 95% de confianza e independientemente del sexo
Existe efecto confundidor de sex sobre la relación entre edad y enfermedad coronaria? No, dado que el OR de la edad no se modifica al agregar el sexo
Existe interacción entre edad y sex? Sí, existe interacción

modelo3 <- glm(newchd ~ age+sex + age:sex, family = "binomial", data = datos)
t3<-tab_model(modelo3, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = F, title = "Newchd - EDAD y sexo e interacción")
knitr::asis_output(t3$knitr)

Newchd - EDAD y sexo e interacción
	newchd
Predictors	Odds Ratios	std. Error	CI	p
age	1.10	0.03	1.05 – 1.16	<0.001
age:sex1	0.95	0.03	0.89 – 1.01	0.076
sex1	34.63	55.55	1.52 – 824.87	0.027
Observations	1363
Deviance	1300.364
log-Likelihood	-650.182

summary(modelo3)

## 
## Call:
## glm(formula = newchd ~ age + sex + age:sex, family = "binomial", 
##     data = datos)
## 
## Coefficients:
##             Estimate Std. Error z value    Pr(>|z|)    
## (Intercept)  -6.9975     1.2544   -5.58 0.000000024 ***
## age           0.0982     0.0232    4.23 0.000022907 ***
## sex1          3.5446     1.6043    2.21       0.027 *  
## age:sex1     -0.0530     0.0299   -1.77       0.076 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1351.2  on 1362  degrees of freedom
## Residual deviance: 1300.4  on 1359  degrees of freedom
## AIC: 1308
## 
## Number of Fisher Scoring iterations: 4

datos %>% ggplot()+geom_point(aes(age,predict(modelo2, datos), color= sex))+ylab("logodds EC")

datos %>% ggplot()+geom_point(aes(age,predict(modelo3, datos), color= sex))+ylab("logodds EC")

#Sexo y enfermedad cardiovascular e hta
mod_sexocv <- glm(newchd ~ sex, family = "binomial", data = datos)
tsexocv<-tab_model(mod_sexocv, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = F, title = "Newchd - EDAD y sexo")
knitr::asis_output(tsexocv$knitr)

Newchd - EDAD y sexo
	newchd
Predictors	Odds Ratios	std. Error	CI	p
sex1	2.03	0.28	1.55 – 2.67	<0.001
Observations	1363
Deviance	1324.865
log-Likelihood	-662.432

summary(mod_sexocv)

## 
## Call:
## glm(formula = newchd ~ sex, family = "binomial", data = datos)
## 
## Coefficients:
##             Estimate Std. Error z value             Pr(>|z|)    
## (Intercept)   -1.779      0.106  -16.78 < 0.0000000000000002 ***
## sex1           0.707      0.139    5.07           0.00000039 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1351.2  on 1362  degrees of freedom
## Residual deviance: 1324.9  on 1361  degrees of freedom
## AIC: 1329
## 
## Number of Fisher Scoring iterations: 4

mod_sexocvhta<- glm(newchd ~ sex + age, family = "binomial", data = datos)
tsexocvhta<-tab_model(mod_sexocvhta, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = F, title = "Newchd - EDAD y sexo")
knitr::asis_output(tsexocvhta$knitr)

Newchd - EDAD y sexo
	newchd
Predictors	Odds Ratios	std. Error	CI	p
age	1.07	0.02	1.04 – 1.10	<0.001
sex1	2.05	0.29	1.56 – 2.70	<0.001
Observations	1363
Deviance	1303.531
log-Likelihood	-651.766

summary(mod_sexocvhta)

## 
## Call:
## glm(formula = newchd ~ sex + age, family = "binomial", data = datos)
## 
## Coefficients:
##             Estimate Std. Error z value       Pr(>|z|)    
## (Intercept)  -5.3082     0.7863   -6.75 0.000000000015 ***
## sex1          0.7161     0.1405    5.10 0.000000346121 ***
## age           0.0667     0.0146    4.58 0.000004751511 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1351.2  on 1362  degrees of freedom
## Residual deviance: 1303.5  on 1360  degrees of freedom
## AIC: 1310
## 
## Number of Fisher Scoring iterations: 4

mod_sexocvhtai<- glm(newchd ~ sex + age+ sex:age, family = "binomial", data = datos)
tsexocvhtai<-tab_model(mod_sexocvhtai, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = F, title = "Newchd - EDAD y sexo")
knitr::asis_output(tsexocvhtai$knitr)

Newchd - EDAD y sexo
	newchd
Predictors	Odds Ratios	std. Error	CI	p
age	1.10	0.03	1.05 – 1.16	<0.001
sex1	34.63	55.55	1.52 – 824.87	0.027
sex1:age	0.95	0.03	0.89 – 1.01	0.076
Observations	1363
Deviance	1300.364
log-Likelihood	-650.182

summary(mod_sexocvhtai)

## 
## Call:
## glm(formula = newchd ~ sex + age + sex:age, family = "binomial", 
##     data = datos)
## 
## Coefficients:
##             Estimate Std. Error z value    Pr(>|z|)    
## (Intercept)  -6.9975     1.2544   -5.58 0.000000024 ***
## sex1          3.5446     1.6043    2.21       0.027 *  
## age           0.0982     0.0232    4.23 0.000022907 ***
## sex1:age     -0.0530     0.0299   -1.77       0.076 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1351.2  on 1362  degrees of freedom
## Residual deviance: 1300.4  on 1359  degrees of freedom
## AIC: 1308
## 
## Number of Fisher Scoring iterations: 4

No hay confusión en la relación sexo CV al agregar hta. Tampoco interacción entre sexo y hta

#Relación hta con enf cv
modelo4 <- glm(newchd ~ sbp,family = "binomial", data = datos)

t4<-tab_model(modelo4, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = T, title = "Newchd - SBP") 

knitr::asis_output(t4$knitr)

Newchd - SBP
	newchd
Predictors	Odds Ratios	std. Error	p
sbp	1.02 (1.01 – 1.02)	0.00	<0.001
Observations	1363
Deviance	1305.048
log-Likelihood	-652.524

Transforme sbp en una variable dicotómica “sbp_cat” que tome valor 0 si sbp <141 y 1 si sbp >140. Ajuste un modelo entre newchd y sbp_cat y otro agregando sex.

datos <- datos %>%
mutate(sbp_cat = as.factor(case_when(
sbp <140 ~ "0",
sbp>= 140 ~ "1")))
modelo5 <- glm(newchd ~ sbp_cat,family = "binomial", data = datos)

t5<-tab_model(modelo5, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = T, title = "Newchd - SBP_cat") 

knitr::asis_output(t5$knitr)

Newchd - SBP_cat
	newchd
Predictors	Odds Ratios	std. Error	p
sbp_cat1	2.20 (1.65 – 2.97)	0.33	<0.001
Observations	1363
Deviance	1321.509
log-Likelihood	-660.755

modelo6 <- glm(newchd ~ sbp_cat + sex,family = "binomial", data = datos)
t6<-tab_model(modelo6, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = T, title = "Newchd - SBP_cat + sex") 

knitr::asis_output(t6$knitr)

Newchd - SBP_cat + sex
	newchd
Predictors	Odds Ratios	std. Error	p
sbp_cat1	2.40 (1.79 – 3.25)	0.37	<0.001
sex1	2.21 (1.68 – 2.93)	0.31	<0.001
Observations	1363
Deviance	1289.353
log-Likelihood	-644.676

anova(modelo6, test = "LRT")

## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: newchd
## 
## Terms added sequentially (first to last)
## 
## 
##         Df Deviance Resid. Df Resid. Dev    Pr(>Chi)    
## NULL                     1362       1351                
## sbp_cat  1     29.7      1361       1322 0.000000049 ***
## sex      1     32.2      1360       1289 0.000000014 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

El modelo múltiple mejora la predicción al disminuir la deviance Evalúe si sex es un confundidor, un modificador de efecto o ambos de la relación entre sbp_cat y newchd. (Justifique su respuesta usando tablas de 2x2 y regresión logística). 1. Sex no es confundidor, modifica menos del 10% el valor del OR de sbp cat cuando se agrega al modelo 2.Sex es modificador de efecto

modelo7 <- glm(newchd ~ sbp_cat + sex + sbp_cat:sex,family = "binomial", data = datos)
t7<-tab_model(modelo7, show.se = T, show.dev = T, show.loglik = T, show.intercept = F,show.stat = F, show.reflvl = T, digits = 2, digits.p = 3, show.r2 = F,collapse.ci = T, title = "Newchd - SBP_cat + sex + int") 

knitr::asis_output(t7$knitr)

Newchd - SBP_cat + sex + int
	newchd
Predictors	Odds Ratios	std. Error	p
sbp_cat1	3.61 (2.15 – 6.42)	1.00	<0.001
sbp_cat1:sex1	0.54 (0.27 – 1.02)	0.18	0.063
sex1	3.50 (2.02 – 6.36)	1.02	<0.001
Observations	1363
Deviance	1285.728
log-Likelihood	-642.864

library(epiR)
a<-epi.2by2(table(exposure = datos$sbp_cat,outcome = datos$newchd, strata= datos$sex), method = "cohort.count", digits = 3, conf.level = 0.95, outcome = "as.columns", interpret = F)

a

##              Outcome +    Outcome -      Total                 Inc risk *
## Exposed +          495           73        568  87.148 (84.113 to 89.789)
## Exposed -          600          195        795  75.472 (72.327 to 78.426)
## Total             1095          268       1363  80.337 (78.126 to 82.417)
## 
## 
## Point estimates and 95% CIs:
## -------------------------------------------------------------------
## Inc risk ratio (crude)                         1.155 (1.098, 1.215)
## Inc risk ratio (M-H)                           1.170 (1.112, 1.230)
## Inc risk ratio (crude:M-H)                     0.987
## Inc Odds ratio (crude)                         2.204 (1.642, 2.957)
## Inc Odds ratio (M-H)                           2.424 (1.793, 3.279)
## Inc Odds ratio (crude:M-H)                     0.909
## Attrib risk in the exposed (crude) *           11.676 (7.612, 15.741)
## Attrib risk in the exposed (M-H) *             12.701 (7.809, 17.594)
## Attrib risk (crude:M-H)                        0.919
## -------------------------------------------------------------------
##  M-H test of homogeneity of IRRs: chi2(1) = 0.048 Pr>chi2 = 0.826
##  M-H test of homogeneity of ORs: chi2(1) = 3.320 Pr>chi2 = 0.068
##  Test that M-H adjusted OR = 1:  chi2(1) = 34.007 Pr>chi2 = <0.001
##  Wald confidence limits
##  M-H: Mantel-Haenszel; CI: confidence interval
##  * Outcomes per 100 population units

b<-a$massoc.detail$OR.strata.wald

row.names(b)<-c("sex_0", "sex_1")

b

##        est lower upper
## sex_0 3.61  2.10  6.23
## sex_1 1.94  1.34  2.80

Ejercicio 1 logística

2025-05-18

R Markdown

Including Plots