La inasistencia a citas médicas es un problema recurrente en los
hospitales, que afecta la eficiencia en el uso de recursos y genera
barreras en el acceso oportuno a la atención.
Pregunta de investigación: ¿Qué factores influyen en la
asistencia a citas médicas y cómo impactan en la eficiencia
hospitalaria?
#Cargar librerias#
## Rows: 111488 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): appointment_id, slot_id, status, patient_id, sex, age_group
## dbl (4): scheduling_interval, appointment_duration, waiting_time, age
## date (2): scheduling_date, appointment_date
## time (4): appointment_time, check_in_time, start_time, end_time
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
appointments <- appointments %>% mutate(no_show = ifelse(status == “did not attend”, 1, 0))
# Ver valores únicos de la columna 'status'
unique(appointments$status)
# Ver estructura de la base
str(appointments)
# Contar valores faltantes por columna
colSums(is.na(appointments))
# Exploración inicial
``` r
unique(appointments$status)
## [1] "did not attend" "attended" "cancelled" "unknown"
## [5] "scheduled"
str(appointments)
## tibble [111,488 × 17] (S3: tbl_df/tbl/data.frame)
## $ appointment_id : chr [1:111488] "0000138" "0000146" "0000021" "0000233" ...
## $ slot_id : chr [1:111488] "0000001" "0000023" "0000024" "0000025" ...
## $ scheduling_date : Date[1:111488], format: "2014-12-28" "2014-12-29" ...
## $ appointment_date : Date[1:111488], format: "2015-01-01" "2015-01-01" ...
## $ appointment_time : 'hms' num [1:111488] 08:00:00 13:30:00 13:45:00 14:00:00 ...
## ..- attr(*, "units")= chr "secs"
## $ scheduling_interval : num [1:111488] 4 3 15 1 6 2 2 2 4 4 ...
## $ status : chr [1:111488] "did not attend" "did not attend" "attended" "attended" ...
## $ check_in_time : 'hms' num [1:111488] NA NA 13:36:45 13:59:32 ...
## ..- attr(*, "units")= chr "secs"
## $ appointment_duration: num [1:111488] NA NA 5.2 28.9 NA 7.7 4.2 27.1 NA 1.2 ...
## $ start_time : 'hms' num [1:111488] NA NA 13:37:57 14:00:40 ...
## ..- attr(*, "units")= chr "secs"
## $ end_time : 'hms' num [1:111488] NA NA 13:43:09 14:29:34 ...
## ..- attr(*, "units")= chr "secs"
## $ waiting_time : num [1:111488] NA NA 1.2 1.1 NA 21.7 16.2 1 NA 8.5 ...
## $ patient_id : chr [1:111488] "08285" "05972" "06472" "05376" ...
## $ sex : chr [1:111488] "Male" "Male" "Male" "Female" ...
## $ age : num [1:111488] 37 84 77 37 72 51 28 33 29 90 ...
## $ age_group : chr [1:111488] "35-39" "80-84" "75-79" "35-39" ...
## $ no_show : num [1:111488] 1 1 0 0 0 0 0 0 0 0 ...
colSums(is.na(appointments))
## appointment_id slot_id scheduling_date
## 0 0 0
## appointment_date appointment_time scheduling_interval
## 0 0 0
## status check_in_time appointment_duration
## 0 25456 25456
## start_time end_time waiting_time
## 25456 25456 25456
## patient_id sex age
## 0 0 0
## age_group no_show
## 0 0
str(appointments)
colSums(is.na(appointments))
#ESTADISTICAS DESCRIPTIVAS# par(mfrow=c(2,2))
barplot(table(appointments\(sex), main="Distribución por sexo", col="lightgreen") barplot(table(appointments\)age_group), main=“Distribución por grupos de edad”, col=“lightyellow”) barplot(table(appointments$status), main=“Estado de las citas médicas”, col=“lightpink”)
#Modelo matemático: Regresión logística binaria# appointments <- appointments %>% mutate(no_show = ifelse(status == “did not attend”, 1, 0))
modelo_logistico <- glm(no_show ~ age + sex + waiting_time, data = appointments, family = binomial)
modelo_lineal <- lm(waiting_time ~ age + sex, data = appointments) summary(modelo_lineal)
summary(modelo_logistico) summary(modelo_lineal)
#Validación y testeo del modelo logístico#
par(mfrow=c(2,2)) plot(modelo_logistico)
set.seed(123) train_control <- trainControl(method=“cv”, number=5) modelo_cv <- train(no_show ~ age + sex + waiting_time, data=appointments, method=“glm”, family=“binomial”, trControl=train_control) print(modelo_cv)
#Modelo matemático: Regresión lineal múltiple#
appointments\(sex <- as.factor(appointments\)sex)
modelo_lineal <- lm(waiting_time ~ age + sex, data=appointments) summary(modelo_lineal)
#Comparación de modelos# comparacion <- data.frame( Modelo = c(“Regresión Logística Binaria”,“Regresión Lineal Múltiple”), Indicador = c(“AIC”,“R²”), Valor = c(AIC(modelo_logistico), round(summary(modelo_lineal)$r.squared, 4)) ) comparacion
#Indicadores de eficiencia hospitalaria#
tasa_inasistencia <- mean(appointments$no_show) tasa_inasistencia
tasa_inasistencia_sexo <- appointments %>% group_by(sex) %>% summarise(inasistencia = mean(no_show)) tasa_inasistencia_sexo
#Gráficos de comportamiento y variabilidad#
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'