---
title: "03_EDA_datasets_itu"
author: "Sergio Uribe"
date-modified: last-modified
format:
html:
toc: true
toc-expand: 3
code-fold: true
code-tools: true
editor: visual
execute:
echo: false
cache: false
warning: false
message: false
---
# Packages
```{r}
# Load required libraries with pacman; installs them if not already installed
pacman::p_load(tidyverse, # tools for data science
visdat, #NAs
janitor, # for data cleaning and tables
here, # for reproducible research
gtsummary, # for tables
countrycode, # to normalize country data
easystats, # check https://easystats.github.io/easystats/
scales,
lubridate
)
```
```{r}
theme_set(theme_minimal())
```
# Dataset
```{r}
# Load the dataset
df <- read_rds(here("data", "df_clean.rds")) |>
janitor::remove_empty(which = c("rows", "cols"))
```
##
Correct the names
```{r}
df <- df |>
janitor::clean_names()
```
## Select only relevant variables
```{r}
df <- df |>
select( `x26_thinking_about_overall_accessibility_how_difficult_or_easy_was_it_to_access_state_funded_dental_care_for_children_in_the_last_12_months`, # Corrected name
`x27_ow_satisfied_are_you_with_state_funded_dental_care_for_children_in_general`, # Corrected name
`x28_would_you_choose_state_funded_dental_care_for_your_children_in_the_future_if_the_care_system_remained_unchanged`, # Corrected name
x1_age,
x2_sex,
`x3_what_is_your_place_of_residence`,
`x4_in_which_region_do_you_currently_live`,
`x5_monthly_income_of_the_family`,
`x_awareness_of_access_to_public_dental_care_numeric`,
`x17_overall_how_do_you_feel_about_the_availability_of_information_on_state_funded_dental_care_for_children`,
`y_affordability_of_dental_care_for_children_numeric`,
`x22_how_would_you_evaluate_the_statement_in_latvia_the_funding_allocated_from_the_state_budget_for_dental_care_for_children_is_sufficient`,
`x23_what_dental_care_is_available_for_children_in_your_home_neighbourhood`,
`x24_how_long_does_it_take_to_get_a_child_to_a_dental_care_facility_with_the_transport_available_to_you`,
`x25_if_your_child_needs_acute_care_for_pain_or_inflammation_how_accessible_is_dental_care` )
```
Check the dependent variables
```{r}
df |>
ggplot(aes(x = x26_thinking_about_overall_accessibility_how_difficult_or_easy_was_it_to_access_state_funded_dental_care_for_children_in_the_last_12_months )) +
geom_bar()
```
```{r}
df |>
ggplot(aes(x = x27_ow_satisfied_are_you_with_state_funded_dental_care_for_children_in_general)) +
geom_bar()
```
```{r}
df |>
ggplot(aes(x = x28_would_you_choose_state_funded_dental_care_for_your_children_in_the_future_if_the_care_system_remained_unchanged)) +
geom_bar()
```
## Prepare the data
```{r}
df <- df |>
rename(
# Accesibility
accesibility = x26_thinking_about_overall_accessibility_how_difficult_or_easy_was_it_to_access_state_funded_dental_care_for_children_in_the_last_12_months,
# Satisfaction
satisfaction = x27_ow_satisfied_are_you_with_state_funded_dental_care_for_children_in_general,
# Parental preference
parents_preference = x28_would_you_choose_state_funded_dental_care_for_your_children_in_the_future_if_the_care_system_remained_unchanged)
```
### **Shorter Variable Names:**
1. **Age**: x1_age -\> Age
2. **Sex**: x2_sex -\> Sex
3. **Residence**: x3_what_is_your_place_of_residence -\> Residence
4. **Region**: x4_in_which_region_do_you_currently_live -\> Region
5. **Income**: x5_monthly_income_of_the_family -\> Income
6. **Dental Care Awareness**: x_awareness_of_access_to_public_dental_care_numeric -\> DentalCareAwareness
7. **Information Availability**: x17_overall_how_do_you_feel_about_the_availability_of_information on_state_funded_dental_care_for_children -\> InfoAvailability
8. **Affordability**: y_affordability_of_dental_care_for_children_numeric -\> Affordability
9. **Funding Evaluation**: x22_how_would_you_evaluate_the_statement_in_latvia the_funding_allocated_from_the_state_budget_for_dental_care_for_children_is_sufficient -\> FundingEval
10. **Available Care**: x23_what_dental_care_is_available_for_children_in_your_home_neighbourhood -\> AvailableCare
11. **Access Time**: x24_how_long_does_it_take_to_get_a_child_to_a_dental_care_facility with_the_transport_available_to_you -\> AccessTime
12. **Acute Care Accessibility**: x25_if_your_child_needs_acute_care_for_pain_or_inflammation how_accessible_is_dental_care -\> AcuteCareAccess
```{r}
df <- df %>%
rename(
Age = x1_age,
Sex = x2_sex,
Residence = x3_what_is_your_place_of_residence,
Region = x4_in_which_region_do_you_currently_live,
Income = x5_monthly_income_of_the_family,
DentalCareAwareness = x_awareness_of_access_to_public_dental_care_numeric,
InfoAvailability = x17_overall_how_do_you_feel_about_the_availability_of_information_on_state_funded_dental_care_for_children,
Affordability = y_affordability_of_dental_care_for_children_numeric,
FundingEval = x22_how_would_you_evaluate_the_statement_in_latvia_the_funding_allocated_from_the_state_budget_for_dental_care_for_children_is_sufficient,
AvailableCare = x23_what_dental_care_is_available_for_children_in_your_home_neighbourhood,
AccessTime = x24_how_long_does_it_take_to_get_a_child_to_a_dental_care_facility_with_the_transport_available_to_you,
AcuteCareAccess = x25_if_your_child_needs_acute_care_for_pain_or_inflammation_how_accessible_is_dental_care
)
```
### Relevel variables
```{r}
df <- df |>
mutate(Residence = fct_relevel(Residence, "Rīga"),
Region = fct_relevel(Region, "Rīgā"),
AvailableCare = fct_relevel(AvailableCare, "Dental practice with state-funded services"),
AccessTime = fct_relevel(AccessTime, "Less than half an hour"),
AcuteCareAccess = fct_relevel(AcuteCareAccess, "I can get it at a dental practice near me"))
```
### Remove some levels
```{r}
df <- df |>
mutate(AccessTime = na_if(AccessTime, "Don't know" ))
```
Cita valstspilsēta (Daugavpils, Jelgava, Jēkabpils, Jūrmala, Liepāja, Ogre, Rēzekne, Valmiera, Ventspils)
```{r}
df <- df |>
mutate(Residence = fct_recode(Residence,
"Cita valstspilsēta (DGP-JEL-JKB-JUR-LPX-OGRE-REZ-VMA-VNT)" = "Cita valstspilsēta (Daugavpils, Jelgava, Jēkabpils, Jūrmala, Liepāja, Ogre, Rēzekne, Valmiera, Ventspils)"))
```
## -- Accesibility --
```{r}
df <- df |>
mutate(accesibility_1 = fct_collapse(accesibility,
"Easy" = c("Easy", "Very easy"),
"Not easy" = c("Moderate", "Difficult", "Very difficult"),
other_level = "NULL")) |>
mutate(accesibility_1 = na_if(accesibility_1, "NULL")) |>
mutate(accesibility_1 = droplevels(accesibility_1) )
```
```{r}
df <- df |>
mutate(accesibility_2 = fct_collapse(accesibility,
"Difficult" = c("Difficult", "Very difficult"),
"Not difficult" = c("Moderate", "Easy", "Very easy"),
other_level = "NULL")) |>
mutate(accesibility_2 = na_if(accesibility_2, "NULL")) |>
mutate(accesibility_2 = droplevels(accesibility_2) )
```
### MODEL ACCESIBILITY 1
```{r}
tabyl(df$accesibility_1) |>
adorn_pct_formatting() |>
knitr::kable()
```
```{r}
m_accesibility_1 <- df |>
glm(
formula = accesibility_1 ~
Age +
Sex +
Residence +
Region +
Income +
# DentalCareAwareness +
InfoAvailability +
Affordability +
FundingEval +
AvailableCare +
AccessTime +
AcuteCareAccess,
family = binomial()
)
```
```{r}
gtsummary::tbl_regression(m_accesibility_1,
exponentiate = T, pvalue_fun = ~ style_pvalue(.x, digits = 2)) |> add_global_p() |> bold_p(t = 0.10) |> bold_labels() |> italicize_levels()
```
```{r}
m_accesibility_1 <- gtsummary::tbl_regression(m_accesibility_1,
exponentiate = T, pvalue_fun = ~ style_pvalue(.x, digits = 2)) |> add_global_p() |> bold_p(t = 0.10) |> bold_labels() |> italicize_levels()
```
### MODEL ACCESIBILITY 2
```{r}
tabyl(df$accesibility_2) |>
adorn_pct_formatting() |>
knitr::kable()
```
```{r}
m_accesibility_2 <- df |>
glm(
formula = accesibility_2 ~
Age +
Sex +
Residence +
Region +
Income +
# DentalCareAwareness +
InfoAvailability +
Affordability +
FundingEval +
AvailableCare +
AccessTime +
AcuteCareAccess,
family = binomial()
)
```
```{r}
gtsummary::tbl_regression(m_accesibility_2, exponentiate = T, pvalue_fun = ~ style_pvalue(.x, digits = 2)) |> add_global_p() |> bold_p(t = 0.10) |> bold_labels() |> italicize_levels()
```
```{r}
m_accesibility_2 <- gtsummary::tbl_regression(m_accesibility_2, exponentiate = T, pvalue_fun = ~ style_pvalue(.x, digits = 2)) |> add_global_p() |> bold_p(t = 0.10) |> bold_labels() |> italicize_levels()
```
### Models together
```{r}
tbls <-
tbl_merge(
tbls = list(m_accesibility_1, m_accesibility_2),
tab_spanner = c("**Very Easy/Easy**", "**+ Moderate**")
)
```
```{r}
tbls
```
```{r}
rm(m_accesibility_1, m_accesibility_2, tbls)
```
## -- Satisfaction --
```{r}
df <- df |>
mutate(satisfaction_1 = fct_collapse(satisfaction,
"Satisfied" = c("Very satisfied", "Satisfied"),
"Not satisfied" = c("Moderately satisfied", "Dissatisfied", "Very dissatisfied"),
other_level = "NULL")) |>
# mutate(satisfaction_1 = na_if(satisfaction_1, "NULL"))
mutate(satisfaction_1 = droplevels(satisfaction_1) )
```
```{r}
df <- df |>
mutate(satisfaction_2 = fct_collapse(satisfaction,
"Satisfied" = c("Very satisfied", "Satisfied", "Moderately satisfied"),
"Not satisfied" = c("Dissatisfied", "Very dissatisfied"),
other_level = "NULL")) |>
# mutate(satisfaction_2 = na_if(satisfaction_2, "NULL")) |>
mutate(satisfaction_2 = droplevels(satisfaction_2) )
```
### MODEL satisfaction 1
```{r}
tabyl(df$satisfaction_1) |>
adorn_pct_formatting() |>
knitr::kable()
```
```{r}
m_satisfaction_1 <- df |>
glm(
formula = satisfaction_1 ~
Age +
Sex +
Residence +
Region +
Income +
# DentalCareAwareness +
InfoAvailability +
Affordability +
FundingEval +
AvailableCare +
AccessTime +
AcuteCareAccess,
family = binomial()
)
```
```{r}
gtsummary::tbl_regression(m_satisfaction_1, exponentiate = T, pvalue_fun = ~ style_pvalue(.x, digits = 2)) |> add_global_p() |> bold_p(t = 0.10) |> bold_labels() |> italicize_levels()
```
```{r}
# Plot the model
```
### MODEL satisfaction 2
```{r}
tabyl(df$satisfaction_2) |>
adorn_pct_formatting() |>
knitr::kable()
```
```{r}
m_satisfaction_2 <- df |>
glm(
formula = satisfaction_2 ~
Age +
Sex +
Residence +
Region +
Income +
# DentalCareAwareness +
InfoAvailability +
Affordability +
FundingEval +
AvailableCare +
AccessTime +
AcuteCareAccess,
family = binomial()
)
```
```{r}
gtsummary::tbl_regression(m_satisfaction_2, exponentiate = T, pvalue_fun = ~ style_pvalue(.x, digits = 2)) |> add_global_p() |> bold_p(t = 0.10) |> bold_labels() |> italicize_levels()
```
```{r}
# Plot the model
```
## -- Parents preference --
```{r}
df <- df |>
mutate(parents_preference = fct_collapse(parents_preference,
"Yes" = c("Definitely", "Probably yes"),
"Probably no" = c("Not sure", "Probably no", "Definitely no"),
other_level = "NULL")) |>
mutate(parents_preference = na_if(parents_preference, "NULL")) |>
mutate(parents_preference = droplevels(parents_preference) )
```
### MODEL parents preference
```{r}
tabyl(df$parents_preference) |>
adorn_pct_formatting() |>
knitr::kable()
```
```{r}
m_parents_preference <- df |>
glm(
formula = parents_preference ~
Age +
Sex +
Residence +
Region +
Income +
# DentalCareAwareness +
InfoAvailability +
Affordability +
FundingEval +
AvailableCare +
AccessTime +
AcuteCareAccess,
family = binomial()
)
```
```{r}
gtsummary::tbl_regression(m_parents_preference, exponentiate = T, pvalue_fun = ~ style_pvalue(.x, digits = 2)) |> add_global_p() |> bold_p(t = 0.10) |> bold_labels() |> italicize_levels()
```