1 WIN Threshold & Cognition

1.1 Objective:

Examined WIN (Words-in-Noise) Threshold as the outcome, with cognitive status (PACC & MOCA categories) as key predictors, adjusting for demographic and environmental covariates.

1.2 1. WIN Threshold ~ PACC Classification

-Lower PACC → Higher WIN Thresholds (poorer auditory performance).
- Time significantly increases WIN Threshold across all participants.
- Age, gender, and race influence WIN performance.
- No significant time × PACC interaction (similar progression over time).

1.3 2. WIN Threshold ~ MOCA Classification

  • Lower MOCA → Higher WIN Thresholds (worse performance).
  • Time significantly affects WIN Threshold.
  • No significant time × MOCA interaction (similar rate of change).

1.4 3. WIN Threshold ~ CDR Status

  • Higher CDR (CDR > 0) is associated with higher WIN Thresholds (indicating worse performance).
  • Time significantly affects WIN Threshold (performance changes over time).
  • No significant time × CDR interaction → The rate of change in WIN Thresholds over time is similar for CDR = 0 and CDR > 0.

1.5 Conclusion:

✔ Cognitive impairment (low PACC/MOCA/CDR > 0) is linked to poorer auditory performance.
✔ Hearing difficulties are associated with cognitive status, independent of time-based interactions.
✔ Demographic factors (age, gender, race) also play a role.


Takeaway:
Hearing performance (WIN) is worse in individuals with cognitive impairment, but their rate of decline over time is not significantly different from cognitively intact individuals.

Note The analysis was done on a dataset filtered for non-missing baseline Words in noise value, n=420. Follow up Words in noise threshold were filtered for missing values.

Note There were 107 missing data points in baseline pacc score. 35 matched pacc scores were extracted and filled the missing baseline pacc from follow-up 1 pacc score in the ADRC data.

`

Note time was calculated from baseline to each followup otdate in years.

Words in noise summary

## # A tibble: 1 × 6
##    mean median    sd range_min range_max n_observations
##   <dbl>  <dbl> <dbl>     <dbl>     <dbl>          <int>
## 1  11.0     10  4.15       3.2        26           1165

Note PACC classification was done by first calculating the mean and sd of baseline pacc for the dataset. Those below and equal to -1 sd were called “low PACC” and those above were named “high pacc.

## # A tibble: 2 × 2
##   PACC_class_sick     n
##   <fct>           <int>
## 1 high PACC         256
## 2 low PACC           54

2 Model 1 pacc classified

model_pacc_classification <- lmer(
  WIN_Threshold_Avg ~ time * PACC_class_sick + educ + gender + age + race + ADI_NATRANK + noise_censusblock2020_mean + (1 | ID),
  data = Drives_for_the_WIN_lme_long,
  REML = FALSE
)
# View the summary of the model
summary(model_pacc_classification)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: WIN_Threshold_Avg ~ time * PACC_class_sick + educ + gender +  
##     age + race + ADI_NATRANK + noise_censusblock2020_mean + (1 |      ID)
##    Data: Drives_for_the_WIN_lme_long
## 
##      AIC      BIC   logLik deviance df.resid 
##   3915.1   3971.6  -1945.5   3891.1      808 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9974 -0.4931 -0.0619  0.4333  4.5284 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 9.629    3.103   
##  Residual             3.016    1.737   
## Number of obs: 820, groups:  ID, 310
## 
## Fixed effects:
##                               Estimate Std. Error t value
## (Intercept)                   0.514455   5.007136   0.103
## time                          0.449637   0.036123  12.447
## PACC_class_sicklow PACC       1.089955   0.533431   2.043
## educ                         -0.203945   0.081464  -2.503
## gender                       -2.662624   0.391622  -6.799
## age                           0.204005   0.039372   5.182
## raceWhite                     2.124973   0.575232   3.694
## ADI_NATRANK                   0.008401   0.008873   0.947
## noise_censusblock2020_mean    0.004328   0.073524   0.059
## time:PACC_class_sicklow PACC -0.019915   0.136055  -0.146
## 
## Correlation of Fixed Effects:
##             (Intr) time   PACC_P educ   gender age    racWht ADI_NA n_2020
## time        -0.028                                                        
## PACC_c_PACC  0.054  0.100                                                 
## educ        -0.332 -0.003  0.108                                          
## gender      -0.242  0.007  0.056  0.147                                   
## age         -0.508 -0.001 -0.199  0.013  0.084                            
## raceWhite   -0.113  0.037  0.068  0.029  0.064 -0.030                     
## ADI_NATRANK -0.324  0.012  0.017  0.272 -0.056  0.142  0.342              
## ns_cns2020_ -0.723  0.017 -0.003  0.017  0.043 -0.104 -0.018  0.075       
## t:PACC__PAC  0.023 -0.264 -0.166 -0.003  0.003 -0.054  0.021 -0.007  0.012

2.1 Model 1 table

sjPlot::plot_model(model_pacc_classification, 
                   title="Effect of Baseline PACC on WIN threshold")

sjPlot::tab_model(model_pacc_classification, 
                   title = "Effect of Baseline PACC on WIN threshold",
                   file = NULL)
Effect of Baseline PACC on WIN threshold
  WIN_Threshold_Avg
Predictors Estimates CI p
(Intercept) 0.51 -9.31 – 10.34 0.918
time 0.45 0.38 – 0.52 <0.001
PACC class sick [low
PACC]
1.09 0.04 – 2.14 0.041
educ -0.20 -0.36 – -0.04 0.012
gender -2.66 -3.43 – -1.89 <0.001
age 0.20 0.13 – 0.28 <0.001
race [White] 2.12 1.00 – 3.25 <0.001
ADI NATRANK 0.01 -0.01 – 0.03 0.344
noise censusblock2020
mean
0.00 -0.14 – 0.15 0.953
time × PACC class sick
[low PACC]
-0.02 -0.29 – 0.25 0.884
Random Effects
σ2 3.02
τ00 ID 9.63
ICC 0.76
N ID 310
Observations 820
Marginal R2 / Conditional R2 0.275 / 0.827

2.2 Model 1 data vis

2.2.1 Model 1 with no adjustment

## Intercept for High PACC: 0.5145
## Intercept for Low PACC: 1.6044

### Model 1 with plot_model

plot_model(model_pacc_classification, 
           type = "pred", 
           terms = c("time", "PACC_class_sick"),
           mdrt.values = "meansd",  # Ensure covariates are set at their means
           show.data = TRUE)  # Overlay raw data points

2.2.2 Model 1 with manual adjustment at the intercept

Adjusting at the Intercept Level (Group-Level Adjustment) When we adjust at the intercept, we assume that all participants share the same average values for covariates.

The intercept is pre-adjusted using the mean values of educ, age, ADI_NATRANK, etc. The same intercept applies to everyone, so predictions are for an “average” participant in each PACC group.

# Extract fixed effects from the model
fixed_effects <- fixef(model_pacc_classification)

# Compute mean values for continuous covariates
mean_educ <- mean(clean_data_PACC_Class$educ, na.rm = TRUE)
mean_age <- mean(clean_data_PACC_Class$age, na.rm = TRUE)
mean_ADI_NATRANK <- mean(clean_data_PACC_Class$ADI_NATRANK, na.rm = TRUE)
mean_noise <- mean(clean_data_PACC_Class$noise_censusblock2020_mean, na.rm = TRUE)

# Compute proportion of White participants
mean_raceWhite <- mean(clean_data_PACC_Class$race == "White", na.rm = TRUE)  # Binary 0/1 = 0.82
print(mean_raceWhite)
## [1] 0.8158537
# Find most common gender
most_common_gender <- as.numeric(names(sort(table(clean_data_PACC_Class$gender), decreasing = TRUE)[1]))

# Base intercept for "high PACC" group (reference)
intercept_high_pacc <- fixed_effects["(Intercept)"] +
                       fixed_effects["educ"] * mean_educ +
                       fixed_effects["age"] * mean_age +
                       fixed_effects["ADI_NATRANK"] * mean_ADI_NATRANK +
                       fixed_effects["noise_censusblock2020_mean"] * mean_noise +
                       fixed_effects["raceWhite"] * mean_raceWhite

# Adjust for gender: Apply effect if the most common gender is Female (coded as 2)
if (most_common_gender == 2) {
    intercept_high_pacc <- intercept_high_pacc + fixed_effects["gender"]
}

# Adjust for "low PACC" group
intercept_low_pacc <- intercept_high_pacc + fixed_effects["PACC_class_sicklow PACC"]

# Print intercepts
cat(sprintf("Intercept for PACC = high: %.4f\n", intercept_high_pacc))
## Intercept for PACC = high: 11.6542
cat(sprintf("Intercept for PACC = low: %.4f\n", intercept_low_pacc))
## Intercept for PACC = low: 12.7441
# Create a new data frame for predictions (fix factor levels)
new_data <- expand.grid(
  time = seq(0, 12, by = 1.5),
  PACC_class_sick = factor(c("high PACC", "low PACC"), levels = c("high PACC", "low PACC"))  # Explicitly set levels
)

# Compute predictions (holding covariates constant)
new_data <- new_data %>%
  mutate(
    PACC_class_sick = as.character(PACC_class_sick),  # Convert to character
    predicted = case_when(
      PACC_class_sick == "high PACC" ~ intercept_high_pacc + fixed_effects["time"] * time,
      PACC_class_sick == "low PACC" ~ intercept_low_pacc + (fixed_effects["time"] + fixed_effects["time:PACC_class_sicklow PACC"]) * time
    )
  )

# Plot results
ggplot(new_data, aes(x = time, y = predicted, color = PACC_class_sick)) +
  geom_line(size = 1.2) +  
  geom_point(size = 2) +  
  theme_minimal() +
  labs(
    title = "Hearing Ability Decline Over Time (WIN Threshold Avg)",
    subtitle = "Comparing PACC = High vs. Low at Baseline",
    x = "Time (Years)",
    y = "Predicted WIN Threshold Avg (Higher = Worse Hearing)",
    color = "Baseline PACC Classification"
  ) +
  theme(legend.position = "top")

## # A tibble: 2 × 2
##   MOCA_CAT      n
##   <fct>     <int>
## 1 high moca   176
## 2 low moca    156

3 Model 2 MOCA classified

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: WIN_Threshold_Avg ~ time * MOCA_CAT + educ + gender + age + race +  
##     ADI_NATRANK + noise_censusblock2020_mean + (1 | ID)
##    Data: clean_data_MOCA_CAT
## 
##      AIC      BIC   logLik deviance df.resid 
##   4235.4   4292.9  -2105.7   4211.4      874 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9249 -0.5074 -0.0461  0.4275  4.5591 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 9.910    3.148   
##  Residual             3.019    1.737   
## Number of obs: 886, groups:  ID, 332
## 
## Fixed effects:
##                             Estimate Std. Error t value
## (Intercept)                 0.691516   4.990949   0.139
## time                        0.433812   0.041411  10.476
## MOCA_CATlow moca            1.303714   0.398694   3.270
## educ                       -0.142609   0.077644  -1.837
## gender                     -2.632626   0.383697  -6.861
## age                         0.183372   0.038650   4.744
## raceWhite                   2.148517   0.554354   3.876
## ADI_NATRANK                 0.006228   0.008753   0.711
## noise_censusblock2020_mean  0.004776   0.073672   0.065
## time:MOCA_CATlow moca       0.085755   0.073194   1.172
## 
## Correlation of Fixed Effects:
##             (Intr) time   MOCA_m educ   gender age    racWht ADI_NA n_2020
## time        -0.020                                                        
## MOCA_CATlwm  0.090  0.168                                                 
## educ        -0.317  0.000  0.111                                          
## gender      -0.224  0.010  0.127  0.133                                   
## age         -0.510 -0.016 -0.198  0.007  0.082                            
## raceWhite   -0.118  0.018 -0.053  0.006  0.030 -0.004                     
## ADI_NATRANK -0.323 -0.001 -0.086  0.243 -0.070  0.148  0.355              
## ns_cns2020_ -0.740  0.016 -0.060  0.025  0.032 -0.084 -0.010  0.092       
## t:MOCA_CATm  0.004 -0.565 -0.222 -0.005 -0.004  0.007  0.018  0.007 -0.001

3.1 Model 2 table and estimate plot

sjPlot::plot_model(model_MOCA_classification, 
                   title="Effect of Baseline MoCA on WIN threshold")

sjPlot::tab_model(model_MOCA_classification, 
                   title = "Effect of Baseline MoCA on WIN threshold",
                   file = NULL)
Effect of Baseline MoCA on WIN threshold
  WIN_Threshold_Avg
Predictors Estimates CI p
(Intercept) 0.69 -9.10 – 10.49 0.890
time 0.43 0.35 – 0.52 <0.001
MOCA CAT [low moca] 1.30 0.52 – 2.09 0.001
educ -0.14 -0.30 – 0.01 0.067
gender -2.63 -3.39 – -1.88 <0.001
age 0.18 0.11 – 0.26 <0.001
race [White] 2.15 1.06 – 3.24 <0.001
ADI NATRANK 0.01 -0.01 – 0.02 0.477
noise censusblock2020
mean
0.00 -0.14 – 0.15 0.948
time × MOCA CAT [low
moca]
0.09 -0.06 – 0.23 0.242
Random Effects
σ2 3.02
τ00 ID 9.91
ICC 0.77
N ID 332
Observations 886
Marginal R2 / Conditional R2 0.285 / 0.833

3.2 Model 2 data vis

3.2.1 Model 2 with no adjustment

## Intercept for High MOCA: 0.6915
## Intercept for Low MOCA: 1.9952

3.2.2 Model 2 with manual adjustment at the intercept

Adjusting at the Intercept Level (Group-Level Adjustment) When we adjust at the intercept, we assume that all participants share the same average values for covariates.

The intercept is pre-adjusted using the mean values of educ, age, ADI_NATRANK, etc. The same intercept applies to everyone, so predictions are for an “average” participant in each MoCA group.

# Extract fixed effects from the model
fixed_effects <- fixef(model_MOCA_classification)

# Compute mean values for continuous covariates
mean_educ <- mean(clean_data_MOCA_CAT$educ, na.rm = TRUE)
mean_age <- mean(clean_data_MOCA_CAT$age, na.rm = TRUE)
mean_ADI_NATRANK <- mean(clean_data_MOCA_CAT$ADI_NATRANK, na.rm = TRUE)
mean_noise <- mean(clean_data_MOCA_CAT$noise_censusblock2020_mean, na.rm = TRUE)

# Compute proportion of White participants
mean_raceWhite <- mean(clean_data_MOCA_CAT$race == "White", na.rm = TRUE)  # Binary 0/1 = 0.82

# Find most common gender
most_common_gender <- as.numeric(names(sort(table(clean_data_MOCA_CAT$gender), decreasing = TRUE)[1]))

# Base intercept for "high moca" group (reference group)
intercept_high_moca <- fixed_effects["(Intercept)"] +
                       fixed_effects["educ"] * mean_educ +
                       fixed_effects["age"] * mean_age +
                       fixed_effects["ADI_NATRANK"] * mean_ADI_NATRANK +
                       fixed_effects["noise_censusblock2020_mean"] * mean_noise +
                       fixed_effects["raceWhite"] * mean_raceWhite

# Adjust for gender: Apply effect if the most common gender is Female (coded as 2)
if (most_common_gender == 2) {
    intercept_high_moca <- intercept_high_moca + fixed_effects["gender"]
}

# Adjust intercept for "low moca" group
intercept_low_moca <- intercept_high_moca + fixed_effects["MOCA_CATlow moca"]

# Print intercepts
cat(sprintf("Intercept for MOCA = high: %.4f\n", intercept_high_moca))
## Intercept for MOCA = high: 11.2863
cat(sprintf("Intercept for MOCA = low: %.4f\n", intercept_low_moca))
## Intercept for MOCA = low: 12.5901
# Create a new data frame for predictions
new_data <- expand.grid(
  time = seq(0, 12, by = 1.5),  # Time from 0 to 12
  MOCA_CAT = factor(c("high moca", "low moca"), levels = levels(clean_data_MOCA_CAT$MOCA_CAT))  # Match factor levels
)

# Compute predictions manually (holding covariates constant)
new_data <- new_data %>%
  mutate(
      predicted = case_when(
      MOCA_CAT == "high moca" ~ intercept_high_moca + fixed_effects["time"] * time,
      MOCA_CAT == "low moca" ~ intercept_low_moca + (fixed_effects["time"] + fixed_effects["time:MOCA_CATlow moca"]) * time
    )
  )

# Plot results
ggplot(new_data, aes(x = time, y = predicted, color = MOCA_CAT)) +
  geom_line(size = 1.2) +  
  geom_point(size = 2) +  
  theme_minimal() +
  labs(
    title = "Hearing Ability Decline Over Time (WIN Threshold Avg)",
    subtitle = "Comparing MOCA = High vs. Low at Baseline",
    x = "Time (Years)",
    y = "Predicted WIN Threshold Avg (Higher = Worse Hearing)",
    color = "Baseline MOCA Classification"
  ) +
  theme(legend.position = "top")

### Model 2 with plot_model

Plotting using a package: By default, plot_model() holds all covariates at their mean or median values and covariate coefficient estimate for categorical variables is based on the most prevalent category.

plot_model(model_MOCA_classification, 
           type = "pred", 
           terms = c("time", "MOCA_CAT"),
           mdrt.values = "meansd",  # Ensure covariates are set at their means
           show.data = TRUE)  # Overlay raw data points

## # A tibble: 2 × 2
##   CDR_status     n
##   <fct>      <int>
## 1 cdr = 0      251
## 2 cdr > 0      122

4 Model 3 CDR_status

model_CDR_status <- lmer(
  WIN_Threshold_Avg ~ time * CDR_status + educ + gender + age + race + ADI_NATRANK + noise_censusblock2020_mean + (1 | ID),
  data = clean_data_CDR,
  REML = FALSE
)
# View the summary of the model
summary(model_CDR_status)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: WIN_Threshold_Avg ~ time * CDR_status + educ + gender + age +  
##     race + ADI_NATRANK + noise_censusblock2020_mean + (1 | ID)
##    Data: clean_data_CDR
## 
##      AIC      BIC   logLik deviance df.resid 
##   4949.7   5009.2  -2462.9   4925.7     1032 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.0405 -0.5038 -0.0716  0.4400  4.5622 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 10.413   3.227   
##  Residual              2.911   1.706   
## Number of obs: 1044, groups:  ID, 373
## 
## Fixed effects:
##                             Estimate Std. Error t value
## (Intercept)                -5.811478   4.709096  -1.234
## time                        0.476691   0.029873  15.957
## CDR_statuscdr > 0           1.225474   0.398670   3.074
## educ                       -0.170846   0.077561  -2.203
## gender2                    -2.378407   0.368601  -6.453
## age                         0.196800   0.037083   5.307
## raceWhite                   1.632221   0.539349   3.026
## ADI_NATRANK                 0.009407   0.008178   1.150
## noise_censusblock2020_mean  0.072777   0.070630   1.030
## time:CDR_statuscdr > 0     -0.065040   0.118498  -0.549
## 
## Correlation of Fixed Effects:
##             (Intr) time   CDR_>0 educ   gendr2 age    racWht ADI_NA n_2020
## time        -0.026                                                        
## CDR_sttsc>0 -0.005  0.137                                                 
## educ        -0.333 -0.004  0.114                                          
## gender2     -0.154  0.012  0.035  0.166                                   
## age         -0.499  0.003 -0.084  0.001  0.080                            
## raceWhite   -0.113  0.010 -0.096  0.008  0.070  0.020                     
## ADI_NATRANK -0.319  0.015 -0.004  0.261 -0.064  0.115  0.351              
## ns_cns2020_ -0.740  0.013  0.005  0.040  0.025 -0.106 -0.034  0.087       
## tm:CDR_st>0 -0.008 -0.252 -0.187  0.021 -0.004 -0.024  0.017  0.002  0.022

4.1 Table for Model 3

sjPlot::plot_model(model_CDR_status, 
                   title="Effect of Baseline CDR status on WIN threshold")

sjPlot::tab_model(model_CDR_status, 
                   title = "Effect of Baseline CDR status on WIN threshold",
                   file = NULL)
Effect of Baseline CDR status on WIN threshold
  WIN_Threshold_Avg
Predictors Estimates CI p
(Intercept) -5.81 -15.05 – 3.43 0.217
time 0.48 0.42 – 0.54 <0.001
CDR status [cdr > 0] 1.23 0.44 – 2.01 0.002
educ -0.17 -0.32 – -0.02 0.028
gender [2] -2.38 -3.10 – -1.66 <0.001
age 0.20 0.12 – 0.27 <0.001
race [White] 1.63 0.57 – 2.69 0.003
ADI NATRANK 0.01 -0.01 – 0.03 0.250
noise censusblock2020
mean
0.07 -0.07 – 0.21 0.303
time × CDR status [cdr >
0]
-0.07 -0.30 – 0.17 0.583
Random Effects
σ2 2.91
τ00 ID 10.41
ICC 0.78
N ID 373
Observations 1044
Marginal R2 / Conditional R2 0.230 / 0.832

4.2 Data vis for Model 3

4.2.1 Model 3 with no adjustment at the intercept

# Extract fixed effects from the model
fixed_effects <- fixef(model_CDR_status)

# Compute intercepts for MOCA classification groups
intercept_CDR_0 <- fixed_effects["(Intercept)"]   # -5.811478 (Reference group)
intercept_CDR_GT0 <- intercept_CDR_0 + fixed_effects["CDR_statuscdr > 0"]  # -5.811478 + 1.225474

# Print intercepts with titles
cat(sprintf("Intercept for CDR_0: %.4f\n", intercept_CDR_0))
## Intercept for CDR_0: -5.8115
cat(sprintf("Intercept for CDR_GT0: %.4f\n", intercept_CDR_GT0))
## Intercept for CDR_GT0: -4.5860
# Create a new data frame for predictions
new_data <- expand.grid(
  time = seq(0, 12, by = 1.5),  # Time from 0 to 12
  CDR_status = c("CDR = 0", "CDR > 0")
)

# Compute predictions manually
new_data <- new_data %>%
  mutate(
    predicted = case_when(
      CDR_status == "CDR = 0" ~ intercept_CDR_0 + fixed_effects["time"] * time,
      CDR_status == "CDR > 0" ~ intercept_CDR_GT0 + (fixed_effects["time"] + fixed_effects["time:CDR_statuscdr > 0"]) * time
    )
  )

# Plot using ggplot2
ggplot(new_data, aes(x = time, y = predicted, color = CDR_status)) +
  geom_line(size = 1) +
  geom_point(size = 2) +
  theme_minimal() +
  labs(title = "Predicted values of WIN_Threshold_Avg by CDR_status",
       x = "Time",
       y = "WIN_Threshold_Avg",
       color = "CDR_status") +
  theme(legend.position = "top")

4.2.2 Model 3 with adjustment at the intercept

Adjusting at the Intercept Level (Group-Level Adjustment) When we adjust at the intercept, we assume that all participants share the same average values for covariates.

The intercept is pre-adjusted using the mean values of educ, age, ADI_NATRANK, etc. The same intercept applies to everyone, so predictions are for an “average” participant in each CDR_status group.

## Intercept for CDR = 0 (holding covariates constant): 8.8340
## Intercept for CDR > 0 (holding covariates constant): 10.0595

4.2.3 Model 3 with plot_model

Plotting using a package: By default, plot_model() holds all covariates at their mean or median values and covariate coefficient estimate for categorical variables is based on the most prevalent category.

plot_model(model_CDR_status, 
           type = "pred", 
           terms = c("time", "CDR_status"),
           mdrt.values = "meansd",  # Ensure covariates are set at their means
           show.data = TRUE)  # Overlay raw data points

4.2.4 Model 3 with manual adjustment at the intercept

# Compute mean values for continuous variables
mean_educ <- mean(clean_data_CDR$educ, na.rm = TRUE)
mean_age <- mean(clean_data_CDR$age, na.rm = TRUE)
mean_ADI_NATRANK <- mean(clean_data_CDR$ADI_NATRANK, na.rm = TRUE)
mean_noise <- mean(clean_data_CDR$noise_censusblock2020_mean, na.rm = TRUE)

# Compute proportion of White participants
mean_raceWhite <- mean(clean_data_CDR$race == "White", na.rm = TRUE)  # Binary 0/1 → proportion

# Find most common gender
most_common_gender <- as.numeric(names(sort(table(clean_data_CDR$gender), decreasing = TRUE)[1]))

# Extract fixed effects from the model
fixed_effects <- fixef(model_CDR_status)

# Compute intercepts for MOCA classification groups
intercept_CDR_0 <- fixed_effects["(Intercept)"]   # -5.811478 (Reference group)
intercept_CDR_GT0 <- intercept_CDR_0 + fixed_effects["CDR_statuscdr > 0"]  # -5.811478 + 1.225474

# Print intercepts with titles
cat(sprintf("Intercept for CDR_0: %.4f\n", intercept_CDR_0))
## Intercept for CDR_0: -5.8115
cat(sprintf("Intercept for CDR_GT0: %.4f\n", intercept_CDR_GT0))
## Intercept for CDR_GT0: -4.5860
# Create a new data frame for predictions
new_data <- expand.grid(
  time = seq(0, 12, by = 1.5),  # Time from 0 to 12
  CDR_status = factor(c("cdr = 0", "cdr > 0"), levels = levels(clean_data_CDR$CDR_status))  # Ensures factor levels match the dataset
)

# Compute predictions in mutate() (apply all covariate adjustments here)
new_data <- new_data %>%
  mutate(
    predicted = case_when(
      CDR_status == "cdr = 0" ~ intercept_CDR_0 +
                                fixed_effects["time"] * time +
                                fixed_effects["educ"] * mean_educ +
                                fixed_effects["gender2"] * (most_common_gender == 2) +
                                fixed_effects["age"] * mean_age +
                                fixed_effects["raceWhite"] * mean_raceWhite +
                                fixed_effects["ADI_NATRANK"] * mean_ADI_NATRANK +
                                fixed_effects["noise_censusblock2020_mean"] * mean_noise,
                                
      CDR_status == "cdr > 0" ~ intercept_CDR_GT0 +
                                (fixed_effects["time"] + fixed_effects["time:CDR_statuscdr > 0"]) * time +
                                fixed_effects["educ"] * mean_educ +
                                fixed_effects["gender2"] * (most_common_gender == 2) +
                                fixed_effects["age"] * mean_age +
                                fixed_effects["raceWhite"] * mean_raceWhite +
                                fixed_effects["ADI_NATRANK"] * mean_ADI_NATRANK +
                                fixed_effects["noise_censusblock2020_mean"] * mean_noise
    )
  )

# Plot using ggplot2
ggplot(new_data, aes(x = time, y = predicted, color = CDR_status)) +
  geom_line(size = 1.2) +  # Trend lines
  geom_point(size = 2) +   # Points for each time step
  theme_minimal() +
  labs(
    title = "Hearing Ability Decline Over Time (WIN Threshold Avg)",
    subtitle = "Comparing CDR = 0 vs. CDR > 0 at Baseline",
    x = "Time (Years)",
    y = "Predicted WIN Threshold Avg (Higher = Worse Hearing)",
    color = "Baseline CDR Status"
  ) +
  theme(legend.position = "top")