NSAIDs and Dementia: Reassessing Their Role in Alzheimer’s Disease

The Potential of NSAIDs: What Have We Overlooked in Alzheimer’s Research?

NSAIDs and Alzheimer’s Risk: A Fresh Look at the Evidence

In this file, I have attempted to summarize the code and reproduce it. Below, you will find:

NACCALZD (Dementia because of AD) ≠ demented (without etiology of dementia)

? I looked everywhere, but I couldnt find the criteria for AD. Have you?

Discussion Questions & Considerations

Q1) How was the education-corrected MOCA score of 25 (19, 27) calculated?

Q2) Fig 1

  • When looking at Figure 1, I find it odd that we include patients with a MOCA score below a certain threshold as “normal cognition”, even if their low score is due to impairments unrelated to dementia. On the other hand, it also seems strange to include patients with a MOCA score above 26 (up to 30) in both the demented and NACCALZD groups.

What do you think?

Q3) Why was the following hierarchy applied in assigning NSAID_TYPE? And without Etodolac?

# Assign NSAID_TYPE = 1 where DICLOFENAC is used (Diclofenac supersedes Naproxen)
collapsed_data2$NSAID_TYPE[collapsed_data2$DICLOFENAC == 1] <- 1

# Assign NSAID_TYPE = 0 where only Naproxen is used (and not Diclofenac)
collapsed_data2$NSAID_TYPE[is.na(collapsed_data2$NSAID_TYPE) & collapsed_data2$NAPROXEN == 1] <- 0

Q4) Fig 2 only includes demented patients.

  • What about NACCALZD?
  • Etodolac (It also had a positive, though weak, effect on the risk)?

  • comparing to non-NSAID?

Q5) Fig 4 Shall we also include a box for demented non-NSAID and NACCALZD ?

Q6) NSAIDs, Arthritis, and Dementia Risk

  • Off-topic, but the data also suggests a significant decrease in dementia risk among those with comorbid arthritis. Would it be too much to compare this cohort based on NSAID use?

Q7) Changes in MOCA scores over time: A comparison of Naproxen, Diclofenac, and non-NSAID groups of NACCALZD

  • How does the time interval between assessments affect the calculations? The impact may differ if visits are spaced one week apart versus five years apart.
  • Analysis includes AD patients with multiple MOCA assessments.

  • Evaluating the change (Δ) in MOCA scores over time.

Technical Questions: R Code Issues

I have the following R code where I’ve marked the parts I struggled with by writing "ERROR" in the header. Some of these errors are consequential.

Would you like me to send the R script file instead of pasting the code here? Or should I just provide the code inline for you to review ?

1) Dataframes

I am unable to reproduce the model_data dataframe or the match_obj object. Of course as a result, all subsequent formulas are incorrect.

2) What does this error mean and why is it a whole page long?

Warning message:
“glm.fit: fitted probabilities numerically 0 or 1 occurred”
Warning message:
Warning message:
“Removed 25522 rows containing non-finite outside the scale range
(`stat_smooth()`).”
`geom_smooth()` using formula = 'y ~ x'
Warning message:
“Removed 25522 rows containing non-finite outside the scale range
(`stat_smooth()`).”

Code

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: lattice
## 
## 
## Attaching package: 'caret'
## 
## 
## The following object is masked from 'package:purrr':
## 
##     lift
## 
## 
## Loading required package: carData
## 
## 
## Attaching package: 'car'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## 
## The following object is masked from 'package:purrr':
## 
##     some
## 
## 
## Loading required package: zoo
## 
## 
## Attaching package: 'zoo'
## 
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## 
## 
## Attaching package: 'gridExtra'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     combine

Data collapsed

Data collapsed 1

# Define the columns to search for drugs (columns that start with "DRUG")
drug_columns <- grep("^DRUG", names(data), value = TRUE)

# Convert all drug columns to lowercase for case-insensitive comparison
data[drug_columns] <- lapply(data[drug_columns], tolower)

# Efficiently check for the presence of each drug across the relevant columns

# Create DICLOFENAC column
data$DICLOFENAC <- ifelse(rowSums(sapply(data[drug_columns], function(x) grepl("diclofenac", x))) > 0, 1, 0)

# Create NAPROXEN column
data$NAPROXEN <- ifelse(rowSums(sapply(data[drug_columns], function(x) grepl("naproxen", x))) > 0, 1, 0)

# Create ETODOLAC column
data$ETODOLAC <- ifelse(rowSums(sapply(data[drug_columns], function(x) grepl("etodolac", x))) > 0, 1, 0)

grep("NACCTBI",names(data))
## [1] 147
# Define the columns to search for diagnostic conditions
diagnosis_cols <- c("NACCTBI", "HXHYPER", "HXSTROKE", "DEP", "BIPOLDX", "SCHIZOP", 
                    "ANXIET", "PTSDDX", "OTHPSY", "ALCABUSE", "CANCER", "DIABETES", 
                    "MYOINF", "CONGHRT", "AFIBRILL", "HYPERT", "HYPCHOL", "VB12DEF", 
                    "THYDIS", "ARTH", "SLEEPAP", "OTHCOND")

# Step 2: Create binary indicators for each diagnosis column indicating if the patient has the condition at any visit
data <- data %>%
  mutate(across(all_of(diagnosis_cols), ~ ifelse(. %in% c(1, 2), 1, 0)))

# Now, aggregate these indicators by NACCID to get the total number of unique conditions per patient
diagnosis_summary <- data %>%
  group_by(NACCID) %>%
  summarise(across(all_of(diagnosis_cols), ~ max(.x, na.rm = TRUE))) %>%
  mutate(DIAGNOSIS = rowSums(across(all_of(diagnosis_cols))))

# Merge the DIAGNOSIS summary back to the original data without collapsing to one row per NACCID
data <- data %>%
  left_join(diagnosis_summary %>% select(NACCID, DIAGNOSIS), by = "NACCID")

save(data, diagnosis_summary, diagnosis_cols, drug_columns, file = "data2Oct2024.RData")


# Preliminary: Collapse DICLOFENAC, ETODOLAC, NAPROXEN, and CSFTAU columns across all visits for each unique patient (NACCID)
data <- data %>%
  group_by(NACCID) %>%
  mutate(DICLOFENAC = ifelse(any(DICLOFENAC == 1, na.rm = TRUE), 1, 0),
         NAPROXEN = ifelse(any(NAPROXEN == 1, na.rm = TRUE), 1, 0),
         ETODOLAC = ifelse(any(ETODOLAC == 1, na.rm = TRUE), 1, 0),
         CSFTAU = ifelse(any(CSFTAU == 1, na.rm = TRUE), 1, 0))%>%
  ungroup()

# Select the most recent visit for specific variables if NACCMOCA is between 0 and 30, otherwise keep data from the most recent visit
most_recent_data <- data %>%
  group_by(NACCID) %>%
  filter((NACCMOCA >= 0 & NACCMOCA <= 30) | is.na(NACCMOCA) | NACCVNUM == max(NACCVNUM)) %>%
  slice_max(NACCVNUM, n = 1) %>%
  ungroup()

# c. Combine diagnosis data with the most recent visit data
collapsed_data <- most_recent_data %>%
  select(NACCID, NACCVNUM, NACCAGE, SEX, NACCNIHR, EDUC, NACCALZD, NACCALZP,
         NACCMOCA, CDRGLOB, NACCMMSE, NORMCOG, DEMENTED, NACCUDSD, 
         CSFTAU, NACCTBI, HXHYPER, HXSTROKE, DEP, BIPOLDX, SCHIZOP, 
         ANXIET, PTSDDX, OTHPSY, ALCABUSE, CANCER, DIABETES, MYOINF, 
         CONGHRT, AFIBRILL, HYPERT, HYPCHOL, VB12DEF, THYDIS, ARTH, 
         SLEEPAP, OTHCOND, DICLOFENAC, NAPROXEN, ETODOLAC, DIAGNOSIS,
         MOCATRAI, MOCACUBE, MOCACLOC, MOCACLON, MOCACLOH, MOCANAMI, 
         MOCAFLUE, MOCAREPE, MOCAREGI, MOCARECN, MOCAABST, MOCADIGI, 
         MOCALETT, MOCASER7, MOCAORDT, MOCAORMO, MOCAORYR, MOCAORDY, 
         MOCAORPL, MOCAORCT)


# List of columns to convert
columns_to_convert <- c("MOCATRAI", "MOCACUBE", "MOCACLOC", "MOCACLON", "MOCACLOH", "MOCANAMI", 
                        "MOCAFLUE", "MOCAREPE", "MOCAREGI", "MOCARECN", "MOCAABST", "MOCADIGI", 
                        "MOCALETT", "MOCASER7", "MOCAORDT", "MOCAORMO", "MOCAORYR", "MOCAORDY", 
                        "MOCAORPL", "MOCAORCT")

# Convert the columns to numeric
collapsed_data[columns_to_convert] <- lapply(collapsed_data[columns_to_convert], as.numeric)

save(data, collapsed_data, file="data2Oct2024.RData")

names(collapsed_data)
##  [1] "NACCID"     "NACCVNUM"   "NACCAGE"    "SEX"        "NACCNIHR"  
##  [6] "EDUC"       "NACCALZD"   "NACCALZP"   "NACCMOCA"   "CDRGLOB"   
## [11] "NACCMMSE"   "NORMCOG"    "DEMENTED"   "NACCUDSD"   "CSFTAU"    
## [16] "NACCTBI"    "HXHYPER"    "HXSTROKE"   "DEP"        "BIPOLDX"   
## [21] "SCHIZOP"    "ANXIET"     "PTSDDX"     "OTHPSY"     "ALCABUSE"  
## [26] "CANCER"     "DIABETES"   "MYOINF"     "CONGHRT"    "AFIBRILL"  
## [31] "HYPERT"     "HYPCHOL"    "VB12DEF"    "THYDIS"     "ARTH"      
## [36] "SLEEPAP"    "OTHCOND"    "DICLOFENAC" "NAPROXEN"   "ETODOLAC"  
## [41] "DIAGNOSIS"  "MOCATRAI"   "MOCACUBE"   "MOCACLOC"   "MOCACLON"  
## [46] "MOCACLOH"   "MOCANAMI"   "MOCAFLUE"   "MOCAREPE"   "MOCAREGI"  
## [51] "MOCARECN"   "MOCAABST"   "MOCADIGI"   "MOCALETT"   "MOCASER7"  
## [56] "MOCAORDT"   "MOCAORMO"   "MOCAORYR"   "MOCAORDY"   "MOCAORPL"  
## [61] "MOCAORCT"
dim(collapsed_data)
## [1] 47165    61
n_unique_NACCID <- length(unique(data$NACCID))


#collapsed_data
collapsed_data <- collapsed_data %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., -4))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 88))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 95))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 96))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 97))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 98))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 99)))

# Identify the variables that need to be summed
columns_of_interestVS <- c("MOCATRAI", "MOCACUBE", "MOCACLOC", "MOCACLON", "MOCACLOH")
columns_of_interestLAN <- c("MOCANAMI", "MOCAFLUE", "MOCAREPE")
columns_of_interestMEM1 <- c("MOCAREGI")
columns_of_interestMEM2 <- c("MOCARECN")
columns_of_interestABS <- c("MOCAABST")
columns_of_interestATTN <- c("MOCADIGI", "MOCALETT", "MOCASER7")
columns_of_interestORI <- c("MOCAORDT", "MOCAORMO", "MOCAORYR", "MOCAORDY", "MOCAORPL", "MOCAORCT")

# Create new columns based on the sum of the variables
collapsed_data <- collapsed_data %>%
  mutate(
    VISUOSPATIAL = rowSums(select(., all_of(columns_of_interestVS)), na.rm = TRUE),
    LANGUAGE = rowSums(select(., all_of(columns_of_interestLAN)), na.rm = TRUE),
    MEMORY1 = rowSums(select(., all_of(columns_of_interestMEM1)), na.rm = TRUE),
    MEMORY2 = rowSums(select(., all_of(columns_of_interestMEM2)), na.rm = TRUE),
    ABSTRACTION = rowSums(select(., all_of(columns_of_interestABS)), na.rm = TRUE),
    ATTENTION = rowSums(select(., all_of(columns_of_interestATTN)), na.rm = TRUE),
    ORIENTATION = rowSums(select(., all_of(columns_of_interestORI)), na.rm = TRUE)
  )

summary(collapsed_data)
##     NACCID             NACCVNUM         NACCAGE            SEX      
##  Length:47165       Min.   : 1.000   Min.   : 18.00   Min.   :1.00  
##  Class :character   1st Qu.: 1.000   1st Qu.: 68.00   1st Qu.:1.00  
##  Mode  :character   Median : 3.000   Median : 75.00   Median :2.00  
##                     Mean   : 3.704   Mean   : 74.62   Mean   :1.57  
##                     3rd Qu.: 5.000   3rd Qu.: 82.00   3rd Qu.:2.00  
##                     Max.   :18.000   Max.   :110.00   Max.   :2.00  
##                                                                     
##     NACCNIHR          EDUC          NACCALZD        NACCALZP    
##  Min.   :1.000   Min.   : 0.00   Min.   :0.000   Min.   :1.000  
##  1st Qu.:1.000   1st Qu.:12.00   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :16.00   Median :1.000   Median :7.000  
##  Mean   :1.411   Mean   :15.17   Mean   :3.292   Mean   :4.757  
##  3rd Qu.:1.000   3rd Qu.:18.00   3rd Qu.:8.000   3rd Qu.:8.000  
##  Max.   :6.000   Max.   :31.00   Max.   :8.000   Max.   :8.000  
##  NA's   :824     NA's   :363                                    
##     NACCMOCA        CDRGLOB          NACCMMSE        NORMCOG      
##  Min.   : 0.00   Min.   :0.0000   Min.   : 0.00   Min.   :0.0000  
##  1st Qu.:19.00   1st Qu.:0.0000   1st Qu.:19.00   1st Qu.:0.0000  
##  Median :24.00   Median :0.5000   Median :26.00   Median :0.0000  
##  Mean   :22.05   Mean   :0.8002   Mean   :22.76   Mean   :0.3567  
##  3rd Qu.:27.00   3rd Qu.:1.0000   3rd Qu.:29.00   3rd Qu.:1.0000  
##  Max.   :30.00   Max.   :3.0000   Max.   :30.00   Max.   :1.0000  
##  NA's   :32487                    NA's   :26564                   
##     DEMENTED         NACCUDSD         CSFTAU           NACCTBI       
##  Min.   :0.0000   Min.   :1.000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:1.000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.0000   Median :3.000   Median :0.00000   Median :0.00000  
##  Mean   :0.4252   Mean   :2.672   Mean   :0.01662   Mean   :0.07887  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :4.000   Max.   :1.00000   Max.   :1.00000  
##                                                                      
##     HXHYPER         HXSTROKE            DEP            BIPOLDX       
##  Min.   :0.000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.000   Median :0.00000   Median :0.0000   Median :0.00000  
##  Mean   :0.255   Mean   :0.03515   Mean   :0.1763   Mean   :0.00299  
##  3rd Qu.:1.000   3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000  
##  Max.   :1.000   Max.   :1.00000   Max.   :1.0000   Max.   :1.00000  
##                                                                      
##     SCHIZOP              ANXIET            PTSDDX             OTHPSY       
##  Min.   :0.0000000   Min.   :0.00000   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.0000000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.00000  
##  Median :0.0000000   Median :0.00000   Median :0.000000   Median :0.00000  
##  Mean   :0.0008057   Mean   :0.02837   Mean   :0.002353   Mean   :0.01427  
##  3rd Qu.:0.0000000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:0.00000  
##  Max.   :1.0000000   Max.   :1.00000   Max.   :1.000000   Max.   :1.00000  
##                                                                            
##     ALCABUSE            CANCER           DIABETES           MYOINF       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.000000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.001399   Mean   :0.09348   Mean   :0.09492   Mean   :0.01904  
##  3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.000000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##                                                                          
##     CONGHRT           AFIBRILL          HYPERT          HYPCHOL      
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.01446   Mean   :0.0444   Mean   :0.2561   Mean   :0.2704  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##                                                                      
##     VB12DEF            THYDIS             ARTH           SLEEPAP       
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.0000   Median :0.00000  
##  Mean   :0.03785   Mean   :0.09844   Mean   :0.2793   Mean   :0.09354  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.0000   Max.   :1.00000  
##                                                                        
##     OTHCOND         DICLOFENAC        NAPROXEN          ETODOLAC       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.000000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.000000  
##  Mean   :0.1595   Mean   :0.0219   Mean   :0.06098   Mean   :0.003922  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.000000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.000000  
##                                                                        
##    DIAGNOSIS         MOCATRAI        MOCACUBE        MOCACLOC    
##  Min.   : 0.000   Min.   :0.00    Min.   :0.0     Min.   :0.00   
##  1st Qu.: 1.000   1st Qu.:0.00    1st Qu.:0.0     1st Qu.:1.00   
##  Median : 2.000   Median :1.00    Median :1.0     Median :1.00   
##  Mean   : 2.692   Mean   :0.66    Mean   :0.5     Mean   :0.94   
##  3rd Qu.: 4.000   3rd Qu.:1.00    3rd Qu.:1.0     3rd Qu.:1.00   
##  Max.   :15.000   Max.   :1.00    Max.   :1.0     Max.   :1.00   
##                   NA's   :32208   NA's   :32188   NA's   :32192  
##     MOCACLON        MOCACLOH        MOCANAMI        MOCAFLUE    
##  Min.   :0.00    Min.   :0.00    Min.   :0.00    Min.   :0.000  
##  1st Qu.:1.00    1st Qu.:0.00    1st Qu.:2.00    1st Qu.:0.000  
##  Median :1.00    Median :1.00    Median :3.00    Median :1.000  
##  Mean   :0.77    Mean   :0.58    Mean   :2.64    Mean   :0.674  
##  3rd Qu.:1.00    3rd Qu.:1.00    3rd Qu.:3.00    3rd Qu.:1.000  
##  Max.   :1.00    Max.   :1.00    Max.   :3.00    Max.   :1.000  
##  NA's   :32193   NA's   :32194   NA's   :32125   NA's   :29182  
##     MOCAREPE        MOCAREGI        MOCARECN        MOCAABST    
##  Min.   :0.000   Min.   : 0.00   Min.   :0.00    Min.   :0.000  
##  1st Qu.:1.000   1st Qu.: 8.00   1st Qu.:0.00    1st Qu.:1.000  
##  Median :2.000   Median : 9.00   Median :3.00    Median :2.000  
##  Mean   :1.348   Mean   : 8.33   Mean   :2.35    Mean   :1.526  
##  3rd Qu.:2.000   3rd Qu.:10.00   3rd Qu.:4.00    3rd Qu.:2.000  
##  Max.   :2.000   Max.   :10.00   Max.   :5.00    Max.   :2.000  
##  NA's   :29160   NA's   :32144   NA's   :29205   NA's   :29171  
##     MOCADIGI        MOCALETT        MOCASER7        MOCAORDT    
##  Min.   :0.00    Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:2.00    1st Qu.:1.000   1st Qu.:2.000   1st Qu.:1.000  
##  Median :2.00    Median :1.000   Median :3.000   Median :1.000  
##  Mean   :1.72    Mean   :0.843   Mean   :2.365   Mean   :0.764  
##  3rd Qu.:2.00    3rd Qu.:1.000   3rd Qu.:3.000   3rd Qu.:1.000  
##  Max.   :2.00    Max.   :1.000   Max.   :3.000   Max.   :1.000  
##  NA's   :29134   NA's   :29158   NA's   :29263   NA's   :29152  
##     MOCAORMO        MOCAORYR        MOCAORDY        MOCAORPL    
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.888   Mean   :0.876   Mean   :0.855   Mean   :0.849  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##  NA's   :29152   NA's   :29152   NA's   :29150   NA's   :29152  
##     MOCAORCT      VISUOSPATIAL      LANGUAGE        MEMORY1      
##  Min.   :0.00    Min.   :0.000   Min.   :0.000   Min.   : 0.000  
##  1st Qu.:1.00    1st Qu.:0.000   1st Qu.:0.000   1st Qu.: 0.000  
##  Median :1.00    Median :0.000   Median :0.000   Median : 0.000  
##  Mean   :0.94    Mean   :1.098   Mean   :1.614   Mean   : 2.654  
##  3rd Qu.:1.00    3rd Qu.:2.000   3rd Qu.:4.000   3rd Qu.: 7.000  
##  Max.   :1.00    Max.   :5.000   Max.   :6.000   Max.   :10.000  
##  NA's   :29148                                                   
##     MEMORY2       ABSTRACTION       ATTENTION      ORIENTATION   
##  Min.   :0.000   Min.   :0.0000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.000  
##  Median :0.000   Median :0.0000   Median :0.000   Median :0.000  
##  Mean   :0.895   Mean   :0.5821   Mean   :1.877   Mean   :1.976  
##  3rd Qu.:1.000   3rd Qu.:2.0000   3rd Qu.:5.000   3rd Qu.:6.000  
##  Max.   :5.000   Max.   :2.0000   Max.   :6.000   Max.   :6.000  
## 
collapsed_data <- collapsed_data %>%
  mutate(SUM_SCORE = VISUOSPATIAL + LANGUAGE + MEMORY2 + ABSTRACTION + ATTENTION + ORIENTATION)
ggplot(collapsed_data, aes(x = SUM_SCORE, y = NACCMOCA)) +
  geom_point(color = "blue", alpha = 0.7) +
  labs(title = "Scatterplot of Sum of Cognitive Scores vs NACCMOCA",
       x = "Sum of VISUOSPATIAL, LANGUAGE, MEMORY2, ABSTRACTION, ATTENTION, ORIENTATION",
       y = "NACCMOCA") +
  theme_minimal()

# Convert categorical variables to factors
collapsed_data <- collapsed_data %>%
  mutate(
    SEX = as.factor(SEX),
    NACCNIHR = as.factor(NACCNIHR),
    NACCTBI = as.factor(NACCTBI),
    DEP = as.factor(DEP),
    BIPOLDX = as.factor(BIPOLDX),
    SCHIZOP = as.factor(SCHIZOP),
    ANXIET = as.factor(ANXIET),
    PTSDDX = as.factor(PTSDDX),
    CANCER = as.factor(CANCER),
    DIABETES = as.factor(DIABETES),
    CONGHRT = as.factor(CONGHRT),
    HYPERT = as.factor(HYPERT),
    HYPCHOL = as.factor(HYPCHOL),
    VB12DEF = as.factor(VB12DEF),
    THYDIS = as.factor(THYDIS),
    ARTH = as.factor(ARTH),
    SLEEPAP = as.factor(SLEEPAP),
    DICLOFENAC = as.factor(DICLOFENAC),
    NAPROXEN = as.factor(NAPROXEN),
    ETODOLAC = as.factor(ETODOLAC),
    DIAGNOSIS = as.numeric(DIAGNOSIS)  # DIAGNOSIS is already numeric (sum of conditions)
  )

names(collapsed_data)
##  [1] "NACCID"       "NACCVNUM"     "NACCAGE"      "SEX"          "NACCNIHR"    
##  [6] "EDUC"         "NACCALZD"     "NACCALZP"     "NACCMOCA"     "CDRGLOB"     
## [11] "NACCMMSE"     "NORMCOG"      "DEMENTED"     "NACCUDSD"     "CSFTAU"      
## [16] "NACCTBI"      "HXHYPER"      "HXSTROKE"     "DEP"          "BIPOLDX"     
## [21] "SCHIZOP"      "ANXIET"       "PTSDDX"       "OTHPSY"       "ALCABUSE"    
## [26] "CANCER"       "DIABETES"     "MYOINF"       "CONGHRT"      "AFIBRILL"    
## [31] "HYPERT"       "HYPCHOL"      "VB12DEF"      "THYDIS"       "ARTH"        
## [36] "SLEEPAP"      "OTHCOND"      "DICLOFENAC"   "NAPROXEN"     "ETODOLAC"    
## [41] "DIAGNOSIS"    "MOCATRAI"     "MOCACUBE"     "MOCACLOC"     "MOCACLON"    
## [46] "MOCACLOH"     "MOCANAMI"     "MOCAFLUE"     "MOCAREPE"     "MOCAREGI"    
## [51] "MOCARECN"     "MOCAABST"     "MOCADIGI"     "MOCALETT"     "MOCASER7"    
## [56] "MOCAORDT"     "MOCAORMO"     "MOCAORYR"     "MOCAORDY"     "MOCAORPL"    
## [61] "MOCAORCT"     "VISUOSPATIAL" "LANGUAGE"     "MEMORY1"      "MEMORY2"     
## [66] "ABSTRACTION"  "ATTENTION"    "ORIENTATION"  "SUM_SCORE"

Data collapsed 2

#Step 1: Filter out patients with NACCALZD == 0
filtered_data <- collapsed_data %>%
  filter(NACCALZD != 0)

# Step 2: Recode NACCALZD where 8 (normal cognition) is changed to 0
filtered_data <- filtered_data %>%
  mutate(NACCALZD = case_when(
    NACCALZD == 8 ~ 0,  # Recode 8 to 0  (8 ~ 0 no cognitive impairment)
    NACCALZD == 1 ~ 1   # Keep 1 as 1 (Alzheimer's disease)
  ))

collapsed_data2<-filtered_data

# Initialize NSAID_TYPE as NA
collapsed_data2$NSAID_TYPE <- NA

# Assign NSAID_TYPE = 1 where DICLOFENAC is used (Diclofenac supersedes Naproxen)
collapsed_data2$NSAID_TYPE[collapsed_data2$DICLOFENAC == 1] <- 1

# Assign NSAID_TYPE = 0 where only Naproxen is used (and not Diclofenac)
collapsed_data2$NSAID_TYPE[is.na(collapsed_data2$NSAID_TYPE) & collapsed_data2$NAPROXEN == 1] <- 0

table(collapsed_data2$DICLOFENAC, useNA = "always")
## 
##     0     1  <NA> 
## 36625   844     0
table(collapsed_data2$NAPROXEN, useNA = "always")
## 
##     0     1  <NA> 
## 35084  2385     0

Data collapsed 3 & model_data, Match_obj matched_data

Iam unable to reproduce the model_data dataframe or the match_obj object. As a result, all subsequent formulas are incorrect.

# model_data
model_data <- collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]
dim(model_data)
## [1] 3111   70
#collapsed_data
collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]
dim(collapsed_data3)
## [1] 3111   70
#match_data
match_obj <- matchit(NSAID_TYPE ~ NACCAGE + SEX + NACCNIHR + EDUC + DIAGNOSIS, 
                     data = model_data, 
                     method = "full", 
                     distance = "glm", 
                     link = "probit")
## Error in `matchit()`:
## ! Missing and non-finite values are not allowed in the covariates.
## Covariates with missingness or non-finite values: NACCNIHR, EDUC
matched_data <- match.data(match_obj)
## Error: object 'match_obj' not found

Inconsistencies MOCA Score / Cogn. Score to Demented / NACCALZD

When looking at Figure 1, I find it odd that we include patients with a MOCA score below a certain threshold as “normal cognition”, even if their low score is due to impairments unrelated to dementia. On the other hand, it also seems strange to include patients with a MOCA score above 26 (up to 30) in both the demented and NACCALZD groups.

# Create new dataframe ad_moca with selected columns
ad_moca <- collapsed_data2[, c("NACCID", "NACCMOCA", "DEMENTED", "NACCALZD")]

# Filter for cases where DEMENTED status or NACCALZD Status doesnt match Moca
inconsistent_cases <- collapsed_data2 %>%
  filter(NACCMOCA > 25 & DEMENTED == 1)
dim(inconsistent_cases)
## [1] 21 70
inconsistent_cases1 <- collapsed_data2 %>%
  filter(NACCMOCA < 20 & DEMENTED == 0)
dim(inconsistent_cases1)
## [1] 643  70
 ## nearly 1%

inconsistent_cases2 <- collapsed_data2 %>%
  filter(NACCMOCA > 26 & NACCALZD == 1)
dim(inconsistent_cases2)
## [1] 190  70
 ## nearly 1% 

inconsistent_cases3 <- collapsed_data2 %>%
  filter(NACCMOCA < 20 & NACCALZD == 0)
dim(inconsistent_cases3)
## [1] 172  70
# Display the results
print(inconsistent_cases)
## # A tibble: 21 × 70
##    NACCID     NACCVNUM NACCAGE SEX   NACCNIHR  EDUC NACCALZD NACCALZP NACCMOCA
##    <chr>         <int>   <int> <fct> <fct>    <int>    <dbl>    <int>    <int>
##  1 NACC097130        5      73 1     1           19        1        1       26
##  2 NACC108978        2      78 1     1           20        1        1       26
##  3 NACC162469        1      77 1     1           12        1        1       28
##  4 NACC175064        1      63 1     1           16        1        2       30
##  5 NACC179619        2      66 1     1           18        1        2       27
##  6 NACC194582        1      63 2     2           14        1        1       26
##  7 NACC253699        2      65 1     1           18        1        1       28
##  8 NACC308810        1      73 2     1           18        1        1       30
##  9 NACC344354        4      73 2     1           16        1        1       26
## 10 NACC406578        1      64 2     1           13        1        1       28
## # ℹ 11 more rows
## # ℹ 61 more variables: CDRGLOB <dbl>, NACCMMSE <int>, NORMCOG <int>,
## #   DEMENTED <int>, NACCUDSD <int>, CSFTAU <dbl>, NACCTBI <fct>, HXHYPER <dbl>,
## #   HXSTROKE <dbl>, DEP <fct>, BIPOLDX <fct>, SCHIZOP <fct>, ANXIET <fct>,
## #   PTSDDX <fct>, OTHPSY <dbl>, ALCABUSE <dbl>, CANCER <fct>, DIABETES <fct>,
## #   MYOINF <dbl>, CONGHRT <fct>, AFIBRILL <dbl>, HYPERT <fct>, HYPCHOL <fct>,
## #   VB12DEF <fct>, THYDIS <fct>, ARTH <fct>, SLEEPAP <fct>, OTHCOND <dbl>, …

Linear regression models Demented/NACCALZD

# Logistic regression model 1 without filtering NACCALZD, linear regression on demented
set.seed(123)
logistic_model1 <- glm(
  DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data,
  family = binomial(link = "logit")
)

# View the summary of the model
summary(logistic_model1)
## 
## Call:
## glm(formula = DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.2288157  0.0894181  -2.559 0.010499 *  
## NACCAGE      0.0244402  0.0009579  25.514  < 2e-16 ***
## SEX2        -0.4082967  0.0210270 -19.418  < 2e-16 ***
## NACCNIHR2   -0.6084231  0.0326240 -18.650  < 2e-16 ***
## NACCNIHR3   -0.4569022  0.1317157  -3.469 0.000523 ***
## NACCNIHR4    0.3826622  0.3513750   1.089 0.276135    
## NACCNIHR5   -0.3097220  0.0628109  -4.931 8.18e-07 ***
## NACCNIHR6   -0.4540993  0.0598494  -7.587 3.26e-14 ***
## EDUC        -0.0834979  0.0031702 -26.339  < 2e-16 ***
## NACCTBI1    -0.1484236  0.0377231  -3.935 8.33e-05 ***
## DEP1         0.5397809  0.0279950  19.281  < 2e-16 ***
## BIPOLDX1     0.0429715  0.1916503   0.224 0.822588    
## SCHIZOP1     0.0629820  0.3724545   0.169 0.865718    
## ANXIET1      0.1670566  0.0635791   2.628 0.008600 ** 
## PTSDDX1     -0.8818708  0.2618662  -3.368 0.000758 ***
## CANCER1     -0.2720226  0.0397499  -6.843 7.74e-12 ***
## DIABETES1   -0.1495084  0.0362394  -4.126 3.70e-05 ***
## CONGHRT1     0.2238859  0.0884959   2.530 0.011409 *  
## HYPERT1     -0.1031722  0.0326610  -3.159 0.001584 ** 
## HYPCHOL1    -0.1200694  0.0309884  -3.875 0.000107 ***
## VB12DEF1     0.3626941  0.0544457   6.662 2.71e-11 ***
## THYDIS1     -0.0113423  0.0386633  -0.293 0.769246    
## ARTH1       -0.7911988  0.0303579 -26.062  < 2e-16 ***
## SLEEPAP1    -0.0505020  0.0402077  -1.256 0.209106    
## DICLOFENAC1 -0.3210074  0.0756263  -4.245 2.19e-05 ***
## NAPROXEN1   -0.2572108  0.0440678  -5.837 5.32e-09 ***
## ETODOLAC1   -0.3605477  0.1664991  -2.165 0.030352 *  
## DIAGNOSIS   -0.0348220  0.0092516  -3.764 0.000167 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 62680  on 46010  degrees of freedom
## Residual deviance: 57643  on 45983  degrees of freedom
##   (1154 observations deleted due to missingness)
## AIC: 57699
## 
## Number of Fisher Scoring iterations: 4
# Logistic regression model 2 filtering NACCALZD, linear regression on demented
set.seed(123)
logistic_model2 <- glm(
  DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data2,
  family = binomial(link = "logit")
)
# View the summary of the model
summary(logistic_model2)
## 
## Call:
## glm(formula = DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data2)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.863713   0.105864  -8.159 3.39e-16 ***
## NACCAGE      0.036396   0.001125  32.352  < 2e-16 ***
## SEX2        -0.402324   0.024249 -16.591  < 2e-16 ***
## NACCNIHR2   -0.449112   0.036489 -12.308  < 2e-16 ***
## NACCNIHR3   -0.231145   0.152121  -1.519 0.128643    
## NACCNIHR4    0.179366   0.449004   0.399 0.689542    
## NACCNIHR5   -0.298822   0.072385  -4.128 3.66e-05 ***
## NACCNIHR6   -0.310564   0.067980  -4.568 4.91e-06 ***
## EDUC        -0.104785   0.003689 -28.404  < 2e-16 ***
## NACCTBI1    -0.097578   0.045352  -2.152 0.031433 *  
## DEP1         0.798841   0.033535  23.821  < 2e-16 ***
## BIPOLDX1     0.221831   0.257139   0.863 0.388307    
## SCHIZOP1     0.162300   0.516691   0.314 0.753434    
## ANXIET1      0.355259   0.075861   4.683 2.83e-06 ***
## PTSDDX1     -0.960394   0.346725  -2.770 0.005607 ** 
## CANCER1     -0.287111   0.044382  -6.469 9.86e-11 ***
## DIABETES1   -0.072395   0.042244  -1.714 0.086576 .  
## CONGHRT1     0.216954   0.099803   2.174 0.029719 *  
## HYPERT1     -0.164898   0.036819  -4.479 7.51e-06 ***
## HYPCHOL1    -0.127070   0.034762  -3.655 0.000257 ***
## VB12DEF1     0.388951   0.061902   6.283 3.31e-10 ***
## THYDIS1     -0.002559   0.043131  -0.059 0.952688    
## ARTH1       -0.820028   0.034007 -24.113  < 2e-16 ***
## SLEEPAP1    -0.046994   0.046432  -1.012 0.311487    
## DICLOFENAC1 -0.316232   0.085022  -3.719 0.000200 ***
## NAPROXEN1   -0.319987   0.049793  -6.426 1.31e-10 ***
## ETODOLAC1   -0.348451   0.183136  -1.903 0.057081 .  
## DIAGNOSIS   -0.032523   0.010481  -3.103 0.001914 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 49815  on 36622  degrees of freedom
## Residual deviance: 44463  on 36595  degrees of freedom
##   (846 observations deleted due to missingness)
## AIC: 44519
## 
## Number of Fisher Scoring iterations: 4
# Logistic regression model 3 filtering NACCALZD, linear regression on NACCALZD
set.seed(123)
logistic_model3 <- glm(
  NACCALZD ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data2,
  family = binomial(link = "logit")
)

# View the summary of the model
summary(logistic_model3)
## 
## Call:
## glm(formula = NACCALZD ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data2)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.389924   0.107053 -12.984  < 2e-16 ***
## NACCAGE      0.049491   0.001153  42.919  < 2e-16 ***
## SEX2        -0.509775   0.024509 -20.800  < 2e-16 ***
## NACCNIHR2   -0.319050   0.035193  -9.066  < 2e-16 ***
## NACCNIHR3   -0.146400   0.151231  -0.968  0.33302    
## NACCNIHR4    0.461190   0.464436   0.993  0.32070    
## NACCNIHR5   -0.218825   0.069955  -3.128  0.00176 ** 
## NACCNIHR6   -0.202888   0.066383  -3.056  0.00224 ** 
## EDUC        -0.098621   0.003796 -25.979  < 2e-16 ***
## NACCTBI1     0.119274   0.046636   2.558  0.01054 *  
## DEP1         1.129675   0.036812  30.688  < 2e-16 ***
## BIPOLDX1     0.447559   0.255568   1.751  0.07991 .  
## SCHIZOP1     0.771827   0.550836   1.401  0.16116    
## ANXIET1      0.498332   0.077439   6.435 1.23e-10 ***
## PTSDDX1     -0.775745   0.288313  -2.691  0.00713 ** 
## CANCER1     -0.170735   0.041267  -4.137 3.51e-05 ***
## DIABETES1    0.082835   0.043415   1.908  0.05640 .  
## CONGHRT1     0.167094   0.097655   1.711  0.08707 .  
## HYPERT1      0.022381   0.035189   0.636  0.52476    
## HYPCHOL1    -0.019456   0.033258  -0.585  0.55854    
## VB12DEF1     0.445637   0.061317   7.268 3.65e-13 ***
## THYDIS1      0.026937   0.040842   0.660  0.50955    
## ARTH1       -0.658537   0.032294 -20.392  < 2e-16 ***
## SLEEPAP1     0.057278   0.043415   1.319  0.18707    
## DICLOFENAC1 -0.247914   0.078063  -3.176  0.00149 ** 
## NAPROXEN1   -0.281654   0.047079  -5.983 2.20e-09 ***
## ETODOLAC1   -0.415376   0.176151  -2.358  0.01837 *  
## DIAGNOSIS   -0.084324   0.010216  -8.254  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 50427  on 36622  degrees of freedom
## Residual deviance: 44424  on 36595  degrees of freedom
##   (846 observations deleted due to missingness)
## AIC: 44480
## 
## Number of Fisher Scoring iterations: 4
# View the summary of the model
summary(logistic_model)
## Error: object 'logistic_model' not found
with(summary(logistic_model1), 1 - deviance/null.deviance)
## [1] 0.08036398
with(summary(logistic_model2), 1 - deviance/null.deviance)
## [1] 0.1074406
with(summary(logistic_model3), 1 - deviance/null.deviance)
## [1] 0.1190576
#tidysummary_model <- tidy(logistic_model2)
#write_csv(tidysummary_model, file = "tableS1.csv")
#logistic_coefficients<-exp(cbind(OR = coef(logistic_model2), confint(logistic_model2)))
#logistic_coefficients <- as.data.frame(logistic_coefficients)
#write_csv(logistic_coefficients, file = "tableS1_coef.csv")

# Perform a t-test to compare NACCAGE between DEMENTED = 0 and DEMENTED = 1
t_test_result <- t.test(NACCAGE ~ DEMENTED, data = collapsed_data)

# Print the result of the t-test
print(t_test_result)
## 
##  Welch Two Sample t-test
## 
## data:  NACCAGE by DEMENTED
## t = -21.108, df = 43815, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -2.345538 -1.946947
## sample estimates:
## mean in group 0 mean in group 1 
##        73.70854        75.85478

Visual checks Error

Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, …) : plot.new has not been called yet

#Visual checks 

# Visual checks
par(mfrow = c(1, 2))  # Set up 2 plots side by side
hist(collapsed_data$NACCAGE[data$DEMENTED == 0], main = "DEMENTED = 0", xlab = "NACCAGE", breaks = 20)
hist(collapsed_data$NACCAGE[data$DEMENTED == 1], main = "DEMENTED = 1", xlab = "NACCAGE", breaks = 20)

# Q-Q plot to check normality
qqnorm(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 0], main = "Q-Q Plot DEMENTED = 0")
qqline(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 0])
qqnorm(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 1], main = "Q-Q Plot DEMENTED = 1")
qqline(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 1])

Mann-Whitney U test (Wilcoxon rank-sum test) AGE ~ DEMENTED

# Mann-Whitney U test (Wilcoxon rank-sum test)

wilcox.test(NACCAGE ~ DEMENTED, data = collapsed_data)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  NACCAGE by DEMENTED
## W = 240365721, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
# Mann-Whitney U test (Wilcoxon rank-sum test)
wilcox.test(NACCAGE ~ NACCALZD, data = filtered_data)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  NACCAGE by NACCALZD
## W = 130858134, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

Problem with propensity score matching

ERROR Call: matchit(formula = NSAID_TYPE ~ NACCAGE + SEX + NACCNIHR + EDUC + DIAGNOSIS, data = model_data, method = “full”, distance = “glm”, link = “probit”) Summary of Balance for All Data: Summary of Balance for Matched Data: Sample Sizes:

# Remove rows with missing values in the key covariates
model_data <- collapsed_data2 %>%
  filter(!is.na(NSAID_TYPE), !is.na(NACCAGE), !is.na(NACCNIHR), !is.na(EDUC), !is.na(DIAGNOSIS))

# Convert NSAID_TYPE to a factor
model_data$NSAID_TYPE <- as.factor(model_data$NSAID_TYPE)

# Perform propensity score matching
set.seed(123)
match_obj <- matchit(NSAID_TYPE ~ NACCAGE + SEX + NACCNIHR + EDUC + DIAGNOSIS, 
                     data = model_data, 
                     method = "full", 
                     distance = "glm", 
                     link = "probit")

# Check if the matching worked
summary(match_obj)

# Summarize the matching results
summary(match_obj)

plot(match_obj, type = "jitter", interactive = FALSE)
plot(summary(match_obj), abs = FALSE)

# lines # dont work in RMD
#pdf(file = "FigS_matching.pdf",   # The directory you want to save the file in
    width = 4, # The width of the plot in inches
    height = 4) # The height of the plot in inches

#plot(summary(match_obj), abs = FALSE)
#dev.off()
## Error in parse(text = input): <text>:27:14: unexpected ','
## 26: #pdf(file = "FigS_matching.pdf",   # The directory you want to save the file in
## 27:     width = 4,
##                  ^

Preform logistic regression on the matched data ACCALZD/DEMENTED ~ NSAID_TYPE (consequential error)

# Extract matched data

matched_data <- match.data(match_obj)
## Error: object 'match_obj' not found
# Perform logistic regression on the matched data
set.seed(123)
res1 <- glm(DEMENTED ~ NSAID_TYPE, data = matched_data, family = binomial)
## Error in eval(mf, parent.frame()): object 'matched_data' not found
# Perform logistic regression on the matched data
set.seed(123)
res2 <- glm(NACCALZD ~ NSAID_TYPE, data = matched_data, family = binomial)
## Error in eval(mf, parent.frame()): object 'matched_data' not found
summary(res)
## Error: object 'res' not found
# Summarize the logistic regression results
summary(res1)
## Error: object 'res1' not found
summary(res2)
## Error: object 'res2' not found
# in # for rmd file
#tidysummary_model1 <- tidy(res1)
#write_csv(tidysummary_model1, file = "tableS1_match.csv")
#logistic_coefficients1<-exp(cbind(OR = coef(res1), confint(res1)))
#logistic_coefficients1 <- as.data.frame(logistic_coefficients)
#write_csv(logistic_coefficients, file = "tableS1_coef_match.csv")

#tidysummary_model2 <- tidy(res2)
#write_csv(tidysummary_model2, file = "tableS2_match.csv")
#logistic_coefficients2<-exp(cbind(OR = coef(res2), confint(res2)))
#logistic_coefficients2 <- as.data.frame(logistic_coefficients)
#write_csv(logistic_coefficients, file = "tableS2_coef_match.csv")

Create a contingency table of NSAID_TYPE and DEMENTED matched (error)

# Create a contingency table of NSAID_TYPE and DEMENTED
contingency_table <- table(matched_data$NSAID_TYPE, matched_data$DEMENTED)
## Error: object 'matched_data' not found
# Perform Fisher's exact test
fisher_test_result <- fisher.test(contingency_table)
## Error: object 'contingency_table' not found
# Print the result of the Fisher's exact test
print(fisher_test_result)
## Error: object 'fisher_test_result' not found
print(contingency_table)
## Error: object 'contingency_table' not found
foo<-contingency_table
## Error: object 'contingency_table' not found
write.csv(foo, "Dementia_match_contingency.csv")
## Error in eval(expr, p): object 'foo' not found

#Moca Subscores -> weird outpot of boxplots

# List of cognitive variables
cognitive_variables <- c("VISUOSPATIAL", "LANGUAGE", "MEMORY2", "ABSTRACTION", "ATTENTION", "ORIENTATION")

#doenst work in RMD pdf(file = "Fig2.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

#doenst work in RMD pdf(file = "Fig_trial.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 12) # The height of the plot in inches
# Create individual plots for each cognitive variable by NSAID_TYPE
plot_list <- lapply(cognitive_variables, function(var) {
  ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = .data[[var]], fill = as.factor(NSAID_TYPE))) +
    geom_boxplot() +
    labs(title = paste("Distribution of", var, "by NSAID_TYPE"), x = "NSAID Type", y = "Score") +
    scale_fill_manual(values = c("0" = "blue", "1" = "red"), labels = c("0" = "Naproxen", "1" = "Diclofenac")) +
    theme_minimal())

# Arrange plots in a grid
do.call(grid.arrange, c(plot_list, ncol = 3))
## Error in parse(text = input): <text>:5:14: unexpected ','
## 4: #doenst work in RMD pdf(file = "Fig2.pdf",   # The directory you want to save the file in
## 5:     width = 8,
##                 ^

CSF

# Logistic regression model CSF
set.seed(123)
logistic_model_CSF <- glm(
  CSFTAU ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data2,
  family = binomial(link = "logit")
)

# View the summary of the model
summary(logistic_model_CSF)
## 
## Call:
## glm(formula = CSFTAU ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data2)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -1.064752   0.319324  -3.334 0.000855 ***
## NACCAGE      -0.049836   0.003345 -14.897  < 2e-16 ***
## SEX2         -0.369395   0.081605  -4.527 5.99e-06 ***
## NACCNIHR2    -1.069886   0.190576  -5.614 1.98e-08 ***
## NACCNIHR3    -0.896474   0.716740  -1.251 0.211020    
## NACCNIHR4   -12.382824 278.421472  -0.044 0.964526    
## NACCNIHR5    -1.170738   0.383010  -3.057 0.002238 ** 
## NACCNIHR6    -0.485020   0.265932  -1.824 0.068174 .  
## EDUC          0.038777   0.013691   2.832 0.004623 ** 
## NACCTBI1     -0.493659   0.186419  -2.648 0.008094 ** 
## DEP1         -0.165008   0.116818  -1.413 0.157796    
## BIPOLDX1      0.488456   0.484024   1.009 0.312899    
## SCHIZOP1    -11.277195 317.258656  -0.036 0.971645    
## ANXIET1       0.634077   0.164231   3.861 0.000113 ***
## PTSDDX1      -0.273404   0.734351  -0.372 0.709665    
## CANCER1       0.314663   0.113336   2.776 0.005497 ** 
## DIABETES1    -1.322862   0.294439  -4.493 7.03e-06 ***
## CONGHRT1     -0.997635   0.460654  -2.166 0.030335 *  
## HYPERT1      -0.116910   0.106604  -1.097 0.272784    
## HYPCHOL1      0.490275   0.100841   4.862 1.16e-06 ***
## VB12DEF1      0.377673   0.151139   2.499 0.012460 *  
## THYDIS1       0.092134   0.118005   0.781 0.434944    
## ARTH1         0.049171   0.099298   0.495 0.620465    
## SLEEPAP1      0.186360   0.114383   1.629 0.103256    
## DICLOFENAC1   0.001704   0.252642   0.007 0.994618    
## NAPROXEN1    -0.225072   0.163996  -1.372 0.169932    
## ETODOLAC1    -0.279308   0.716635  -0.390 0.696722    
## DIAGNOSIS     0.099229   0.030804   3.221 0.001276 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 6911.1  on 36622  degrees of freedom
## Residual deviance: 6354.7  on 36595  degrees of freedom
##   (846 observations deleted due to missingness)
## AIC: 6410.7
## 
## Number of Fisher Scoring iterations: 14
with(summary(logistic_model_CSF), 1 - deviance/null.deviance)
## [1] 0.08050304
# in # for rmd file
#tidysummary_model_CSF <- tidy(logistic_model_CSF)
 #write_csv(tidysummary_model_CSF, file = "tableS3_tau.csv")
#logistic_coefficients_CSF<-exp(cbind(OR = coef(logistic_model_CSF), confint(logistic_model_CSF)))
#logistic_coefficients_CSF <- as.data.frame(logistic_coefficients_CSF)
#write_csv(logistic_coefficients_CSF, file = "tableS3_coef_tau.csv")


# Perform logistic regression on the matched data
set.seed(123)
res_CSF <- glm(CSFTAU ~ NSAID_TYPE, data = matched_data, family = binomial)
## Error in eval(mf, parent.frame()): object 'matched_data' not found
summary(res_CSF)
## Error: object 'res_CSF' not found
# in # for rmd file
#tidysummary_model_CSF <- tidy(res_CSF)
#write_csv(tidysummary_model, file = "tableS3_tau_match.csv")
#logistic_coefficients_CSF<-exp(cbind(OR = coef(res_CSF), confint(res_CSF)))
#logistic_coefficients_CSF <- as.data.frame(logistic_coefficients_CSF)
#write_csv(logistic_coefficients_CSF, file = "tableS3_coef_tau_match.csv")


# Create a contingency table of NSAID_TYPE and NACCALZD
contingency_table_CSF <- table(matched_data$NSAID_TYPE, matched_data$CSFTAU)
## Error: object 'matched_data' not found
# Perform Fisher's exact test
fisher_test_result_CSF <- fisher.test(contingency_table_CSF)
## Error: object 'contingency_table_CSF' not found
# Print the result of the Fisher's exact test
print(fisher_test_result_CSF)
## Error: object 'fisher_test_result_CSF' not found
print(contingency_table_CSF)
## Error: object 'contingency_table_CSF' not found
foo<-contingency_table_CSF
## Error: object 'contingency_table_CSF' not found
write.csv(foo, "Csftau_match_contingency.csv")
## Error in eval(expr, p): object 'foo' not found

Figures

Fig 1v2

pdf(file = "Fig1_v2.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

p1 <- ggscatter(collapsed_data2, x = "NACCAGE", y = "NACCMOCA",
          add = "reg.line",                                 # Add regression line
          conf.int = TRUE,                                  # Add confidence interval
          add.params = list(color = "blue",
                            fill = "lightgray")
          )+
  stat_cor(method = "pearson", label.x = 0, label.y = 20,aes(label = paste0(..r.label.., sep = " ")))+
stat_cor(method = "pearson", label.x = 0, label.y = 18.5,aes(label = paste0(..p.label.., sep = " ")))
p1<-ggpar(p1, xlim = c(0, 100),ylim = c(0, 30),title = "A",ylab="MOCA Score", xlab="Age (years)")

p2 <- ggscatter(matched_data, x = "NACCAGE", y = "NACCMOCA",
          add = "reg.line",                                 # Add regression line
          conf.int = TRUE,                                  # Add confidence interval
          add.params = list(color = "blue",
                            fill = "lightgray")
          )+
  stat_cor(method = "pearson", label.x = 3, label.y = 15)
## Error: object 'matched_data' not found
p2<-ggpar(p2, xlim = c(0, 100),ylim = c(0, 30),title = "D",ylab="MOCA Score", xlab="Age (years)")
## Error: object 'p2' not found
p3 <- ggboxplot(collapsed_data2, x = "DEMENTED", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p3<- p3+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))
p3<-ggpar(p3,xlab="Demented Status", ylab="MOCA Score",title = "B")

p4 <- ggboxplot(matched_data, x = "DEMENTED", y = "NACCMOCA")
## Error: object 'matched_data' not found
my_comparisons <- list( c("0", "1") )
p4<- p4+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))
## Error: object 'p4' not found
p4<-ggpar(p4,xlab="Demented Status", ylab="MOCA Score",title = "E")
## Error: object 'p4' not found
p5 <- ggboxplot(collapsed_data2, x = "NACCALZD", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p5<- p5+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))
p5<-ggpar(p5,xlab="AD Status", ylab="MOCA Score",title = "C")

p6 <- ggboxplot(matched_data, x = "NACCALZD", y = "NACCMOCA")
## Error: object 'matched_data' not found
my_comparisons <- list( c("0", "1") )
p6<- p6+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))
## Error: object 'p6' not found
p6<-ggpar(p6,xlab="AD Status", ylab="MOCA Score",title = "F")
## Error: object 'p6' not found
grid.arrange(p1, p3, p5, p2,p4,p6, ncol = 3)
## Error: object 'p2' not found

Fig 1 v1

# doesnt work in RMD : pdf(file = "Fig1.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

# Filter out NACCMOCA values of 88 and 99 for both datasets
collapsed_data_filtered <- filtered_data %>%
  filter(NACCMOCA != 88 & NACCMOCA != 99)



# A. Relationship of NACCMOCA vs NACCAGE scatter plot in collapsed_data with trend line and R-squared
p1 <- ggplot(filtered_data, aes(x = NACCAGE, y = NACCMOCA)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +  # Add trend line
  theme_minimal() +
  labs(title = "A. NACCMOCA vs NACCAGE (collapsed_data)", 
       x = "Age", 
       y = "MOCA Score") +
  stat_cor(method = "pearson", label.x = 60, label.y = 30, aes(label = after_stat(rr.label)))  # Updated notation

# B. Relationship of NACCMOCA vs NACCAGE scatter plot in matched_data with trend line and R-squared
p2 <- ggplot(matched_data, aes(x = NACCAGE, y = NACCMOCA)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +  # Add trend line
  theme_minimal() +
  labs(title = "B. NACCMOCA vs NACCAGE (matched_data)", 
       x = "Age", 
       y = "MOCA Score") +
  stat_cor(method = "pearson", label.x = 60, label.y = 30, aes(label = after_stat(rr.label)))  # Updated notation

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p3 <- ggplot(filtered_data, aes(x = as.factor(DEMENTED), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "C. NACCMOCA by DEMENTED (collapsed_data)", 
       x = "Demented Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))

# D. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (matched_data)
p4 <- ggplot(matched_data, aes(x = as.factor(DEMENTED), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "D. NACCMOCA by DEMENTED (matched_data)", 
       x = "Demented Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))

# E. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (collapsed_data)
p5 <- ggplot(filtered_data, aes(x = as.factor(NACCALZD), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "E. NACCMOCA by NACCALZD (collapsed_data)", 
       x = "AD Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))

# F. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (matched_data)
p6 <- ggplot(matched_data, aes(x = as.factor(NACCALZD), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "F. NACCMOCA by NACCALZD (matched_data)", 
       x = "AD Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))

# Combine all plots into a 2x2 grid
grid.arrange(p1, p3, p5, p2, p4, p6, ncol = 3)
## Error in parse(text = input): <text>:2:15: unexpected ','
## 1: # doesnt work in RMD : pdf(file = "Fig1.pdf",   # The directory you want to save the file in
## 2:     width = 12,
##                  ^

Fig 2 v 2

#doesnt work in rmd # pdf(file = "Fig2_v2.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

p1 <- ggboxplot(collapsed_data3, x = "NSAID_TYPE", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p1<- p1+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
p1<-ggpar(p1,xlab="NSAID Type", ylab="MOCA Score",title = "A")

p2 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p2<- p2+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
p2<-ggpar(p2,xlab="NSAID Type", ylab="MOCA Score",title = "B")

grid.arrange(p1, p2, ncol = 2)
## Error in parse(text = input): <text>:2:14: unexpected ','
## 1: #doesnt work in rmd # pdf(file = "Fig2_v2.pdf",   # The directory you want to save the file in
## 2:     width = 8,
##                 ^

Fig 2 v1

# doesnt work in rmd pdf(file = "Fig2.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

# A. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (collapsed_data)
p1 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "A. NACCMOCA by NSAID TYPE (collapsed_data)", 
       x = "NSAID TYPE", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
my_comparisons <- list( c("0", "1") )
p1<-p1 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# B. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (matched_data)
p2 <- ggplot(matched_data, aes(x = as.factor(NSAID_TYPE), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "B. NACCMOCA by NSAID TYPE (matched_data)", 
       x = "NSAID TYPE", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
my_comparisons <- list( c("0", "1") )
p2<-p2 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, ncol = 2)
## Error in parse(text = input): <text>:2:14: unexpected ','
## 1: # doesnt work in rmd pdf(file = "Fig2.pdf",   # The directory you want to save the file in
## 2:     width = 8,
##                 ^

#Fig 3

#doesnt work in rmd pdf(file = "Fig3.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

# A. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (collapsed_data)
p1 <- ggplot(collapsed_data3, aes(x = as.factor(CSFTAU), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "A. NACCMOCA by CSF TAU positivity (collapsed_data)", 
       x = "CSF TAU positivity", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "CSF Tau Negative (n=3054)", "1" = "CSF Tau Positive (n=57)"))
my_comparisons <- list( c("0", "1") )
p1<-p1 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# B. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (matched_data)
p2 <- ggplot(matched_data, aes(x = as.factor(CSFTAU), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "B. NACCMOCA by CSF TAU positivity (matched_data)", 
       x = "CSF TAU positivity", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "CSF Tau Negative (n=2999)", "1" = "CSF Tau Positive (n=57)"))
my_comparisons <- list( c("0", "1") )
p2<-p2 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, ncol = 2)
## Error in parse(text = input): <text>:2:14: unexpected ','
## 1: #doesnt work in rmd pdf(file = "Fig3.pdf",   # The directory you want to save the file in
## 2:     width = 8,
##                 ^

Fig 4 Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1

#doesnt work in rmd pdf(file = "Fig4.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p1 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = VISUOSPATIAL)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "A. VISUOSPATIAL by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p1<-p1 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p2 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = LANGUAGE)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "B. LANGUAGE by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p2<-p2 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p3 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = MEMORY2)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "C. MEMORY2 by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p3<-p3 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p4 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = ABSTRACTION)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "D. ABSTRACTION by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p4<-p4 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p5 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = ATTENTION)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "E. ATTENTION by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p5<-p5 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p6 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = ORIENTATION)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "F. ORIENTATION by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p6<-p6 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, p3, p4,p5,p6, ncol = 3)
## Error in parse(text = input): <text>:2:15: unexpected ','
## 1: #doesnt work in rmd pdf(file = "Fig4.pdf",   # The directory you want to save the file in
## 2:     width = 12,
##                  ^

Fig 4v3 Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1

# doesnt work in rmd pdf(file = "Fig4_v3.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

p1 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "VISUOSPATIAL")
my_comparisons <- list( c("0", "1") )
p1<- p1+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p1<-ggpar(p1,xlab="NSAID Type", ylab="MOCA Subscore",title = "A. Visuospatial")

p2 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "LANGUAGE")
my_comparisons <- list( c("0", "1") )
p2<- p2+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p2<-ggpar(p2,xlab="NSAID Type", ylab="MOCA Subscore",title = "B. Language")

p3 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "MEMORY2")
my_comparisons <- list( c("0", "1") )
p3<- p3+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p3<-ggpar(p3,xlab="NSAID Type", ylab="MOCA Subscore",title = "C. Memory")

p4 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "ABSTRACTION")
my_comparisons <- list( c("0", "1") )
p4<- p4+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p4<-ggpar(p4,xlab="NSAID Type", ylab="MOCA Subscore",title = "D. Abstraction")

p5 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "ATTENTION")
my_comparisons <- list( c("0", "1") )
p5<- p5+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p5<-ggpar(p5,xlab="NSAID Type", ylab="MOCA Subscore",title = "E. Attention")

p6 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "ORIENTATION")
my_comparisons <- list( c("0", "1") )
p6<- p6+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p6<-ggpar(p6,xlab="NSAID Type", ylab="MOCA Subscore",title = "F. Orientation")


summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ORIENTATION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])
summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ORIENTATION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])

summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ATTENTION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])
summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ATTENTION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])
# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, p3, p4,p5,p6, ncol = 3)
## Error in parse(text = input): <text>:2:15: unexpected ','
## 1: # doesnt work in rmd pdf(file = "Fig4_v3.pdf",   # The directory you want to save the file in
## 2:     width = 12,
##                  ^

#pdf2

summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ORIENTATION)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00    0.00    2.52    6.00    6.00
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])
## [1] 2267   70
summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ORIENTATION)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   4.000   3.082   6.000   6.000
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])
## [1] 844  70
summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ATTENTION)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.000   2.421   6.000   6.000
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])
## [1] 2267   70
summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ATTENTION)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   3.000   2.902   6.000   6.000
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])
## [1] 844  70