NSAIDs and Dementia: Reassessing Their Role in Alzheimer’s Disease

The Potential of NSAIDs: What Have We Overlooked in Alzheimer’s Research?

NSAIDs and Alzheimer’s Risk: A Fresh Look at the Evidence

In this file, I have attempted to summarize the code and reproduce it. Below, you will find:

A list of questions and considerations for discussion.

Some technical issues encountered in RStudio.

The full R code, in case @Victor wants to check specific parts.

NACCALZD (Dementia because of AD) ≠ demented (without etiology of dementia)

? I looked everywhere, but I couldnt find the criteria for AD. Have you?

Discussion Questions & Considerations

Q1) How was the education-corrected MOCA score of 25 (19, 27) calculated?

Q2) Fig 1

When looking at Figure 1, I find it odd that we include patients with a MOCA score below a certain threshold as “normal cognition”, even if their low score is due to impairments unrelated to dementia. On the other hand, it also seems strange to include patients with a MOCA score above 26 (up to 30) in both the demented and NACCALZD groups.

What do you think?

Q3) Why was the following hierarchy applied in assigning NSAID_TYPE? And without Etodolac?

# Assign NSAID_TYPE = 1 where DICLOFENAC is used (Diclofenac supersedes Naproxen)
collapsed_data2$NSAID_TYPE[collapsed_data2$DICLOFENAC == 1] <- 1

# Assign NSAID_TYPE = 0 where only Naproxen is used (and not Diclofenac)
collapsed_data2$NSAID_TYPE[is.na(collapsed_data2$NSAID_TYPE) & collapsed_data2$NAPROXEN == 1] <- 0

Q4) Fig 2 only includes demented patients.

What about NACCALZD?

Etodolac (It also had a positive, though weak, effect on the risk)?
comparing to non-NSAID?

Q5) Fig 4 Shall we also include a box for demented non-NSAID and NACCALZD ?

Q6) NSAIDs, Arthritis, and Dementia Risk

Off-topic, but the data also suggests a significant decrease in dementia risk among those with comorbid arthritis. Would it be too much to compare this cohort based on NSAID use?

Q7) Changes in MOCA scores over time: A comparison of Naproxen, Diclofenac, and non-NSAID groups of NACCALZD

How does the time interval between assessments affect the calculations? The impact may differ if visits are spaced one week apart versus five years apart.

Analysis includes AD patients with multiple MOCA assessments.
Evaluating the change (Δ) in MOCA scores over time.

Technical Questions: R Code Issues

I have the following R code where I’ve marked the parts I struggled with by writing "ERROR" in the header. Some of these errors are consequential.

Would you like me to send the R script file instead of pasting the code here? Or should I just provide the code inline for you to review ?

1) Dataframes

I am unable to reproduce the model_data dataframe or the match_obj object. Of course as a result, all subsequent formulas are incorrect.

2) What does this error mean and why is it a whole page long?

Warning message:
“glm.fit: fitted probabilities numerically 0 or 1 occurred”
Warning message:

Warning message:
“Removed 25522 rows containing non-finite outside the scale range
(`stat_smooth()`).”

`geom_smooth()` using formula = 'y ~ x'
Warning message:
“Removed 25522 rows containing non-finite outside the scale range
(`stat_smooth()`).”

Code

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: lattice
## 
## 
## Attaching package: 'caret'
## 
## 
## The following object is masked from 'package:purrr':
## 
##     lift
## 
## 
## Loading required package: carData
## 
## 
## Attaching package: 'car'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## 
## The following object is masked from 'package:purrr':
## 
##     some
## 
## 
## Loading required package: zoo
## 
## 
## Attaching package: 'zoo'
## 
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## 
## 
## Attaching package: 'gridExtra'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     combine

Data collapsed

Data collapsed 1

# Define the columns to search for drugs (columns that start with "DRUG")
drug_columns <- grep("^DRUG", names(data), value = TRUE)

# Convert all drug columns to lowercase for case-insensitive comparison
data[drug_columns] <- lapply(data[drug_columns], tolower)

# Efficiently check for the presence of each drug across the relevant columns

# Create DICLOFENAC column
data$DICLOFENAC <- ifelse(rowSums(sapply(data[drug_columns], function(x) grepl("diclofenac", x))) > 0, 1, 0)

# Create NAPROXEN column
data$NAPROXEN <- ifelse(rowSums(sapply(data[drug_columns], function(x) grepl("naproxen", x))) > 0, 1, 0)

# Create ETODOLAC column
data$ETODOLAC <- ifelse(rowSums(sapply(data[drug_columns], function(x) grepl("etodolac", x))) > 0, 1, 0)

grep("NACCTBI",names(data))

## [1] 147

# Define the columns to search for diagnostic conditions
diagnosis_cols <- c("NACCTBI", "HXHYPER", "HXSTROKE", "DEP", "BIPOLDX", "SCHIZOP", 
                    "ANXIET", "PTSDDX", "OTHPSY", "ALCABUSE", "CANCER", "DIABETES", 
                    "MYOINF", "CONGHRT", "AFIBRILL", "HYPERT", "HYPCHOL", "VB12DEF", 
                    "THYDIS", "ARTH", "SLEEPAP", "OTHCOND")

# Step 2: Create binary indicators for each diagnosis column indicating if the patient has the condition at any visit
data <- data %>%
  mutate(across(all_of(diagnosis_cols), ~ ifelse(. %in% c(1, 2), 1, 0)))

# Now, aggregate these indicators by NACCID to get the total number of unique conditions per patient
diagnosis_summary <- data %>%
  group_by(NACCID) %>%
  summarise(across(all_of(diagnosis_cols), ~ max(.x, na.rm = TRUE))) %>%
  mutate(DIAGNOSIS = rowSums(across(all_of(diagnosis_cols))))

# Merge the DIAGNOSIS summary back to the original data without collapsing to one row per NACCID
data <- data %>%
  left_join(diagnosis_summary %>% select(NACCID, DIAGNOSIS), by = "NACCID")

save(data, diagnosis_summary, diagnosis_cols, drug_columns, file = "data2Oct2024.RData")


# Preliminary: Collapse DICLOFENAC, ETODOLAC, NAPROXEN, and CSFTAU columns across all visits for each unique patient (NACCID)
data <- data %>%
  group_by(NACCID) %>%
  mutate(DICLOFENAC = ifelse(any(DICLOFENAC == 1, na.rm = TRUE), 1, 0),
         NAPROXEN = ifelse(any(NAPROXEN == 1, na.rm = TRUE), 1, 0),
         ETODOLAC = ifelse(any(ETODOLAC == 1, na.rm = TRUE), 1, 0),
         CSFTAU = ifelse(any(CSFTAU == 1, na.rm = TRUE), 1, 0))%>%
  ungroup()

# Select the most recent visit for specific variables if NACCMOCA is between 0 and 30, otherwise keep data from the most recent visit
most_recent_data <- data %>%
  group_by(NACCID) %>%
  filter((NACCMOCA >= 0 & NACCMOCA <= 30) | is.na(NACCMOCA) | NACCVNUM == max(NACCVNUM)) %>%
  slice_max(NACCVNUM, n = 1) %>%
  ungroup()

# c. Combine diagnosis data with the most recent visit data
collapsed_data <- most_recent_data %>%
  select(NACCID, NACCVNUM, NACCAGE, SEX, NACCNIHR, EDUC, NACCALZD, NACCALZP,
         NACCMOCA, CDRGLOB, NACCMMSE, NORMCOG, DEMENTED, NACCUDSD, 
         CSFTAU, NACCTBI, HXHYPER, HXSTROKE, DEP, BIPOLDX, SCHIZOP, 
         ANXIET, PTSDDX, OTHPSY, ALCABUSE, CANCER, DIABETES, MYOINF, 
         CONGHRT, AFIBRILL, HYPERT, HYPCHOL, VB12DEF, THYDIS, ARTH, 
         SLEEPAP, OTHCOND, DICLOFENAC, NAPROXEN, ETODOLAC, DIAGNOSIS,
         MOCATRAI, MOCACUBE, MOCACLOC, MOCACLON, MOCACLOH, MOCANAMI, 
         MOCAFLUE, MOCAREPE, MOCAREGI, MOCARECN, MOCAABST, MOCADIGI, 
         MOCALETT, MOCASER7, MOCAORDT, MOCAORMO, MOCAORYR, MOCAORDY, 
         MOCAORPL, MOCAORCT)


# List of columns to convert
columns_to_convert <- c("MOCATRAI", "MOCACUBE", "MOCACLOC", "MOCACLON", "MOCACLOH", "MOCANAMI", 
                        "MOCAFLUE", "MOCAREPE", "MOCAREGI", "MOCARECN", "MOCAABST", "MOCADIGI", 
                        "MOCALETT", "MOCASER7", "MOCAORDT", "MOCAORMO", "MOCAORYR", "MOCAORDY", 
                        "MOCAORPL", "MOCAORCT")

# Convert the columns to numeric
collapsed_data[columns_to_convert] <- lapply(collapsed_data[columns_to_convert], as.numeric)

save(data, collapsed_data, file="data2Oct2024.RData")

names(collapsed_data)

##  [1] "NACCID"     "NACCVNUM"   "NACCAGE"    "SEX"        "NACCNIHR"  
##  [6] "EDUC"       "NACCALZD"   "NACCALZP"   "NACCMOCA"   "CDRGLOB"   
## [11] "NACCMMSE"   "NORMCOG"    "DEMENTED"   "NACCUDSD"   "CSFTAU"    
## [16] "NACCTBI"    "HXHYPER"    "HXSTROKE"   "DEP"        "BIPOLDX"   
## [21] "SCHIZOP"    "ANXIET"     "PTSDDX"     "OTHPSY"     "ALCABUSE"  
## [26] "CANCER"     "DIABETES"   "MYOINF"     "CONGHRT"    "AFIBRILL"  
## [31] "HYPERT"     "HYPCHOL"    "VB12DEF"    "THYDIS"     "ARTH"      
## [36] "SLEEPAP"    "OTHCOND"    "DICLOFENAC" "NAPROXEN"   "ETODOLAC"  
## [41] "DIAGNOSIS"  "MOCATRAI"   "MOCACUBE"   "MOCACLOC"   "MOCACLON"  
## [46] "MOCACLOH"   "MOCANAMI"   "MOCAFLUE"   "MOCAREPE"   "MOCAREGI"  
## [51] "MOCARECN"   "MOCAABST"   "MOCADIGI"   "MOCALETT"   "MOCASER7"  
## [56] "MOCAORDT"   "MOCAORMO"   "MOCAORYR"   "MOCAORDY"   "MOCAORPL"  
## [61] "MOCAORCT"

dim(collapsed_data)

## [1] 47165    61

n_unique_NACCID <- length(unique(data$NACCID))


#collapsed_data
collapsed_data <- collapsed_data %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., -4))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 88))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 95))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 96))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 97))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 98))) %>%
  mutate(across(
    .cols = where(is.numeric) & !all_of("NACCAGE"), 
    .fns = ~ na_if(., 99)))

# Identify the variables that need to be summed
columns_of_interestVS <- c("MOCATRAI", "MOCACUBE", "MOCACLOC", "MOCACLON", "MOCACLOH")
columns_of_interestLAN <- c("MOCANAMI", "MOCAFLUE", "MOCAREPE")
columns_of_interestMEM1 <- c("MOCAREGI")
columns_of_interestMEM2 <- c("MOCARECN")
columns_of_interestABS <- c("MOCAABST")
columns_of_interestATTN <- c("MOCADIGI", "MOCALETT", "MOCASER7")
columns_of_interestORI <- c("MOCAORDT", "MOCAORMO", "MOCAORYR", "MOCAORDY", "MOCAORPL", "MOCAORCT")

# Create new columns based on the sum of the variables
collapsed_data <- collapsed_data %>%
  mutate(
    VISUOSPATIAL = rowSums(select(., all_of(columns_of_interestVS)), na.rm = TRUE),
    LANGUAGE = rowSums(select(., all_of(columns_of_interestLAN)), na.rm = TRUE),
    MEMORY1 = rowSums(select(., all_of(columns_of_interestMEM1)), na.rm = TRUE),
    MEMORY2 = rowSums(select(., all_of(columns_of_interestMEM2)), na.rm = TRUE),
    ABSTRACTION = rowSums(select(., all_of(columns_of_interestABS)), na.rm = TRUE),
    ATTENTION = rowSums(select(., all_of(columns_of_interestATTN)), na.rm = TRUE),
    ORIENTATION = rowSums(select(., all_of(columns_of_interestORI)), na.rm = TRUE)
  )

summary(collapsed_data)

##     NACCID             NACCVNUM         NACCAGE            SEX      
##  Length:47165       Min.   : 1.000   Min.   : 18.00   Min.   :1.00  
##  Class :character   1st Qu.: 1.000   1st Qu.: 68.00   1st Qu.:1.00  
##  Mode  :character   Median : 3.000   Median : 75.00   Median :2.00  
##                     Mean   : 3.704   Mean   : 74.62   Mean   :1.57  
##                     3rd Qu.: 5.000   3rd Qu.: 82.00   3rd Qu.:2.00  
##                     Max.   :18.000   Max.   :110.00   Max.   :2.00  
##                                                                     
##     NACCNIHR          EDUC          NACCALZD        NACCALZP    
##  Min.   :1.000   Min.   : 0.00   Min.   :0.000   Min.   :1.000  
##  1st Qu.:1.000   1st Qu.:12.00   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :16.00   Median :1.000   Median :7.000  
##  Mean   :1.411   Mean   :15.17   Mean   :3.292   Mean   :4.757  
##  3rd Qu.:1.000   3rd Qu.:18.00   3rd Qu.:8.000   3rd Qu.:8.000  
##  Max.   :6.000   Max.   :31.00   Max.   :8.000   Max.   :8.000  
##  NA's   :824     NA's   :363                                    
##     NACCMOCA        CDRGLOB          NACCMMSE        NORMCOG      
##  Min.   : 0.00   Min.   :0.0000   Min.   : 0.00   Min.   :0.0000  
##  1st Qu.:19.00   1st Qu.:0.0000   1st Qu.:19.00   1st Qu.:0.0000  
##  Median :24.00   Median :0.5000   Median :26.00   Median :0.0000  
##  Mean   :22.05   Mean   :0.8002   Mean   :22.76   Mean   :0.3567  
##  3rd Qu.:27.00   3rd Qu.:1.0000   3rd Qu.:29.00   3rd Qu.:1.0000  
##  Max.   :30.00   Max.   :3.0000   Max.   :30.00   Max.   :1.0000  
##  NA's   :32487                    NA's   :26564                   
##     DEMENTED         NACCUDSD         CSFTAU           NACCTBI       
##  Min.   :0.0000   Min.   :1.000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:1.000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.0000   Median :3.000   Median :0.00000   Median :0.00000  
##  Mean   :0.4252   Mean   :2.672   Mean   :0.01662   Mean   :0.07887  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :4.000   Max.   :1.00000   Max.   :1.00000  
##                                                                      
##     HXHYPER         HXSTROKE            DEP            BIPOLDX       
##  Min.   :0.000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.000   Median :0.00000   Median :0.0000   Median :0.00000  
##  Mean   :0.255   Mean   :0.03515   Mean   :0.1763   Mean   :0.00299  
##  3rd Qu.:1.000   3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000  
##  Max.   :1.000   Max.   :1.00000   Max.   :1.0000   Max.   :1.00000  
##                                                                      
##     SCHIZOP              ANXIET            PTSDDX             OTHPSY       
##  Min.   :0.0000000   Min.   :0.00000   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.0000000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.00000  
##  Median :0.0000000   Median :0.00000   Median :0.000000   Median :0.00000  
##  Mean   :0.0008057   Mean   :0.02837   Mean   :0.002353   Mean   :0.01427  
##  3rd Qu.:0.0000000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:0.00000  
##  Max.   :1.0000000   Max.   :1.00000   Max.   :1.000000   Max.   :1.00000  
##                                                                            
##     ALCABUSE            CANCER           DIABETES           MYOINF       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.000000   Median :0.00000   Median :0.00000   Median :0.00000  
##  Mean   :0.001399   Mean   :0.09348   Mean   :0.09492   Mean   :0.01904  
##  3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.000000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##                                                                          
##     CONGHRT           AFIBRILL          HYPERT          HYPCHOL      
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000  
##  Mean   :0.01446   Mean   :0.0444   Mean   :0.2561   Mean   :0.2704  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##                                                                      
##     VB12DEF            THYDIS             ARTH           SLEEPAP       
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.00000   Median :0.0000   Median :0.00000  
##  Mean   :0.03785   Mean   :0.09844   Mean   :0.2793   Mean   :0.09354  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:1.0000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.0000   Max.   :1.00000  
##                                                                        
##     OTHCOND         DICLOFENAC        NAPROXEN          ETODOLAC       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.000000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.000000  
##  Mean   :0.1595   Mean   :0.0219   Mean   :0.06098   Mean   :0.003922  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.000000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.000000  
##                                                                        
##    DIAGNOSIS         MOCATRAI        MOCACUBE        MOCACLOC    
##  Min.   : 0.000   Min.   :0.00    Min.   :0.0     Min.   :0.00   
##  1st Qu.: 1.000   1st Qu.:0.00    1st Qu.:0.0     1st Qu.:1.00   
##  Median : 2.000   Median :1.00    Median :1.0     Median :1.00   
##  Mean   : 2.692   Mean   :0.66    Mean   :0.5     Mean   :0.94   
##  3rd Qu.: 4.000   3rd Qu.:1.00    3rd Qu.:1.0     3rd Qu.:1.00   
##  Max.   :15.000   Max.   :1.00    Max.   :1.0     Max.   :1.00   
##                   NA's   :32208   NA's   :32188   NA's   :32192  
##     MOCACLON        MOCACLOH        MOCANAMI        MOCAFLUE    
##  Min.   :0.00    Min.   :0.00    Min.   :0.00    Min.   :0.000  
##  1st Qu.:1.00    1st Qu.:0.00    1st Qu.:2.00    1st Qu.:0.000  
##  Median :1.00    Median :1.00    Median :3.00    Median :1.000  
##  Mean   :0.77    Mean   :0.58    Mean   :2.64    Mean   :0.674  
##  3rd Qu.:1.00    3rd Qu.:1.00    3rd Qu.:3.00    3rd Qu.:1.000  
##  Max.   :1.00    Max.   :1.00    Max.   :3.00    Max.   :1.000  
##  NA's   :32193   NA's   :32194   NA's   :32125   NA's   :29182  
##     MOCAREPE        MOCAREGI        MOCARECN        MOCAABST    
##  Min.   :0.000   Min.   : 0.00   Min.   :0.00    Min.   :0.000  
##  1st Qu.:1.000   1st Qu.: 8.00   1st Qu.:0.00    1st Qu.:1.000  
##  Median :2.000   Median : 9.00   Median :3.00    Median :2.000  
##  Mean   :1.348   Mean   : 8.33   Mean   :2.35    Mean   :1.526  
##  3rd Qu.:2.000   3rd Qu.:10.00   3rd Qu.:4.00    3rd Qu.:2.000  
##  Max.   :2.000   Max.   :10.00   Max.   :5.00    Max.   :2.000  
##  NA's   :29160   NA's   :32144   NA's   :29205   NA's   :29171  
##     MOCADIGI        MOCALETT        MOCASER7        MOCAORDT    
##  Min.   :0.00    Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:2.00    1st Qu.:1.000   1st Qu.:2.000   1st Qu.:1.000  
##  Median :2.00    Median :1.000   Median :3.000   Median :1.000  
##  Mean   :1.72    Mean   :0.843   Mean   :2.365   Mean   :0.764  
##  3rd Qu.:2.00    3rd Qu.:1.000   3rd Qu.:3.000   3rd Qu.:1.000  
##  Max.   :2.00    Max.   :1.000   Max.   :3.000   Max.   :1.000  
##  NA's   :29134   NA's   :29158   NA's   :29263   NA's   :29152  
##     MOCAORMO        MOCAORYR        MOCAORDY        MOCAORPL    
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000  
##  Median :1.000   Median :1.000   Median :1.000   Median :1.000  
##  Mean   :0.888   Mean   :0.876   Mean   :0.855   Mean   :0.849  
##  3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
##  NA's   :29152   NA's   :29152   NA's   :29150   NA's   :29152  
##     MOCAORCT      VISUOSPATIAL      LANGUAGE        MEMORY1      
##  Min.   :0.00    Min.   :0.000   Min.   :0.000   Min.   : 0.000  
##  1st Qu.:1.00    1st Qu.:0.000   1st Qu.:0.000   1st Qu.: 0.000  
##  Median :1.00    Median :0.000   Median :0.000   Median : 0.000  
##  Mean   :0.94    Mean   :1.098   Mean   :1.614   Mean   : 2.654  
##  3rd Qu.:1.00    3rd Qu.:2.000   3rd Qu.:4.000   3rd Qu.: 7.000  
##  Max.   :1.00    Max.   :5.000   Max.   :6.000   Max.   :10.000  
##  NA's   :29148                                                   
##     MEMORY2       ABSTRACTION       ATTENTION      ORIENTATION   
##  Min.   :0.000   Min.   :0.0000   Min.   :0.000   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.000  
##  Median :0.000   Median :0.0000   Median :0.000   Median :0.000  
##  Mean   :0.895   Mean   :0.5821   Mean   :1.877   Mean   :1.976  
##  3rd Qu.:1.000   3rd Qu.:2.0000   3rd Qu.:5.000   3rd Qu.:6.000  
##  Max.   :5.000   Max.   :2.0000   Max.   :6.000   Max.   :6.000  
##

collapsed_data <- collapsed_data %>%
  mutate(SUM_SCORE = VISUOSPATIAL + LANGUAGE + MEMORY2 + ABSTRACTION + ATTENTION + ORIENTATION)
ggplot(collapsed_data, aes(x = SUM_SCORE, y = NACCMOCA)) +
  geom_point(color = "blue", alpha = 0.7) +
  labs(title = "Scatterplot of Sum of Cognitive Scores vs NACCMOCA",
       x = "Sum of VISUOSPATIAL, LANGUAGE, MEMORY2, ABSTRACTION, ATTENTION, ORIENTATION",
       y = "NACCMOCA") +
  theme_minimal()

# Convert categorical variables to factors
collapsed_data <- collapsed_data %>%
  mutate(
    SEX = as.factor(SEX),
    NACCNIHR = as.factor(NACCNIHR),
    NACCTBI = as.factor(NACCTBI),
    DEP = as.factor(DEP),
    BIPOLDX = as.factor(BIPOLDX),
    SCHIZOP = as.factor(SCHIZOP),
    ANXIET = as.factor(ANXIET),
    PTSDDX = as.factor(PTSDDX),
    CANCER = as.factor(CANCER),
    DIABETES = as.factor(DIABETES),
    CONGHRT = as.factor(CONGHRT),
    HYPERT = as.factor(HYPERT),
    HYPCHOL = as.factor(HYPCHOL),
    VB12DEF = as.factor(VB12DEF),
    THYDIS = as.factor(THYDIS),
    ARTH = as.factor(ARTH),
    SLEEPAP = as.factor(SLEEPAP),
    DICLOFENAC = as.factor(DICLOFENAC),
    NAPROXEN = as.factor(NAPROXEN),
    ETODOLAC = as.factor(ETODOLAC),
    DIAGNOSIS = as.numeric(DIAGNOSIS)  # DIAGNOSIS is already numeric (sum of conditions)
  )

names(collapsed_data)

##  [1] "NACCID"       "NACCVNUM"     "NACCAGE"      "SEX"          "NACCNIHR"    
##  [6] "EDUC"         "NACCALZD"     "NACCALZP"     "NACCMOCA"     "CDRGLOB"     
## [11] "NACCMMSE"     "NORMCOG"      "DEMENTED"     "NACCUDSD"     "CSFTAU"      
## [16] "NACCTBI"      "HXHYPER"      "HXSTROKE"     "DEP"          "BIPOLDX"     
## [21] "SCHIZOP"      "ANXIET"       "PTSDDX"       "OTHPSY"       "ALCABUSE"    
## [26] "CANCER"       "DIABETES"     "MYOINF"       "CONGHRT"      "AFIBRILL"    
## [31] "HYPERT"       "HYPCHOL"      "VB12DEF"      "THYDIS"       "ARTH"        
## [36] "SLEEPAP"      "OTHCOND"      "DICLOFENAC"   "NAPROXEN"     "ETODOLAC"    
## [41] "DIAGNOSIS"    "MOCATRAI"     "MOCACUBE"     "MOCACLOC"     "MOCACLON"    
## [46] "MOCACLOH"     "MOCANAMI"     "MOCAFLUE"     "MOCAREPE"     "MOCAREGI"    
## [51] "MOCARECN"     "MOCAABST"     "MOCADIGI"     "MOCALETT"     "MOCASER7"    
## [56] "MOCAORDT"     "MOCAORMO"     "MOCAORYR"     "MOCAORDY"     "MOCAORPL"    
## [61] "MOCAORCT"     "VISUOSPATIAL" "LANGUAGE"     "MEMORY1"      "MEMORY2"     
## [66] "ABSTRACTION"  "ATTENTION"    "ORIENTATION"  "SUM_SCORE"

Data collapsed 2

#Step 1: Filter out patients with NACCALZD == 0
filtered_data <- collapsed_data %>%
  filter(NACCALZD != 0)

# Step 2: Recode NACCALZD where 8 (normal cognition) is changed to 0
filtered_data <- filtered_data %>%
  mutate(NACCALZD = case_when(
    NACCALZD == 8 ~ 0,  # Recode 8 to 0  (8 ~ 0 no cognitive impairment)
    NACCALZD == 1 ~ 1   # Keep 1 as 1 (Alzheimer's disease)
  ))

collapsed_data2<-filtered_data

# Initialize NSAID_TYPE as NA
collapsed_data2$NSAID_TYPE <- NA

# Assign NSAID_TYPE = 1 where DICLOFENAC is used (Diclofenac supersedes Naproxen)
collapsed_data2$NSAID_TYPE[collapsed_data2$DICLOFENAC == 1] <- 1

# Assign NSAID_TYPE = 0 where only Naproxen is used (and not Diclofenac)
collapsed_data2$NSAID_TYPE[is.na(collapsed_data2$NSAID_TYPE) & collapsed_data2$NAPROXEN == 1] <- 0

table(collapsed_data2$DICLOFENAC, useNA = "always")

## 
##     0     1  <NA> 
## 36625   844     0

table(collapsed_data2$NAPROXEN, useNA = "always")

## 
##     0     1  <NA> 
## 35084  2385     0

Data collapsed 3 & model_data, Match_obj matched_data

Iam unable to reproduce the model_data dataframe or the match_obj object. As a result, all subsequent formulas are incorrect.

# model_data
model_data <- collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]
dim(model_data)

## [1] 3111   70

#collapsed_data
collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]
dim(collapsed_data3)

## [1] 3111   70

#match_data
match_obj <- matchit(NSAID_TYPE ~ NACCAGE + SEX + NACCNIHR + EDUC + DIAGNOSIS, 
                     data = model_data, 
                     method = "full", 
                     distance = "glm", 
                     link = "probit")

## Error in `matchit()`:
## ! Missing and non-finite values are not allowed in the covariates.
## Covariates with missingness or non-finite values: NACCNIHR, EDUC

matched_data <- match.data(match_obj)

## Error: object 'match_obj' not found

Inconsistencies MOCA Score / Cogn. Score to Demented / NACCALZD

When looking at Figure 1, I find it odd that we include patients with a MOCA score below a certain threshold as “normal cognition”, even if their low score is due to impairments unrelated to dementia. On the other hand, it also seems strange to include patients with a MOCA score above 26 (up to 30) in both the demented and NACCALZD groups.

# Create new dataframe ad_moca with selected columns
ad_moca <- collapsed_data2[, c("NACCID", "NACCMOCA", "DEMENTED", "NACCALZD")]

# Filter for cases where DEMENTED status or NACCALZD Status doesnt match Moca
inconsistent_cases <- collapsed_data2 %>%
  filter(NACCMOCA > 25 & DEMENTED == 1)
dim(inconsistent_cases)

## [1] 21 70

inconsistent_cases1 <- collapsed_data2 %>%
  filter(NACCMOCA < 20 & DEMENTED == 0)
dim(inconsistent_cases1)

## [1] 643  70

 ## nearly 1%

inconsistent_cases2 <- collapsed_data2 %>%
  filter(NACCMOCA > 26 & NACCALZD == 1)
dim(inconsistent_cases2)

## [1] 190  70

 ## nearly 1% 

inconsistent_cases3 <- collapsed_data2 %>%
  filter(NACCMOCA < 20 & NACCALZD == 0)
dim(inconsistent_cases3)

## [1] 172  70

# Display the results
print(inconsistent_cases)

## # A tibble: 21 × 70
##    NACCID     NACCVNUM NACCAGE SEX   NACCNIHR  EDUC NACCALZD NACCALZP NACCMOCA
##    <chr>         <int>   <int> <fct> <fct>    <int>    <dbl>    <int>    <int>
##  1 NACC097130        5      73 1     1           19        1        1       26
##  2 NACC108978        2      78 1     1           20        1        1       26
##  3 NACC162469        1      77 1     1           12        1        1       28
##  4 NACC175064        1      63 1     1           16        1        2       30
##  5 NACC179619        2      66 1     1           18        1        2       27
##  6 NACC194582        1      63 2     2           14        1        1       26
##  7 NACC253699        2      65 1     1           18        1        1       28
##  8 NACC308810        1      73 2     1           18        1        1       30
##  9 NACC344354        4      73 2     1           16        1        1       26
## 10 NACC406578        1      64 2     1           13        1        1       28
## # ℹ 11 more rows
## # ℹ 61 more variables: CDRGLOB <dbl>, NACCMMSE <int>, NORMCOG <int>,
## #   DEMENTED <int>, NACCUDSD <int>, CSFTAU <dbl>, NACCTBI <fct>, HXHYPER <dbl>,
## #   HXSTROKE <dbl>, DEP <fct>, BIPOLDX <fct>, SCHIZOP <fct>, ANXIET <fct>,
## #   PTSDDX <fct>, OTHPSY <dbl>, ALCABUSE <dbl>, CANCER <fct>, DIABETES <fct>,
## #   MYOINF <dbl>, CONGHRT <fct>, AFIBRILL <dbl>, HYPERT <fct>, HYPCHOL <fct>,
## #   VB12DEF <fct>, THYDIS <fct>, ARTH <fct>, SLEEPAP <fct>, OTHCOND <dbl>, …

Linear regression models Demented/NACCALZD

# Logistic regression model 1 without filtering NACCALZD, linear regression on demented
set.seed(123)
logistic_model1 <- glm(
  DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data,
  family = binomial(link = "logit")
)

# View the summary of the model
summary(logistic_model1)

## 
## Call:
## glm(formula = DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.2288157  0.0894181  -2.559 0.010499 *  
## NACCAGE      0.0244402  0.0009579  25.514  < 2e-16 ***
## SEX2        -0.4082967  0.0210270 -19.418  < 2e-16 ***
## NACCNIHR2   -0.6084231  0.0326240 -18.650  < 2e-16 ***
## NACCNIHR3   -0.4569022  0.1317157  -3.469 0.000523 ***
## NACCNIHR4    0.3826622  0.3513750   1.089 0.276135    
## NACCNIHR5   -0.3097220  0.0628109  -4.931 8.18e-07 ***
## NACCNIHR6   -0.4540993  0.0598494  -7.587 3.26e-14 ***
## EDUC        -0.0834979  0.0031702 -26.339  < 2e-16 ***
## NACCTBI1    -0.1484236  0.0377231  -3.935 8.33e-05 ***
## DEP1         0.5397809  0.0279950  19.281  < 2e-16 ***
## BIPOLDX1     0.0429715  0.1916503   0.224 0.822588    
## SCHIZOP1     0.0629820  0.3724545   0.169 0.865718    
## ANXIET1      0.1670566  0.0635791   2.628 0.008600 ** 
## PTSDDX1     -0.8818708  0.2618662  -3.368 0.000758 ***
## CANCER1     -0.2720226  0.0397499  -6.843 7.74e-12 ***
## DIABETES1   -0.1495084  0.0362394  -4.126 3.70e-05 ***
## CONGHRT1     0.2238859  0.0884959   2.530 0.011409 *  
## HYPERT1     -0.1031722  0.0326610  -3.159 0.001584 ** 
## HYPCHOL1    -0.1200694  0.0309884  -3.875 0.000107 ***
## VB12DEF1     0.3626941  0.0544457   6.662 2.71e-11 ***
## THYDIS1     -0.0113423  0.0386633  -0.293 0.769246    
## ARTH1       -0.7911988  0.0303579 -26.062  < 2e-16 ***
## SLEEPAP1    -0.0505020  0.0402077  -1.256 0.209106    
## DICLOFENAC1 -0.3210074  0.0756263  -4.245 2.19e-05 ***
## NAPROXEN1   -0.2572108  0.0440678  -5.837 5.32e-09 ***
## ETODOLAC1   -0.3605477  0.1664991  -2.165 0.030352 *  
## DIAGNOSIS   -0.0348220  0.0092516  -3.764 0.000167 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 62680  on 46010  degrees of freedom
## Residual deviance: 57643  on 45983  degrees of freedom
##   (1154 observations deleted due to missingness)
## AIC: 57699
## 
## Number of Fisher Scoring iterations: 4

# Logistic regression model 2 filtering NACCALZD, linear regression on demented
set.seed(123)
logistic_model2 <- glm(
  DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data2,
  family = binomial(link = "logit")
)
# View the summary of the model
summary(logistic_model2)

## 
## Call:
## glm(formula = DEMENTED ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data2)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.863713   0.105864  -8.159 3.39e-16 ***
## NACCAGE      0.036396   0.001125  32.352  < 2e-16 ***
## SEX2        -0.402324   0.024249 -16.591  < 2e-16 ***
## NACCNIHR2   -0.449112   0.036489 -12.308  < 2e-16 ***
## NACCNIHR3   -0.231145   0.152121  -1.519 0.128643    
## NACCNIHR4    0.179366   0.449004   0.399 0.689542    
## NACCNIHR5   -0.298822   0.072385  -4.128 3.66e-05 ***
## NACCNIHR6   -0.310564   0.067980  -4.568 4.91e-06 ***
## EDUC        -0.104785   0.003689 -28.404  < 2e-16 ***
## NACCTBI1    -0.097578   0.045352  -2.152 0.031433 *  
## DEP1         0.798841   0.033535  23.821  < 2e-16 ***
## BIPOLDX1     0.221831   0.257139   0.863 0.388307    
## SCHIZOP1     0.162300   0.516691   0.314 0.753434    
## ANXIET1      0.355259   0.075861   4.683 2.83e-06 ***
## PTSDDX1     -0.960394   0.346725  -2.770 0.005607 ** 
## CANCER1     -0.287111   0.044382  -6.469 9.86e-11 ***
## DIABETES1   -0.072395   0.042244  -1.714 0.086576 .  
## CONGHRT1     0.216954   0.099803   2.174 0.029719 *  
## HYPERT1     -0.164898   0.036819  -4.479 7.51e-06 ***
## HYPCHOL1    -0.127070   0.034762  -3.655 0.000257 ***
## VB12DEF1     0.388951   0.061902   6.283 3.31e-10 ***
## THYDIS1     -0.002559   0.043131  -0.059 0.952688    
## ARTH1       -0.820028   0.034007 -24.113  < 2e-16 ***
## SLEEPAP1    -0.046994   0.046432  -1.012 0.311487    
## DICLOFENAC1 -0.316232   0.085022  -3.719 0.000200 ***
## NAPROXEN1   -0.319987   0.049793  -6.426 1.31e-10 ***
## ETODOLAC1   -0.348451   0.183136  -1.903 0.057081 .  
## DIAGNOSIS   -0.032523   0.010481  -3.103 0.001914 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 49815  on 36622  degrees of freedom
## Residual deviance: 44463  on 36595  degrees of freedom
##   (846 observations deleted due to missingness)
## AIC: 44519
## 
## Number of Fisher Scoring iterations: 4

# Logistic regression model 3 filtering NACCALZD, linear regression on NACCALZD
set.seed(123)
logistic_model3 <- glm(
  NACCALZD ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data2,
  family = binomial(link = "logit")
)

# View the summary of the model
summary(logistic_model3)

## 
## Call:
## glm(formula = NACCALZD ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data2)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.389924   0.107053 -12.984  < 2e-16 ***
## NACCAGE      0.049491   0.001153  42.919  < 2e-16 ***
## SEX2        -0.509775   0.024509 -20.800  < 2e-16 ***
## NACCNIHR2   -0.319050   0.035193  -9.066  < 2e-16 ***
## NACCNIHR3   -0.146400   0.151231  -0.968  0.33302    
## NACCNIHR4    0.461190   0.464436   0.993  0.32070    
## NACCNIHR5   -0.218825   0.069955  -3.128  0.00176 ** 
## NACCNIHR6   -0.202888   0.066383  -3.056  0.00224 ** 
## EDUC        -0.098621   0.003796 -25.979  < 2e-16 ***
## NACCTBI1     0.119274   0.046636   2.558  0.01054 *  
## DEP1         1.129675   0.036812  30.688  < 2e-16 ***
## BIPOLDX1     0.447559   0.255568   1.751  0.07991 .  
## SCHIZOP1     0.771827   0.550836   1.401  0.16116    
## ANXIET1      0.498332   0.077439   6.435 1.23e-10 ***
## PTSDDX1     -0.775745   0.288313  -2.691  0.00713 ** 
## CANCER1     -0.170735   0.041267  -4.137 3.51e-05 ***
## DIABETES1    0.082835   0.043415   1.908  0.05640 .  
## CONGHRT1     0.167094   0.097655   1.711  0.08707 .  
## HYPERT1      0.022381   0.035189   0.636  0.52476    
## HYPCHOL1    -0.019456   0.033258  -0.585  0.55854    
## VB12DEF1     0.445637   0.061317   7.268 3.65e-13 ***
## THYDIS1      0.026937   0.040842   0.660  0.50955    
## ARTH1       -0.658537   0.032294 -20.392  < 2e-16 ***
## SLEEPAP1     0.057278   0.043415   1.319  0.18707    
## DICLOFENAC1 -0.247914   0.078063  -3.176  0.00149 ** 
## NAPROXEN1   -0.281654   0.047079  -5.983 2.20e-09 ***
## ETODOLAC1   -0.415376   0.176151  -2.358  0.01837 *  
## DIAGNOSIS   -0.084324   0.010216  -8.254  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 50427  on 36622  degrees of freedom
## Residual deviance: 44424  on 36595  degrees of freedom
##   (846 observations deleted due to missingness)
## AIC: 44480
## 
## Number of Fisher Scoring iterations: 4

# View the summary of the model
summary(logistic_model)

## Error: object 'logistic_model' not found

with(summary(logistic_model1), 1 - deviance/null.deviance)

## [1] 0.08036398

with(summary(logistic_model2), 1 - deviance/null.deviance)

## [1] 0.1074406

with(summary(logistic_model3), 1 - deviance/null.deviance)

## [1] 0.1190576

#tidysummary_model <- tidy(logistic_model2)
#write_csv(tidysummary_model, file = "tableS1.csv")
#logistic_coefficients<-exp(cbind(OR = coef(logistic_model2), confint(logistic_model2)))
#logistic_coefficients <- as.data.frame(logistic_coefficients)
#write_csv(logistic_coefficients, file = "tableS1_coef.csv")

# Perform a t-test to compare NACCAGE between DEMENTED = 0 and DEMENTED = 1
t_test_result <- t.test(NACCAGE ~ DEMENTED, data = collapsed_data)

# Print the result of the t-test
print(t_test_result)

## 
##  Welch Two Sample t-test
## 
## data:  NACCAGE by DEMENTED
## t = -21.108, df = 43815, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -2.345538 -1.946947
## sample estimates:
## mean in group 0 mean in group 1 
##        73.70854        75.85478

Visual checks Error

Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, …) : plot.new has not been called yet

#Visual checks 

# Visual checks
par(mfrow = c(1, 2))  # Set up 2 plots side by side
hist(collapsed_data$NACCAGE[data$DEMENTED == 0], main = "DEMENTED = 0", xlab = "NACCAGE", breaks = 20)
hist(collapsed_data$NACCAGE[data$DEMENTED == 1], main = "DEMENTED = 1", xlab = "NACCAGE", breaks = 20)

# Q-Q plot to check normality
qqnorm(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 0], main = "Q-Q Plot DEMENTED = 0")
qqline(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 0])
qqnorm(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 1], main = "Q-Q Plot DEMENTED = 1")
qqline(collapsed_data$NACCAGE[collapsed_data$DEMENTED == 1])

Mann-Whitney U test (Wilcoxon rank-sum test) AGE ~ DEMENTED

# Mann-Whitney U test (Wilcoxon rank-sum test)

wilcox.test(NACCAGE ~ DEMENTED, data = collapsed_data)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  NACCAGE by DEMENTED
## W = 240365721, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

# Mann-Whitney U test (Wilcoxon rank-sum test)
wilcox.test(NACCAGE ~ NACCALZD, data = filtered_data)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  NACCAGE by NACCALZD
## W = 130858134, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0

Problem with propensity score matching

ERROR Call: matchit(formula = NSAID_TYPE ~ NACCAGE + SEX + NACCNIHR + EDUC + DIAGNOSIS, data = model_data, method = “full”, distance = “glm”, link = “probit”) Summary of Balance for All Data: Summary of Balance for Matched Data: Sample Sizes:

# Remove rows with missing values in the key covariates
model_data <- collapsed_data2 %>%
  filter(!is.na(NSAID_TYPE), !is.na(NACCAGE), !is.na(NACCNIHR), !is.na(EDUC), !is.na(DIAGNOSIS))

# Convert NSAID_TYPE to a factor
model_data$NSAID_TYPE <- as.factor(model_data$NSAID_TYPE)

# Perform propensity score matching
set.seed(123)
match_obj <- matchit(NSAID_TYPE ~ NACCAGE + SEX + NACCNIHR + EDUC + DIAGNOSIS, 
                     data = model_data, 
                     method = "full", 
                     distance = "glm", 
                     link = "probit")

# Check if the matching worked
summary(match_obj)

# Summarize the matching results
summary(match_obj)

plot(match_obj, type = "jitter", interactive = FALSE)
plot(summary(match_obj), abs = FALSE)

# lines # dont work in RMD
#pdf(file = "FigS_matching.pdf",   # The directory you want to save the file in
    width = 4, # The width of the plot in inches
    height = 4) # The height of the plot in inches

#plot(summary(match_obj), abs = FALSE)
#dev.off()

## Error in parse(text = input): <text>:27:14: unexpected ','
## 26: #pdf(file = "FigS_matching.pdf",   # The directory you want to save the file in
## 27:     width = 4,
##                  ^

Preform logistic regression on the matched data ACCALZD/DEMENTED ~ NSAID_TYPE (consequential error)

# Extract matched data

matched_data <- match.data(match_obj)

## Error: object 'match_obj' not found

# Perform logistic regression on the matched data
set.seed(123)
res1 <- glm(DEMENTED ~ NSAID_TYPE, data = matched_data, family = binomial)

## Error in eval(mf, parent.frame()): object 'matched_data' not found

# Perform logistic regression on the matched data
set.seed(123)
res2 <- glm(NACCALZD ~ NSAID_TYPE, data = matched_data, family = binomial)

## Error in eval(mf, parent.frame()): object 'matched_data' not found

summary(res)

## Error: object 'res' not found

# Summarize the logistic regression results
summary(res1)

## Error: object 'res1' not found

summary(res2)

## Error: object 'res2' not found

# in # for rmd file
#tidysummary_model1 <- tidy(res1)
#write_csv(tidysummary_model1, file = "tableS1_match.csv")
#logistic_coefficients1<-exp(cbind(OR = coef(res1), confint(res1)))
#logistic_coefficients1 <- as.data.frame(logistic_coefficients)
#write_csv(logistic_coefficients, file = "tableS1_coef_match.csv")

#tidysummary_model2 <- tidy(res2)
#write_csv(tidysummary_model2, file = "tableS2_match.csv")
#logistic_coefficients2<-exp(cbind(OR = coef(res2), confint(res2)))
#logistic_coefficients2 <- as.data.frame(logistic_coefficients)
#write_csv(logistic_coefficients, file = "tableS2_coef_match.csv")

Create a contingency table of NSAID_TYPE and DEMENTED matched (error)

# Create a contingency table of NSAID_TYPE and DEMENTED
contingency_table <- table(matched_data$NSAID_TYPE, matched_data$DEMENTED)

## Error: object 'matched_data' not found

# Perform Fisher's exact test
fisher_test_result <- fisher.test(contingency_table)

## Error: object 'contingency_table' not found

# Print the result of the Fisher's exact test
print(fisher_test_result)

## Error: object 'fisher_test_result' not found

print(contingency_table)

## Error: object 'contingency_table' not found

foo<-contingency_table

## Error: object 'contingency_table' not found

write.csv(foo, "Dementia_match_contingency.csv")

## Error in eval(expr, p): object 'foo' not found

#Moca Subscores -> weird outpot of boxplots

# List of cognitive variables
cognitive_variables <- c("VISUOSPATIAL", "LANGUAGE", "MEMORY2", "ABSTRACTION", "ATTENTION", "ORIENTATION")

#doenst work in RMD pdf(file = "Fig2.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

#doenst work in RMD pdf(file = "Fig_trial.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 12) # The height of the plot in inches
# Create individual plots for each cognitive variable by NSAID_TYPE
plot_list <- lapply(cognitive_variables, function(var) {
  ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = .data[[var]], fill = as.factor(NSAID_TYPE))) +
    geom_boxplot() +
    labs(title = paste("Distribution of", var, "by NSAID_TYPE"), x = "NSAID Type", y = "Score") +
    scale_fill_manual(values = c("0" = "blue", "1" = "red"), labels = c("0" = "Naproxen", "1" = "Diclofenac")) +
    theme_minimal())

# Arrange plots in a grid
do.call(grid.arrange, c(plot_list, ncol = 3))

## Error in parse(text = input): <text>:5:14: unexpected ','
## 4: #doenst work in RMD pdf(file = "Fig2.pdf",   # The directory you want to save the file in
## 5:     width = 8,
##                 ^

CSF

# Logistic regression model CSF
set.seed(123)
logistic_model_CSF <- glm(
  CSFTAU ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + DEP + BIPOLDX + SCHIZOP + ANXIET + 
              PTSDDX + CANCER + DIABETES + CONGHRT + HYPERT + HYPCHOL + VB12DEF + 
              THYDIS + ARTH + SLEEPAP + DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS,
  data = collapsed_data2,
  family = binomial(link = "logit")
)

# View the summary of the model
summary(logistic_model_CSF)

## 
## Call:
## glm(formula = CSFTAU ~ NACCAGE + SEX + NACCNIHR + EDUC + NACCTBI + 
##     DEP + BIPOLDX + SCHIZOP + ANXIET + PTSDDX + CANCER + DIABETES + 
##     CONGHRT + HYPERT + HYPCHOL + VB12DEF + THYDIS + ARTH + SLEEPAP + 
##     DICLOFENAC + NAPROXEN + ETODOLAC + DIAGNOSIS, family = binomial(link = "logit"), 
##     data = collapsed_data2)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -1.064752   0.319324  -3.334 0.000855 ***
## NACCAGE      -0.049836   0.003345 -14.897  < 2e-16 ***
## SEX2         -0.369395   0.081605  -4.527 5.99e-06 ***
## NACCNIHR2    -1.069886   0.190576  -5.614 1.98e-08 ***
## NACCNIHR3    -0.896474   0.716740  -1.251 0.211020    
## NACCNIHR4   -12.382824 278.421472  -0.044 0.964526    
## NACCNIHR5    -1.170738   0.383010  -3.057 0.002238 ** 
## NACCNIHR6    -0.485020   0.265932  -1.824 0.068174 .  
## EDUC          0.038777   0.013691   2.832 0.004623 ** 
## NACCTBI1     -0.493659   0.186419  -2.648 0.008094 ** 
## DEP1         -0.165008   0.116818  -1.413 0.157796    
## BIPOLDX1      0.488456   0.484024   1.009 0.312899    
## SCHIZOP1    -11.277195 317.258656  -0.036 0.971645    
## ANXIET1       0.634077   0.164231   3.861 0.000113 ***
## PTSDDX1      -0.273404   0.734351  -0.372 0.709665    
## CANCER1       0.314663   0.113336   2.776 0.005497 ** 
## DIABETES1    -1.322862   0.294439  -4.493 7.03e-06 ***
## CONGHRT1     -0.997635   0.460654  -2.166 0.030335 *  
## HYPERT1      -0.116910   0.106604  -1.097 0.272784    
## HYPCHOL1      0.490275   0.100841   4.862 1.16e-06 ***
## VB12DEF1      0.377673   0.151139   2.499 0.012460 *  
## THYDIS1       0.092134   0.118005   0.781 0.434944    
## ARTH1         0.049171   0.099298   0.495 0.620465    
## SLEEPAP1      0.186360   0.114383   1.629 0.103256    
## DICLOFENAC1   0.001704   0.252642   0.007 0.994618    
## NAPROXEN1    -0.225072   0.163996  -1.372 0.169932    
## ETODOLAC1    -0.279308   0.716635  -0.390 0.696722    
## DIAGNOSIS     0.099229   0.030804   3.221 0.001276 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 6911.1  on 36622  degrees of freedom
## Residual deviance: 6354.7  on 36595  degrees of freedom
##   (846 observations deleted due to missingness)
## AIC: 6410.7
## 
## Number of Fisher Scoring iterations: 14

with(summary(logistic_model_CSF), 1 - deviance/null.deviance)

## [1] 0.08050304

# in # for rmd file
#tidysummary_model_CSF <- tidy(logistic_model_CSF)
 #write_csv(tidysummary_model_CSF, file = "tableS3_tau.csv")
#logistic_coefficients_CSF<-exp(cbind(OR = coef(logistic_model_CSF), confint(logistic_model_CSF)))
#logistic_coefficients_CSF <- as.data.frame(logistic_coefficients_CSF)
#write_csv(logistic_coefficients_CSF, file = "tableS3_coef_tau.csv")


# Perform logistic regression on the matched data
set.seed(123)
res_CSF <- glm(CSFTAU ~ NSAID_TYPE, data = matched_data, family = binomial)

## Error in eval(mf, parent.frame()): object 'matched_data' not found

summary(res_CSF)

## Error: object 'res_CSF' not found

# in # for rmd file
#tidysummary_model_CSF <- tidy(res_CSF)
#write_csv(tidysummary_model, file = "tableS3_tau_match.csv")
#logistic_coefficients_CSF<-exp(cbind(OR = coef(res_CSF), confint(res_CSF)))
#logistic_coefficients_CSF <- as.data.frame(logistic_coefficients_CSF)
#write_csv(logistic_coefficients_CSF, file = "tableS3_coef_tau_match.csv")


# Create a contingency table of NSAID_TYPE and NACCALZD
contingency_table_CSF <- table(matched_data$NSAID_TYPE, matched_data$CSFTAU)

## Error: object 'matched_data' not found

# Perform Fisher's exact test
fisher_test_result_CSF <- fisher.test(contingency_table_CSF)

## Error: object 'contingency_table_CSF' not found

# Print the result of the Fisher's exact test
print(fisher_test_result_CSF)

## Error: object 'fisher_test_result_CSF' not found

print(contingency_table_CSF)

## Error: object 'contingency_table_CSF' not found

foo<-contingency_table_CSF

## Error: object 'contingency_table_CSF' not found

write.csv(foo, "Csftau_match_contingency.csv")

## Error in eval(expr, p): object 'foo' not found

Figures

Fig 1v2

pdf(file = "Fig1_v2.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

p1 <- ggscatter(collapsed_data2, x = "NACCAGE", y = "NACCMOCA",
          add = "reg.line",                                 # Add regression line
          conf.int = TRUE,                                  # Add confidence interval
          add.params = list(color = "blue",
                            fill = "lightgray")
          )+
  stat_cor(method = "pearson", label.x = 0, label.y = 20,aes(label = paste0(..r.label.., sep = " ")))+
stat_cor(method = "pearson", label.x = 0, label.y = 18.5,aes(label = paste0(..p.label.., sep = " ")))
p1<-ggpar(p1, xlim = c(0, 100),ylim = c(0, 30),title = "A",ylab="MOCA Score", xlab="Age (years)")

p2 <- ggscatter(matched_data, x = "NACCAGE", y = "NACCMOCA",
          add = "reg.line",                                 # Add regression line
          conf.int = TRUE,                                  # Add confidence interval
          add.params = list(color = "blue",
                            fill = "lightgray")
          )+
  stat_cor(method = "pearson", label.x = 3, label.y = 15)

## Error: object 'matched_data' not found

p2<-ggpar(p2, xlim = c(0, 100),ylim = c(0, 30),title = "D",ylab="MOCA Score", xlab="Age (years)")

## Error: object 'p2' not found

p3 <- ggboxplot(collapsed_data2, x = "DEMENTED", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p3<- p3+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))
p3<-ggpar(p3,xlab="Demented Status", ylab="MOCA Score",title = "B")

p4 <- ggboxplot(matched_data, x = "DEMENTED", y = "NACCMOCA")

## Error: object 'matched_data' not found

my_comparisons <- list( c("0", "1") )
p4<- p4+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))

## Error: object 'p4' not found

p4<-ggpar(p4,xlab="Demented Status", ylab="MOCA Score",title = "E")

## Error: object 'p4' not found

p5 <- ggboxplot(collapsed_data2, x = "NACCALZD", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p5<- p5+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))
p5<-ggpar(p5,xlab="AD Status", ylab="MOCA Score",title = "C")

p6 <- ggboxplot(matched_data, x = "NACCALZD", y = "NACCMOCA")

## Error: object 'matched_data' not found

my_comparisons <- list( c("0", "1") )
p6<- p6+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))

## Error: object 'p6' not found

p6<-ggpar(p6,xlab="AD Status", ylab="MOCA Score",title = "F")

## Error: object 'p6' not found

grid.arrange(p1, p3, p5, p2,p4,p6, ncol = 3)

## Error: object 'p2' not found

Fig 1 v1

# doesnt work in RMD : pdf(file = "Fig1.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

# Filter out NACCMOCA values of 88 and 99 for both datasets
collapsed_data_filtered <- filtered_data %>%
  filter(NACCMOCA != 88 & NACCMOCA != 99)



# A. Relationship of NACCMOCA vs NACCAGE scatter plot in collapsed_data with trend line and R-squared
p1 <- ggplot(filtered_data, aes(x = NACCAGE, y = NACCMOCA)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +  # Add trend line
  theme_minimal() +
  labs(title = "A. NACCMOCA vs NACCAGE (collapsed_data)", 
       x = "Age", 
       y = "MOCA Score") +
  stat_cor(method = "pearson", label.x = 60, label.y = 30, aes(label = after_stat(rr.label)))  # Updated notation

# B. Relationship of NACCMOCA vs NACCAGE scatter plot in matched_data with trend line and R-squared
p2 <- ggplot(matched_data, aes(x = NACCAGE, y = NACCMOCA)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +  # Add trend line
  theme_minimal() +
  labs(title = "B. NACCMOCA vs NACCAGE (matched_data)", 
       x = "Age", 
       y = "MOCA Score") +
  stat_cor(method = "pearson", label.x = 60, label.y = 30, aes(label = after_stat(rr.label)))  # Updated notation

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p3 <- ggplot(filtered_data, aes(x = as.factor(DEMENTED), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "C. NACCMOCA by DEMENTED (collapsed_data)", 
       x = "Demented Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))

# D. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (matched_data)
p4 <- ggplot(matched_data, aes(x = as.factor(DEMENTED), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "D. NACCMOCA by DEMENTED (matched_data)", 
       x = "Demented Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Not Demented", "1" = "Demented"))

# E. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (collapsed_data)
p5 <- ggplot(filtered_data, aes(x = as.factor(NACCALZD), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "E. NACCMOCA by NACCALZD (collapsed_data)", 
       x = "AD Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))

# F. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (matched_data)
p6 <- ggplot(matched_data, aes(x = as.factor(NACCALZD), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "F. NACCMOCA by NACCALZD (matched_data)", 
       x = "AD Status", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Normal cognition", "1" = "AD"))

# Combine all plots into a 2x2 grid
grid.arrange(p1, p3, p5, p2, p4, p6, ncol = 3)

## Error in parse(text = input): <text>:2:15: unexpected ','
## 1: # doesnt work in RMD : pdf(file = "Fig1.pdf",   # The directory you want to save the file in
## 2:     width = 12,
##                  ^

Fig 2 v 2

#doesnt work in rmd # pdf(file = "Fig2_v2.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

p1 <- ggboxplot(collapsed_data3, x = "NSAID_TYPE", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p1<- p1+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
p1<-ggpar(p1,xlab="NSAID Type", ylab="MOCA Score",title = "A")

p2 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "NACCMOCA")
my_comparisons <- list( c("0", "1") )
p2<- p2+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
p2<-ggpar(p2,xlab="NSAID Type", ylab="MOCA Score",title = "B")

grid.arrange(p1, p2, ncol = 2)

## Error in parse(text = input): <text>:2:14: unexpected ','
## 1: #doesnt work in rmd # pdf(file = "Fig2_v2.pdf",   # The directory you want to save the file in
## 2:     width = 8,
##                 ^

Fig 2 v1

# doesnt work in rmd pdf(file = "Fig2.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

# A. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (collapsed_data)
p1 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "A. NACCMOCA by NSAID TYPE (collapsed_data)", 
       x = "NSAID TYPE", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
my_comparisons <- list( c("0", "1") )
p1<-p1 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# B. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (matched_data)
p2 <- ggplot(matched_data, aes(x = as.factor(NSAID_TYPE), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "B. NACCMOCA by NSAID TYPE (matched_data)", 
       x = "NSAID TYPE", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))
my_comparisons <- list( c("0", "1") )
p2<-p2 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, ncol = 2)

## Error in parse(text = input): <text>:2:14: unexpected ','
## 1: # doesnt work in rmd pdf(file = "Fig2.pdf",   # The directory you want to save the file in
## 2:     width = 8,
##                 ^

#Fig 3

#doesnt work in rmd pdf(file = "Fig3.pdf",   # The directory you want to save the file in
    width = 8, # The width of the plot in inches
    height = 4) # The height of the plot in inches

collapsed_data3<-collapsed_data2[!is.na(collapsed_data2$NSAID_TYPE),]

# A. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (collapsed_data)
p1 <- ggplot(collapsed_data3, aes(x = as.factor(CSFTAU), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "A. NACCMOCA by CSF TAU positivity (collapsed_data)", 
       x = "CSF TAU positivity", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "CSF Tau Negative (n=3054)", "1" = "CSF Tau Positive (n=57)"))
my_comparisons <- list( c("0", "1") )
p1<-p1 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# B. Boxplot distribution of NACCMOCA among NACCALZD = 0 and 1 (matched_data)
p2 <- ggplot(matched_data, aes(x = as.factor(CSFTAU), y = NACCMOCA)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "B. NACCMOCA by CSF TAU positivity (matched_data)", 
       x = "CSF TAU positivity", 
       y = "MOCA Score") +
  scale_x_discrete(labels = c("0" = "CSF Tau Negative (n=2999)", "1" = "CSF Tau Positive (n=57)"))
my_comparisons <- list( c("0", "1") )
p2<-p2 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, ncol = 2)

## Error in parse(text = input): <text>:2:14: unexpected ','
## 1: #doesnt work in rmd pdf(file = "Fig3.pdf",   # The directory you want to save the file in
## 2:     width = 8,
##                 ^

Fig 4 Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1

#doesnt work in rmd pdf(file = "Fig4.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p1 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = VISUOSPATIAL)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "A. VISUOSPATIAL by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p1<-p1 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p2 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = LANGUAGE)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "B. LANGUAGE by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p2<-p2 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p3 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = MEMORY2)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "C. MEMORY2 by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p3<-p3 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p4 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = ABSTRACTION)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "D. ABSTRACTION by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p4<-p4 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p5 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = ATTENTION)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "E. ATTENTION by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p5<-p5 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# C. Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1 (collapsed_data)
p6 <- ggplot(collapsed_data3, aes(x = as.factor(NSAID_TYPE), y = ORIENTATION)) +
  geom_boxplot() +
  theme_minimal() +
  labs(title = "F. ORIENTATION by NSAID TYPE", 
       x = "NSAID Type", 
       y = "Score") +
  scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
my_comparisons <- list( c("0", "1") )
p6<-p6 + stat_compare_means(comparisons = my_comparisons, method = "wilcox.test")

# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, p3, p4,p5,p6, ncol = 3)

## Error in parse(text = input): <text>:2:15: unexpected ','
## 1: #doesnt work in rmd pdf(file = "Fig4.pdf",   # The directory you want to save the file in
## 2:     width = 12,
##                  ^

Fig 4v3 Boxplot distribution of NACCMOCA among DEMENTED = 0 and 1

# doesnt work in rmd pdf(file = "Fig4_v3.pdf",   # The directory you want to save the file in
    width = 12, # The width of the plot in inches
    height = 8) # The height of the plot in inches

p1 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "VISUOSPATIAL")
my_comparisons <- list( c("0", "1") )
p1<- p1+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p1<-ggpar(p1,xlab="NSAID Type", ylab="MOCA Subscore",title = "A. Visuospatial")

p2 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "LANGUAGE")
my_comparisons <- list( c("0", "1") )
p2<- p2+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p2<-ggpar(p2,xlab="NSAID Type", ylab="MOCA Subscore",title = "B. Language")

p3 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "MEMORY2")
my_comparisons <- list( c("0", "1") )
p3<- p3+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p3<-ggpar(p3,xlab="NSAID Type", ylab="MOCA Subscore",title = "C. Memory")

p4 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "ABSTRACTION")
my_comparisons <- list( c("0", "1") )
p4<- p4+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p4<-ggpar(p4,xlab="NSAID Type", ylab="MOCA Subscore",title = "D. Abstraction")

p5 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "ATTENTION")
my_comparisons <- list( c("0", "1") )
p5<- p5+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p5<-ggpar(p5,xlab="NSAID Type", ylab="MOCA Subscore",title = "E. Attention")

p6 <- ggboxplot(matched_data, x = "NSAID_TYPE", y = "ORIENTATION")
my_comparisons <- list( c("0", "1") )
p6<- p6+stat_compare_means(comparisons = my_comparisons,method = "wilcox.test")+
scale_x_discrete(labels = c("0" = "Naproxen", "1" = "Diclofenac"))+
stat_summary(fun=mean, geom="point", shape=18, size=3, color="red", fill="red")
p6<-ggpar(p6,xlab="NSAID Type", ylab="MOCA Subscore",title = "F. Orientation")


summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ORIENTATION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])
summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ORIENTATION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])

summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ATTENTION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])
summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ATTENTION)
dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])
# Combine all plots into a 2x2 grid
grid.arrange(p1, p2, p3, p4,p5,p6, ncol = 3)

## Error in parse(text = input): <text>:2:15: unexpected ','
## 1: # doesnt work in rmd pdf(file = "Fig4_v3.pdf",   # The directory you want to save the file in
## 2:     width = 12,
##                  ^

#pdf2

summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ORIENTATION)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00    0.00    2.52    6.00    6.00

dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])

## [1] 2267   70

summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ORIENTATION)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   4.000   3.082   6.000   6.000

dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])

## [1] 844  70

summary(collapsed_data3[collapsed_data3$NSAID_TYPE==0,]$ATTENTION)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.000   2.421   6.000   6.000

dim(collapsed_data3[collapsed_data3$NSAID_TYPE==0,])

## [1] 2267   70

summary(collapsed_data3[collapsed_data3$NSAID_TYPE==1,]$ATTENTION)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   3.000   2.902   6.000   6.000

dim(collapsed_data3[collapsed_data3$NSAID_TYPE==1,])

## [1] 844  70

Statistic

2025-03-05