ADNI + FreeSurfer Analysis Update 2

When we merge these DM2 cases with FreeSurfer data, our count is reduced to 28 observations of patients for whom we have both freesurfer data and a diagnosis of DM2.

We also take a random sample of 150 patients with MCI who do not have DM2.

set.seed(999)

all_diabetes <- read_excel("dm2_r.xlsx") #every record with a note about diabetes
dm2 <- all_diabetes[!complete.cases(all_diabetes), ]
dm2 <- dm2[!grepl("pre-diabetes", dm2$note),] #ADNI records with only full onset diabetes type 2

mci_pts <- read_excel("mci_pts_no_dm.xlsx") #every ADNI record with MCI

surf <- read_csv("freesurfer_data.csv") #all freesurfer data
surf_mci <- merge(mci_pts,surf,by="RID") #all freesurfer data with MCI pts
surf_mci <- surf_mci[ !(surf_mci$RID %in% all_diabetes$RID), ] #remove all diabetics

mci_ctrl_150 <- surf_mci[sample(nrow(surf_mci), 150), ] #grab sample of 150 mci freesurfer data, no dm2


Below, we subset the data to include only relevant columns. The dataframe with DM2 patients has control cases dropped and only includes MCI, resulting in a total of 21 patients. A binary variable is also added to indicate whether the observation is a patient with diabetes.

Now, we have 3 dataframes: MCI+DM2 patients (N=21), MCI-only (N=150), and the combined dataframe with both groups, categorized by a binary DM2 indicator (N=171).

regions <- read_excel("aseg_names.xlsx", col_names = c("Code", "Region"))
codes <- unname(unlist(regions[,1]))
names <- unname(unlist(regions[,2]))

keep <- c("RID","VISCODE.y","COLPROT.x","COLPROT.y","DX_bl","AGE","PTGENDER","APOE4","MMSE")

df0 <- subset(merge(dm2,surf,by="RID"), select=c(keep,codes)) #all dm2 pts with freesurfer data
df <- df0[grepl("MCI", df0$DX_bl),] #dm2 with MCI, drops controls
df$dm2 <- "Yes DM2"

df2 <- subset(mci_ctrl_150, select = c(keep,codes)) #mci no dm2
df2$dm2 <- "No DM2"

dfx <- rbind(df,df2)


Data Description

Here, we see a brief overview of the spread of the data. There is likely a significant age difference between the two populations, but an N = 21 can only offer so much. The other factors align nicely. We see the distribution of MMSE scores match nicely to the two groups.

SUMMARY OF PT DATA BY DIABETES STATUS
Characteristic No DM2, N = 1501 Yes DM2, N = 211 p-value2
AGE 71 (66, 76) 64 (62, 70) 0.002
Unknown 1 0
PTGENDER 0.9
Female 57 (38%) 7 (33%)
Male 93 (62%) 14 (67%)
APOE4 0.6
0 78 (58%) 15 (71%)
1 44 (33%) 5 (24%)
2 13 (9.6%) 1 (4.8%)
Unknown 15 0
MMSE 0.4
21 1 (0.7%) 0 (0%)
23 1 (0.7%) 0 (0%)
24 6 (4.0%) 0 (0%)
25 7 (4.7%) 0 (0%)
26 13 (8.7%) 3 (14%)
27 10 (6.7%) 3 (14%)
28 32 (21%) 1 (4.8%)
29 46 (31%) 8 (38%)
30 34 (23%) 6 (29%)

1 Statistics presented: median (IQR); n (%)

2 Statistical tests performed: Wilcoxon rank-sum test; chi-square test of independence; Fisher's exact test





Analysis & Findings

T-tests are applied between the MCI+DM2 and MCI-only groups for each brain region of interest. Initial results reveal over 30 significant regions.

False Discovery Rate correction is applied to these results. This conservative method brought about 22 p-values up past the 0.05 threshold. Significant regions are shown below. Cortical Thickness yielded no significant results.

all_results <- data.frame()
for(i in 1:128) {
  a <- t.test(dfx[,9+i]~dm2, data=dfx)
  b <- a$p.value; c <- colnames(dfx[9+i])
  all_results <- rbind.data.frame(all_results,c(c, b))
}
all_results$Region <- names; all_results <- all_results[,c(1,3,2)]
colnames(all_results) <- c("Code","Region","P-Value")

Area_results <- all_results[grepl("Surface Area", all_results$Region),]
Thick_results <- all_results[grepl("Thickness", all_results$Region),]
WM_GM_results <- all_results[grepl("Volume", all_results$Region),]

Area_results <- FDR.correction(Area_results);Area_results$metric <- "Surface Area"
Thick_results <- FDR.correction(Thick_results);Thick_results$metric <- "Cortical Thickness"
WM_GM_results <- FDR.correction(WM_GM_results);WM_GM_results$metric <- "Cortical"




Code Region FDR_corrected FDR_sig metric
11 ST107SA Surface Area (aparc.stats) of RightPericalcarine 0.0352645 YES Surface Area
21 ST114SA Surface Area (aparc.stats) of RightRostralMiddleFrontal 0.0380866 YES Surface Area
23 ST115SA Surface Area (aparc.stats) of RightSuperiorFrontal 0.0442650 YES Surface Area
27 ST117SA Surface Area (aparc.stats) of RightSuperiorTemporal 0.0369374 YES Surface Area
39 ST23SA Surface Area (aparc.stats) of LeftCuneus 0.0352645 YES Surface Area
47 ST28SA Surface Area (aparc.stats) of LeftHemisphereWM 0.0380866 YES Surface Area


Regions Significantly Different Between MCI + Diabetes Type 2 patients, and patients with MCI only

Surface Area of RightPericalcarine

Surface Area of RightRostralMiddleFrontal

Surface Area of RightSuperiorFrontal

Surface Area of RightSuperiorTemporal

Surface Area of LeftCuneus

Surface Area of LeftHemisphereWM

Surface Area of LeftRostralMiddleFrontal

Surface Area of LeftSuperiorTemporal

Surface Area of LeftTransverseTemporal

Subcortical Volume of CorticalWM

Subcortical Volume of TotalGM



#a <- "ST107SA"
#dfx$ST107SA

ggplot(dfx, aes(x=dm2, y=dfx$ST107SA)) + 
  geom_boxplot(outlier.colour="red", outlier.shape=8,
                outlier.size=4)