I used the preliminary NFL data from Suzanne. I only kept the most recent NFL reading for each individual so that I could use fixed effects models. I log-transformed the NFL data because apparently that’s what you’re supposed to do. (Thanks, Jeremy)
Quickly visualizing, we can see that people’s NFL levels increase as they age, and that people with higher NFL levels are more likely to convert.
Comparing individuals on the basis of amyloid positivity (I used PIB > 1.42 as my cutoff, we had 222 participants with both PIB and NFL) and CDR status (25 people with CDR > 0, 197 with CDR = 0), there are some differences:
When we correct for age, education, and sex, those differences remain strong.
## [1] "Amyloid Status"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 218 | 31.93362 | NA | NA | NA | NA |
| 216 | 30.67847 | 2 | 1.255148 | 4.418603 | 0.0131596 |
## [1] "Interaction of Amyloid Status * age"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 217 | 31.71484 | NA | NA | NA | NA |
| 216 | 30.67847 | 1 | 1.036362 | 7.296782 | 0.007456 |
## [1] "CDR Status"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 218 | 31.93362 | NA | NA | NA | NA |
| 216 | 30.21585 | 2 | 1.71777 | 6.139797 | 0.0025501 |
## [1] "Interaction of CDR Status * age"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 217 | 31.15225 | NA | NA | NA | NA |
| 216 | 30.21585 | 1 | 0.9364013 | 6.693926 | 0.0103299 |
In order to get the rate of neuropsych changes, I used Andy Aschenbrenner’s “PACC” score. To get the PACC, I: -Dropped all MMSE scores > 30. I have no idea why 5 or 6 participants had scores in the 90’s.
-Z scored the following neuropsych tests: srtfree, digsym, MEMUNITS, and MMSE -I kept the following four neuropsych scores: srtfree, digsym, MEMUNITS, and MMSE -I calculated the mean of these 4 scores to get their PACC score -If an individual was missing a test, I just averaged the ones they had. -I kept all visits for which an individual had at least 2 of the 4 PACC components
I wrote a function to run a linear regression for each individual’s sum of boxes over time. Then I extracted the slope of this regression and used that as the slope to describe their neuropsych change. This is what that looks like:
coef.fcn = function(DATA) {
coeffs = coef(lm(PACC ~ psy_date, data=DATA))
return(data.frame(Intercept=coeffs[1], Slope=coeffs[2]))
}
lm_coefs = clin %>%
group_by(ID) %>%
do(coef.fcn(.))
The result of this regression is that it looks like there’s a pretty good correlation between slope and NFL, either by CDR status or amyloid status.
When I test that using maximum likelihood model selection, after correcting for age and sex, there’s a significant relationship between slope and NFL, CDR, Amyloid Status, and the interactions of NFL x CDR and NFL x Amyloid Status
## [1] "Relationship between PACC rate of change and NFL by CDR Groups"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 212 | 1.64e-05 | NA | NA | NA | NA |
| 210 | 1.44e-05 | 2 | 2.1e-06 | 15.16657 | 7e-07 |
## [1] "Relationship between PACC rate of change and CDR by CDR Groups"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 212 | 1.76e-05 | NA | NA | NA | NA |
| 210 | 1.44e-05 | 2 | 3.2e-06 | 23.30717 | 0 |
## [1] "Relationship between PACC rate of change and NFL:CDR by CDR groups"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 211 | 1.59e-05 | NA | NA | NA | NA |
| 210 | 1.44e-05 | 1 | 1.6e-06 | 22.68576 | 3.6e-06 |
## [1] "Relationship between PACC rate of change and NFL by Amyloid Status"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 212 | 1.83e-05 | NA | NA | NA | NA |
| 210 | 1.74e-05 | 2 | 8e-07 | 5.041035 | 0.0072717 |
## [1] "Relationship between PACC rate of change and Amyloid Level by Amyloid Status"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 212 | 1.76e-05 | NA | NA | NA | NA |
| 210 | 1.74e-05 | 2 | 1e-07 | 0.8358669 | 0.4349355 |
## [1] "Relationship between PACC rate of change and NFL:Amyloid by Amyloid Status"
| Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|
| 211 | 1.75e-05 | NA | NA | NA | NA |
| 210 | 1.74e-05 | 1 | 0 | 0.4399838 | 0.5078578 |
I also looked at resting state and NFL. I converted the Pearson correlation coefficients from the composite data to Z scores, and then I corrected everyone’s FC for age using a linear regression. I also took the log of the NFL data as before.
Then I used expectation-maximization to do best subset selection. I used the R leaps package to search for the best possible combination of up to 6 variables that would correlate with the preliminary NFL data. The inputs that I considered were the network composite scores only.
After performing 10-fold cross-validation I found that the optimal linear model was below:
fit<-lm(Prelim_NFL ~ SM_x_SM_lat + SM_x_VAN + SM_lat_x_CEREB + SM_lat_x_CO + SM_lat_x_DMN + SM_lat_x_SAL, data = df.temp2)
So I think the takeaway message is - there’s something going on in the somatomotor network with NFL. Here is a visualization of each of these key components overlain on our actual data.
And here is a visualization of the correlations between each of these parameters.
Because of the high level of correlation, I need to check for collinearity. Looking at these metrics, it looks like we’re ok.
##
## Call:
## omcdiag(x = df.temp2[, c("SM_x_SM_lat", "SM_x_VAN", "SM_lat_x_CEREB",
## "SM_lat_x_CO", "SM_lat_x_DMN", "SM_lat_x_SAL")], y = df.temp2$Prelim_NFL)
##
##
## Overall Multicollinearity Diagnostics
##
## MC Results detection
## Determinant |X'X|: 0.2499 0
## Farrar Chi-Square: 276.1616 1
## Red Indicator: 0.2923 0
## Sum of Lambda Inverse: 9.4284 0
## Theil's Method: 0.7197 1
## Condition Number: 12.5295 0
##
## 1 --> COLLINEARITY is detected by the test
## 0 --> COLLINEARITY is not detected by the test