imaging data pulled: 2021-01-21
clinical data pulled: 2020-11-16
code written: 2020-12-22
last ran: 2021-04-13
website: http://rpubs.com/navona/NM_prediction
code: https://github.com/navonacalarco/NM-MRI/blob/master/analyses/06_prediction.Rmd
related analysis: CCA | Hierarchical clustering


Description

This report contains a series of analyses to relate the CCA results (namely, participants’ CV1 scores on the \(X\) and \(Y\) set) to a variety of unseen variables of interest. The lenend in all coloured plots is as follows:




Symptom severity

We opted to determine if participants’ CV1 scores on the \(X\) (brain) set and/or \(Y\) (behaviour/cognition) set predicted depression symptom severity, as measured by the PHQ-9 and MADRS total scores. The PHQ-9 was administered at four study timepoints (screening, T0, T1, and T2), and the MADRS was administered at three (T0, T1, T2). However, not all of the n=48 participants in our dataset with NM scans returned for followup assessments. Frequency counts of the PHQ-9 and MADRS at each timepoint are as follows:

PHQ-9
MADRS
timepoint screening T0 T1 T2 screening T0 T1 T2
participant count 48 10 22 7 0 48 22 6

Here, we review depression severity scores taken at two timepoints: ‘closest-to-scan’, and ‘longitudinal’. The closest-to-scan timepoint for the PHQ-9 is ‘screening’ with n=48, and for the MADRS is ‘T0’, with n=48), coloured in green. The ‘longitudinal’ timepoint for both is ‘T1’, which provides PHQ-9 data from 22 participants (14 LLD), and MADRS data from n=22 participants (14 LLD), coloured in yellow. Thus, note that this longitudinal estimation of depression severity is based on roughly 46% of participants included in our dataset.

For both the ‘closest-to-scan’ and ‘longitudinal’ depression data, and for both the separate LLD and HC and combined LLD-HC diagnostic groups, we review two methods of prediction: (i) simple linear regression, with \(X\) and \(Y\) as separate predictors, and (ii) multiple linear regression, with \(X\) and \(Y\) as combined predictors. Neither the simple nor multiple linear regression model finds an association between any diagnostic combination of participants’ \(X\) and/or \(Y\) score and (continuous) depression symptom severity, at either the ‘closest-to-scan’ or ‘longitudinal’ timepoints, i.e., participants’ CCA CV1 scores do not predict depression symptom severity.

PHQ-9: ‘closest-to-scan’

Simple linear regression

Multiple linear regression

Combined HC-LLD group: Adjusted R-squared = -0.016, model p = 0.539.
HC only: Adjusted R-squared = 0.048, model p = 0.236.
LLD only: Adjusted R-squared = -0.062, model p = 0.742.

PHQ-9: ‘longitudinal’

Simple linear regression

Multiple linear regression

Combined HC-LLD group: Adjusted R-squared = 0.03, model p = 0.289.
HC only: Adjusted R-squared = 0.712, model p = 0.019.
LLD only: Adjusted R-squared = -0.029, model p = 0.467.

MADRS: ‘closest-to-scan’

Simple linear regression

Multiple linear regression

Combined HC-LLD group: Adjusted R-squared = -0.019, model p = 0.579.
HC only: Adjusted R-squared = -0.063, model p = 0.712.
LLD only: Adjusted R-squared = -0.064, model p = 0.76.

MADRS: ‘longitudinal’

Simple linear regression

Multiple linear regression

Combined HC-LLD group: Adjusted R-squared = -0.003, model p = 0.399.
HC only: Adjusted R-squared = -0.01, model p = 0.442.
LLD only: Adjusted R-squared = -0.151, model p = 0.865.




SASP Index

We also reviewed associations with the SASP (Senescence-Associated Secretory Phenotype) Index. The SASP Index is a composite, integrated measure that reflects the dysregulation of distinct senescence-related pathways. We reviewed associationsbetween the SASP Index and number of variables including participants’ CV1 scores on the X (brain) set and/or Y (behaviour/cognition) set. Note that we are missing SASP data from 2 participants. We find no predictive association between participants’ CCA CV1 scores and the SASP Index.




CIRS-G

We were also interested to see if participants’ CV1 scores on the X (brain) set and/or Y (behaviour/cognition) set correlated with the CIRS-G. We find no predictive association between participants’ CCA CV1 scores and the CIRS-G.




Health, etc.

For post-hoc exploration, we reviewed associations between participants’ CV1 scores on the X (brain) set and/or Y (behaviour/cognition) set, and a number of total scores from the ‘health’ variables in the SENDEP dataset. The variables are:

Variable Scale name Rating scale
health_ecog_total Everyday Cognition higher=worse
health_fas_total Fatigue Assessment Scale higher=worse
health_frail_total FRAIL scale higher=worse
health_gad7_total Generalized Anxiety Disorder Scale higher=worse
health_moca_total Montreal Cognitive Assessment higher=better
health_pss_total Perceived Stress Scale higher=worse
health_ucla3_total UCLA Loneliness Scale higher=worse
health_whodasv2_total WHO Disability Assessment Schedule higher=worse
health_wrat_total Word Reading Subtest higher=better

Correlations and significance tests, by diagnosis and across diagnosis, with the CCA CV1 scores are are shown below. We find no predictive association between participants’ CCA CV1 scores and the health measures.

X - correlations

X - significance (diagnosis)

X - significance (group)

Y - correlations

Y - significance (diagnosis)

Y - significance (group)




Cognition

Lastly, we opted to determine if participants’ CV1 scores on the \(X\) (brain) set and/or \(Y\) (behaviour/cognition) set predicted longitudinal cognition, as measured by the same scales included in the \(Y\) set (namely RBANS and D-KEFS), at followup. Both scales were administered at three study timepoints (T0, T1, and T2). However, not all of the 48 participants in our dataset with NM scans returned for followup assessments. Frequency counts of the RBANS and D-KEFS at each timepoint are as follows:

RBANS
D-KEFS
timepoint screening T0 T1 T2 screening T0 T1 T2
participant count 0 48 22 5 0 48 22 6

Based on participant counts, we review longitudinal cognition scores at the T1 timepoint, which is the same ‘longitudinal timepoint’ reviewed re: depression severity scores, above. We also calculated a delta score, representing the difference in the T0 and T1 timepoitns. The RBANS has complete data from 22 participants (14 LLD), and D-KEFS data from n=22 participants (14 LLD), coloured in green. Thus, note that this longitudinal evaluation of cognition at follow-up is based on roughly 46% of participants included in our dataset.


First, we wanted to assess if participants’ cognition scores were relatively stable over time. The plots below show cognition scores across the two timepoints, coloured by diagnostic group.


Delta

The following visualization shows the delta (change) in cognition score between the two timepoints, coloured by group. The t-test reports if there is a difference in delta between the HC and LLD groups. The table reports the mean and SD per group, per cognitive task.

Boxplots

Table
Diagnosis timepoint var n mean sd
HC t0 dkefs_cwi_1_time 23 34.217 6.987
HC t0 dkefs_cwi_2_time 23 25.522 5.526
HC t0 dkefs_cwi_3_time 23 62.391 16.784
HC t0 dkefs_cwi_4_time 23 64.435 11.965
HC t0 dkefs_trails4_time 23 105.435 47.958
HC t0 dkefs_trails5_time 23 41.565 28.428
HC t0 rbans_attention_index 23 107.174 16.892
HC t0 rbans_delmem_index 23 97.000 14.894
HC t0 rbans_immmemory_index 23 96.435 12.427
HC t0 rbans_language_index 23 98.870 12.524
HC t0 rbans_visuo_index 23 97.174 15.799
HC t1 dkefs_cwi_1_time 8 35.500 8.053
HC t1 dkefs_cwi_2_time 8 27.125 4.794
HC t1 dkefs_cwi_3_time 8 61.125 10.934
HC t1 dkefs_cwi_4_time 8 64.500 13.617
HC t1 dkefs_trails4_time 8 106.875 29.705
HC t1 dkefs_trails5_time 8 43.000 12.750
HC t1 rbans_attention_index 8 104.500 18.868
HC t1 rbans_delmem_index 8 109.375 12.153
HC t1 rbans_immmemory_index 8 106.750 12.903
HC t1 rbans_language_index 8 104.500 14.531
HC t1 rbans_visuo_index 8 100.750 15.746
LLD t0 dkefs_cwi_1_time 25 34.160 10.351
LLD t0 dkefs_cwi_2_time 25 24.280 6.361
LLD t0 dkefs_cwi_3_time 25 68.120 23.446
LLD t0 dkefs_cwi_4_time 25 68.680 22.026
LLD t0 dkefs_trails4_time 25 122.680 58.996
LLD t0 dkefs_trails5_time 25 41.240 14.301
LLD t0 rbans_attention_index 25 101.240 16.435
LLD t0 rbans_delmem_index 25 100.960 11.077
LLD t0 rbans_immmemory_index 25 98.480 13.292
LLD t0 rbans_language_index 25 99.640 9.331
LLD t0 rbans_visuo_index 25 92.680 17.303
LLD t1 dkefs_cwi_1_time 14 33.714 7.087
LLD t1 dkefs_cwi_2_time 14 24.286 4.548
LLD t1 dkefs_cwi_3_time 14 67.357 27.692
LLD t1 dkefs_cwi_4_time 13 74.538 31.384
LLD t1 dkefs_trails4_time 12 119.583 63.572
LLD t1 dkefs_trails5_time 14 40.643 17.046
LLD t1 rbans_attention_index 14 93.643 16.118
LLD t1 rbans_delmem_index 14 102.643 11.126
LLD t1 rbans_immmemory_index 14 101.000 14.486
LLD t1 rbans_language_index 14 102.571 7.439
LLD t1 rbans_visuo_index 14 89.857 17.168


ANOVA: cognition over time

Next, we performed repeated-measures ANOVAs for each of the cognition variables. First, we first consider just one within-subject factor, timepoint, to evaluate whether there is any difference in cognition across the two timepoints. The code takes the form of aov(score ~ timepoint + Error(ID/timepoint), data). The predictor is timepoint, and the outcome is score. Error(ID/timepoint) is used to divide the error variance into two different clusters, which therefore takes into account the repeated measures. To examine the effect of the timepoint in the results, check the output for the Error: Within section. If the p value is significant, this means that there is a significant different among the two timepoints. Most of the RBANS components show a difference over time; few of the DKEFS do. Note that this analysis does not take diagnosis into consideration.

RBANS immediate memory
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1     96   96.37   0.428  0.516
## Residuals 46  10358  225.16               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint  1  378.2   378.2   5.717 0.0262 *
## Residuals 21 1389.3    66.2                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RBANS visual
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    533   533.5   1.423  0.239
## Residuals 46  17246   374.9               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint  1  349.5   349.5   7.515 0.0122 *
## Residuals 21  976.5    46.5                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RBANS language
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1      0    0.03       0  0.988
## Residuals 46   6394  139.01               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint  1  327.3   327.3   5.354 0.0309 *
## Residuals 21 1283.7    61.1                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RBANS attention
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint  1   1393  1393.1   3.703 0.0605 .
## Residuals 46  17307   376.2                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1     46   46.02   0.627  0.437
## Residuals 21   1540   73.36
RBANS delayed memory
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    258   257.6   1.246   0.27
## Residuals 46   9514   206.8               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint  1  311.1   311.1   4.831 0.0393 *
## Residuals 21 1352.4    64.4                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
DKEFS CWI 1
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1     13   13.20   0.132  0.718
## Residuals 46   4588   99.73               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1   9.09   9.091   1.201  0.285
## Residuals 21 158.91   7.567
DKEFS CWI 2
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    2.7    2.69    0.06  0.807
## Residuals 46 2050.3   44.57               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1   9.09   9.091   2.618  0.121
## Residuals 21  72.91   3.472
DKEFS CWI 3
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    932   932.2   1.541  0.221
## Residuals 46  27827   604.9               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint  1    396   396.0   5.092 0.0348 *
## Residuals 21   1633    77.8                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
DKEFS CWI 4
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    548   547.6   0.931   0.34
## Residuals 46  27066   588.4               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1   13.7   13.71   0.221  0.643
## Residuals 20 1239.3   61.96
DKEFS TRAILS A
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    284     284   0.077  0.783
## Residuals 46 169234    3679               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    109   108.9   0.106  0.748
## Residuals 19  19476  1025.1
DKEFS TRAILS B
## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    770   769.5       2  0.164
## Residuals 46  17700   384.8               
## 
## Error: ID:timepoint
##           Df Sum Sq Mean Sq F value Pr(>F)
## timepoint  1    270   270.0   0.638  0.434
## Residuals 21   8893   423.5


ANOVA: cognition over time by diagnosis

Second, we consider one within-subject factor and one between-subject factor, diagnosis. With the two factors, we can also test the interaction effect between timepoint and diagnosis. The code takes the form of aov(score ~ timepoint *Diagnosis + Error(ID/(timepoint*Diagnosis)), data). As above,most of the RBANS components show a difference over time; few of the DKEFS do. There is no effect of diagnosis or interaction with diagnosis.

RBANS immediate memory
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1     96    96.4   0.438 0.5116  
## Diagnosis            1      1     0.7   0.003 0.9541  
## timepoint:Diagnosis  1    673   673.2   3.059 0.0873 .
## Residuals           44   9684   220.1                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1  378.2   378.2   5.455   0.03 *
## timepoint:Diagnosis  1    2.5     2.5   0.037   0.85  
## Residuals           20 1386.8    69.3                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RBANS visual
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1    533   533.5   1.499 0.2274  
## Diagnosis            1   1038  1037.9   2.915 0.0948 .
## timepoint:Diagnosis  1    544   544.4   1.529 0.2228  
## Residuals           44  15664   356.0                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1  349.5   349.5   7.344 0.0135 *
## timepoint:Diagnosis  1   24.9    24.9   0.522 0.4782  
## Residuals           20  951.7    47.6                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RBANS language
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1      0    0.03   0.000  0.988
## Diagnosis            1      2    2.35   0.016  0.899
## timepoint:Diagnosis  1     21   21.22   0.147  0.704
## Residuals           44   6371  144.79               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1  327.3   327.3   5.181  0.034 *
## timepoint:Diagnosis  1   20.3    20.3   0.321  0.577  
## Residuals           20 1263.5    63.2                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RBANS attention
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1   1393  1393.1   3.700 0.0609 .
## Diagnosis            1    673   673.3   1.788 0.1880  
## timepoint:Diagnosis  1     67    67.1   0.178 0.6750  
## Residuals           44  16567   376.5                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1   46.0   46.02   0.632  0.436
## timepoint:Diagnosis  1   84.7   84.68   1.163  0.294
## Residuals           20 1455.8   72.79
RBANS delayed memory
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    258  257.61   1.222  0.275
## Diagnosis            1      6    5.85   0.028  0.868
## timepoint:Diagnosis  1    232  232.49   1.103  0.299
## Residuals           44   9275  210.81               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1  311.1  311.11   5.372 0.0312 *
## timepoint:Diagnosis  1  194.1  194.09   3.351 0.0821 .
## Residuals           20 1158.3   57.91                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
DKEFS CWI 1
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1     13   13.20   0.128  0.722
## Diagnosis            1      2    2.43   0.024  0.879
## timepoint:Diagnosis  1     54   54.15   0.526  0.472
## Residuals           44   4531  102.98               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1   9.09   9.091   1.144  0.298
## timepoint:Diagnosis  1   0.01   0.007   0.001  0.976
## Residuals           20 158.90   7.945
DKEFS CWI 2
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    2.7    2.69   0.062  0.804
## Diagnosis            1   44.6   44.58   1.033  0.315
## timepoint:Diagnosis  1  107.1  107.08   2.482  0.122
## Residuals           44 1898.6   43.15               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1   9.09   9.091   2.747  0.113
## timepoint:Diagnosis  1   6.72   6.722   2.031  0.170
## Residuals           20  66.19   3.309
DKEFS CWI 3
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    932   932.2   1.493  0.228
## Diagnosis            1    327   327.1   0.524  0.473
## timepoint:Diagnosis  1     32    31.8   0.051  0.822
## Residuals           44  27468   624.3               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)  
## timepoint            1  396.0   396.0   4.864 0.0393 *
## timepoint:Diagnosis  1    4.8     4.8   0.059 0.8104  
## Residuals           20 1628.2    81.4                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
DKEFS CWI 4
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    548   547.6   0.918  0.343
## Diagnosis            1    487   487.1   0.816  0.371
## timepoint:Diagnosis  1    318   317.8   0.533  0.469
## Residuals           44  26262   596.9               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1   13.7   13.71   0.212  0.650
## timepoint:Diagnosis  1   10.4   10.39   0.161  0.693
## Residuals           19 1228.9   64.68
DKEFS TRAILS A
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    284     284   0.076  0.784
## Diagnosis            1   4031    4031   1.077  0.305
## timepoint:Diagnosis  1    516     516   0.138  0.712
## Residuals           44 164687    3743               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    109   108.9   0.104  0.751
## timepoint:Diagnosis  1    561   561.2   0.534  0.474
## Residuals           18  18915  1050.8
DKEFS TRAILS B
## 
## Error: ID
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    770   769.5   2.005  0.164
## Diagnosis            1     95    95.0   0.248  0.621
## timepoint:Diagnosis  1    717   717.1   1.868  0.179
## Residuals           44  16888   383.8               
## 
## Error: ID:timepoint
##                     Df Sum Sq Mean Sq F value Pr(>F)
## timepoint            1    270   270.0   0.627  0.438
## timepoint:Diagnosis  1    280   279.7   0.649  0.430
## Residuals           20   8614   430.7


CV1 and longitudinal

We also wanted to review the correlation between the longitudinal cognition score visualized against the CV1 score. There are a small number of significant associations.

RBANS T1 & X set
RBANS T1 & Y set
D-KEFS T1 & X set
D-KEFS T1 & Y set


CV1 and delta

Here, we review the correlation between the cognition score delta visualized against the CV1 score. There are no/few significant associations.

RBANS delta & X set

RBANS delta & Y set

D-KEFS delta & X set

D-KEFS delta & Y set


CV1 and residuals

Lastly, we review the correlation between the cognition scores at the follow-up visit, residualized by cognition at baseline , visualized against the CV1 score. There are no significant associations.

RBANS residuals & X set

RBANS residuals & Y set

D-KEFS residuals & X set

D-KEFS residuals & Y set