SPINS CCA analysis

Introduction

Neurocognitive and social cognitive deficits are pervasive in schizophrenia (SSD). Both deficits are often evident early in illness course, highly predictive of functional outcome, and largely unalleviated by existing pharmacological intervention. A great deal of research has described the unique and shared variance of neurocognition and social cognition deficits, as well as their respective, and mutual, relationship to other hallmark features in SSD. More recently, interest in biologically-driven detection and targeted intervention has driven a smaller body of work to investigate the neurobiology underlying neurocognition and social cognition deficits. To date, most imaging studies have tested the association of a single construct (i.e., either neurocognition or social cognition, often considered as unitary rather than dimensional) with whole-brain derived features. As a result, it is unknown if neurocognitive and social cognitive deficits arise from overlapping or distinct brain pathology.

Objective. In the present analysis, we apply a multivariate statistical technique, canonical correlation analysis (CCA), to illuminate the relationship between white matter fractional anisotropy (FA) and neurocognition and social cognition, across a spectrum of SSD and healthy controls (HCs). In past work, we have found FA values in the right AF, left UF, and right ILF to be highly related to social cognition (Voineskos, 2013; Wheeler, 2015; Behdinan, 2015). Here, we seek to replicate this finding, confirm its hemispheric sensitivity, and investigate if these regions are also implicated in neurocognition. We expect to find that white matter disintegrity in these three tracts (and others implicated in social cognition) will also predict neurocognitive impairment, though perhaps only in some facets, and likely to a lesser degree.

Methods

Data were extracted from the collaborative multi-centre study, ‘Social Processes Initiative in Neurobiology of the Schizophrenia(s)’ (SPINS), designed to identify neural circuitry related to social cognitive impairment, in nearly 500 adults who either have SSD or no psychiatric diagnosis (HC). (See Figure 1 for a visualization of available data.) The study has an extensive multi-modal imaging protocol, as well as ‘deep phenotyping’ of neurocognitive and social cognitive ability, from multiple days of participant assessment.

DWI acquisition and preprocessing. The present study analyzed diffusion weighted imaging (DWI) scans, which are sensitive to white matter microstructure. Despite prospective efforts to harmonize DWI acquisitions across sites, analyses revealed that derived metrics showed a large, non-linear scanner effect (i.e., different scanner models produced signals that varied nonlinearly as a function of brain tissue), and we have therefore elected to analyze only data collected from three matched 3T Siemens PRISMA MRIs with 64-channel head-coils. All DWI scans were a 60-direction EPI dual spin echo sequence with the following identical parameters: b=1000, 5 b0s, TR/TE=8800/85ms, FOV=256mm, 128x128 matrix, and 2mm isotropic voxels. DWI data were denoised, skull-stripped, aligned, and motion- and distortion-corrected using FSL and MRtrix software. Data underwent automated and manual quality control after every preprocessing step.

White matter tract estimation. We used the Slicer whitematteranalysis computational pipeline to fit a tensor and perform deterministic whole brain fiber tracking. Streamlines were bundled via registration to the ORG atlas (Zhang 2018), and parcellated into 74 white matter tracts (41 unique; see Supplementary Table 1). For each of these tracts, we estimated fractional anisotropy (FA), which indicates the coherence with which water molecules diffuse along tissue, and is held to be a proxy of white matter integrity (values closer to 1 are indicative of healthy tissue; those closer to 0 indicate probable pathology).

Neurocognitive and social cognitive assessment. Neurocognitive status was assessed via the MATRICS (Measurement and Treatment Research to Improve Cognition in Schizophrenia) Cognitive Consensus Battery (MCCB), which provides an evaluation of key cognitive domains relevant to SSD. In this analysis, we included factor scores for Processing speed, Attention & vigilance, Working memory, Verbal learning, Visual learning, and Problem solving. Social cognitive ability was captured by total scores for the Reading the Mind in the Eyes Test (RMET), Relationships Across Domains (RAD), the Penn Emotion Recognition Test (ER-40), the Interpersonal Reactivity Index (IRI), and the Empathic Accuracy task (EA), and factor scores for the Awareness of Social Inference Test-Revised (TASIT). Importantly, all neurocognitive and social cognitive assessments were administered experimentally, and produce objective performance-based outcomes. Moreover, though all assessments are sensitive to impairment in SSD, a wide distribution of performance is evident across the SSD-HC continuum (no floor or ceiling performance).

Canonical Correlation Analysis. CCA reveals multivariate patterns of linked dimensions between two sets of variables, often denoted as an \(X\) (predictor) set and \(Y\) (criterion) set. Our \(X\) set was comprised of 72 reliably-segmented white matter tracts. The \(Y\) set was comprised of 14 behavioural variables: 6 neurocognition variables, and 8 social cognition variables, described above.

Results

Participants. Participant characteristics are presented in Table 1. Data from 162 participants (78 HC, 84 SSD) were available for analysis after excluding participants for failed eligibility, poor DWI data quality, and excessive missing observations (defined as ≥15/74 FA values (n=17 participants), ≥2/6 neurocognition scores (n=2), or ≥2/8 social cognition scores (n=0)). Missing data from participants with below threshold missingness for the \(X\) (n=1503 tracts) and \(Y\) (n=7 neurocognition, n=27 social cognition) sets were imputed via multivariate chained equations (van Buuren, 2019). Note that the 86 variables across the 162 participants exceeded the ideal 10:1 observation-to-variable ratio guideline for CCA, but not so greatly that dimensionality reduction techniques (e.g., PCA) were thought warranted to avoid over-fitting. In general, our sample was gender-balanced, young adult, well educated, and the SSD participants were clinically stable.

Table 1. Participant characteristics (means and standard deviations).

	Combined, n=162	HC, n=78	SSD, n=84	p
Demographics
Sex (F:M)	63:99	39:39	24:60	.008
Age (years)	32.19 (10.03)	33.17 (10.37)	31.29 (9.68)	.234
Handedness (0=L, 1=R)	0.67 (0.45)	0.62 (0.50)	0.72 (0.39)	.176
Education (years)	15.05 (2.35)	16.15 (2.08)	14.02 (2.11)	.000
BMI	27.60 (5.55)	26.69 (5.72)	28.44 (5.29)	.045
WTAR	109.36 (13.19)	113.03 (11.13)	105.95 (14.07)	.001
Neurocognition (MATRICS MCCB)
Processing speed	45.93 (14.50)	54.27 (9.82)	38.19 (13.88)	.000
Attention and vigilance	44.13 (14.00)	47.96 (14.46)	40.57 (12.64)	.001
Working memory	44.37 (11.59)	46.64 (12.16)	42.26 (10.68)	.016
Verbal learning	45.34 (11.35)	50.99 (10.59)	40.10 (9.39)	.000
Visual learning	42.62 (13.25)	48.13 (11.45)	37.51 (12.82)	.000
Reasoning and problem solving	46.42 (11.65)	49.55 (10.09)	43.51 (12.29)	.001
Social cognition
ER-40	3461.12 (546.80)	3695.38 (368.08)	3243.59 (595.67)	.000
RMET	25.68 (5.15)	27.05 (4.16)	24.40 (5.65)	.001
EA	0.47 (0.18)	0.53 (0.16)	0.41 (0.19)	.000
RAD	55.90 (8.59)	60.36 (5.59)	51.76 (8.84)	.000
IRI	66.51 (12.38)	65.76 (11.69)	67.21 (13.01)	.455
TASIT 1	23.40 (3.17)	24.78 (2.27)	22.12 (3.35)	.000
TASIT 2	50.75 (7.72)	54.51 (4.40)	47.26 (8.50)	.000
TASIT 3	50.94 (8.23)	54.31 (5.16)	47.81 (9.28)	.000
MRI
Absolute motion	0.49 (0.41)	0.50 (0.43)	0.48 (0.40)	.755
Composite image quality score	-0.93 (1.39)	-1.12 (1.43)	-0.75 (1.34)	.095
Whole brain fractional anisotropy (FA)	0.37 (0.01)	0.38 (0.01)	0.37 (0.01)	.002
Clinical, functional, outcome
SPQ-B	5.11 (5.56)	1.47 (2.52)	8.49 (5.48)	.000
CIRG-S	2.81 (3.00)	1.59 (2.00)	3.95 (3.31)	.000
BPRS	–	–	24.88 (11.67)	–
SAS	–	–	556.35 (450.68)	–
QLS	–	–	31.18 (7.82)	–
SANS	–	–	2.21 (2.6)	–
Medication
CPZE equivalents (mg / day)	–	–	69.96 (20.23)	–
Note:
The BPRS, SAS, QLS, and SANS were only administered to participants with SSD; All neurocognition scores are MATRICS MCCB factor scores and have been normed for age and sex (Nuechterlein et al., 2008); HC = healthy control, SSD = schizophrenia spectrum disorder, BMI = body mass index, WTAR = Wechsler Test of Adult Reading, SPQ-B = Schizotypal Personality Questionnaire-Brief, CIRS-G = Cumulative Illness Rating Scale - Geriatric, BPRS = Brief Psychiatric Rating Scale, SAS = Simpson-Angus Scale, QLS= Quality of Life Scale, SANS = Scale for the Assessment of Negative Symptoms

Statistical preprocessing. All statistical tests were run with R version 3.5.3. We [WILL] regressed age, sex, and a DWI scan quality composite score from the \(X\) set, and age and sex from the social cognition scores in the \(Y\) set, and used neurocognition scores normed for age and sex. For ease of interpretation, all variables in the \(X\) and \(Y\) sets were standardized via Z-scoring. Though variable sets were not found to be univariate and multivariate normal, their observed distributions, in conjunction with sample size, were sufficient to assume properties of the central limit theorum were in effect and thus assumptions of CCA were satisfied. We also examined raw correlations within and between the \(X\) and \(Y\) sets, to affirm our conceptual grouping. As expected, we found small-to-moderate positive correlations between brain variables (\(R_{xx}\) matrix mean r=.21, range=-.40-.79) and moderate-to-strong correlations between neurocognitive and social cognitive variables (\(R_{yy}\) matrix mean r=.38, range=-.04-.69; see Supplementary Figure 1). We also found that several variables in the \(R_{xy}\) matrix showed some multicolinearity, as visualized by an adjusted \(R_{xy\omega}\) matrix (see Supplementary Figure 2).

Selection of canonical variates. The CCA analysis produced 14 canonical functions (analogous with the size of the smaller \(Y\) set). First, we employed a permutation test (500 bootstraps) to evaluate the null hypothesis of no correlation between the \(X\) and \(Y\) sets. Wilk’s lambda suggested no correlation between the sets (p=.35). Similar values were seen when employing the Hotelling-Lawley Trace (p=.232), the Pillai-Bartlett Trace (p=.476), and Roy’s Largest Root (p=.052), with all canonical correlations included in all models. We proceeded to review the functions stepwise for significance, magnitude, and redundancy (Hair, Anderson, and Tatham, 1998). Using Wilk’s lambda, we found no functions were significant; the first function – which by definition falls closest to significance – found \(\lambda\)=<.00, F(1008, 1112.5)=1.02, p=.360. The Stewart-Love redundancy index (i.e., variance in one set that can be explained by the other set) was 1% for the \(X\) set and 0.5% for the \(Y\) set. Though they should not be interpreted due to lack of significance, Table 2 shows standardized canonical function coefficients (\(\beta\)), structure coefficents (\(r_s\)), and squared structure coefficients \(r_s^2\) (here analogous with communalities) for the first derived function. Canonical weights (\(\beta\)) are included for illustrative purposes, and should not be interpreted even if the model were significant, as they are unstable in the face of multicollinearity, which we have already demonstrated is a feature of the \(R_{XX}\) matrix in Supplementary Figure 2. Interestingly, there was a very strong correlation, \(R_c\)=.866, between participants’ CCA scores in the \(X\) and \(Y\) for the first function, albeit insignificant (see Figure 2 for a visualization of the association).

Table 2.

	Function 1
Variable	\(\beta\)	\(r_s\)	\(r_s^2\) (\(h^2\))
X
AF (R)	-0.144	0.02	0.000
AF (L)	0.137	-0.073	0.005
CB (R)	0.090	-0.22	0.049
CB (L)	-0.169	-0.165	0.027
CC1 (C)	-0.086	-0.116	0.014
CC2 (C)	0.089	-0.14	0.020
CC3 (C)	-0.191	-0.102	0.010
CC4 (C)	-0.086	-0.029	0.001
CC5 (C)	0.372	0.148	0.022
CC6 (C)	-0.008	-0.033	0.001
CC7 (C)	0.079	-0.007	0.000
CPC (L)	0.302	0.084	0.007
CPC (R)	-0.259	-0.316	0.100
CR.F (L)	0.309	0.075	0.006
CR.F (R)	0.154	0.006	0.000
CR.P (R)	0.048	-0.028	0.001
CR.P (L)	-0.045	0.018	0.000
CST (L)	0.298	0.009	0.000
CST (R)	-0.065	-0.054	0.003
EC (R)	0.005	-0.081	0.007
EC (L)	0.015	-0.067	0.005
EmC (L)	-0.020	0.006	0.000
EmC (R)	0.219	0.037	0.001
ICP (L)	-0.274	-0.312	0.097
ICP (R)	0.020	-0.145	0.021
ILF (R)	0.136	0.127	0.016
ILF (L)	-0.165	-0.026	0.001
Intra.CBLM.I.P (R)	-0.318	-0.118	0.014
Intra.CBLM.I.P (L)	0.070	0.029	0.001
Intra.CBLM.PaT (L)	0.127	0.028	0.001
Intra.CBLM.PaT (R)	-0.284	-0.156	0.024
IOFF (L)	-0.144	-0.073	0.005
IOFF (R)	0.093	0.185	0.034
MCP (C)	0.022	-0.009	0.000
MdLF (R)	-0.352	0.028	0.001
MdLF (L)	0.125	-0.061	0.004
PLIC (R)	0.328	-0.029	0.001
PLIC (L)	-0.192	-0.211	0.044
SF (R)	0.186	-0.087	0.008
SF (L)	-0.202	-0.199	0.040
SLF.II (L)	0.318	0.184	0.034
SLF.II (R)	-0.137	-0.064	0.004
SLF.III (R)	-0.249	-0.193	0.037
SLF.III (L)	-0.114	-0.1	0.010
SO (L)	-0.211	-0.071	0.005
SO (R)	0.021	0.085	0.007
SP (R)	-0.148	-0.13	0.017
SP (L)	-0.279	-0.19	0.036
Sup.F (R)	-0.170	-0.09	0.008
Sup.F (L)	0.254	-0.148	0.022
Sup.FP (L)	-0.196	-0.112	0.012
Sup.FP (R)	-0.061	-0.187	0.035
Sup.O (L)	-0.039	-0.039	0.002
Sup.O (R)	-0.098	0.074	0.006
Sup.OT (R)	0.205	0.237	0.056
Sup.OT (L)	0.017	-0.109	0.012
Sup.P (L)	-0.001	-0.05	0.003
Sup.P (R)	0.382	0.023	0.001
Sup.PO (L)	-0.321	-0.257	0.066
Sup.PO (R)	0.072	0.057	0.003
Sup.PT (R)	0.187	0.123	0.015
Sup.PT (L)	-0.010	0.059	0.003
Sup.T (R)	0.094	0.08	0.006
Sup.T (L)	0.048	0.052	0.003
TF (L)	-0.462	-0.073	0.005
TF (R)	0.014	-0.106	0.011
TO (L)	0.174	0.053	0.003
TO (R)	-0.021	-0.106	0.011
TP (L)	0.051	0.022	0.000
TP (R)	-0.022	0.001	0.000
UF (L)	0.339	0.069	0.005
UF (R)	-0.327	-0.209	0.044
Y
Processing speed	-0.475	-0.461	0.212
Attention & vigilance	-0.459	-0.482	0.233
Working memory	-0.297	-0.286	0.082
Verbal learning	-0.181	-0.352	0.124
Visual learning	-0.181	-0.423	0.179
Problem solving	0.359	-0.152	0.023
RMET	0.217	-0.013	0.000
RAD	0.025	-0.143	0.020
ER-40	0.470	0.086	0.007
TASIT 1	-0.267	-0.164	0.027
TASIT 2	-0.563	-0.265	0.070
TASIT 3	0.698	0.089	0.008
IRI	0.055	0.201	0.040
EA	0.464	0.192	0.037
Note:
\(r_s\) values > .450 are emphasized, following convention; \(\beta\) = standardized canonical function coefficient; \(r_s\) = structure coefficient; \(r_s^2\) = squared structure coefficient, here also communality \(h^2\)

Validation. Lastly, we validated our model via iterative feature removal, i.e., we removed one of features from the combined p=86 of the \(X\) and \(Y\) sets, ran the CCA, and compared the derived canonical correlation coefficients. This procedure leveraged the CCA property that each variable relates to all other variables in both sets: if the model is stable, canonical correlation estimates will remain similar; if unstable, estimate will vary widely. Values were similar across all iterations, suggesting that our model is stable (Supplementary Figure 3).

Exploratory analysis. Though increasing evidence suggests that neurocognition and social cognition are dimensional constructs on which HC and SSD fall along a continuum, as opposed to discrete classes, we were curious to verify that HC and SSD participants exhibited a complementary pattern of correlation within or between \(X\) and \(Y\) sets. We observed that their respective \(R_{xx}\), \(R_{yy}\), and \(R_{xy}\) matrices are very similar across HC and SSD populations.

Discussion

Summary. We performed a CCA on a large \(X\) set of white matter tracts, and a small \(Y\) set of neurocognitive and social cognitive variables, across 162 participants along the HC-SSD spectrum. CCA no significant relationship between these sets, though CCA scores on the first canonical function were highly correlated, \(R_c\)=.866. Thus, it appears that the structural integrity of white matter tracts is generally predictive of neurocognitive and social cognitive ability. However, evaluating the strength of correlations between \(X\) and \(Y\) sets suggests that no particular variables within each set contributed uniquely to the correlation. This may be an upshot of other features of the data distributions (e.g., non-normality), and/or derivation (e.g., pre-processing).

The null results of the present analysis are also a consequence of the number of features included in the \(X\) set. By way of contrast, we ran an otherwise comparable analysis with just six brain features (bilateral AF, UF, and ILF) and found one significant component, λ=.397, F(48, 382.93)=1.65, p=.006, with a large magnitude, \(R_c\)=.571. In that analysis, examination of structure coefficients (\(r_s\)) (which reveal the correlation between a given variable and the synthetic set to which it belongs, irrespective of the contribution of other variables) for the \(X\) set revealed that FA values in the right ILF contributed most highly (\(r_s\)=.832), followed by the left ILF (\(r_s\)=.509) and right AF (\(r_s\)=.464). In contrast, the left AF and right UF made modest contributions, and the left UF a negligible contribution, suggesting these tracts may not be very strongly related to the synthetic variable combining neurocognition and social cognition. In that analysis, examination of the \(Y\) set also showed some very high structure coefficients. Specifically, the social cognition variables (in that analysis, represented by Simulation (\(r_s\)=.911) and Mentalizing (\(r_s\)=.943) factor scores; Oliver, 2018), made substantive contributions. Additionally, four of six neurocognition variables made large contributions: Processing speed (\(r_s\)=.593), Working memory (\(r_s\)=.512), Verbal learning (\(r_s\)=.547), and Visual learning (\(r_s\)=.489). All of the \(Y\) set structure coefficients had the same sign, indicating that all variables are positively related. The Attention & vigilance and Problem solving factors did not reveal high structure coefficients, suggesting that those factors were not strongly related to white matter integrity in the included tracts, and also provided evidence that neurocognition has multidimensional representation in the brain, as well as behaviour.

Conclusion

We used the multivariate technique CCA to uncover links between integrity of white matter microstructure (operationalized as FA) and domains of neurocognition and social cognition. To the best of our knowledge, this is the first study to examine how tracts implicated in social cognition may also subserve select facets of neurocognition, in a large sample of SSD and HCs.

At least one aspect of CCA makes it particularly attractive to the task of elucidating brain and behaviour links in psychiatry: it holistically integrates observations from transdiagnostic groups (here, HC and SSD), and thus avoids the increasingly noted pitfalls of case-control design. Accumulating evidence suggests that the neurobiology underlying particular deficits in SSD is also evident in HC. Our results suggest that this relationship holds for the association between white matter microstructure and neurocognitive and social cognitive deficits in both HC and SSD. Future research should further test this association transdiagnostically, for example, by including data from other disorders with noted neurocognitive and social cognitive deficits.

Several limitations should be noted, both of the present study design and the CCA statistical method. In relation to the former, perhaps the greatest limitation is that our current sample size (limited by use of only data from the PRISMA scanners, until we implement harmonization) is insufficient to be confident in our conclusions. Relatedly, obtaining an independent replication sample is essential for cross-validation, i.e., the ability to assess the model’s ability to generalize to unseen, unrelated samples. Pertaining to methods, it must be remembered that as a linear model, CCA assumes additive covariation patterns, which may prove a simplification of brain-behaviour relationships. Finally, CCA is ‘merely’ correlational; investigating the cause(s) of the reported correlations remains a challenging issue for future experimental work.

Nonetheless, multivariate analyses that illuminate the shared and distinct neural basis of disabling deficits, such as neurocognitive and social cognitive impairment, are sure to accelerate the design of novel, biologically-targeted treatments for SSD, and psychiatry more broadly.

Code availability. All analysis code is available on GitHub (private repo until code review): github.com/navonacalarco/thesis/tree/master/SPINS/analyses

Relevant sub-analyses.

(1) Summary of psychotropic medications
(2) Calculation of CPZE
(3) Automated DWI QC procedure – description
(4) Automated DWI QC procedure – results
(5) Review of data distribution – missing values – \(X\) set
(6) Review of data distribution – univariate outliers – \(Y\) set
(7) Review of multivariate and univariate normality – \(X\) set
(8) Review of multivariate and univariate normality – \(Y\) set
(9) Multivariate interpolation of missing data – \(X\) set
(10) Multivariate interpolation of missing data – \(Y\) set
(11) Review of site effect in prospectively harmonized data
(12) RISH harmonization: participant matching
(13) RISH harmonization: participant characteristics
(14) RISH harmonization: problem with brain extraction, requiring new preprocessing
(15) Comparable analysis with smaller p x n for statistics class

References

Behdinan, T., Foussias, G., Wheeler, A.L., Stefanik, L., Felsky, D., Remington, G., Rajji, T.K., Chakravarty, M.M., Voineskos, A.N. (2015). Neuroimaging predictors of functional outcomes in schizophrenia at baseline and 6-month follow up. Schizophrenia Research., 169(1), 69-75.

Hair, J.F Jr., Anderson, R.E., Tatham. R.L., Black, W.C. (1998). Chapter 8. Multivariate Data Analysis, 5th edition. Prentice Hall.

Oliver, L.D., Haltigan, J.D., Gold, J.M., Foussias, G., DeRosse, P., Buchanan, R.W., Malhotra, A.K., Voineskos, A.N. (2019). Lower- and higher-level social cognitive factors across individuals with Schizophrenia Spectrum Disorders and healthy controls: Relationship with neurocognition and functional outcome. Schizophrenia Bulletin, 45(3), 629-638.

van Buuren, S. (2019). mice: Multivariate Imputation by Chained Equations . R package veersion 3.7.0. https://cran.r-project.org/web/packages/mice/mice.pdf

Voineskos, A.N., Foussias, G., Lerch, J., Felsky, D., Remington, G., Rajji, T.K., Lobaugh, N., Pollock, B.G., Mulsant, B.H. (2013). Neuroimaging evidence for the deficit subtype of schizophrenia. JAMA Psychiatry, 70(5), 472-480.

Wheeler, A.L., Wessa, M., Szeszko, P.R., Foussias, G., Chakravarty, M.M., Lerch, J.P., DeRosse, P., Remington, G., Mulsant, B.H., Linke, J., Malhotra, A.K., Voineskos, A.N. (2015). Further Neuroimaging Evidence for the Deficit Subtype of Schizophrenia: A Cortical Connectomics Analysis. JAMA Psychiatry, 72(5), 446-455.

Zhang, F., Wu, Y., Norton, I., Rathi, Y., Makris, N., O’Donnell, L.J. (2018). An anatomically curated fiber clustering white matter atlas for consistent white matter tract parcellation across the lifespan. NeuroImage, 179, 429-447.

Supplementary Tables

Supplementary Table 1. List of the 41 unique (74 combinations) of white matter tracts parcellated by the Slicer ORG atlas.

	Abbreviation	Full name	Hemisphere
Association tracts
1	AF	arcuate fasciculus	LR
2	CB	cingulum bundle	LR
3	EC	external capsule	LR
4	EmC	extreme capsule	LR
5	ILF	inferior longitudinal fasciculus	LR
6	IoFF	inferior occipito-frontal fasciculus	LR
7	MdLF	middle longitudinal fasciculus	LR
8	PLIC	posterior limb of internal capsule	LR
9	SLF I	superior longitudinal fasciculus I	LR
10	SLF II	superior longitudinal fasciculus II	LR
11	SLF III	superior longitudinal fasciculus III	LR
12	UF	uncinate fasciculus	LR
Cerebellar tracts
13	CPC	cortico-ponto-cerebellar	LR
14	ICP	inferior cerebellar peduncle	LR
15	Intra-CBLM-I&P	intracerebellar input and Purkinje tract	LR
16	Intra-CBLM-PaT	intracerebellar parallel tract	LR
17	MCP	middle cerebellar peduncle	C
Commissural tracts
18	CC 1	corpus callosum 1	C
19	CC 2	corpus callosum 2	C
20	CC 3	corpus callosum 3	C
21	CC 4	corpus callosum 4	C
22	CC 5	corpus callosum 5	C
23	CC 6	corpus callosum 6	C
24	CC 7	corpus callosum 7	C
Projection tracts
25	CST	corticospinal tract	LR
26	CR-F	corona-radiata-frontal	LR
27	CR-P	corona-radiata-parietal	LR
28	SF	striato-frontal	LR
29	SO	striato-occipital	LR
30	SP	striato-parietal	LR
31	TF	thalamo-frontal	LR
32	TO	thalamo-occipital	LR
33	TP	thalamo-parietal	LR
Superficial tracts
34	Sup-F	superficial-frontal	LR
35	Sup-FP	superficial-frontal-parietal	LR
36	Sup-O	superficial-occipital	LR
37	Sup-OT	superficial-occipital-temporal	LR
38	Sup-P	superficial-parietal	LR
39	Sup-PO	superficial-parietal-occipital	LR
40	Sup-PT	superficial-parietal-temporal	LR
41	Sup-T	superficial-temporal	LR
Note:
LR = left and right, C = commissural

Supplementary Figures

Supplementary Figure 1. Correlation matrices for the \(X\) and \(Y\) sets. A. The \(R_{xx}\) matrix (white matter FA) generally shows small-to-moderate positive correlations between tracts. The order of tracts is alphabetical along the axes, as captured in Supplementary Table 1. B. The \(R_{yy}\) matrix (neurocognition and social cognition variables) shows moderate-to-strong positive correlations between all variables.

Supplementary Figure 2. Cross-correlation, omega, and difference matrices. A. The \(R_{xy}\) matrix shows cross-correlations between variables in the \(X\) and \(Y\) sets. We see some evidence of multicolinearity. B. The omega matrix shows the \(R_{xy}\) matrix adjusted for redundancy. The omega matrix is calculated as the product of the inverse of the Choleski factorization of the \(R_{xx}\) matrix, the \(R_{xy}\), and the inverse of the Choleski factorization of the \(R_{xy}\) matrix; C. The difference matrix shows the magnitude of difference between the original \(R_{xy}\) and difference matrices.

Supplementary Figure 3. Model validation. We validated our model using iterative feature removal, i.e., we removed one of 14 features from the combined \(X\) and \(Y\) sets, ran the CCA, and compared the derived canonical correlation coefficients. The plot shows the 86 comparisons (black dots), as well the original model (red dots), across all 14 canonical functions. We see that, though iteratively removing a variable does alter the canonical correlation coefficients, the change is not drastic, nor statistically significant.