Date of PAC: 2021-04-12
Date last code update: 2021-04-05
Date last ran: 2021-04-11



Analysis Overview

Test and replication samples. We have finalized a ‘test’ and ‘replication’ sample from the SPINS dataset. The test sample is comprised of n=135 participants scanned on the CMH GE scanner (89 SSD, 87 men). The replication sample is comprised of n=173 participants scanned on the combined CMP, MRP, and ZHP PRISMA scanners (91 SSD, 103 men).

Participant characteristics. Participant characteristics, by sample, are similar. Both samples’ SSD participants have lower education and WTAR. SSD participants show lower performance on all nonsocial and social cognition tests (with the exception of TASIT 1 in the GE sample, and TASIT 2 sincere in the GE and PRISMA sample). Group differences between white matter tracts are less consistent: the PRISMA sample generally distinguishes SSD and HC, but the GE sample is more mixed.

Variable Combined HC SSD p
Demographics
Sex (F:M) 48:87 20:26 28:61 .233
Age (years) 27.51 (7.68) 27.17 (8.34) 27.69 (7.36) .715
Handedness (0=L, 1=R) 0.67 (0.45) 0.70 (0.41) 0.65 (0.47) .538
Education (years) 14.19 (2.13) 15.48 (1.82) 13.52 (1.97) .000
Parental education 15.73 (2.96) 16.26 (2.49) 15.44 (3.15) .130
WTAR 113.04 (11.78) 116.00 (9.07) 111.52 (12.73) .036
Neurocognition (MATRICS MCCB)
Processing speed 45.38 (12.52) 52.22 (9.87) 41.84 (12.31) .000
Attention and vigilance 41.41 (11.26) 47.80 (9.29) 38.10 (10.80) .000
Working memory 45.22 (11.97) 52.17 (10.21) 41.63 (11.25) .000
Verbal learning 44.35 (9.58) 49.15 (8.78) 41.87 (9.06) .000
Visual learning 45.01 (11.28) 50.04 (7.70) 42.42 (11.97) .000
Reasoning and problem solving 44.64 (9.61) 47.48 (8.85) 43.17 (9.71) .013
Social cognition
ER-40 -2038.66 (573.77) -1756.65 (275.21) -2184.42 (632.07) .000
RMET 26.51 (4.48) 28.33 (3.35) 25.57 (4.71) .001
EA 0.84 (0.23) 0.93 (0.13) 0.79 (0.26) .001
RAD 56.27 (9.43) 61.26 (5.85) 53.69 (9.91) .000
TASIT 1 24.19 (2.78) 24.72 (2.06) 23.91 (3.06) .110
TASIT 2 sincere 17.18 (2.81) 17.33 (2.99) 17.10 (2.73) .662
TASIT 2 simple sarcasm 16.28 (4.13) 18.13 (2.04) 15.33 (4.60) .000
TASIT 2 paradoxical sarcasm 17.22 (3.30) 18.67 (2.07) 16.47 (3.56) .000
TASIT 3 lies 26.12 (4.20) 28.22 (2.67) 25.03 (4.44) .000
TASIT 3 sarcasm 25.47 (4.58) 27.33 (3.82) 24.52 (4.66) .001
White matter
AF left 0.60 (0.02) 0.61 (0.02) 0.60 (0.03) .730
AF right 0.57 (0.02) 0.57 (0.03) 0.57 (0.02) .875
CB left 0.50 (0.03) 0.50 (0.02) 0.50 (0.03) .467
CB right 0.49 (0.03) 0.49 (0.03) 0.48 (0.03) .072
ILF left 0.51 (0.02) 0.52 (0.02) 0.51 (0.02) .114
ILF right 0.51 (0.02) 0.52 (0.02) 0.51 (0.02) .651
IOFF left 0.63 (0.03) 0.63 (0.03) 0.63 (0.02) .993
IOFF right 0.63 (0.02) 0.63 (0.02) 0.62 (0.02) .464
UF left 0.48 (0.03) 0.48 (0.03) 0.49 (0.03) .536
UF right 0.46 (0.03) 0.46 (0.03) 0.46 (0.03) .917
TF left 0.49 (0.01) 0.50 (0.01) 0.49 (0.01) .122
TF right 0.48 (0.01) 0.49 (0.01) 0.48 (0.01) .049
CC1 0.51 (0.03) 0.51 (0.02) 0.51 (0.03) .635
CC2 0.57 (0.02) 0.57 (0.02) 0.57 (0.02) .350
CC3 0.59 (0.02) 0.60 (0.02) 0.59 (0.02) .043
CC4 0.60 (0.02) 0.60 (0.01) 0.60 (0.02) .047
CC5 0.60 (0.02) 0.60 (0.02) 0.60 (0.02) .819
CC6 0.61 (0.01) 0.62 (0.01) 0.61 (0.01) .026
CC7 0.61 (0.02) 0.62 (0.02) 0.61 (0.02) .210
Function outcome, health, clinical
BSFS 151.09 (27.02) 171.17 (21.29) 140.71 (23.66) .000
CIRS-G 2.54 (2.27) 1.91 (1.35) 2.87 (2.57) .020
BPRS – – 29.96 (6.65) –
SAS – – 22.58 (12.09) –
QLS – – 79.77 (20.03) –
SANS – – 2.57 (2.23) –
Medication
CPZE equivalents (mg / day) – – 321.96 (290.86) –
Variable Combined HC SSD p
Demographics
Sex (F:M) 70:103 41:41 29:62 .023
Age (years) 32.65 (10.56) 33.06 (10.48) 32.27 (10.67) .626
Handedness (0=L, 1=R) 0.67 (0.44) 0.61 (0.50) 0.72 (0.38) .090
Education (years) 14.99 (2.38) 16.13 (2.10) 13.96 (2.14) .000
Parental education 15.42 (3.07) 15.69 (2.76) 15.17 (3.34) .272
WTAR 109.28 (12.28) 113.48 (11.11) 105.32 (12.10) .001
Neurocognition (MATRICS MCCB)
Processing speed 45.86 (14.52) 54.26 (10.09) 38.30 (13.75) .000
Attention and vigilance 43.88 (14.22) 47.55 (15.29) 40.58 (12.36) .001
Working memory 44.65 (11.52) 46.95 (12.08) 42.57 (10.64) .012
Verbal learning 45.14 (11.13) 50.93 (10.33) 39.93 (9.11) .000
Visual learning 42.36 (12.84) 48.02 (11.02) 37.26 (12.26) .000
Reasoning and problem solving 46.30 (11.61) 49.57 (10.04) 43.35 (12.19) .000
Social cognition
ER-40 -2108.40 (608.06) -1848.78 (374.13) -2342.35 (681.08) .000
RMET 25.82 (4.92) 27.13 (4.07) 24.63 (5.32) .001
EA 0.73 (0.30) 0.83 (0.26) 0.63 (0.30) .000
RAD 55.83 (8.45) 60.18 (5.61) 51.91 (8.68) .000
TASIT 1 23.41 (3.14) 24.79 (2.25) 22.16 (3.31) .000
TASIT 2 sincere 17.13 (3.07) 17.49 (2.53) 16.81 (3.46) .149
TASIT 2 simple sarcasm 16.83 (3.99) 18.62 (1.97) 15.22 (4.63) .000
TASIT 2 paradoxical sarcasm 16.77 (3.58) 18.29 (2.26) 15.40 (3.99) .000
TASIT 3 lies 25.40 (4.66) 26.38 (4.14) 24.53 (4.94) .009
TASIT 3 sarcasm 25.36 (5.03) 27.63 (3.10) 23.32 (5.54) .000
White matter
AF left 0.53 (0.04) 0.53 (0.04) 0.52 (0.04) .015
AF right 0.52 (0.04) 0.53 (0.03) 0.51 (0.04) .002
CB left 0.44 (0.03) 0.44 (0.03) 0.43 (0.03) .005
CB right 0.43 (0.03) 0.43 (0.03) 0.42 (0.03) .002
ILF left 0.47 (0.03) 0.48 (0.03) 0.46 (0.03) .000
ILF right 0.47 (0.03) 0.48 (0.03) 0.46 (0.03) .000
IOFF left 0.54 (0.04) 0.55 (0.04) 0.53 (0.04) .001
IOFF right 0.55 (0.04) 0.55 (0.03) 0.54 (0.04) .001
UF left 0.43 (0.03) 0.44 (0.03) 0.43 (0.03) .686
UF right 0.44 (0.03) 0.44 (0.03) 0.44 (0.03) .690
TF left 0.46 (0.02) 0.46 (0.02) 0.45 (0.02) .009
TF right 0.46 (0.02) 0.46 (0.02) 0.46 (0.02) .006
CC1 0.47 (0.03) 0.48 (0.02) 0.46 (0.03) .004
CC2 0.53 (0.03) 0.54 (0.02) 0.53 (0.03) .000
CC3 0.55 (0.03) 0.56 (0.02) 0.55 (0.03) .001
CC4 0.57 (0.03) 0.57 (0.02) 0.56 (0.03) .003
CC5 0.57 (0.03) 0.58 (0.02) 0.57 (0.03) .022
CC6 0.57 (0.02) 0.58 (0.02) 0.57 (0.02) .002
CC7 0.58 (0.02) 0.59 (0.02) 0.58 (0.02) .002
Function outcome, health, clinical
BSFS 154.06 (29.94) 175.79 (19.10) 134.47 (23.80) .000
CIRS-G 2.89 (3.09) 1.55 (1.97) 4.10 (3.42) .000
BPRS – – 31.16 (7.88) –
SAS – – 25.96 (12.07) –
QLS – – 69.03 (20.57) –
SANS – – 2.32 (2.63) –
Medication
CPZE equivalents (mg / day) – – 523.11 (444.68) –

CCA analysis. We conducted separate CCAs on the test and replication samples. We elected to include 35 variables in the CCA analysis. In both samples, the \(X\) set is comprised of FA estimates in 19 white matter tracts, representing association, projection, and commissural fibers (pictured below, bilateral association and projection), and the \(Y\) set is comprised of 16 nonsocial and social cognition variables, from the the MATRICS MCCB and SPINS battery, respectively. Thus, the test set has an ~3.86:1 observation:feature ratio, and the replication set has an ~4.94:1 observation:feature ratio.


CCA results

[1] Canonical correlations and redundancies

Both samples show similar canonical correlation \((R_c)\) values across the derived variates. Specifically, the first variate for GE shows \(R_{cTEST}\) = 0.71 (shared variance between variates, \(R_c^2\) = 0.5), and for PRISMA, \(R_{cREPLICATION}\) = 0.72 (\(R_c^2\) = 0.52). Permutation testing found canonical correlation values to be stable, across all variates, in both samples. These high \(R_c\) values reflect the strong linear relationship between participant-wise \(X\) and \(Y\) scores (derived by multiplying the standardized values of each participant’s original variables by the standardized canonical coefficients, and adding the products). The visual on the right shows participant scores on the first variate.

In addition to canonical correlations, it is recommended to calculate redundancy \((R_d)\). Redundancy estimates the extent to which the variance of one set of variables can be accounted for by the other. Note that redundancies are set-specific, i.e., they are not symmetrical like the \(R_c\) values). We calculate the total canonical redundancy for the variables in the \(Y\) set (i.e., the total percent of \(Y\) set variance accounted for by the \(X\) set variables, through all canonical variates: \(R_{dTEST}\) = 21.99%, and \(R_{dREPLICATION}\) = 22.96%.

GE

PRISMA


[2] Statistical testing

We performed a parametric and nonparametric significance test, reporting the four primary test statistics for multivariate models. The parametric tests are not reliably significant for the GE sample, but all parametric test are significant in the PRISMA sample. The permutation tests are all significant for both samples , with the exception of the Pillai’s trace statistic in the GE sample. In all cases of model significance, only the first variate (CV1) is significant. Moreover, in both samples, the first variate’s redundancy warrants interpretation (\(R_{dTEST}\)=10%, \(R_{dREPLICATION}\)=14%).

GE
variates test statistic statistic critical df1 df2 parametric p permuted p
1-16 Wilks 0.052 1.143 304 1274.432 0.065 0.05
1-16 Hotelling 3.655 1.180 304 1570.000 0.027 0.036
1-16 Pillai 2.471 1.105 304 1840.000 0.119 0.108
1-16 Roy 0.502 7.427 16 118.000 0 0.016

Below, we visualize the permuted Hotelling-Lawley trace statistic, which is significant in both the parametric and nonparamatric significance tests, in both the GE and PRISMA samples.

PRISMA
variates test statistic statistic critical df1 df2 parametric p permuted p
1-16 Wilks 0.067 1.411 304 1741.384 0 0
1-16 Hotelling 3.335 1.493 304 2178.000 0 0
1-16 Pillai 2.264 1.327 304 2448.000 0 0
1-16 Roy 0.516 10.411 16 156.000 0 0

Below, we visualize the permuted Hotelling-Lawley trace statistic, which is significant in both the parametric and nonparamatric significance tests, in both the GE and PRISMA samples.


[3] Variate structure

Standardized structure coefficients (\(r_s\)) indicate the variables that were important to forming the canonical variate from that variable’s set. They can be interpreted like regression coefficients, i.e., a one unit increase in the given variable would result in an increase/decrease of the \(rs\) value. By convention, (\(r_s\)) values ≥ ~.45 are interpreted.

Though permutation testing found structure coefficients to be consistent within the GE and PRISMA samples, we see here that the variate structure is somewhat inconsistent between the GE and PRISMA samples (see also the visualization in the comparison table tab). In general, the PRISMA sample has more highly contributing variables: only three variables contribute in the GE sample but not in the PRISMA (left ILF to the \(X\) set, and working memory and ER-40 to the \(Y\) set); eight variables contribute in the PRISMA sample but not the GE (right AF, right TF, and left UF to the \(X\) set, and EA, RAD, TASIT 1, TASIT 2 paradoxical sarcasm, and TASIT 2 simple sarcasm to the \(Y\) set). We interpret this pattern of results across both the GE and PRISMA samples as follows: the significant association between the \(X\) and \(Y\) variates is driven by the CC3 and right UF, several neurocognitive domains, and higher-order social cognition.

GE
Variate 1
Variable \(\beta\) \(r_s\) \(r_s^2\) (\(h^2\))
X
AF_left -0.284 0.152 0.023
AF_right 0.283 -0.216 0.046
CB_left 0.060 -0.026 0.001
CB_right -0.285 -0.318 0.101
CC1_commissural -0.247 -0.207 0.043
CC2_commissural 0.511 -0.05 0.002
CC3_commissural -0.405 -0.501 0.251
CC4_commissural -0.261 -0.428 0.183
CC5_commissural 0.108 -0.066 0.004
CC6_commissural -0.055 -0.045 0.002
CC7_commissural -0.028 -0.074 0.006
ILF_left 0.313 0.588 0.345
ILF_right 0.270 -0.102 0.010
IOFF_left -0.403 -0.082 0.007
IOFF_right 0.334 0.094 0.009
TF_left 0.619 0.032 0.001
TF_right -0.752 -0.445 0.198
UF_left -0.016 -0.222 0.049
UF_right -0.165 -0.592 0.350
Y
attn_vig -0.147 -0.631 0.398
problem_solving 0.195 -0.212 0.045
process_speed -0.101 -0.561 0.315
verbal_learning 0.129 -0.47 0.221
visual_learning -0.501 -0.665 0.442
work_memory -0.378 -0.657 0.432
EA -0.269 -0.414 0.171
ER_40 -0.400 -0.493 0.243
RAD 0.242 -0.425 0.181
RMET 0.099 -0.259 0.067
TASIT_1 0.379 -0.036 0.001
TASIT_2_paradoxicalSarcasm 0.220 -0.299 0.089
TASIT_2_simpleSarcasm -0.260 -0.387 0.150
TASIT_2_sincere 0.144 0.11 0.012
TASIT_3_lies -0.015 -0.227 0.051
TASIT_3_sarcasm -0.326 -0.46 0.211
Note:
\(r_s\) values > .45 are emphasized, following convention;
\(\beta\) = standardized canonical function coefficient;
\(r_s\) = structure coefficient;
\(r_s^2\) = squared structure coefficient, here also communality \(h^2\)

PRISMA
Variate 1
Variable \(\beta\) \(r_s\) \(r_s^2\) (\(h^2\))
X
AF_left 0.270 0.085 0.007
AF_right -0.048 -0.529 0.280
CB_left -0.227 -0.279 0.078
CB_right 0.051 -0.447 0.200
CC1_commissural 0.187 -0.12 0.014
CC2_commissural 0.215 -0.39 0.152
CC3_commissural -0.664 -0.635 0.404
CC4_commissural 0.228 -0.449 0.201
CC5_commissural -0.079 -0.237 0.056
CC6_commissural 0.316 -0.084 0.007
CC7_commissural 0.151 -0.295 0.087
ILF_left -0.071 0.268 0.072
ILF_right -0.222 -0.41 0.168
IOFF_left 0.096 -0.288 0.083
IOFF_right -0.262 -0.327 0.107
TF_left 0.049 -0.195 0.038
TF_right -0.076 -0.559 0.312
UF_left 0.211 -0.508 0.258
UF_right -0.833 -0.834 0.696
Y
attn_vig -0.144 -0.548 0.301
problem_solving 0.235 -0.337 0.114
process_speed -0.758 -0.806 0.650
verbal_learning -0.085 -0.624 0.389
visual_learning 0.241 -0.489 0.239
work_memory 0.269 -0.301 0.090
EA -0.205 -0.545 0.297
ER_40 -0.070 -0.386 0.149
RAD -0.266 -0.698 0.487
RMET 0.111 -0.372 0.139
TASIT_1 -0.067 -0.557 0.310
TASIT_2_paradoxicalSarcasm -0.149 -0.608 0.370
TASIT_2_simpleSarcasm -0.041 -0.538 0.290
TASIT_2_sincere 0.030 -0.062 0.004
TASIT_3_lies 0.095 -0.195 0.038
TASIT_3_sarcasm -0.211 -0.582 0.339
Note:
\(r_s\) values > .45 are emphasized, following convention;
\(\beta\) = standardized canonical function coefficient;
\(r_s\) = structure coefficient;
\(r_s^2\) = squared structure coefficient, here also communality \(h^2\)

comparison table

This table indicates if a given variable was found to be important (\(r_s\) ≥ .45) to its variate in neither sample, the GE sample only, the PRISMA sample only, or both samples.


Prediction

Participant scores on significant canonical variates can serve as new variables in subsequent analyses. We opted to see if participant scores predicted BSFS (social functioning) scores, given noted contributions of white matter, neurocognition, and social cognition to social functioning. Both simple and multiple linear regression models predicting BSFS scores were highly significant in both samples, with moderate-to-large correlation coefficients (\(R\)) and coefficients of determination (\(R^2\)).

GE

Multiple regression: Adjusted \(R^2\) = 0.155, model p = 0.

PRISMA

Multiple regression: Adjusted \(R^2\) = 0.332, model p = 0.


Classification

Lastly, we performed k-means clustering on participant scores, imposing a cluster size of two. The reason for this was to determine if the two clusters would be differentiated by categorical diagnosis, i.e., HC vs. SSD. We found high but imperfect overlap between diagnostic label and classification based on \(X\) and \(Y\) CV1 scores. The ARI was higher for the PRISMA than GE. We take this as support for the RDoC approach.

GE

PRISMA