CCA

Date written: 2020-07-16
Last ran: 2020-07-24

Overview

Here, we perform canonical correlation analysis (CCA) on the final sample of eligible participants in the SPINS and PNS studies, scanned on the CAMH G3 scanner. The combined sample at present includes a total n=174 participants, comprised of n=128 SSD and 46 HC; n=117 men and n=57 women; and 83 from the SPINS study and 91 from the PNS study.

We have elected to include 35 variables in the CCA analysis. Our \(X\) set is comprised of FA estimates in 23 white matter tracts, representing association, projection, and commissural fibers. Our \(Y\) set is comprised of 12 neurocognition and social cognition variables.

Data structure

[1] Review correlation matrices. Below, we summarise correlations within the X and Y sets, and the cross correlation matrix. We see that variables within sets are more highly correlated than between sets.

#run correlations (note: identical to pairwise complete, `cor(X, use='p')`)
corr <- matcor(X, Y)

#also run correlations for separate HC and SSD
corr_HC <- matcor(X_HC, Y_HC)
corr_SSD <- matcor(X_SSD, Y_SSD)

#function to describe correlations -- excluding the 1s from the diagonal
tab_corr_fn <- function(corr){
  data.frame(as.list(summary(c(corr[upper.tri(corr)], corr [lower.tri(corr)])))) 
}

#function to plot correlation matrices
plotCorr_fn <- function(df){
  ggcorrplot(df) + 
  theme_classic(base_size = 6) +
  theme(legend.position = 'top',
        axis.text.x = element_text(angle = 90, vjust=.5, hjust=1),
        axis.title= element_blank())
}

X correlation matrix

Min.	X1st.Qu.	Median	Mean	X3rd.Qu.	Max.
0.006	0.245	0.333	0.346	0.425	0.807

Y correlation matrix

Min.	X1st.Qu.	Median	Mean	X3rd.Qu.	Max.
0.06	0.233	0.325	0.331	0.422	0.619

XY correlation matrix

Min.	X1st.Qu.	Median	Mean	X3rd.Qu.	Max.
-0.219	0.005	0.161	0.181	0.342	0.807

[2] Test of multicollinearity. Collinearity refers to two or more variables being correlated with each other, and multicollinearity indicates collinearity between three or more variables even if no pair has a particularly high correlation. This latter case of multicollinearity indicates a redundancy between variables, rendering any solution unstable and difficult to interpret. If the pair-wise correlation coefficient between two variables is high (say >0.8) then it is possible that multicollinearity exists; however, this it is not sufficient nor necessary condition for the detection of multicollinearity, as a linear relation in CCA involves many variables. One common approach to identify collinearity among explanatory variables is the use of variance inflation factors (VIF).. The VIF is an indicator of how much the variance of the coefficient is inflated compared to the non-significant correlations with any other variables in the model. The VIF is calculated as 1 divided by tolerance, where tolerence is defined as 1 minus R^2. The higher the VIF value, the higher the collinearity. Typically, VIF = 1 indicates no correlation; > 1 < 5 indicates moderate correlation, >5 high correlation. Here, I use the usdm package, which allows for the calculation of VIF within a variable set, instead of a model. We see no evidence of multicollinearity in our variable sets.

X set

usdm::vif(X)

##          Variables      VIF
## 1          AF_left 2.749468
## 2          CB_left 3.551350
## 3         ILF_left 2.968001
## 4        IOFF_left 3.922892
## 5          UF_left 2.096306
## 6         AF_right 1.695905
## 7         CB_right 3.538748
## 8        ILF_right 3.432844
## 9       IOFF_right 3.785303
## 10        UF_right 2.956842
## 11       CR_F_left 2.786564
## 12       CR_P_left 1.612477
## 13         TF_left 2.552575
## 14      CR_F_right 3.176728
## 15      CR_P_right 1.422856
## 16        TF_right 2.471910
## 17 CC1_commissural 1.793382
## 18 CC2_commissural 3.394907
## 19 CC3_commissural 3.346369
## 20 CC4_commissural 4.305549
## 21 CC5_commissural 2.222846
## 22 CC6_commissural 1.819252
## 23 CC7_commissural 1.569340

Y set

usdm::vif(Y)

##              Variables      VIF
## 1     Processing_speed 2.132778
## 2  Attention_vigilance 1.809310
## 3       Working_memory 2.052315
## 4      Verbal_learning 1.935346
## 5      Visual_learning 1.447797
## 6      Problem_solving 1.415076
## 7              TASIT_1 1.309239
## 8              TASIT_2 1.862862
## 9              TASIT_3 2.223463
## 10                RMET 1.282908
## 11                 RAD 1.933451
## 12               ER_40 1.462921

[3] Test of homogeneity between groups (SSD and HC). We can review the covariance matrices, which tell us how much two random variables vary together (as opposed to correlation, which tells how a change in one variable is correspondance to a change in another). In our case, the covariance matrices are useful to compare between HC and SSD groups, as there is a statistical test, Box M, to test homogeneity. Visually, the matrices appear similar, with some differences:

#make covariance matrices
cov_HC <- cov(df_HC)
cov_SSD <- cov(df_SSD)

#plot
cov_HC_plot <- plotCorr_fn(cov_HC)
cov_SSD_plot <-plotCorr_fn(cov_SSD)

#arrange side by side
grid.arrange(cov_HC_plot, cov_SSD_plot, ncol=2)

We can use Box's M-test to test the null hypothesis that the covariance matrices are equal; i.e., that the observed covariance matrices of the variable sets are equal across groups. Box's M-test is the multivariate version of the univariate assumption of homogeneity of variance (Levene's test) and the bivariate assumption of homoscedasticity. We find that that the M test is significant for X (brain) variables, but not Y (behaviour) variables, meaning that patients and controls do not have equal covariance of brain variables. One option might be to exclude the n=46 participants from the analysis. However, this may not be necessary, for at least two reasons. First, inhomogeneity of variance here is not unexpected. Second, Box's M-test is prone to errors when the sample sizes are small or when the data do not meet model assumptions, especially the assumption of multivariate normality. For large samples, Box’s M test may be too strict, indicating heterogeneity when the covariance matrices are not very different.

X set

rstatix::box_m(X, df_demo$group)

## # A tibble: 1 x 4
##   statistic p.value parameter method                                            
##       <dbl>   <dbl>     <dbl> <chr>                                             
## 1      332.  0.0111       276 Box's M-test for Homogeneity of Covariance Matric…

Y set

rstatix::box_m(Y, df_demo$group)

## # A tibble: 1 x 4
##   statistic p.value parameter method                                            
##       <dbl>   <dbl>     <dbl> <chr>                                             
## 1      92.0   0.133        78 Box's M-test for Homogeneity of Covariance Matric…

CCA

[1] Canonical correlations. We have quite high canonical correlations. As seen in the plot below, the first variate has a correlation values of 0.584854. However, this does not indicate significance.

#manipulate canonical correlation values
correlations$variate <- seq(1:ncol(Y)) #
names(correlations)[names(correlations) == 'unlist(correlations)'] <- 'canonical correlation'
correlations$`canonical correlation` <- round(correlations$`canonical correlation`, 3)

#bar plot/scree plot
ggplot(correlations, aes(x=variate, y=`canonical correlation`)) + 
  geom_bar(stat='identity') +
  scale_x_continuous(breaks=seq(1, ncol(Y), 1)) +
  theme_minimal() +
  theme(axis.text.x=element_blank()) +
  geom_text(aes(label=`canonical correlation`), hjust = 1.2, color= 'white', size=3.5) +
  coord_flip()

[2] Visualization. We can visualize the structure of canonical correlations. The 'target' plot below shows a representation on the plane defined by the first two canonical variates. The rings are called the "correlation cricle", with two circumferences corresponding to radius 0.5 and 1, which reveals the most salient patterns in the ring. Variables with a strong relation are projected in the same direction from the origin. The greater the distance from the origin, the stronger the relation.

#visualize 'target' plot and 'user' plot
plt.cc(cca_cca, var.label = T)

Significance testing

#the null hypothesis is that the canonical correlations in all rows are 0

#define n, p, q
n <- nrow(df)
p <- ncol(X)
q <- ncol(Y)

#function to calculate p-values using the F-approximations of different statistical tests
p_fn <- function(tstat){
  df_p <- as.data.frame(p.asym(cca_cca$cor, n, p, q, tstat = tstat)) #test
  df_p <- setDT(df_p, keep.rownames = 'variate')[] #add row names
  df_p$variate <- paste(df_p$variate, '-', ncol(Y), sep='')
  return(df_p)
}

[1] Statistical testing, F distribution. Next, we formally test for significance. There are 4 primary tests of significance in multivariate models, which all draw from the F distribution. We see that the first 3 tests, that similarly evaluate the entire CCA model, have similar p values, none of which reach significance. However, the final test, Roy's Largest Root, is highly significant, p<0.001. Unlike the other 3 tests, Roy's Largest Root considers only the largest eigenvector, i.e., it estimates significance on the basis of only the first canonical variate (hence the "largest root"). Note, however, that such a small value for Roy's largest root can be a reflection of dimensionality, but also the consequence of non-normality or heterscedasticity, to which this estimate is especially sensitive.

Wilks' Lambda

p_wilks <- p_fn('Wilks')

## Wilks' Lambda, using F-approximation (Rao's F):
##                 stat     approx df1       df2   p.value
## 1 to 12:   0.1660943 1.00700075 276 1518.1644 0.4614505
## 2 to 12:   0.2524437 0.87192831 242 1411.2870 0.9109514
## 3 to 12:   0.3467475 0.76789153 210 1301.8821 0.9919337
## 4 to 12:   0.4349081 0.70190072 180 1189.7155 0.9984944
## 5 to 12:   0.5253799 0.64092399 152 1074.5108 0.9996631
## 6 to 12:   0.6103847 0.59228108 126  955.9434 0.9998519
## 7 to 12:   0.7042099 0.51852534 102  833.6360 0.9999715
## 8 to 12:   0.7924908 0.43754474  80  707.1590 0.9999946
## 9 to 12:   0.8519829 0.40218213  60  576.0420 0.9999836
## 10 to 12:  0.9070377 0.35014885  42  439.8043 0.9999569
## 11 to 12:  0.9584471 0.24581782  26  298.0000 0.9999591
## 12 to 12:  0.9937181 0.07901956  12  150.0000 0.9999876

Hotelling-Lawley Trace

p_hotelling <- p_fn('Hotelling')

##  Hotelling-Lawley Trace, using F-approximation:
##                   stat     approx df1  df2   p.value
## 1 to 12:   2.036083673 1.01189424 276 1646 0.4401712
## 2 to 12:   1.516201726 0.87192041 242 1670 0.9131396
## 3 to 12:   1.142638222 0.76810680 210 1694 0.9925800
## 4 to 12:   0.888387871 0.70659739 180 1718 0.9984647
## 5 to 12:   0.680362770 0.64977629 152 1742 0.9996117
## 6 to 12:   0.518565966 0.60567956 126 1766 0.9998013
## 7 to 12:   0.364851113 0.53356495 102 1790 0.9999593
## 8 to 12:   0.239489469 0.45253531  80 1814 0.9999922
## 9 to 12:   0.164419679 0.41972690  60 1838 0.9999749
## 10 to 12:  0.099800138 0.36870607  42 1862 0.9999342
## 11 to 12:  0.043121733 0.26066535  26 1886 0.9999422
## 12 to 12:  0.006321565 0.08384854  12 1910 0.9999851

Pillai-Bartlett Trace

p_pillai <- p_fn('Pillai')

##  Pillai-Bartlett Trace, using F-approximation:
##                   stat     approx df1  df2   p.value
## 1 to 12:   1.598769478 1.00245422 276 1800 0.4803851
## 2 to 12:   1.256715316 0.88167655 242 1824 0.8953725
## 3 to 12:   0.984748637 0.78670815 210 1848 0.9870732
## 4 to 12:   0.782037631 0.72501503 180 1872 0.9970845
## 5 to 12:   0.609835000 0.66784715 152 1896 0.9992240
## 6 to 12:   0.470570712 0.62193897 126 1920 0.9996265
## 7 to 12:   0.337336012 0.55126578 102 1944 0.9999137
## 8 to 12:   0.225939249 0.47206360  80 1968 0.9999807
## 9 to 12:   0.156111419 0.43760114  60 1992 0.9999483
## 10 to 12:  0.095414111 0.38471538  42 2016 0.9998839
## 11 to 12:  0.041775837 0.27410395  26 2040 0.9999058
## 12 to 12:  0.006281854 0.09008706  12 2064 0.9999779

Roy's Largest Root

plt.asym(p.asym(rho, n, p, q, tstat = "Roy"), rhostart=1)

##  Roy's Largest Root, using F-approximation:
##               stat   approx df1 df2            p.value
## 1 to 1:  0.3420542 6.975083  12 161 0.0000000004137379
## 
##  F statistic for Roy's Greatest Root is an upper bound.

[2] Permuted statistical test. However, becayse randomization (cf. random sampling) is the norm in psychiatry, we should perform permutation testing / randomization tests: doing so is preferred to assuming the F-distribution, which is not appropriate if the sample is non-random. Like bootstrapping, a permutation test builds - rather than assumes - sampling distribution (the “permutation distribution”) by resampling the observed data. Unfortunately, permutation testing (999 permutations) indicates that Roy's largest root is not significant (nor are any of the other 3 multivariate significance tests):

#for some reason, this doesn't work in rstudio (must be packages loaded - run in R instead - takes a bit of time)
#CCP::p.perm(X, Y, nboot=500, rhostart=1, type='Roy')

Model exploration

#calculate correlations
rxx <- cor(X, use = "p") #pairwise complete observations
ryy <- cor(Y, use = "p")
rxy <- cor(X,Y, use = "p")

#calculate omega
omega = t(solve(chol(rxx))) %*% rxy %*% solve(chol(ryy)) 

#decompose matrix
usv <- svd(omega)

#standardized weights (beta)
x.weights.std <- solve(chol(rxx)) %*% usv$u
y.weights.std <- solve(chol(ryy)) %*% usv$v

#structure coefficients
x.structures <- rxx %*% x.weights.std
y.structures <- ryy %*% y.weights.std

#canonical cross-loadings/redundancy
x.scores <- as.matrix(X) %*% x.weights.std
y.scores <- as.matrix(Y) %*% y.weights.std

#redundancy (?)
red.x.yscores <- cor(X, y.scores) 
red.y.xscores <- cor(Y, x.scores)

[1] Participant scores (CV1). Here, we show the association between scores in the X set and Y set, for each participant, for the first canonical variate, Rc=0.585.

#bind together data
scores.plot <- as.data.frame(cbind(x.scores[, 1], y.scores[, 1], df_demo$group))

#make sure numeric
scores.plot[1:2] <- sapply(scores.plot[1:2],as.numeric)

#change names for clarity
names(scores.plot) <- c('Xset', 'Yset', 'group')

#make a scatter plot
ggplot(scores.plot, aes(x=Xset, y=Yset)) +
  geom_point(size = 4, aes(shape=group), alpha=.8) +
  geom_smooth(method=lm, color="black", alpha=.4) +
  theme_bw() +
  theme(axis.ticks=element_blank(), 
        panel.grid.minor = element_blank(), 
        panel.grid.major.x = element_blank(),
        legend.position='top')

[2] Descriptive table (CV1). Here, we show standardized canonical function coefficients, structure coefficients, and squared structure coefficients, for the first canonical variate. The structure coefficients, \(rs\), are of most interest here. We can interpret the structure coefficients like regression coefficients, i.e., a one unit increase in the given variable would result in an increase/decrease of the \(rs\) value of the first canonical variate for the given variable's set, when the other variables are held constant. We do not observe a distinctive pattern of variable contribution to the first canonical variate.

#for table, take just first column
x.weights.std <- x.weights.std[, 1]
x.weights.std <- t(data.frame(lapply(x.weights.std, type.convert), stringsAsFactors = F)) #get into df

y.weights.std <- y.weights.std[, 1]
y.weights.std <- t(data.frame(lapply(y.weights.std, type.convert), stringsAsFactors = F)) #get into df

#bind together standardized weights
weights <- rbind(x.weights.std, y.weights.std)

#bind together structure coefficients
structures <- rbind(x.structures, y.structures)

#make table
tbl <- as.data.frame(cbind(weights, structures))

#add rownames as a variable
tbl <- tibble::rownames_to_column(tbl, "variable")

#redefine row names with full names
tbl$variable <- c(names(X), names(Y))

#set up dynamic names for table
label <- paste("$R_c$=", round(cca_cca$cor[1], 5))
myHeader <- c(" " = 1, label = 2) # create vector with colspan
names(myHeader) <- c(" ", label) #set names

#make into pretty table
tbl_plot <- tbl

#take only first 3 columns -- adjust if want more variates
tbl_plot <- tbl_plot[, 1:3]

#add a column for the communality coefficient
tbl_plot$communalities <- tbl_plot$V2 ^ 2

#round
tbl_plot[,2:4] <- round(tbl_plot[,2:4], 3) #shorten here, because formatting makes characters (?!)

tbl_plot %>%
  mutate(V2 = ifelse((V2 > .45 | V2 < -.45),
          cell_spec(V2, color='black', underline = T, bold = T), #no green
          cell_spec(V2, color='black'))) %>%
  kable(escape=F,
        align = c('l', 'c', 'c', 'c'),
        col.names = c('_Variable_', '$\\beta$', '$r_s$', '$r_s^2$ ($h^2$)')) %>%
kable_styling(bootstrap_options=c('striped', 'hover', 'condensed')) %>%
add_header_above(c(" " = 1, "_Function 1_" = 3)) %>%
pack_rows('X', 1, 23) %>%
pack_rows('Y', 24, 35) %>%
footnote(general = "$r_s$ values > .450 are emphasized, following convention;
         $\\beta$ = standardized canonical function coefficient;
         $r_s$ = structure coefficient;
         $r_s^2$ = squared structure coefficient, here also communality $h^2$")

	Function 1
Variable	\(\beta\)	\(r_s\)	\(r_s^2\) (\(h^2\))
X
AF_left	0.356	0.106	0.011
CB_left	-0.309	0.017	0.000
ILF_left	0.134	0.286	0.082
IOFF_left	-0.753	-0.247	0.061
UF_left	0.392	0.043	0.002
AF_right	-0.057	-0.065	0.004
CB_right	0.318	-0.013	0.000
ILF_right	0.491	0.35	0.123
IOFF_right	0.093	-0.124	0.015
UF_right	-0.380	-0.249	0.062
CR_F_left	0.345	0.265	0.070
CR_P_left	0.343	0.298	0.089
TF_left	-0.347	-0.081	0.007
CR_F_right	-0.015	0.107	0.011
CR_P_right	-0.295	-0.203	0.041
TF_right	0.189	0.009	0.000
CC1_commissural	-0.347	-0.134	0.018
CC2_commissural	0.156	0.111	0.012
CC3_commissural	-0.009	0.011	0.000
CC4_commissural	-0.099	0.003	0.000
CC5_commissural	-0.177	-0.009	0.000
CC6_commissural	0.260	0.385	0.148
CC7_commissural	-0.173	-0.147	0.022
Y
Processing_speed	0.040	0.107	0.011
Attention_vigilance	0.189	0.078	0.006
Working_memory	0.194	0.231	0.053
Verbal_learning	0.004	0.117	0.014
Visual_learning	-0.728	-0.28	0.078
Problem_solving	0.288	0.354	0.125
TASIT_1	-0.324	-0.088	0.008
TASIT_2	0.769	0.579	0.335
TASIT_3	-0.047	0.354	0.125
RMET	-0.062	0.205	0.042
RAD	0.229	0.287	0.082
ER_40	-0.511	-0.235	0.055
Note:
\(r_s\) values > .450 are emphasized, following convention; \(\beta\) = standardized canonical function coefficient; \(r_s\) = structure coefficient; \(r_s^2\) = squared structure coefficient, here also communality \(h^2\)

[3] Structure coefficients (CV1). This is the same data as in the table above, but perhaps more clearly presented in this visualization. As above, we do not observe a distinctive pattern of variable contribution to the first canonical variate.

#make a new dataframe
tbl_sc <- tbl_plot

#keep only the structure coefficients variable
tbl_sc <- subset(tbl_sc, select = c(variable, V2))

#make a logical variable for colour of bars
tbl_sc$colour <- as.factor(ifelse(abs(tbl_sc$V2) >= .45, 1, 0))

#Lock in factor-level order
tbl_sc$variable <- factor(tbl_sc$variable, levels=tbl_sc$variable)

#make plot
ggplot(tbl_sc, aes(variable, V2)) + 
  geom_bar(stat='identity', aes(fill = colour)) + 
  theme_bw() +
  theme(axis.ticks=element_blank(), 
        panel.grid.minor = element_blank(), 
        panel.grid.major.x = element_blank(),
        axis.text.x = element_text(angle = 90, hjust = 1, size=5),
        legend.position = "none") + 
  xlab('') + 
  ylab('structure coefficient') + 
  geom_hline(yintercept = .45, linetype="dashed", color = "darkgreen", size=1.5) +
  #annotate(geom='text', x=.87, y = 10, color='green', label = "rs = abs(.45)") +
  geom_hline(yintercept = -.45, linetype="dashed", color = "darkgreen", size=1.5) +
  scale_fill_manual(values=c('grey', 'darkgreen'))

[4] Model redundancy. Here, we calculate the Stewart-Love redundancy index for each variate (i.e., variance in one set that can be explained by the other set, and vice versa). I believe that the result that the X set explains ~15% of the variance in Y is actually quite high for similar brain-behaviour relationships.

#proportion of variance in X explained by Y, and vice versa
redundancy(cca_candisc)

## 
## Redundancies for the X variables & total X canonical redundancy
## 
##     Xcan1     Xcan2     Xcan3     Xcan4     Xcan5     Xcan6     Xcan7     Xcan8 
## 0.0115892 0.0136879 0.0062526 0.0056001 0.0078876 0.0038908 0.0038935 0.0019508 
##     Xcan9    Xcan10    Xcan11    Xcan12 total X|Y 
## 0.0027743 0.0034809 0.0007542 0.0001787 0.0619406 
## 
## Redundancies for the Y variables & total Y canonical redundancy
## 
##     Ycan1     Ycan2     Ycan3     Ycan4     Ycan5     Ycan6     Ycan7     Ycan8 
## 0.0266714 0.0573773 0.0187295 0.0116229 0.0062293 0.0154838 0.0063050 0.0041995 
##     Ycan9    Ycan10    Ycan11    Ycan12 total Y|X 
## 0.0029889 0.0035124 0.0025914 0.0005386 0.1562499

[5] Model stability / validation. Lastly, we validated our model via iterative feature removal, i.e., we removed one of features from the combined p=86 of the \(X\) and \(Y\) sets, ran the CCA, and compared the derived canonical correlation coefficients. This procedure leveraged the CCA property that each variable relates to all other variables in both sets: if the model is stable, canonical correlation estimates will remain similar; if unstable, estimate will vary widely. Values were similar across all iterations, suggesting that our model is stable. Note, however, that this is not a statistical test of stability.

#run the original correlation 
original <- cc(X, Y)$cor

#write a nested function to leave out a column, and calculate correlation
cc_df_one_out <- function(X, Y){ #takes X and Y dataframe or matrix
  f <- function(x) combn(ncol(x), ncol(x) - 1) #base combn function leave out a column ...
  
  X_inx <- f(X) #from X
  Y_inx <- f(Y) #from Y
  
  ccX <- t(apply(X_inx, 2, function(i) cc(X[, i], Y)$cor)) #run the correlation between dropped X and Y
  ccY <- t(apply(Y_inx, 2, function(i) cc(X, Y[, i])$cor)) #run the correlation between dropped Y and X
  
  list(XVALS = as.data.frame(ccX), YVALS = as.data.frame(ccY)) #add the values to a dataframe
}

#run the function
values <- cc_df_one_out(X, Y)

#extract from list and combine the dataframes 
values <- data.frame(Reduce(rbind.fill, values))

#add the original observations as the last row
values <- rbind(values, original)

#add a column indicating that the last row is original -- used to colour the plot
values$original <- c(rep(FALSE, ncol(X)), rep(FALSE, ncol(Y)), TRUE)

#melt for ggplot
values <- reshape2::melt(values)

#boxplot
ggplot(values, aes(x=variable, y=value)) +
  geom_boxplot(outlier.shape = NA) +
  geom_violin(fill='grey', alpha=.5) +
  geom_jitter(aes(colour = original, size= original)) +
  theme_bw() +
  xlab('') +
  theme(axis.ticks = element_blank()) +
  scale_colour_manual(values = c("FALSE" = "black", "TRUE" = "red")) +
  scale_size_manual(values = c("FALSE" = .5, "TRUE" = 5)) +
  guides(color = FALSE, size = FALSE)

SSD subset

Though increasing evidence suggests that neurocognition and social cognition are dimensional constructs on which HC and SSD fall along a continuum, as opposed to discrete classes, we were curious to verify that including only SSD participants would result in similar results.

Correlations

corr_SSD <- matcor(X_SSD, Y_SSD)

plotX <- plotCorr_fn(corr_SSD$Xcor)
plotY <- plotCorr_fn(corr_SSD$Ycor)
plotXY <-  plotCorr_fn(corr_SSD$XYcor)

#arrange side by side
grid.arrange(plotX, plotY, plotXY, ncol=3)

Canonical correlations

Significance testing (F)

n <- nrow(df_SSD)
p_wilks <- p_fn('Wilks')

## Wilks' Lambda, using F-approximation (Rao's F):
##                  stat    approx df1       df2   p.value
## 1 to 12:   0.06355148 1.0955769 276 1026.9543 0.1639705
## 2 to 12:   0.11985696 0.9471032 242  956.8406 0.6952147
## 3 to 12:   0.18969348 0.8473504 210  884.6526 0.9301136
## 4 to 12:   0.28192708 0.7466442 180  810.2257 0.9918733
## 5 to 12:   0.38085824 0.6707212 152  733.3656 0.9986562
## 6 to 12:   0.50257888 0.5731244 126  653.8440 0.9999155
## 7 to 12:   0.59755531 0.5295239 102  571.3956 0.9999390
## 8 to 12:   0.69300388 0.4805914  80  485.7183 0.9999491
## 9 to 12:   0.78074003 0.4325708  60  396.4812 0.9999292
## 10 to 12:  0.84632565 0.4178763  42  303.3463 0.9995060
## 11 to 12:  0.90080416 0.4248508  26  206.0000 0.9941701
## 12 to 12:  0.95532624 0.4052779  12  104.0000 0.9587017

p_hotelling <- p_fn('Hotelling')

##  Hotelling-Lawley Trace, using F-approximation:
##                  stat    approx df1  df2   p.value
## 1 to 12:   3.35635472 1.1086510 276 1094 0.1330061
## 2 to 12:   2.47037258 0.9510594 242 1118 0.6829566
## 3 to 12:   1.88770705 0.8554609 210 1142 0.9220218
## 4 to 12:   1.40148263 0.7565411 180 1166 0.9906571
## 5 to 12:   1.05057216 0.6854062 152 1190 0.9982542
## 6 to 12:   0.73097651 0.5869084 126 1214 0.9998967
## 7 to 12:   0.54199834 0.5481977 102 1238 0.9999127
## 8 to 12:   0.38226657 0.5025213  80 1262 0.9999203
## 9 to 12:   0.25566389 0.4566441  60 1286 0.9998854
## 10 to 12:  0.17165946 0.4461784  42 1310 0.9991995
## 11 to 12:  0.10728884 0.4587286  26 1334 0.9912819
## 12 to 12:  0.04676283 0.4409995  12 1358 0.9471573

p_pillai <- p_fn('Pillai')

##  Pillai-Bartlett Trace, using F-approximation:
##                  stat    approx df1  df2   p.value
## 1 to 12:   2.31579812 1.0812904 276 1248 0.1956201
## 2 to 12:   1.84602583 0.9555941 242 1272 0.6674599
## 3 to 12:   1.47787127 0.8667996 210 1296 0.9048446
## 4 to 12:   1.15071717 0.7778019 180 1320 0.9836842
## 5 to 12:   0.89095868 0.7091476 152 1344 0.9963205
## 6 to 12:   0.64876657 0.6205274 126 1368 0.9996059
## 7 to 12:   0.48982491 0.5807617 102 1392 0.9996935
## 8 to 12:   0.35209325 0.5350361  80 1416 0.9997349
## 9 to 12:   0.23971763 0.4892079  60 1440 0.9996549
## 10 to 12:  0.16222308 0.4776770  42 1464 0.9982182
## 11 to 12:  0.10174545 0.4893970  26 1488 0.9859109
## 12 to 12:  0.04467376 0.4708273  12 1512 0.9323305

plt.asym(p.asym(rho, n, p, q, tstat = "Roy"), rhostart=1)

##  Roy's Largest Root, using F-approximation:
##               stat   approx df1 df2        p.value
## 1 to 1:  0.3420542 4.982202  12 115 0.000001318744
## 
##  F statistic for Roy's Greatest Root is an upper bound.

Significance testing (perm)

Scores

#calculate correlations
rxx <- cor(X_SSD, use = "p") #pairwise complete observations
ryy <- cor(Y_SSD, use = "p")
rxy <- cor(X_SSD,Y_SSD, use = "p")

#calculate omega
omega = t(solve(chol(rxx))) %*% rxy %*% solve(chol(ryy)) 

#decompose matrix
usv <- svd(omega)

#standardized weights (beta)
x.weights.std <- solve(chol(rxx)) %*% usv$u
y.weights.std <- solve(chol(ryy)) %*% usv$v

#structure coefficients
x.structures <- rxx %*% x.weights.std
y.structures <- ryy %*% y.weights.std

#canonical cross-loadings/redundancy
x.scores <- as.matrix(X_SSD) %*% x.weights.std
y.scores <- as.matrix(Y_SSD) %*% y.weights.std

#redundancy (?)
red.x.yscores <- cor(X_SSD, y.scores) 
red.y.xscores <- cor(Y_SSD, x.scores)

#bind together data
scores.plot <- as.data.frame(cbind(x.scores[, 1], y.scores[, 1]))

#make sure numeric
scores.plot[1:2] <- sapply(scores.plot[1:2],as.numeric)

#change names for clarity
names(scores.plot) <- c('Xset', 'Yset')

#make a scatter plot
ggplot(scores.plot, aes(x=Xset, y=Yset)) +
  geom_point(size = 4, alpha=.8) +
  geom_smooth(method=lm, color="black", alpha=.4) +
  theme_bw() +
  theme(axis.ticks=element_blank(), 
        panel.grid.minor = element_blank(), 
        panel.grid.major.x = element_blank(),
        legend.position='top')

Descriptives

        #for table, take just first column
        x.weights.std <- x.weights.std[, 1]
        x.weights.std <- t(data.frame(lapply(x.weights.std, type.convert), stringsAsFactors = F)) #get into df

        y.weights.std <- y.weights.std[, 1]
        y.weights.std <- t(data.frame(lapply(y.weights.std, type.convert), stringsAsFactors = F)) #get into df

        #bind together standardized weights
        weights <- rbind(x.weights.std, y.weights.std)

        #bind together structure coefficients
        structures <- rbind(x.structures, y.structures)

        #make table
        tbl <- as.data.frame(cbind(weights, structures))

        #add rownames as a variable
        tbl <- tibble::rownames_to_column(tbl, "variable")

        #redefine row names with full names
        tbl$variable <- c(names(X_SSD), names(Y_SSD))

        #set up dynamic names for table
        label <- paste("$R_c$=", round(cca_cca$cor[1], 5))
        myHeader <- c(" " = 1, label = 2) # create vector with colspan
        names(myHeader) <- c(" ", label) #set names

        #make into pretty table
        tbl_plot <- tbl

        #take only first 3 columns -- adjust if want more variates
        tbl_plot <- tbl_plot[, 1:3]

        #add a column for the communality coefficient
        tbl_plot$communalities <- tbl_plot$V2 ^ 2

        #round
        tbl_plot[,2:4] <- round(tbl_plot[,2:4], 3) #shorten here, because formatting makes characters (?!)

        tbl_plot %>%
          mutate(V2 = ifelse((V2 > .45 | V2 < -.45),
                  cell_spec(V2, color='black', underline = T, bold = T), #no green
                  cell_spec(V2, color='black'))) %>%
          kable(escape=F,
                align = c('l', 'c', 'c', 'c'),
                col.names = c('_Variable_', '$\\beta$', '$r_s$', '$r_s^2$ ($h^2$)')) %>%
        kable_styling(bootstrap_options=c('striped', 'hover', 'condensed')) %>%
        add_header_above(c(" " = 1, "_Function 1_" = 3)) %>%
        pack_rows('X', 1, 23) %>%
        pack_rows('Y', 24, 35) %>%
        footnote(general = "$r_s$ values > .450 are emphasized, following convention;
                 $\\beta$ = standardized canonical function coefficient;
                 $r_s$ = structure coefficient;
                 $r_s^2$ = squared structure coefficient, here also communality $h^2$")

	Function 1
Variable	\(\beta\)	\(r_s\)	\(r_s^2\) (\(h^2\))
X
AF_left	0.219	0.082	0.007
CB_left	0.014	0.083	0.007
ILF_left	-0.236	0.199	0.040
IOFF_left	-0.423	-0.201	0.041
UF_left	0.040	-0.165	0.027
AF_right	0.036	-0.029	0.001
CB_right	0.107	0.088	0.008
ILF_right	0.870	0.448	0.201
IOFF_right	0.194	-0.048	0.002
UF_right	-0.430	-0.315	0.099
CR_F_left	0.230	0.356	0.126
CR_P_left	0.291	0.197	0.039
TF_left	-0.381	-0.155	0.024
CR_F_right	0.294	0.226	0.051
CR_P_right	-0.128	-0.144	0.021
TF_right	0.175	-0.08	0.006
CC1_commissural	0.091	0.019	0.000
CC2_commissural	-0.153	0.021	0.000
CC3_commissural	-0.189	-0.193	0.037
CC4_commissural	-0.339	-0.173	0.030
CC5_commissural	0.078	0.005	0.000
CC6_commissural	0.174	0.221	0.049
CC7_commissural	-0.302	-0.078	0.006
Y
Processing_speed	-0.172	-0.2	0.040
Attention_vigilance	-0.049	-0.228	0.052
Working_memory	-0.348	-0.235	0.055
Verbal_learning	0.243	0.021	0.000
Visual_learning	-0.475	-0.327	0.107
Problem_solving	0.365	0.266	0.071
TASIT_1	-0.239	-0.146	0.021
TASIT_2	0.783	0.564	0.318
TASIT_3	0.067	0.221	0.049
RMET	-0.267	-0.073	0.005
RAD	0.105	0.128	0.016
ER_40	-0.310	-0.295	0.087
Note:
\(r_s\) values > .450 are emphasized, following convention; \(\beta\) = standardized canonical function coefficient; \(r_s\) = structure coefficient; \(r_s^2\) = squared structure coefficient, here also communality \(h^2\)

Redundancy

cca_candisc <- cancor(X_SSD, Y_SSD)
redundancy(cca_candisc)

## 
## Redundancies for the X variables & total X canonical redundancy
## 
##     Xcan1     Xcan2     Xcan3     Xcan4     Xcan5     Xcan6     Xcan7     Xcan8 
##  0.016803  0.014522  0.013196  0.006606  0.009214  0.004356  0.004425  0.013557 
##     Xcan9    Xcan10    Xcan11    Xcan12 total X|Y 
##  0.003283  0.002182  0.002411  0.002129  0.092683 
## 
## Redundancies for the Y variables & total Y canonical redundancy
## 
##     Ycan1     Ycan2     Ycan3     Ycan4     Ycan5     Ycan6     Ycan7     Ycan8 
##  0.032155  0.027607  0.066995  0.017319  0.014782  0.013747  0.012381  0.007500 
##     Ycan9    Ycan10    Ycan11    Ycan12 total Y|X 
##  0.004819  0.004706  0.003280  0.003730  0.209021