Startup

Packages

library(pacman)
p_load(psych, EGAnet, lavaan, qgraph, semPlot, qpcR, dplyr, tidyr, knitr, ggplot2)

Functions

#From Gunn, Grimm & Edwards (2019)

SDI2.UDI2 = function(p, loading1, loading2, intercept1, intercept2, stdev2, fmean2, fsd2, nodewidth = 0.01, lowerLV = -5, upperLV = 5) {
   LV = seq(lowerLV,upperLV,nodewidth)
   DiffExpItem12 = matrix(NA,length(LV),p)
   pdfLV2 = matrix(NA,length(LV),1)
   SDI2numerator = matrix(NA,length(LV),p)
   UDI2numerator = matrix(NA,length(LV),p)
   SDI2 = matrix(NA,p,1)
   UDI2 = matrix(NA,p,1)
   for(j in 1:p){
    for(k in 1:length(LV)){
      DiffExpItem12[k,j] <- (intercept1[j]-intercept2[j]) + (loading1[j]-loading2[j])*LV[k]
      pdfLV2[k] = dnorm(LV[k], mean=fmean2, sd=fsd2)
      SDI2numerator[k,j] = DiffExpItem12[k,j]*pdfLV2[k]*nodewidth
      UDI2numerator[k,j] = abs(SDI2numerator[k,j])
    }
    SDI2[j] <- sum(SDI2numerator[,j])/stdev2[j]
    UDI2[j] <- sum(UDI2numerator[,j])/stdev2[j]
   }
  colnames(SDI2) = "SDI2"
  colnames(UDI2) = "UDI2"
  return(list(SDI2=round(SDI2,4), UDI2=round(UDI2,4)))}

Purpose

Sample and Procedure

Cockroft, Alloway, Copello & Milligan (2015) utilized multi-group confirmatory factor analysis (MGCFA) in order to investigate measurement invariance (MI) on the Wechsler Adult Intelligence Scale Third Edition (WAIS-III) for British and South African samples.

The British sample comprised 349 monolingual English-speaking undergraduate students of middle-class extraction (33% male; \(\mu_{age}\) = 27.18 years, \(\sigma_{age}\) = 8.74, 18-58; 71% white, 1% black, 18% other, including Asian, Indian, no race indicated, and various international students).

The South African sample was composed of 107 entirely black and multilingual undergraduate students attending a university which taught in English (40% male; \(\mu_{age}\) = 20.79 years, \(\sigma_{age}\) = 1.56, 18-25). 82% of the sample attended primary and secondary schools administered in English, but in all cases, an indigenous African language was the first language and for 65% of the sample it was their second language, with 55% of the sample learning English before the age of six and the entirety learning it before the age of thirteen. The socioeconomic status of the South African sample was markedly different from that of the British middle class, with 82% of the sample derived from a rural area, 98% lacking common household appliances like microwaves and washing machines, fewer than 1% owning an automobile or computer, only 67% attending preschool in their youth, 52% coming from single-parent homes, and 28% never knowing their fathers. 58% of mothers completed high school and 35% enjoyed post-secondary school qualifications.

These samples differed markedly in their life experiences culturally and socioeconomically, making them an excellent sample to assess whether a cognitive test developed in the Anglosphere could still be used without bias in third world conditions markedly different from those it was constructed in. The tests were essentially the same in content with minor alterations to localize the South African version, for example, changing “pounds” to “rands”. The analysis by Claasen et al. (2001) indicated that localizations were unlikely to affect results.

MGCFA is a step-wise modeling effort in which a factor model found to fit well in one group is applied in another and constraints on its parameters are subsequently added in a specific manner at each step. For more information, see the studies by Meredith (1993), Lubke, Dolan, Kelderman & Mellenbergh (2003a, b), Meredith & Teresi (2006), and van de Schoot, Lugtig & Hox (2012). Some psychometricians reverse the steps, constraining residual variances before intercepts (primarily researchers who believe MI is not achieved without constraints on the residual variances), while others consider it important to constrain factor and residual covariances to assess whether equal reliabilities are tenable, between other more commonly recognized stages. In the table below, I present a rendition of the table from an earlier study I authored to lay out these steps; note that where latent means are given by \(\mu\) for reference (r) and focal (f) groups (chosen arbitrarily with no effect on results), \(\mu_r\) - \(\mu_f\) = \(\delta\) and, assuming residual variances (item and factor covariances as well) are identical, the proportion of the group difference due to a particular factor, \(\eta\), can be calculated by computing \(\frac{\lambda_{\eta}\delta}{\Sigma\lambda\delta}\) (see Dolan, 2000).

Measurement Invariance Steps

Model Nesting Common Name(s) Constraints Free Parameters Statement Implication
A1 None Configural, Pattern, or Form Invariance Latent (factor) means and variances The rest \(\lambda_{r} \neq \lambda_{f}, \Psi_{r} \neq \Psi_{f},\Phi_{r} = \Phi_{f}, \Theta_{r} \neq \Theta_{f}, \tau_{r} \neq \tau_{f}, \delta = 0\) Same number of latent variable, indicators, and pattern of constrained and estimated parameters
A2 A1 Metric or Weak Invariance Latent means and factor loadings Intercepts, (all parts of) residuals, latent variances \(\lambda_{r} = \lambda_{f}, \Phi_{r} \neq \Phi_{f}\) Same underlying construct meanings, comparable latent variances and covariances
A3 A2 Scalar or Strong Invariance Intercepts Residuals, latent means and variances \(\tau_{r} = \tau_{f}, \delta \neq 0\) Comparable latent means
A4 A3 Strict (Factorial), Full Uniqueness, or Omnibus Invariance Residuals Latent variances \(\Theta_{r} = \Theta_{f}\) Identical latent variables and reliability
A5 A4 Homogeneity of Latent Variances Latent Variances Latent means \(\Phi_{r} = \Phi_{f}\) Groups use equivalent ranges of the latent construct
A6 A5 Homogeneity of Factor Means Latent Means - \(\delta = 0\) No difference between the groups in the level of the latent construct

A lack of bias is given by Mellenbergh (1989) as

\[f(x|\eta) = f(x|\eta,v)\]

where x is an observed score and v is a grouping variable. Based on this theoretical definition, if members of two different groups possess the same level of \(\eta\), they will receive the same score for the observed variable, x; on the other hand, if bias is found and MI is not tenable, \(\eta\) will not explain scores on x to an equal extent, or in severe cases, at all, in focal group members. This definition of bias - which undergirds MGCFA tests for MI and IRT-based tests for differential item functioning (DIF; conceptually equivalent concepts at different levels of inspection) - is important to understand, as it suggests a mean difference in observed scores is not more than prima facie evidence of bias when it is also assumed that groups have the same underlying level of ability.

Problems with the Cockroft et al. (2015) analysis

Despite utilizing MGCFA in the same fashion and for the same purpose as the psychometricians - who embrace the above-mentioned definition of bias - they cited, Cockroft et al. (2015) described their results with an alternative definition by which bias is a mean difference and a failure to achieve MI is evidence that the bias is against the lower-scoring group and “Eurocentric” in nature. This reasoning is circular. To avoid improper attributions, I will simply quote them; all emphasis is mine.

The UK group significantly outperformed the SA group on the knowledge-based verbal, and some non-verbal subtests, while the SA group performed significantly better on measures of Processing Speed (PS). The groups did not differ significantly on the Matrix Reasoning subtest and on those working memory subtests with minimal reliance on language, which appear to be the least culturally biased.

An integrated interpretation of the statistical results pointed to three findings of interest. (1) There was no evidence of cultural biases in the Matrix Reasoning subtest or in the WM subtests that had minimal reliance on language. (2) All of the verbal and most non-verbal subtests, as well as the PS subtests, showed evidence of cross-cultural differences. (3) The SA and UK samples’ scores revealed different factor structures.

The finding of equivalent performance between the SA and UK samples on the subtests of the WMI suggests that these may be the fairest measures of intellectual ability across culturally, linguistically, and socio-economically diverse groups.

WM tests are designed to be equally unfamiliar to all participants or draw on well-learned material, such as numbers and letters, they are less likely to confer obvious advantages or disadvantages to individuals from vastly differing backgrounds and are relatively uninfluenced by environmental factors such type and quality of education, SES and linguistic background…. This is particularly likely in the case of the LNS and Digit Span subtests of the WMI, which assess non-semantic, auditory aspects of WM. The LNS subtest of the WMI has been found to be the least vulnerable to socio-cultural effects in black Africans, with both advantaged and disadvantaged secondary and primary school education.

[The arithmetic subtest’s] verbal content may mean that this subtest also holds some cultural biases and this may account for the differences in factor loadings between UK and SA samples on the WMI found in Step 1 of the MGCFA.

A similar explanation to that given for the WMI subtests would account for the absence of cross-cultural differences on the Matrix Reasoning subtest, which was added to the WAIS-III to ‘enhance the assessment of non-verbal, fluid reasoning’…. The construct validity of the Matrix Reasoning subtest, however, has not been well-studied and the findings from the current study provide further evidence of its potential as a culture-fair measure.

Despite attempts to broaden the measurement of unbiased, fluid reasoning processes in the WAIS-III…, the majority of its subtests appear to show cross-cultural effects. Given previous evidence, it was unsurprising that the verbal subtests showed the highest degree of cultural difference. Cultural biases were also evident in two of the PO subtests (Block Design and Picture Completion), as well as Picture Arrangement…. This supports the view that non-verbal tests are not necessarily culturally fairer than verbal ones

Most differences between the UK and SA samples in the current study were found on verbal and non-verbal measures that draw on acquired knowledge and learned problem solving strategies.

The significant difference between the SA and UK samples in terms of PS, in favor of the SA sample was surprising given a body of research indicating that PS is an area where individuals of African descent underperform relative to Caucasians and Hispanics from Westernized, urbanized countries…. The reasons for the contrary finding in the current study are most likely a result of multiple interacting factors, which may include sampling and socio-cultural changes in SA over the last 20 years, including the Flynn effect.

Another issue that may have influenced the results on the PSI are socio-cultural ones resulting from a changed SA society since the ending of apartheid 20 years ago. As a consequence of these changes, the ‘born free’ generation (children born after the end of apartheid, as is the case in the current SA sample) are less different to Western populations than previous generations of black South Africans. [The remainder of this paragraphed described SES and alleged self-esteem changes.]

Several of the socio-cultural factors mentioned above are associated with the Flynn effect, which are reported to be strongest on fluid measures (e.g., Matrix Reasoning). Since WM and PS are strongly related to fluid ability, concomitant influences could also be expected in these areas.

Interestingly, a four factor structure could not be replicated with the UK sample.

In conclusion, the findings from this study add to existing evidence that the majority of the subtests in the WAIS-III hold cross-cultural biases. These are most evident in tasks which tap crystallized, long-term learning, irrespective of whether the format is verbal or non-verbal. This challenges the view that visuo-spatial and non-verbal tests tend to be culturally fairer than verbal ones…. Subtests tapping PS also appear to hold biases. Differences on the latter subtests may reflect cultural and/or experiential background. Of value for theory and practice, those subtests tapping WM and fluid reasoning (Matrix Reasoning) showed no cross-cultural effects. Consequently, one of the most effective ways of tapping universal cognitive functions may be via WM and this warrants further investigation. Identification of tests that do not favor individuals from Eurocentric and favorable SES circumstances with advantaged educational backgrounds is valuable in providing direction for the development of culture fair tests…. [M]easures of WM offer promise in the search for fairer assessment practices with individuals from diverse backgrounds.

In addition to these quotes, there are quite a few points in which the authors are semantically confused despite, perhaps, knowing in general what they wanted to do with their study. For instance, they stated that MGCFA is used in “order to detect potential measurement invariance,” when this is not the case: it is used to detect violations of MI; if MI is tenable in its strictest form, it should be statistically undetectable at each of the nesting levels (given some level of \(\Delta\)goodness-of-fit criteria). Potentially more problematic for their analysis was the extreme neglect of prior research on ability differences on their chosen measurement instrument and tests like it. For example, they stated that the smaller working memory differences were evidence of those tests being unbiased, but minor working memory differences in favor of blacks are observed with MI in the U.S.; moreover, verbal advantages have been found to largely dissipate and spatial differences to remain at a large size without bias, net of general intelligence, g (e.g., Jensen, 1998; Frisby & Beaujean, 2015). There was no attempt to incorporate these empirical regularities in their “integrated interpretation” of their results.

They also confuse exploratory differences with confirmatory ones without proper testing. As an example, they claimed different factor structures on the basis of principal components analyses without a proper test to follow. It can be argued that the application of the structure from the Wechsler theory in MGCFA is evidence that they were correct, but this an error without further analysis since it was never ascertained that the differing parameters were those which differed as indicated by PCA. In fact, no partial invariance model was assessed at all. (Note: the South African sample’s factor structure is more dubious since they were the smaller sample and thus should produce less reliable loadings. On the other hand, the much greater age range in the UK sample could have altered their factor structure in a biasing way, but the authors did not assess MI with respect to age using any of the common methods - like moderated linear and nonlinear factor analysis or local structural equation modeling - to do so, making this impossible to assess on the basis of the data they provided.)

Partial invariance is the state in which a subset of the parameters - those which are biased - are freed in order to achieve a less complete, but often sufficient, level of MI in the sample. This allows researchers additional flexibility in analyses where some of the model parameters may not be equivalent but there are still enough parameters to accurately assess latent mean differences (see Steenkamp & Baumgartner, 1998). In recent years, Bayesian procedures for assessing partial MI (‘approximate MI’) have been developed, but for the present purposes, I will stick to partial MI, since numerous effect size indices have been developed for it (there are CFA-based methods to detect the effect of bias in loadings and intercepts as well and they can be applied to approximate solutions, which would improve fit and reduce the extent of estimated bias). These effect size indices allow researchers to estimate the effect of biased loadings and intercepts (and in the near future, residual variances) on observed score differences using the differences in the released parameters, helping to move MI research away from dichotomous thinking about bias (i.e., ‘bias detected = test unusable’ to ‘bias detected, measure it, and if it’s severe, the test is unusable, but if not, it may be salavageable’).

Somewhat less importantly than moving the thinking of the field, though, is that they’ve allowed a more granular, nuanced view of how bias actually appears in the real world and what it does to group mean and validity comparisons. One of the most surprising findings that has become common is that it does next to nothing in most cases. For example, Roznowski & Reith (1999), analyzing the SAT, found that DIF did not lead to systematic predictive or mean bias. Likewise, Brody (1992, p. 295-296), commenting on the results of two DIF studies which found that removing items biased against blacks and against whites led to practically no effect of bias on mean differences, stated that “Bias in items on standard tests exists if one searches for their presence using large samples and the most sophisticated psychometric techniques. The degree of bias is slight and adjustments for the presence of bias have negligible effects on the magnitude of black-white differences in performance on tests of achievement or tests of ability.” Using the unsigned version of the bias effect size index \(d_{MACS}\), Asil & Brown (2015) found an imperfect but strong relationship between the size of this metric and the size of observed bias in terms of CFI. In Millsap & Kwok’s (2004) assessment of the effects of partial invariance on selection, they found that the effect of half the items or, even, fewer than 10% of the items being invariant, on selection was not that great (cf. Culpepper et al., 2019). The point of all of these is that there is no basis for an a priori expected direction or magnitude of effect for bias, even when there are notable mean differences (outside of select scenarios such as total illiteracy or severe mental retardation). In order to determine anything about bias, it must first be assessed where possible, and if it’s found that some bias, though statistically detectable, does not affect mean or validity estimates, then hypotheses of the sort that it’s against any particular group due to differences in culture, socioeconomic differences, or what-say-you, must be rejected; the explanation lies in the sources of the differences in performance, but the differences in performance are real. As a simple rule-of-thumb for researchers, if the validity of a test is not destroyed within a group, you should not assume the direction of bias in a group comparison.

It is unfortunate for their own study that Cockroft et al. (2015) made so many conclusions without assessing whether they were correct when this option was readily available to them. Among their quotes, Cockroft et al. (2015) suggested - outright or implicitly - numerous testable claims. For example:

  1. Because the bias is alleged to favor “Eurocentric” samples with “favorable SES”, the bias should be in favor of the British sample.
  2. Bias should be smallest for the working memory and processing speed tests.
  3. Bias should be largest for the verbal and spatial tests.
  4. Because the factor structure is supposed to differ as indicated by the PCA results, using either the African or British factor structures as the basis for MGCFA should lead to qualitatively different results when partial MI is sought.
    • The British-derived cross-loading for the comprehension subtest should not be significant in the South African sample (and it should lead to a very large parameter change in the partial MI model).
    • Loadings and latent variances should be biased for the extra factor’s indicators if the number of factors actually differs.
    • The cross-loading is not an artefact of the method used to determine the number and configuration of factors.
  5. Though it’s unavailable for testing because they did not provide raw data, if the problem is socioeconomic in origin, using (an appropriate measure of) SES as the moderator in an LSEM/MNLFA should fail MI in both samples.
    • Conditioning on SES as a background variable in an MGCFA should ameliorate MI violations.
    • The combined groups should not be distinguishable, when aggregated by SES, in an LSEM if the moderator (of ‘cultural’ differences as well) is SES.

It is sad that the full raw data were not made available since matrix inputs with low degrees of precision - especially when the groups have quite different sample sizes and thus, just as well, different precisions - can artefactually affect results. Nonetheless, I sought to assess some of these claims. In particular, I assessed 1, 2, 3, and 4 and because of the lack of raw data, the required measures, or appropriate age- or SES-stratified matrices, I could not assess 5.

In order to match their analysis, I used the same goodness-of-fit exclusions as the authors (and if anyone wants different metrics, they can use their own), opting away from using the \(\chi^2\) exact fit test, and clearly not using AIC since they calculated it incorrectly. I used typical guidelines for CFI and RMSEA, where \(\geq\) 0.01 \(\Delta\)CFI coupled with \(\geq\) 0.015 \(\Delta\)RMSEA signifies the presence of an MI violation. This was also the criteria used for assessing the partial MI models, where the freed parameters were those ones which produced the largest improvements in fit until the model fit according to the goodness-of-fit guidelines (there is unfortunately little discussion of using information theoretic metrics for partial MI model selection). I used the South African and British PCA-derived factor structures for the test in order to test proposition 4, but also because the theoretically specified WAIS-III factor configuration they used was unavailable to me (and, curiously, the authors did not plot it or otherwise provide it). EFA is preferable to PCA but the method of determining the number of factors used for both is insufficient on their own, so I applied exploratory graph analysis (EGA) to the data as well. For the confirmatory models, for the sake of making a valid comparison and testing their implied conjectures, I used the correlated group factor (or measurement model) structures provided based on the PCA by Cockroft et al. (2015); after these, I also assessed models with a higher-order and bifactor g and a network model. Since the number of loadings will differ for the models based on the British and South African exploratory results, I will compare them with Schwarz weights (see Wagenmakers & Farrell, 2004; I prefer these to AIC because, while obviously correlated, BIC is more apt for selecting parsimonious models and differentiating among known rather than all possible models); their formula is

\[\frac{w_{A2}(BIC)}{w_{A1}(BIC)+w_{A2}(BIC)}\]

Analysis

Data Input

#Correlation matrices

lowerUK = '
1.00                                                
0.71    1.00                                            
0.32    0.37    1.00                                        
0.10    0.14    0.38    1.00                                    
0.65    0.60    0.35    0.14    1.00                                
0.61    0.65    0.51    0.45    0.61    1.00                            
0.17    0.26    0.46    0.61    0.22    0.38    1.00                        
0.23    0.27    0.34    0.17    0.27    0.25    0.24    1.00                    
0.05    0.05    0.18    0.26    0.05    0.08    0.24    0.26    1.00                
0.32    0.36    0.46    0.32    0.29    0.41    0.36    0.44    0.33    1.00            
0.31    0.39    0.48    0.25    0.35    0.38    0.35    0.49    0.30    0.55    1.00        
0.28    0.29    0.42    0.33    0.31    0.38    0.39    0.65    0.50    0.66    0.69    1.00    
0.15    0.12    0.28    0.36    0.12    0.23    0.31    0.33    0.77    0.43    0.39    0.54    1.00'

lowerSA = '
1.00                                                
0.52    1.00                                            
0.43    0.41    1.00                                        
0.21    0.18    0.32    1.00                                    
0.62    0.56    0.52    0.15    1.00                                
0.60    0.53    0.41    0.31    0.56    1.00                            
0.23    0.20    0.39    0.45    0.18    0.19    1.00                        
0.19    0.25    0.20    0.09    0.25    0.19    0.19    1.00                    
0.09    0.16    0.13    0.16    0.03    -0.04   0.20    0.08    1.00                
0.20    0.28    0.34    0.36    0.33    0.21    0.24    0.42    0.28    1.00            
0.06    0.07    0.21    0.19    0.09    0.17    0.17    0.39    0.07    0.40    1.00        
0.24    0.14    0.31    0.14    0.35    0.26    0.10    0.30    0.12    0.26    0.21    1.00    
0.09    0.12    0.18    0.22    0.06    0.06    0.25    0.24    0.45    0.39    0.25    0.19    1.00'

#Variable names

Names = list("V", "S", "A", "DS", "I", "C", "LNS", "PC", "DSC", "BD", "MR", "PA", "SS")

#Convert to variance-covariance matrices

WAISIIIW.cor = getCov(lowerUK, names = Names)
WAISIIIB.cor = getCov(lowerSA, names = Names)
UKSDs <- c(2.66, 3.04, 3.14, 2.94, 2.72, 2.59, 2.72, 3.13, 2.68, 2.87, 2.71, 2.26, 2.62)
SASDs <- c(2.02, 1.4, 2.53, 2.26, 3.21, 2.5, 2.42, 2.11, 2.51, 2.61, 2.37, 2.82, 2.67)
WAISIIIW.cov = lavaan::cor2cov(R = WAISIIIW.cor, sds = UKSDs)
WAISIIIB.cov = lavaan::cor2cov(R = WAISIIIB.cor, sds = SASDs)

#Means

Wmeans = c(14.56, 14.22, 9.58, 9.5, 11.75, 12.48, 8.88, 10.57, 7.72, 10.66, 10.59, 10.08, 8.25)
Bmeans = c(8.92, 8.12, 8.66, 9.35, 10.19, 8.99, 9.17, 8.05, 9.75, 8.67, 10.68, 8.33, 9.71)

#Group inputs

GCovs <- list(WAISIIIW.cov, WAISIIIB.cov)
GMeans <- list(Wmeans, Bmeans)
GNs <- list(349, 107)

#Fit measures

FITM <- c("chisq", "df", "nPar", "cfi", "rmsea", "rmsea.ci.lower", "rmsea.ci.upper")

Exploratory Models: EFA, PCA, and EGA

fa.parallel(WAISIIIW.cor, n.obs = 349)

## Parallel analysis suggests that the number of factors =  4  and the number of components =  3
fa.parallel(WAISIIIB.cor, n.obs = 107)
## Warning in fa.stats(r = r, f = f, phi = phi, n.obs = n.obs, np.obs = np.obs, :
## The estimated weights for the factor scores are probably incorrect. Try a
## different factor score estimation method.

## Parallel analysis suggests that the number of factors =  2  and the number of components =  2

Parallel analysis on matrices like this is likely to underestimate the true number of components or factors, so it’s probably the case that \(\lambda\) and clustering more reliably indicate the number of factors.

#Check their PCA results

PCAW1 <- pca(WAISIIIW.cor, n.obs = 349)
PCAW2 <- pca(WAISIIIW.cor, n.obs = 349, nfactors = 2)
PCAW3 <- pca(WAISIIIW.cor, n.obs = 349, nfactors = 3)
PCAW4 <- pca(WAISIIIW.cor, n.obs = 349, nfactors = 4)

PCAB1 <- pca(WAISIIIB.cor, n.obs = 107)
PCAB2 <- pca(WAISIIIB.cor, n.obs = 107, nfactors = 2)
PCAB3 <- pca(WAISIIIB.cor, n.obs = 107, nfactors = 3)
PCAB4 <- pca(WAISIIIB.cor, n.obs = 107, nfactors = 4)

print.psych(PCAW1, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##      V   PC1    h2    u2 com
## PA  12 0.789 0.622 0.378   1
## MR  11 0.734 0.539 0.461   1
## BD  10 0.731 0.534 0.466   1
## C    6 0.725 0.525 0.475   1
## A    3 0.678 0.460 0.540   1
## S    2 0.636 0.405 0.595   1
## PC   8 0.603 0.363 0.637   1
## I    5 0.603 0.363 0.637   1
## V    1 0.595 0.354 0.646   1
## LNS  7 0.591 0.350 0.650   1
## SS  13 0.586 0.343 0.657   1
## DS   4 0.526 0.277 0.723   1
## DSC  9 0.463 0.214 0.786   1
## 
##                  PC1
## SS loadings    5.350
## Proportion Var 0.412
## 
## Mean item complexity =  1
## Test of the hypothesis that 1 component is sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.154 
##  with the empirical chi square  1298.374  with prob <  1.02e-228 
## 
## Fit based upon off diagonal values = 0.845
print.psych(PCAW2, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item   RC1    RC2    h2    u2  com
## PA    12 0.815        0.729 0.271 1.19
## SS    13 0.808        0.656 0.344 1.01
## DSC    9 0.755        0.604 0.396 1.12
## BD    10 0.664        0.559 0.441 1.50
## MR    11 0.639        0.552 0.448 1.63
## PC     8 0.578        0.395 0.605 1.36
## LNS    7 0.553        0.373 0.627 1.42
## DS     4 0.547        0.326 0.674 1.18
## A      3 0.488  0.474 0.463 0.537 2.00
## S      2        0.844 0.729 0.271 1.04
## V      1        0.834 0.703 0.297 1.02
## I      5        0.805 0.662 0.338 1.04
## C      6        0.793 0.709 0.291 1.25
## 
##                         RC1   RC2
## SS loadings           4.029 3.431
## Proportion Var        0.310 0.264
## Cumulative Var        0.310 0.574
## Proportion Explained  0.540 0.460
## Cumulative Proportion 0.540 1.000
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 2 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.092 
##  with the empirical chi square  463.616  with prob <  6.2e-67 
## 
## Fit based upon off diagonal values = 0.945
print.psych(PCAW3, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item   RC1    RC2    RC3    h2    u2  com
## PA    12 0.834               0.796 0.204 1.30
## SS    13 0.762               0.664 0.336 1.28
## DSC    9 0.749               0.628 0.372 1.24
## PC     8 0.673               0.533 0.467 1.34
## MR    11 0.655               0.602 0.398 1.75
## BD    10 0.634               0.575 0.425 1.86
## V      1        0.847        0.730 0.270 1.04
## S      2        0.844        0.738 0.262 1.07
## I      5        0.809        0.675 0.325 1.06
## C      6        0.739  0.448 0.761 0.239 1.71
## DS     4               0.876 0.797 0.203 1.08
## LNS    7               0.803 0.715 0.285 1.22
## A      3        0.416  0.504 0.521 0.479 2.63
## 
##                         RC1   RC2   RC3
## SS loadings           3.349 3.275 2.109
## Proportion Var        0.258 0.252 0.162
## Cumulative Var        0.258 0.510 0.672
## Proportion Explained  0.383 0.375 0.241
## Cumulative Proportion 0.383 0.759 1.000
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 3 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.072 
##  with the empirical chi square  281.413  with prob <  3.45e-37 
## 
## Fit based upon off diagonal values = 0.966
print.psych(PCAW4, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    RC2    RC1    RC3    RC4    h2    u2  com
## V      1  0.884                      0.804 0.196 1.06
## S      2  0.840                      0.759 0.241 1.15
## I      5  0.812                      0.701 0.299 1.12
## C      6  0.741         0.444        0.776 0.224 1.77
## PC     8         0.811               0.679 0.321 1.07
## PA    12         0.793               0.829 0.171 1.65
## MR    11         0.756               0.686 0.314 1.42
## BD    10         0.656               0.608 0.392 1.89
## DS     4                0.862        0.798 0.202 1.15
## LNS    7                0.811        0.728 0.272 1.22
## A      3         0.450  0.535        0.583 0.417 2.58
## DSC    9                       0.914 0.887 0.113 1.13
## SS    13                       0.866 0.869 0.131 1.33
## 
##                         RC2   RC1   RC3   RC4
## SS loadings           2.965 2.780 2.090 1.870
## Proportion Var        0.228 0.214 0.161 0.144
## Cumulative Var        0.228 0.442 0.603 0.747
## Proportion Explained  0.305 0.286 0.215 0.193
## Cumulative Proportion 0.305 0.592 0.807 1.000
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 4 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.052 
##  with the empirical chi square  149.439  with prob <  4.27e-17 
## 
## Fit based upon off diagonal values = 0.982
print.psych(PCAB1, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##      V   PC1     h2    u2 com
## I    5 0.720 0.5190 0.481   1
## A    3 0.706 0.4985 0.502   1
## V    1 0.679 0.4614 0.539   1
## C    6 0.677 0.4577 0.542   1
## S    2 0.661 0.4371 0.563   1
## BD  10 0.626 0.3919 0.608   1
## DS   4 0.501 0.2506 0.749   1
## PC   8 0.497 0.2474 0.753   1
## LNS  7 0.497 0.2471 0.753   1
## PA  12 0.485 0.2354 0.765   1
## SS  13 0.403 0.1628 0.837   1
## MR  11 0.400 0.1602 0.840   1
## DSC  9       0.0843 0.916   1
## 
##                  PC1
## SS loadings    4.153
## Proportion Var 0.319
## 
## Mean item complexity =  1
## Test of the hypothesis that 1 component is sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.127 
##  with the empirical chi square  270.502  with prob <  6.92e-27 
## 
## Fit based upon off diagonal values = 0.806
print.psych(PCAB2, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    RC1    RC2    h2    u2  com
## I      5  0.838        0.710 0.290 1.02
## V      1  0.809        0.658 0.342 1.01
## C      6  0.804        0.650 0.350 1.01
## S      2  0.731        0.551 0.449 1.07
## A      3  0.628        0.510 0.490 1.54
## PA    12               0.239 0.761 2.00
## SS    13         0.729 0.534 0.466 1.01
## BD    10         0.709 0.563 0.437 1.24
## MR    11         0.580 0.341 0.659 1.02
## DSC    9         0.573 0.333 0.667 1.03
## PC     8         0.511 0.317 0.683 1.41
## LNS    7         0.493 0.305 0.695 1.48
## DS     4         0.483 0.301 0.699 1.54
## 
##                         RC1   RC2
## SS loadings           3.310 2.704
## Proportion Var        0.255 0.208
## Cumulative Var        0.255 0.463
## Proportion Explained  0.550 0.450
## Cumulative Proportion 0.550 1.000
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 2 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.091 
##  with the empirical chi square  136.74  with prob <  2.5e-09 
## 
## Fit based upon off diagonal values = 0.902
print.psych(PCAB3, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    RC1    RC2    RC3    h2    u2  com
## I      5  0.821               0.720 0.280 1.14
## V      1  0.810               0.666 0.334 1.03
## C      6  0.797               0.650 0.350 1.05
## S      2  0.727               0.558 0.442 1.11
## A      3  0.618               0.528 0.472 1.74
## LNS    7         0.668        0.514 0.486 1.30
## DSC    9         0.667        0.461 0.539 1.07
## DS     4         0.644        0.489 0.511 1.35
## SS    13         0.631        0.549 0.451 1.68
## PC     8                0.765 0.613 0.387 1.09
## MR    11                0.723 0.543 0.457 1.08
## BD    10         0.424  0.610 0.589 0.411 2.02
## PA    12                0.534 0.375 0.625 1.57
## 
##                         RC1   RC2   RC3
## SS loadings           3.182 2.044 2.030
## Proportion Var        0.245 0.157 0.156
## Cumulative Var        0.245 0.402 0.558
## Proportion Explained  0.439 0.282 0.280
## Cumulative Proportion 0.439 0.720 1.000
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 3 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.086 
##  with the empirical chi square  124.382  with prob <  4.4e-10 
## 
## Fit based upon off diagonal values = 0.911
print.psych(PCAB4, cut = 0.4, sort = T, digits = 3)
## Principal Components Analysis
## Call: principal(r = r, nfactors = nfactors, residuals = residuals, 
##     rotate = rotate, n.obs = n.obs, covar = covar, scores = scores, 
##     missing = missing, impute = impute, oblique.scores = oblique.scores, 
##     method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    RC1    RC3    RC4    RC2    h2    u2  com
## I      5  0.839                      0.742 0.258 1.11
## V      1  0.818                      0.685 0.315 1.05
## S      2  0.762                      0.616 0.384 1.12
## C      6  0.755                      0.660 0.340 1.33
## A      3  0.571         0.406        0.541 0.459 2.16
## MR    11         0.771               0.657 0.343 1.21
## PC     8         0.759               0.614 0.386 1.13
## BD    10         0.595               0.589 0.411 2.36
## PA    12         0.507               0.389 0.611 1.93
## DS     4                0.821        0.709 0.291 1.11
## LNS    7                0.775        0.654 0.346 1.18
## DSC    9                       0.887 0.796 0.204 1.02
## SS    13                       0.731 0.659 0.341 1.47
## 
##                         RC1   RC3   RC4   RC2
## SS loadings           3.080 1.995 1.680 1.557
## Proportion Var        0.237 0.153 0.129 0.120
## Cumulative Var        0.237 0.390 0.520 0.639
## Proportion Explained  0.371 0.240 0.202 0.187
## Cumulative Proportion 0.371 0.611 0.813 1.000
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 4 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.075 
##  with the empirical chi square  93.875  with prob <  5.41e-08 
## 
## Fit based upon off diagonal values = 0.933

The PCA loadings are very marginally and clearly insignificantly different; the discrepancy is so small that it probably doesn’t matter and reflects only their use of raw data versus my having to use matrix inputs. The error thrown by the parallel analysis did not seem to show up. If I use a threshold of 0.3 for the loadings, the results differ more markedly; I could not tell why they only used “factor loadings of 0.40 or greater” when 0.3 is more typical, but I stuck to their choices in order to assess the model like they did.

#Compare PCA to EFA

FAW1 <- fa(WAISIIIW.cor, n.obs = 349)
FAW2 <- fa(WAISIIIW.cor, n.obs = 349, nfactors = 2)
## Loading required namespace: GPArotation
FAW3 <- fa(WAISIIIW.cor, n.obs = 349, nfactors = 3)
FAW4 <- fa(WAISIIIW.cor, n.obs = 349, nfactors = 4)

FAB1 <- fa(WAISIIIB.cor, n.obs = 107)
FAB2 <- fa(WAISIIIB.cor, n.obs = 107, nfactors = 2)
FAB3 <- fa(WAISIIIB.cor, n.obs = 107, nfactors = 3)
FAB4 <- fa(WAISIIIB.cor, n.obs = 107, nfactors = 4)

print.psych(FAW1, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIW.cor, n.obs = 349)
## Standardized loadings (pattern matrix) based upon correlation matrix
##      V   MR1    h2    u2 com
## PA  12 0.775 0.600 0.400   1
## MR  11 0.708 0.501 0.499   1
## BD  10 0.704 0.495 0.505   1
## C    6 0.693 0.480 0.520   1
## A    3 0.640 0.410 0.590   1
## S    2 0.593 0.352 0.648   1
## PC   8 0.560 0.314 0.686   1
## I    5 0.557 0.311 0.689   1
## V    1 0.549 0.302 0.698   1
## LNS  7 0.544 0.296 0.704   1
## SS  13 0.538 0.290 0.710   1
## DS   4 0.477 0.228 0.772   1
## DSC  9 0.416 0.173 0.827   1
## 
##                  MR1
## SS loadings    4.751
## Proportion Var 0.365
## 
## Mean item complexity =  1
## Test of the hypothesis that 1 factor is sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  7.092 with Chi Square of  2431.439
## The degrees of freedom for the model are 65  and the objective function was  3.162 
## 
## The root mean square of the residuals (RMSR) is  0.147 
## The df corrected root mean square of the residuals is  0.161 
## 
## The harmonic number of observations is  349 with the empirical chi square  1171.742  with prob <  1.27e-202 
## The total number of observations was  349  with Likelihood Chi Square =  1082.072  with prob <  3.07e-184 
## 
## Tucker Lewis Index of factoring reliability =  0.4804
## RMSEA index =  0.2117  and the 90 % confidence intervals are  0.201 0.2233
## BIC =  701.492
## Fit based upon off diagonal values = 0.86
## Measures of factor score adequacy             
##                                                     MR1
## Correlation of (regression) scores with factors   0.946
## Multiple R square of scores with factors          0.894
## Minimum correlation of possible factor scores     0.788
print.psych(FAW2, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIW.cor, nfactors = 2, n.obs = 349)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    MR1    MR2    h2    u2  com
## PA    12  0.805        0.728 0.272 1.04
## SS    13  0.795        0.570 0.430 1.07
## DSC    9  0.737        0.476 0.524 1.23
## BD    10  0.593        0.507 0.493 1.31
## MR    11  0.566        0.501 0.499 1.43
## PC     8  0.502        0.332 0.668 1.20
## LNS    7  0.455        0.300 0.700 1.32
## DS     4  0.449        0.248 0.752 1.12
## A      3               0.396 0.604 2.00
## S      2         0.824 0.669 0.331 1.00
## V      1         0.804 0.619 0.381 1.01
## I      5         0.751 0.562 0.438 1.00
## C      6         0.750 0.661 0.339 1.08
## 
##                         MR1   MR2
## SS loadings           3.504 3.064
## Proportion Var        0.270 0.236
## Cumulative Var        0.270 0.505
## Proportion Explained  0.533 0.467
## Cumulative Proportion 0.533 1.000
## 
##  With factor correlations of 
##       MR1   MR2
## MR1 1.000 0.352
## MR2 0.352 1.000
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  7.092 with Chi Square of  2431.439
## The degrees of freedom for the model are 53  and the objective function was  1.542 
## 
## The root mean square of the residuals (RMSR) is  0.077 
## The df corrected root mean square of the residuals is  0.094 
## 
## The harmonic number of observations is  349 with the empirical chi square  326.612  with prob <  4.85e-41 
## The total number of observations was  349  with Likelihood Chi Square =  526.608  with prob <  3.3e-79 
## 
## Tucker Lewis Index of factoring reliability =  0.7026
## RMSEA index =  0.16  and the 90 % confidence intervals are  0.148 0.1728
## BIC =  216.289
## Fit based upon off diagonal values = 0.961
## Measures of factor score adequacy             
##                                                     MR1   MR2
## Correlation of (regression) scores with factors   0.943 0.940
## Multiple R square of scores with factors          0.889 0.884
## Minimum correlation of possible factor scores     0.777 0.768
print.psych(FAW3, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIW.cor, nfactors = 3, n.obs = 349)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    MR1    MR2    MR3    h2    u2  com
## PA    12  0.885               0.818 0.182 1.02
## SS    13  0.709               0.548 0.452 1.22
## DSC    9  0.694               0.467 0.533 1.32
## MR    11  0.611               0.533 0.467 1.33
## PC     8  0.603               0.397 0.603 1.18
## BD    10  0.574               0.509 0.491 1.31
## S      2         0.815        0.677 0.323 1.00
## V      1         0.811        0.650 0.350 1.03
## I      5         0.743        0.569 0.431 1.01
## C      6         0.715        0.725 0.275 1.44
## A      3                      0.417 0.583 2.83
## DS     4                0.880 0.769 0.231 1.00
## LNS    7                0.604 0.501 0.499 1.16
## 
##                         MR1   MR2   MR3
## SS loadings           3.118 2.896 1.567
## Proportion Var        0.240 0.223 0.121
## Cumulative Var        0.240 0.463 0.583
## Proportion Explained  0.411 0.382 0.207
## Cumulative Proportion 0.411 0.793 1.000
## 
##  With factor correlations of 
##       MR1   MR2   MR3
## MR1 1.000 0.339 0.419
## MR2 0.339 1.000 0.240
## MR3 0.419 0.240 1.000
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  7.092 with Chi Square of  2431.439
## The degrees of freedom for the model are 42  and the objective function was  0.906 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.068 
## 
## The harmonic number of observations is  349 with the empirical chi square  133.609  with prob <  1.77e-11 
## The total number of observations was  349  with Likelihood Chi Square =  308.789  with prob <  2.47e-42 
## 
## Tucker Lewis Index of factoring reliability =  0.7882
## RMSEA index =  0.1349  and the 90 % confidence intervals are  0.1212 0.1494
## BIC =  62.876
## Fit based upon off diagonal values = 0.984
## Measures of factor score adequacy             
##                                                     MR1   MR2   MR3
## Correlation of (regression) scores with factors   0.951 0.942 0.913
## Multiple R square of scores with factors          0.905 0.887 0.833
## Minimum correlation of possible factor scores     0.810 0.774 0.666
print.psych(FAW4, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIW.cor, nfactors = 4, n.obs = 349)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    MR2    MR1    MR4    MR3    h2    u2  com
## V      1  0.899                      0.735 0.265 1.05
## S      2  0.801                      0.680 0.320 1.02
## I      5  0.728                      0.571 0.429 1.02
## C      6  0.682                      0.725 0.275 1.50
## PA    12         0.901               0.868 0.132 1.04
## MR    11         0.734               0.590 0.410 1.04
## PC     8         0.726               0.458 0.542 1.03
## BD    10         0.589               0.521 0.479 1.15
## A      3                             0.444 0.556 2.68
## DSC    9                0.905        0.800 0.200 1.00
## SS    13                0.801        0.765 0.235 1.04
## DS     4                       0.847 0.728 0.272 1.03
## LNS    7                       0.636 0.527 0.473 1.15
## 
##                         MR2   MR1   MR4   MR3
## SS loadings           2.655 2.626 1.562 1.569
## Proportion Var        0.204 0.202 0.120 0.121
## Cumulative Var        0.204 0.406 0.526 0.647
## Proportion Explained  0.316 0.312 0.186 0.187
## Cumulative Proportion 0.316 0.628 0.813 1.000
## 
##  With factor correlations of 
##       MR2   MR1   MR4   MR3
## MR2 1.000 0.446 0.076 0.291
## MR1 0.446 1.000 0.516 0.399
## MR4 0.076 0.516 1.000 0.300
## MR3 0.291 0.399 0.300 1.000
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  7.092 with Chi Square of  2431.439
## The degrees of freedom for the model are 32  and the objective function was  0.253 
## 
## The root mean square of the residuals (RMSR) is  0.018 
## The df corrected root mean square of the residuals is  0.029 
## 
## The harmonic number of observations is  349 with the empirical chi square  18.244  with prob <  0.976 
## The total number of observations was  349  with Likelihood Chi Square =  85.911  with prob <  7.99e-07 
## 
## Tucker Lewis Index of factoring reliability =  0.9437
## RMSEA index =  0.0694  and the 90 % confidence intervals are  0.052 0.0875
## BIC =  -101.451
## Fit based upon off diagonal values = 0.998
## Measures of factor score adequacy             
##                                                     MR2   MR1   MR4   MR3
## Correlation of (regression) scores with factors   0.946 0.957 0.939 0.904
## Multiple R square of scores with factors          0.894 0.916 0.882 0.818
## Minimum correlation of possible factor scores     0.789 0.831 0.763 0.635
print.psych(FAB1, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIB.cor, n.obs = 107)
## Standardized loadings (pattern matrix) based upon correlation matrix
##      V   MR1     h2    u2 com
## I    5 0.699 0.4891 0.511   1
## A    3 0.670 0.4484 0.552   1
## V    1 0.647 0.4191 0.581   1
## C    6 0.643 0.4133 0.587   1
## S    2 0.623 0.3882 0.612   1
## BD  10 0.560 0.3134 0.687   1
## DS   4 0.435 0.1889 0.811   1
## LNS  7 0.431 0.1858 0.814   1
## PC   8 0.430 0.1853 0.815   1
## PA  12 0.424 0.1795 0.820   1
## MR  11       0.1123 0.888   1
## SS  13       0.1118 0.888   1
## DSC  9       0.0559 0.944   1
## 
##                  MR1
## SS loadings    3.491
## Proportion Var 0.269
## 
## Mean item complexity =  1
## Test of the hypothesis that 1 factor is sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  4.022 with Chi Square of  405.588
## The degrees of freedom for the model are 65  and the objective function was  1.541 
## 
## The root mean square of the residuals (RMSR) is  0.115 
## The df corrected root mean square of the residuals is  0.126 
## 
## The harmonic number of observations is  107 with the empirical chi square  221.541  with prob <  5.88e-19 
## The total number of observations was  107  with Likelihood Chi Square =  154.337  with prob <  3.14e-09 
## 
## Tucker Lewis Index of factoring reliability =  0.67
## RMSEA index =  0.1129  and the 90 % confidence intervals are  0.0908 0.1371
## BIC =  -149.397
## Fit based upon off diagonal values = 0.841
## Measures of factor score adequacy             
##                                                     MR1
## Correlation of (regression) scores with factors   0.919
## Multiple R square of scores with factors          0.845
## Minimum correlation of possible factor scores     0.689
print.psych(FAB2, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIB.cor, nfactors = 2, n.obs = 107)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    MR1    MR2    h2    u2  com
## I      5  0.817        0.658 0.342 1.00
## V      1  0.776        0.582 0.418 1.01
## C      6  0.758        0.560 0.440 1.00
## S      2  0.648        0.451 0.549 1.02
## A      3  0.511        0.432 0.568 1.51
## BD    10         0.673 0.512 0.488 1.04
## SS    13         0.665 0.394 0.606 1.10
## MR    11         0.485 0.230 0.770 1.00
## DSC    9         0.464 0.191 0.809 1.11
## PC     8         0.422 0.237 0.763 1.21
## LNS    7               0.224 0.776 1.30
## DS     4               0.225 0.775 1.33
## PA    12               0.180 0.820 1.99
## 
##                         MR1   MR2
## SS loadings           2.796 2.081
## Proportion Var        0.215 0.160
## Cumulative Var        0.215 0.375
## Proportion Explained  0.573 0.427
## Cumulative Proportion 0.573 1.000
## 
##  With factor correlations of 
##       MR1   MR2
## MR1 1.000 0.358
## MR2 0.358 1.000
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  4.022 with Chi Square of  405.588
## The degrees of freedom for the model are 53  and the objective function was  0.721 
## 
## The root mean square of the residuals (RMSR) is  0.064 
## The df corrected root mean square of the residuals is  0.078 
## 
## The harmonic number of observations is  107 with the empirical chi square  69.425  with prob <  0.0645 
## The total number of observations was  107  with Likelihood Chi Square =  71.787  with prob <  0.0438 
## 
## Tucker Lewis Index of factoring reliability =  0.9142
## RMSEA index =  0.0568  and the 90 % confidence intervals are  0.0103 0.0895
## BIC =  -175.873
## Fit based upon off diagonal values = 0.95
## Measures of factor score adequacy             
##                                                     MR1   MR2
## Correlation of (regression) scores with factors   0.930 0.875
## Multiple R square of scores with factors          0.864 0.765
## Minimum correlation of possible factor scores     0.728 0.531
print.psych(FAB3, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIB.cor, nfactors = 3, n.obs = 107)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    MR1    MR2    MR3    h2    u2  com
## I      5  0.817               0.690 0.310 1.07
## V      1  0.770               0.587 0.413 1.03
## C      6  0.745               0.554 0.446 1.00
## S      2  0.637               0.449 0.551 1.03
## A      3  0.498               0.447 0.553 1.62
## PC     8         0.669        0.443 0.557 1.12
## BD    10         0.572        0.524 0.476 1.36
## MR    11         0.558        0.310 0.690 1.03
## PA    12                      0.220 0.780 1.76
## LNS    7                0.567 0.373 0.627 1.11
## DS     4                0.557 0.367 0.633 1.13
## SS    13                0.414 0.377 0.623 2.28
## DSC    9                0.410 0.217 0.783 1.48
## 
##                         MR1   MR2   MR3
## SS loadings           2.718 1.543 1.296
## Proportion Var        0.209 0.119 0.100
## Cumulative Var        0.209 0.328 0.427
## Proportion Explained  0.489 0.278 0.233
## Cumulative Proportion 0.489 0.767 1.000
## 
##  With factor correlations of 
##       MR1   MR2   MR3
## MR1 1.000 0.330 0.267
## MR2 0.330 1.000 0.379
## MR3 0.267 0.379 1.000
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  4.022 with Chi Square of  405.588
## The degrees of freedom for the model are 42  and the objective function was  0.499 
## 
## The root mean square of the residuals (RMSR) is  0.048 
## The df corrected root mean square of the residuals is  0.065 
## 
## The harmonic number of observations is  107 with the empirical chi square  37.861  with prob <  0.653 
## The total number of observations was  107  with Likelihood Chi Square =  49.318  with prob <  0.204 
## 
## Tucker Lewis Index of factoring reliability =  0.9575
## RMSEA index =  0.0392  and the 90 % confidence intervals are  0 0.0807
## BIC =  -146.941
## Fit based upon off diagonal values = 0.973
## Measures of factor score adequacy             
##                                                     MR1   MR2   MR3
## Correlation of (regression) scores with factors   0.931 0.853 0.819
## Multiple R square of scores with factors          0.866 0.727 0.671
## Minimum correlation of possible factor scores     0.732 0.454 0.343
print.psych(FAB4, cut = 0.4, sort = T, digits = 3)
## Factor Analysis using method =  minres
## Call: fa(r = WAISIIIB.cor, nfactors = 4, n.obs = 107)
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item    MR1    MR3    MR4    MR2    h2    u2  com
## I      5  0.837                      0.702 0.298 1.05
## V      1  0.782                      0.598 0.402 1.04
## C      6  0.688                      0.564 0.436 1.20
## S      2  0.681                      0.483 0.517 1.06
## A      3  0.469                      0.438 0.562 1.69
## PC     8         0.657               0.441 0.559 1.12
## MR    11         0.635               0.394 0.606 1.14
## BD    10         0.539               0.504 0.496 1.43
## PA    12                             0.215 0.785 1.93
## DS     4                0.790        0.627 0.373 1.00
## LNS    7                0.502        0.348 0.652 1.17
## DSC    9                       0.851 0.713 0.287 1.00
## SS    13                       0.449 0.402 0.598 1.98
## 
##                         MR1   MR3   MR4   MR2
## SS loadings           2.689 1.498 1.160 1.083
## Proportion Var        0.207 0.115 0.089 0.083
## Cumulative Var        0.207 0.322 0.411 0.495
## Proportion Explained  0.418 0.233 0.180 0.168
## Cumulative Proportion 0.418 0.651 0.832 1.000
## 
##  With factor correlations of 
##       MR1   MR3   MR4   MR2
## MR1 1.000 0.326 0.316 0.089
## MR3 0.326 1.000 0.332 0.244
## MR4 0.316 0.332 1.000 0.245
## MR2 0.089 0.244 0.245 1.000
## 
## Mean item complexity =  1.3
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  78  and the objective function was  4.022 with Chi Square of  405.588
## The degrees of freedom for the model are 32  and the objective function was  0.269 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.046 
## 
## The harmonic number of observations is  107 with the empirical chi square  14.683  with prob <  0.996 
## The total number of observations was  107  with Likelihood Chi Square =  26.385  with prob <  0.746 
## 
## Tucker Lewis Index of factoring reliability =  1.0432
## RMSEA index =  0  and the 90 % confidence intervals are  0 0.0531
## BIC =  -123.145
## Fit based upon off diagonal values = 0.989
## Measures of factor score adequacy             
##                                                     MR1   MR3   MR4   MR2
## Correlation of (regression) scores with factors   0.932 0.851 0.850 0.871
## Multiple R square of scores with factors          0.869 0.724 0.722 0.758
## Minimum correlation of possible factor scores     0.738 0.449 0.444 0.517

The EFA differs from the PCA results somewhat, notably for cross-loadings assumed to be meaningful (for arithmetic and comprehension, which fall below 0.4 for their secondary factors using EFA). Since PCA uses the total variance rather than the common variance, this can lead to inflated loadings, which probably explains their results. Nonetheless, for consistency with what data they had available to them, I utilize the PCA structure instead of the EFA one. The EGA, below, is only given for illustrative purposes, like the EFA.

#Check dimensionality with EGA

EGAW <- EGAnet::EGA(WAISIIIW.cor, n = 349)
## Network estimated with gamma = 0.5

EGAB <- EGAnet::EGA(WAISIIIB.cor, n = 107)
## Network estimated with gamma = 0.5

The number of dimensions is similar, but there are still differences.

Confirmatory Factor Models

British Model

The model is first fit on one group.

#British PCA structure - initial fit

WIIIUK.model <- '
POI =~ PA + SS + DSC + PC + MR + BD
VCI =~ V + S + I + C + A
WMI =~ C + DS + LNS + A'

WIIIUK.fit <- cfa(WIIIUK.model, sample.cov = WAISIIIW.cov, sample.nobs = 349, std.lv = T, orthogonal = F)
summary(WIIIUK.fit, stand = T, fit = T)
## lavaan 0.6-5 ended normally after 24 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         31
##                                                       
##   Number of observations                           349
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                               403.439
##   Degrees of freedom                                60
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              2475.174
##   Degrees of freedom                                78
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.857
##   Tucker-Lewis Index (TLI)                       0.814
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -10009.960
##   Loglikelihood unrestricted model (H1)      -9808.241
##                                                       
##   Akaike (AIC)                               20081.921
##   Bayesian (BIC)                             20201.428
##   Sample-size adjusted Bayesian (BIC)        20103.085
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.128
##   90 Percent confidence interval - lower         0.116
##   90 Percent confidence interval - upper         0.140
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.080
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI =~                                                                
##     PA                2.091    0.096   21.831    0.000    2.091    0.927
##     SS                1.593    0.132   12.107    0.000    1.593    0.609
##     DSC               1.445    0.138   10.469    0.000    1.445    0.540
##     PC                2.074    0.154   13.492    0.000    2.074    0.664
##     MR                1.996    0.129   15.527    0.000    1.996    0.737
##     BD                2.055    0.138   14.941    0.000    2.055    0.717
##   VCI =~                                                                
##     V                 2.227    0.121   18.414    0.000    2.227    0.839
##     S                 2.519    0.139   18.134    0.000    2.519    0.830
##     I                 2.071    0.129   16.048    0.000    2.071    0.762
##     C                 1.733    0.114   15.210    0.000    1.733    0.670
##     A                 0.973    0.160    6.092    0.000    0.973    0.310
##   WMI =~                                                                
##     C                 0.932    0.107    8.704    0.000    0.932    0.360
##     DS                2.328    0.152   15.285    0.000    2.328    0.793
##     LNS               2.054    0.142   14.481    0.000    2.054    0.756
##     A                 1.518    0.168    9.024    0.000    1.518    0.484
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI ~~                                                                
##     VCI               0.409    0.051    7.975    0.000    0.409    0.409
##     WMI               0.533    0.049   10.906    0.000    0.533    0.533
##   VCI ~~                                                                
##     WMI               0.260    0.063    4.103    0.000    0.260    0.260
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .PA                0.721    0.133    5.409    0.000    0.721    0.141
##    .SS                4.306    0.346   12.432    0.000    4.306    0.629
##    .DSC               5.074    0.400   12.676    0.000    5.074    0.708
##    .PC                5.467    0.450   12.152    0.000    5.467    0.560
##    .MR                3.340    0.289   11.543    0.000    3.340    0.456
##    .BD                3.992    0.340   11.751    0.000    3.992    0.486
##    .V                 2.094    0.235    8.901    0.000    2.094    0.297
##    .S                 2.869    0.312    9.181    0.000    2.869    0.311
##    .I                 3.090    0.288   10.725    0.000    3.090    0.419
##    .C                 1.979    0.202    9.824    0.000    1.979    0.296
##    .A                 5.811    0.491   11.832    0.000    5.811    0.591
##    .DS                3.202    0.441    7.264    0.000    3.202    0.371
##    .LNS               3.159    0.377    8.387    0.000    3.159    0.428
##     POI               1.000                               1.000    1.000
##     VCI               1.000                               1.000    1.000
##     WMI               1.000                               1.000    1.000

The model fits badly but based on the EGA, the reason looks like a massive residual covariance between digit-symbol coding and symbol search. These are usually not modeled together because they’re so collinear, so adding this residual covariance probably won’t count as overfitting, and this relationship is found in both samples, so I doubt it would hurt to constrain this between models as well.

#Adding a large, plausible residual covariance

WIIIUKR.model <- '
POI =~ PA + SS + DSC + PC + MR + BD
VCI =~ V + S + I + C + A
WMI =~ C + DS + LNS + A

DSC ~~ SS'

WIIIUKR.fit <- cfa(WIIIUKR.model, sample.cov = WAISIIIW.cov, sample.nobs = 349, std.lv = T, orthogonal = F)
summary(WIIIUKR.fit, stand = T, fit = T)
## lavaan 0.6-5 ended normally after 28 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         32
##                                                       
##   Number of observations                           349
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                               192.576
##   Degrees of freedom                                59
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              2475.174
##   Degrees of freedom                                78
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.944
##   Tucker-Lewis Index (TLI)                       0.926
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -9904.529
##   Loglikelihood unrestricted model (H1)      -9808.241
##                                                       
##   Akaike (AIC)                               19873.058
##   Bayesian (BIC)                             19996.420
##   Sample-size adjusted Bayesian (BIC)        19894.905
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.081
##   90 Percent confidence interval - lower         0.068
##   90 Percent confidence interval - upper         0.093
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.062
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI =~                                                                
##     PA                2.098    0.096   21.818    0.000    2.098    0.930
##     SS                1.493    0.134   11.167    0.000    1.493    0.571
##     DSC               1.319    0.140    9.404    0.000    1.319    0.493
##     PC                2.101    0.153   13.699    0.000    2.101    0.672
##     MR                2.017    0.128   15.725    0.000    2.017    0.745
##     BD                2.062    0.138   14.989    0.000    2.062    0.720
##   VCI =~                                                                
##     V                 2.227    0.121   18.408    0.000    2.227    0.838
##     S                 2.519    0.139   18.134    0.000    2.519    0.830
##     I                 2.071    0.129   16.055    0.000    2.071    0.762
##     C                 1.732    0.114   15.207    0.000    1.732    0.670
##     A                 0.974    0.160    6.095    0.000    0.974    0.311
##   WMI =~                                                                
##     C                 0.933    0.107    8.709    0.000    0.933    0.361
##     DS                2.324    0.153   15.239    0.000    2.324    0.792
##     LNS               2.055    0.142   14.474    0.000    2.055    0.757
##     A                 1.520    0.168    9.029    0.000    1.520    0.485
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .SS ~~                                                                 
##    .DSC               3.422    0.340   10.059    0.000    3.422    0.684
##   POI ~~                                                                
##     VCI               0.421    0.051    8.295    0.000    0.421    0.421
##     WMI               0.528    0.049   10.725    0.000    0.528    0.528
##   VCI ~~                                                                
##     WMI               0.260    0.063    4.106    0.000    0.260    0.260
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .PA                0.689    0.141    4.906    0.000    0.689    0.135
##    .SS                4.617    0.367   12.565    0.000    4.617    0.675
##    .DSC               5.422    0.424   12.782    0.000    5.422    0.757
##    .PC                5.355    0.444   12.062    0.000    5.355    0.548
##    .MR                3.254    0.286   11.371    0.000    3.254    0.444
##    .BD                3.960    0.339   11.667    0.000    3.960    0.482
##    .V                 2.097    0.235    8.913    0.000    2.097    0.297
##    .S                 2.869    0.312    9.187    0.000    2.869    0.311
##    .I                 3.088    0.288   10.725    0.000    3.088    0.419
##    .C                 1.977    0.201    9.822    0.000    1.977    0.296
##    .A                 5.804    0.491   11.823    0.000    5.804    0.590
##    .DS                3.218    0.442    7.278    0.000    3.218    0.373
##    .LNS               3.155    0.378    8.353    0.000    3.155    0.428
##     POI               1.000                               1.000    1.000
##     VCI               1.000                               1.000    1.000
##     WMI               1.000                               1.000    1.000

This is acceptable as-is, but if we wanted to bring it up above the more typical CFI of 0.95 starting value, residual covariances between digit span and comprehension and digit-symbol coding and picture arrangement are also plausible and viable, raising CFI to 0.962 and reducing RMSEA to 0.067.

#Plot the model

BRLAT <- list(
  POI = c("PA", "SS", "DSC", "PC", "MR", "BD"),
  VCI = c("V", "S", "I", "C", "A"),
  WMI = c("C", "A", "DS", "LNS"))

semPaths(WIIIUKR.fit, "model", "std", title = F, residuals = F, groups = "BRLAT", pastel = T, mar = c(2, 1, 3, 1))
## Warning in if (w <= 0) w <- 1e-07: the condition has length > 1 and only the
## first element will be used

## Warning in if (w <= 0) w <- 1e-07: the condition has length > 1 and only the
## first element will be used

#Configural model

WIIIBWC.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F)

parameterEstimates(WIIIBWC.fit, stand = T)
fitMeasures(WIIIBWC.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        249.398        118.000         90.000          0.952          0.070 
## rmsea.ci.lower rmsea.ci.upper 
##          0.058          0.082

The configural model fits somewhat better than the UK-only model, but apparently a loading is insignificant in the South African group! It would be significant with the British group and - who knows! - it might be significant when forced to equality with the British loading, but only testing can tell.

#Metric model

WIIIBWM.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = "loadings")

parameterEstimates(WIIIBWM.fit, stand = T)
fitMeasures(WIIIBWM.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        315.892        130.000         78.000          0.932          0.079 
## rmsea.ci.lower rmsea.ci.upper 
##          0.068          0.090
pchisq((fitMeasures(WIIIBWM.fit, "chisq") - fitMeasures(WIIIBWC.fit, "chisq")), (fitMeasures(WIIIBWM.fit, "df") - fitMeasures(WIIIBWC.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWM.fit, "cfi") - fitMeasures(WIIIBWC.fit, "cfi")
##   cfi 
## -0.02
fitMeasures(WIIIBWM.fit, "rmsea") - fitMeasures(WIIIBWC.fit, "rmsea")
## rmsea 
## 0.009

The shift in \(\chi^2\) was very significant and the \(\Delta\)CFI was very large, though \(\Delta\)RMSEA was below typical thresholds. Nonetheless, because the CFI shifted so much, a partial model should be sought. As described above, the parameter producing the largest shift in my desired goodness-of-fit metrics was used; this was estimated by referring to the modification indices.

#Partial metric model

WIIIBWMP.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = "loadings", group.partial = "VCI=~S")

parameterEstimates(WIIIBWMP.fit, stand = T)
fitMeasures(WIIIBWMP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        276.594        129.000         79.000          0.946          0.071 
## rmsea.ci.lower rmsea.ci.upper 
##          0.059          0.082
pchisq((fitMeasures(WIIIBWMP.fit, "chisq") - fitMeasures(WIIIBWC.fit, "chisq")), (fitMeasures(WIIIBWMP.fit, "df") - fitMeasures(WIIIBWC.fit, "df")), lower.tail = F)
## chisq 
## 0.004
fitMeasures(WIIIBWMP.fit, "cfi") - fitMeasures(WIIIBWC.fit, "cfi")
##    cfi 
## -0.006
fitMeasures(WIIIBWMP.fit, "rmsea") - fitMeasures(WIIIBWC.fit, "rmsea")
## rmsea 
## 0.001

The change in \(\chi^2\) became nominally significant (I typically use a scale p-value because using a fixed significance threshold leads to issues with Lindley’s paradox, which would be extremely acute in SEMs given how their SEs are calculated) and both the \(\Delta\)CFI and RMSEA became marginal. It is noteworthy that the only loading which differed was for the similarities subtest. The loading difference does favor the British group. Otherwise, though, the meanings of the subtests are basically alike.

#Scalar model

WIIIBWS.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"), group.partial = "VCI=~S")

parameterEstimates(WIIIBWS.fit, stand = T)
fitMeasures(WIIIBWS.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        580.428        139.000         69.000          0.839          0.118 
## rmsea.ci.lower rmsea.ci.upper 
##          0.108          0.128
pchisq((fitMeasures(WIIIBWS.fit, "chisq") - fitMeasures(WIIIBWMP.fit, "chisq")), (fitMeasures(WIIIBWS.fit, "df") - fitMeasures(WIIIBWMP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWS.fit, "cfi") - fitMeasures(WIIIBWMP.fit, "cfi")
##    cfi 
## -0.107
fitMeasures(WIIIBWS.fit, "rmsea") - fitMeasures(WIIIBWMP.fit, "rmsea")
## rmsea 
## 0.047

Scalar invariance was clearly untenable, which is to say, the levels of the items could not be interpreted in common.

#Partial scalar model

WIIIBWSP.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1"))

parameterEstimates(WIIIBWSP.fit, stand = T)
fitMeasures(WIIIBWSP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        305.710        134.000         74.000          0.938          0.075 
## rmsea.ci.lower rmsea.ci.upper 
##          0.064          0.086
pchisq((fitMeasures(WIIIBWSP.fit, "chisq") - fitMeasures(WIIIBWMP.fit, "chisq")), (fitMeasures(WIIIBWSP.fit, "df") - fitMeasures(WIIIBWMP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWSP.fit, "cfi") - fitMeasures(WIIIBWMP.fit, "cfi")
##    cfi 
## -0.009
fitMeasures(WIIIBWSP.fit, "rmsea") - fitMeasures(WIIIBWMP.fit, "rmsea")
## rmsea 
## 0.004

The partial model was significantly different but the \(\Delta\)CFI and RMSEA were acceptable. Remarkably, the released intercepts were uniformly higher for the South African group. Two of the released subtest intercepts were for verbal - information and similarities - and the other three were for nonverbal tests - matrix reasoning, digit-symbol coding, and symbol search. To remind, the Cohen’s d (which was curious for the original authors to provide given the differences in sample size; Hedge’s g would have been more appropriate, although the difference would be minor) for these were, in order, 0.52, 2.58, 0.04, 0.78, and 0.55. Tentatively, the size of the mean difference in some subtest does not provide information as to how biased it may be, as both the largest and smallest mean difference subtests were biased, but many at varying levels of mean difference were not.

#Strict model

WIIIBWR.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1"))

parameterEstimates(WIIIBWR.fit, stand = T)
fitMeasures(WIIIBWR.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        482.353        147.000         61.000          0.878          0.100 
## rmsea.ci.lower rmsea.ci.upper 
##          0.090          0.110
pchisq((fitMeasures(WIIIBWR.fit, "chisq") - fitMeasures(WIIIBWSP.fit, "chisq")), (fitMeasures(WIIIBWR.fit, "df") - fitMeasures(WIIIBWSP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWR.fit, "cfi") - fitMeasures(WIIIBWSP.fit, "cfi")
##   cfi 
## -0.06
fitMeasures(WIIIBWR.fit, "rmsea") - fitMeasures(WIIIBWSP.fit, "rmsea")
## rmsea 
## 0.025

There are clearly different causes involved here.

#Partial strict model

WIIIBWRP.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

parameterEstimates(WIIIBWRP.fit, stand = T)
fitMeasures(WIIIBWRP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        340.945        145.000         63.000          0.929          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.066          0.088
pchisq((fitMeasures(WIIIBWRP.fit, "chisq") - fitMeasures(WIIIBWSP.fit, "chisq")), (fitMeasures(WIIIBWRP.fit, "df") - fitMeasures(WIIIBWSP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWRP.fit, "cfi") - fitMeasures(WIIIBWSP.fit, "cfi")
##    cfi 
## -0.009
fitMeasures(WIIIBWRP.fit, "rmsea") - fitMeasures(WIIIBWSP.fit, "rmsea")
## rmsea 
## 0.002

Only two residuals had to be released to find an acceptably well-fitting strict model. What different influences are needed to explain the biased residual variances for picture arrangement and similarities? It is possible to invoke culture for both of these, but whether that is really appropriate is uncertain. Indices for detecting the effect of biased residual variances are in development by at least two teams of researchers based in the U.S. For the present purposes, I will only assess the effects of biased loadings and intercepts because that’s all Cockroft et al. (2015) assessed, but the effect of differences in the residual variances on latent means is easily observed by comparing the full to the partial models. In this case, the standardized POI mean (which is adjusted for sample size, like Hedge’s g) changed from -1.301 (negative values indicate a British advantage) to -1.452, the VCI mean changed from -2.748 to -2.765, and the WMI mean changed from 0.174 to 0.171, indicating that in all three cases, the residual variance favored the South African sample.

#Homogeneity of latent variances

WIIIBWL.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

parameterEstimates(WIIIBWL.fit, stand = T)
fitMeasures(WIIIBWL.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        340.945        145.000         63.000          0.929          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.066          0.088
pchisq((fitMeasures(WIIIBWL.fit, "chisq") - fitMeasures(WIIIBWRP.fit, "chisq")), (fitMeasures(WIIIBWL.fit, "df") - fitMeasures(WIIIBWRP.fit, "df")), lower.tail = F)
## chisq 
##     1
fitMeasures(WIIIBWL.fit, "cfi") - fitMeasures(WIIIBWRP.fit, "cfi")
## cfi 
##   0
fitMeasures(WIIIBWL.fit, "rmsea") - fitMeasures(WIIIBWRP.fit, "rmsea")
## rmsea 
##     0

Note that the df did not change because fixing the latent variances changes the identifying constraint from a fixed loading. Without differences in the latent variances which are considerable, this shouldn’t do anything.

#Homogeneity of latent means

WIIIBWM.fit <- cfa(WIIIUKR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals", "means"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

parameterEstimates(WIIIBWM.fit, stand = T)
fitMeasures(WIIIBWM.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        559.610        148.000         60.000          0.850          0.110 
## rmsea.ci.lower rmsea.ci.upper 
##          0.101          0.120
pchisq((fitMeasures(WIIIBWM.fit, "chisq") - fitMeasures(WIIIBWL.fit, "chisq")), (fitMeasures(WIIIBWM.fit, "df") - fitMeasures(WIIIBWL.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWM.fit, "cfi") - fitMeasures(WIIIBWL.fit, "cfi")
##    cfi 
## -0.078
fitMeasures(WIIIBWM.fit, "rmsea") - fitMeasures(WIIIBWL.fit, "rmsea")
## rmsea 
## 0.033

Constraining the latent means led to very large shifts in goodness-of-fit so a model constraining only as many means as possible was defined. This was the WMI mean alone; constraining any other mean demolished model fit.

#Partial homogeneity of latent means

WIIIBWMP.fit <- cfa(WIIIUKRMP.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

WIIIBWMPR.fit <- cfa(WIIIUKRMP.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals", "residual.covariances"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

parameterEstimates(WIIIBWMP.fit, stand = T)
fitMeasures(WIIIBWMP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        342.101        146.000         62.000          0.929          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.066          0.087
pchisq((fitMeasures(WIIIBWMP.fit, "chisq") - fitMeasures(WIIIBWL.fit, "chisq")), (fitMeasures(WIIIBWMP.fit, "df") - fitMeasures(WIIIBWL.fit, "df")), lower.tail = F)
## chisq 
## 0.282
fitMeasures(WIIIBWMP.fit, "cfi") - fitMeasures(WIIIBWL.fit, "cfi")
## cfi 
##   0
fitMeasures(WIIIBWMP.fit, "rmsea") - fitMeasures(WIIIBWL.fit, "rmsea")
## rmsea 
##     0
parameterEstimates(WIIIBWMPR.fit, stand = T)
fitMeasures(WIIIBWMPR.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        372.763        147.000         61.000          0.918          0.082 
## rmsea.ci.lower rmsea.ci.upper 
##          0.072          0.092
pchisq((fitMeasures(WIIIBWMPR.fit, "chisq") - fitMeasures(WIIIBWL.fit, "chisq")), (fitMeasures(WIIIBWMPR.fit, "df") - fitMeasures(WIIIBWL.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWMPR.fit, "cfi") - fitMeasures(WIIIBWL.fit, "cfi")
##    cfi 
## -0.011
fitMeasures(WIIIBWMPR.fit, "rmsea") - fitMeasures(WIIIBWL.fit, "rmsea")
## rmsea 
## 0.005

The working memory mean - as in other samples, like Frisby & Beaujean’s (2015) - could be constrained to no difference without any decrement in model fit. The other means could not. Although it is widely agreed that MI does not concern itself with residual covariances or factor covariances, for personal interest, I also fit a model in which the residual covariances were constrained and this fit significantly worse and was marginally over the line for \(\Delta\)CFI but not \(\Delta\)RMSEA. This seemed to have basically no effect on the corresponding latent mean, but the direction of effect was such that constraining this reduced the gap favoring the British sample for POI and lowered the British symbol search intercept while raising the digit-symbol coding intercept and pushing up both of the intercepts for the South African sample; all resulting changes were negligible.

South African Model

At this point, in order to test Cockroft et al.’s (2015) implied propositions, it’s necessary to refit the same series of models based on the South African sample’s structure.

#South African PCA structure - initial fit

WIIISA.model <- '
VCI =~ I + V + S + C + A
POI =~ MR + PC + BD + PA
WMI =~ A + DS + LNS
PSI =~ DSC + SS'

WIIISA.fit <- cfa(WIIISA.model, sample.cov = WAISIIIB.cov, sample.nobs = 107, std.lv = T, orthogonal = F)
summary(WIIISA.fit, stand = T, fit = T)
## lavaan 0.6-5 ended normally after 32 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         33
##                                                       
##   Number of observations                           107
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                53.773
##   Degrees of freedom                                58
##   P-value (Chi-square)                           0.633
## 
## Model Test Baseline Model:
## 
##   Test statistic                               430.393
##   Degrees of freedom                                78
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.016
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -2983.457
##   Loglikelihood unrestricted model (H1)      -2956.570
##                                                       
##   Akaike (AIC)                                6032.913
##   Bayesian (BIC)                              6121.117
##   Sample-size adjusted Bayesian (BIC)         6016.852
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.052
##   P-value RMSEA <= 0.05                          0.942
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.053
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   VCI =~                                                                
##     I                 2.583    0.273    9.460    0.000    2.583    0.808
##     V                 1.537    0.176    8.753    0.000    1.537    0.765
##     S                 0.968    0.126    7.691    0.000    0.968    0.694
##     C                 1.826    0.221    8.275    0.000    1.826    0.734
##     A                 1.142    0.255    4.484    0.000    1.142    0.453
##   POI =~                                                                
##     MR                1.246    0.245    5.086    0.000    1.246    0.528
##     PC                1.228    0.215    5.706    0.000    1.228    0.585
##     BD                1.938    0.260    7.455    0.000    1.938    0.746
##     PA                1.202    0.298    4.035    0.000    1.202    0.428
##   WMI =~                                                                
##     A                 0.916    0.274    3.345    0.001    0.916    0.364
##     DS                1.433    0.250    5.728    0.000    1.433    0.637
##     LNS               1.679    0.272    6.161    0.000    1.679    0.697
##   PSI =~                                                                
##     DSC               1.319    0.296    4.462    0.000    1.319    0.528
##     SS                2.266    0.389    5.818    0.000    2.266    0.853
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   VCI ~~                                                                
##     POI               0.471    0.105    4.483    0.000    0.471    0.471
##     WMI               0.396    0.121    3.270    0.001    0.396    0.396
##     PSI               0.125    0.120    1.036    0.300    0.125    0.125
##   POI ~~                                                                
##     WMI               0.527    0.121    4.349    0.000    0.527    0.527
##     PSI               0.561    0.122    4.577    0.000    0.561    0.561
##   WMI ~~                                                                
##     PSI               0.428    0.132    3.255    0.001    0.428    0.428
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .I                 3.536    0.714    4.954    0.000    3.536    0.346
##    .V                 1.680    0.302    5.553    0.000    1.680    0.416
##    .S                 1.005    0.163    6.163    0.000    1.005    0.518
##    .C                 2.859    0.488    5.863    0.000    2.859    0.462
##    .A                 3.371    0.555    6.069    0.000    3.371    0.532
##    .MR                4.013    0.629    6.380    0.000    4.013    0.721
##    .PC                2.902    0.480    6.043    0.000    2.902    0.658
##    .BD                2.991    0.707    4.233    0.000    2.991    0.443
##    .PA                6.433    0.949    6.780    0.000    6.433    0.817
##    .DS                3.006    0.611    4.917    0.000    3.006    0.594
##    .LNS               2.984    0.735    4.058    0.000    2.984    0.514
##    .DSC               4.502    0.798    5.641    0.000    4.502    0.721
##    .SS                1.929    1.523    1.266    0.205    1.929    0.273
##     VCI               1.000                               1.000    1.000
##     POI               1.000                               1.000    1.000
##     WMI               1.000                               1.000    1.000
##     PSI               1.000                               1.000    1.000

The model fit is so strong for the South African sample that it’s probably overfitted. This may make sense of the errors thrown based on the parallel analysis, or it could be nothing. It is worth noting that the residual covariance required to fit well in the British-derived model has become its own factor in this one. If Cockroft et al. (2015) had used a more typical loading cutoff (i.e., >0.3 instead of >0.4; with luck, in conjunction with EFA instead of PCA), they would not have noticed as much of a discrepancy between the models. Because of the similarity of the models, taken appropriately, it is unlikely that there’s going to be much of any difference starting from one or the other. Ignoring that residual covariances are generally frowned upon due to their connection to overfitting, the British model is more parsimonious even with it (\(\Delta\)df = 1)

#Plot the model

SALAT <- list(
  VCI = c("I", "V", "S", "C", "A"),
  POI = c("MR", "PC", "BD", "PA"),
  WMI = c("A", "DS", "LNS"),
  PSI = c("DSC", "SS"))

#There will be overlap, but this is unimportant until this goes to pub.

semPaths(WIIISA.fit, "model", "std", title = F, residuals = F, groups = "SALAT", pastel = T, mar = c(2, 1, 3, 1))
## Warning in if (w <= 0) w <- 1e-07: the condition has length > 1 and only the
## first element will be used

#Configural model 

WIIIWBC.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F)

parameterEstimates(WIIIWBC.fit, stand = T)
fitMeasures(WIIIWBC.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        297.069        116.000         92.000          0.934          0.083 
## rmsea.ci.lower rmsea.ci.upper 
##          0.071          0.094

This model fits somewhat worse than the British-derived configural model, but it is nonetheless acceptable by typical standards. The fact that this model has two fewer df and fits worse is a strike against it, however.

#Metric model

WIIIWBM.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = "loadings")

parameterEstimates(WIIIWBM.fit, stand = T)
fitMeasures(WIIIWBM.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        365.493        126.000         82.000          0.913          0.091 
## rmsea.ci.lower rmsea.ci.upper 
##          0.080          0.102
pchisq((fitMeasures(WIIIWBM.fit, "chisq") - fitMeasures(WIIIWBC.fit, "chisq")), (fitMeasures(WIIIWBM.fit, "df") - fitMeasures(WIIIWBC.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIWBM.fit, "cfi") - fitMeasures(WIIIWBC.fit, "cfi")
##    cfi 
## -0.021
fitMeasures(WIIIWBM.fit, "rmsea") - fitMeasures(WIIIWBC.fit, "rmsea")
## rmsea 
## 0.009

The parameter changes in this model compared to the metric stage of the British-derived model are virtually identical, but negligibly worse.

#Partial metric model

WIIIWBMP.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = "loadings", group.partial = "VCI=~S")

parameterEstimates(WIIIWBMP.fit, stand = T)
fitMeasures(WIIIWBMP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        319.608        125.000         83.000          0.929          0.083 
## rmsea.ci.lower rmsea.ci.upper 
##          0.071          0.094
pchisq((fitMeasures(WIIIWBMP.fit, "chisq") - fitMeasures(WIIIWBC.fit, "chisq")), (fitMeasures(WIIIWBMP.fit, "df") - fitMeasures(WIIIWBC.fit, "df")), lower.tail = F)
## chisq 
## 0.007
fitMeasures(WIIIWBMP.fit, "cfi") - fitMeasures(WIIIWBC.fit, "cfi")
##    cfi 
## -0.005
fitMeasures(WIIIWBMP.fit, "rmsea") - fitMeasures(WIIIWBC.fit, "rmsea")
## rmsea 
##     0

As expected based on the similarity of their factor structures, the same loading had to be freed despite differing initial models.

#Scalar model

WIIIWBS.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"), group.partial = "VCI=~S")

parameterEstimates(WIIIWBS.fit, stand = T)
fitMeasures(WIIIWBS.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        553.680        134.000         74.000          0.847          0.117 
## rmsea.ci.lower rmsea.ci.upper 
##          0.107          0.127
pchisq((fitMeasures(WIIIWBS.fit, "chisq") - fitMeasures(WIIIWBMP.fit, "chisq")), (fitMeasures(WIIIWBS.fit, "df") - fitMeasures(WIIIWBMP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIWBS.fit, "cfi") - fitMeasures(WIIIWBMP.fit, "cfi")
##    cfi 
## -0.082
fitMeasures(WIIIWBS.fit, "rmsea") - fitMeasures(WIIIWBMP.fit, "rmsea")
## rmsea 
## 0.035

Just as before, the level of the indicators is not interpretable in common.

#Partial scalar model

WIIIWBSP.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"), group.partial = c("VCI=~S", "I~1", "MR~1", "C~1", "A~1", "S~1"))

WIIIWBSPII.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"), group.partial = c("VCI=~S", "I~1", "MR~1", "C~1", "S~1"))

parameterEstimates(WIIIWBSP.fit, stand = T)
fitMeasures(WIIIWBSP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        334.654        129.000         79.000          0.925          0.084 
## rmsea.ci.lower rmsea.ci.upper 
##          0.073          0.095
pchisq((fitMeasures(WIIIWBSP.fit, "chisq") - fitMeasures(WIIIWBMP.fit, "chisq")), (fitMeasures(WIIIWBSP.fit, "df") - fitMeasures(WIIIWBMP.fit, "df")), lower.tail = F)
## chisq 
## 0.005
fitMeasures(WIIIWBSP.fit, "cfi") - fitMeasures(WIIIWBMP.fit, "cfi")
##    cfi 
## -0.004
fitMeasures(WIIIWBSP.fit, "rmsea") - fitMeasures(WIIIWBMP.fit, "rmsea")
## rmsea 
## 0.001
parameterEstimates(WIIIWBSPII.fit, stand = T)
fitMeasures(WIIIWBSPII.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        351.456        130.000         78.000          0.919          0.086 
## rmsea.ci.lower rmsea.ci.upper 
##          0.076          0.097
pchisq((fitMeasures(WIIIWBSPII.fit, "chisq") - fitMeasures(WIIIWBMP.fit, "chisq")), (fitMeasures(WIIIWBSPII.fit, "df") - fitMeasures(WIIIWBMP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIWBSPII.fit, "cfi") - fitMeasures(WIIIWBMP.fit, "cfi")
##   cfi 
## -0.01
fitMeasures(WIIIWBSPII.fit, "rmsea") - fitMeasures(WIIIWBMP.fit, "rmsea")
## rmsea 
## 0.004

In the British-derived structure, information, matrix reasoning, digit-symbol coding, symbol search, and similarities had to be released, but using the South African structure, digit-symbol coding and symbol search are replaced by comprehension and arithmetic. This might be considered theoretical corroboration for Cockroft et al. (2015) until the pattern of intercepts is inspected and it’s noticed that, again, the intercepts are uniformly higher for the South African sample, making these more verbal tests favorable towards rather than biased against the South African sample. Also worth noting is that a change in CFI of -0.01 is attained if arithmetic is not freed, but this is right on the line so debatable whether it should be freed in the first place. Nonetheless, to evade any doubts about the validity of this analysis, I freed it.

#Strict model

WIIIWBR.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "C~1", "A~1", "S~1"))


parameterEstimates(WIIIWBR.fit, stand = T)
fitMeasures(WIIIWBR.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        548.467        142.000         66.000          0.852          0.112 
## rmsea.ci.lower rmsea.ci.upper 
##          0.102          0.122
pchisq((fitMeasures(WIIIWBR.fit, "chisq") - fitMeasures(WIIIWBSP.fit, "chisq")), (fitMeasures(WIIIWBR.fit, "df") - fitMeasures(WIIIWBSP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIWBR.fit, "cfi") - fitMeasures(WIIIWBSP.fit, "cfi")
##    cfi 
## -0.073
fitMeasures(WIIIWBR.fit, "rmsea") - fitMeasures(WIIIWBSP.fit, "rmsea")
## rmsea 
## 0.028

Just as before, there are considerable differences in the residual variances.

#Partial strict model

WIIIWBRP.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "C~1", "A~1", "S~1", "PA~~PA", "SS~~SS", "S~~S", "PC~~PC"))

parameterEstimates(WIIIWBRP.fit, stand = T)
fitMeasures(WIIIWBRP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        365.203        138.000         70.000          0.917          0.085 
## rmsea.ci.lower rmsea.ci.upper 
##          0.074          0.096
pchisq((fitMeasures(WIIIWBRP.fit, "chisq") - fitMeasures(WIIIWBSP.fit, "chisq")), (fitMeasures(WIIIWBRP.fit, "df") - fitMeasures(WIIIWBSP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIWBRP.fit, "cfi") - fitMeasures(WIIIWBSP.fit, "cfi")
##    cfi 
## -0.008
fitMeasures(WIIIWBRP.fit, "rmsea") - fitMeasures(WIIIWBSP.fit, "rmsea")
## rmsea 
## 0.001

Rather than two residual variances, as with the British-derived structure, four had to be released in order to allow for partial MI at this level with the South African model. Their release increased the British advantage in VCI and POI and increased the South African advantage in PSI.

#Homogeneity of latent variances

WIIIWBL.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "C~1", "A~1", "S~1", "PA~~PA", "SS~~SS", "S~~S", "PC~~PC"))

parameterEstimates(WIIIWBL.fit, stand = T)
fitMeasures(WIIIWBL.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        365.203        138.000         70.000          0.917          0.085 
## rmsea.ci.lower rmsea.ci.upper 
##          0.074          0.096
pchisq((fitMeasures(WIIIWBL.fit, "chisq") - fitMeasures(WIIIWBRP.fit, "chisq")), (fitMeasures(WIIIWBL.fit, "df") - fitMeasures(WIIIWBRP.fit, "df")), lower.tail = F)
## chisq 
##     1
fitMeasures(WIIIWBL.fit, "cfi") - fitMeasures(WIIIWBRP.fit, "cfi")
## cfi 
##   0
fitMeasures(WIIIWBL.fit, "rmsea") - fitMeasures(WIIIWBRP.fit, "rmsea")
## rmsea 
##     0
#Homogeneity of latent means

WIIIWBM.fit <- cfa(WIIISA.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals", "means"), group.partial = c("VCI=~S", "I~1", "MR~1", "C~1", "A~1", "S~1", "PA~~PA", "SS~~SS", "S~~S", "PC~~PC"))

parameterEstimates(WIIIWBM.fit, stand = T)
fitMeasures(WIIIWBM.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        604.653        142.000         66.000          0.832          0.120 
## rmsea.ci.lower rmsea.ci.upper 
##          0.110          0.129
pchisq((fitMeasures(WIIIWBM.fit, "chisq") - fitMeasures(WIIIWBL.fit, "chisq")), (fitMeasures(WIIIWBM.fit, "df") - fitMeasures(WIIIWBL.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIWBM.fit, "cfi") - fitMeasures(WIIIWBL.fit, "cfi")
##    cfi 
## -0.086
fitMeasures(WIIIWBM.fit, "rmsea") - fitMeasures(WIIIWBL.fit, "rmsea")
## rmsea 
## 0.035
#Partial homogeneity of latent means

WIIIWBMP.fit <- cfa(WIIISAMP.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "C~1", "A~1", "S~1", "PA~~PA", "SS~~SS", "S~~S", "PC~~PC"))

parameterEstimates(WIIIWBMP.fit, stand = T)
fitMeasures(WIIIWBMP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        365.411        139.000         69.000          0.918          0.085 
## rmsea.ci.lower rmsea.ci.upper 
##          0.074          0.095
pchisq((fitMeasures(WIIIWBMP.fit, "chisq") - fitMeasures(WIIIWBL.fit, "chisq")), (fitMeasures(WIIIWBMP.fit, "df") - fitMeasures(WIIIWBL.fit, "df")), lower.tail = F)
## chisq 
## 0.648
fitMeasures(WIIIWBMP.fit, "cfi") - fitMeasures(WIIIWBL.fit, "cfi")
## cfi 
##   0
fitMeasures(WIIIWBMP.fit, "rmsea") - fitMeasures(WIIIWBL.fit, "rmsea")
## rmsea 
##     0

Just as in the British model, only WMI can be constrained to equality. Any latent mean constrained in addition to this leads to too large a decrement in fit.

Starting Model Comparison

#Vector of BIC values, followed by weights from qpcR

BICs <- c(fitMeasures(WIIIBWMP.fit, "bic"), fitMeasures(WIIIWBMP.fit, "bic"))
BICweights <- akaike.weights(BICs)$weights

#Model name frame

Mods <- data.frame(Model = factor(rep(c("British", "South African"))),
                   CRIT = factor(rep('BIC', each = 2)),
                   values = BICweights)

#Plot and probability output

ggplot(Mods, aes(Model, values, fill = Model)) + geom_bar(stat = "identity", position = "dodge", show.legend = F) + coord_cartesian(ylim = c(0,1)) + xlab('Schwarz Weights') + ylab('Normalized Probability') + theme_bw()

Mods

In terms of information theoretic criterion, the model derived from the British sample is clearly better despite seven more df and despite a four factor model - which the South African sample apparently more closely matched - being the “specified Wechsler structure” (to quote Cockroft et al., 2015). There is actually another shocking fact here: Cockroft et al. (2015) did not attempt to compare CFA models based on the British and South African structures for the British or South African samples. For the British sample, the South African structure clearly fits worse (although it can “be replicated” in the sense that the same degree of partial MI is tenable with it), but for the South African sample, the British structure is actually preferred (information theoretic criteria are more appropriate here because other fit measures are saturated; it is possible there’s another four-factor model which is preferred but they did not offer any clear hints as to what it might look like).

#British PCA structure - South African sample

WIIIUKRSA.fit <- cfa(WIIIUKR.model, sample.cov = WAISIIIB.cov, sample.nobs = 107, std.lv = T, orthogonal = F)

# SASA = SA with SA structure; SAUK = SA with UK structure

round(cbind(SASA = fitMeasures(WIIISA.fit, FITM),
            SAUK = fitMeasures(WIIIUKRSA.fit, FITM)),3)
##                  SASA   SAUK
## chisq          53.773 56.822
## df             58.000 59.000
## npar           33.000 32.000
## cfi             1.000  1.000
## rmsea           0.000  0.000
## rmsea.ci.lower  0.000  0.000
## rmsea.ci.upper  0.052  0.055
BICSAC <- c(fitMeasures(WIIIUKRSA.fit, "bic"), fitMeasures(WIIISA.fit, "bic"))
BICweightsSAC <- akaike.weights(BICSAC)$weights
ModsSAC <- data.frame(Model = factor(rep(c("British", "South African"))),
                   CRITSAC = factor(rep('BIC', each = 2)),
                   values = BICweightsSAC)

ggplot(ModsSAC, aes(Model, values, fill = Model)) + geom_bar(stat = "identity", position = "dodge", show.legend = F) + coord_cartesian(ylim = c(0,1)) + xlab('Schwarz Weights') + ylab('Normalized Probability') + theme_bw()

ModsSAC

Assessing the Magnitude of Bias

At this point, it’s appropriate to assess the \(SDI_2\) in order to test the claim that the WAIS-III showed a “Eurocentric and favorable SES” bias. If this is the case, then the aggregate effect of the differences in the released parameters will be to decrease the expected British observed scores (negative sign) if they use the South African sample’s loadings and intercepts, since I am using the British sample as group 2 in the following function (this is arbitrary and does not affect the results). Parameters come from the model with British structure since this was preferred and from the homogeneity of latent variances model since it’s likely that there was insufficient power to adequately detect a difference in working memory (type-II error).

parameterEstimates(WIIIBWL.fit, stand = T) %>% 
  filter(op == "=~") %>% 
  select('Latent Factor' = lhs, Indicator = rhs, B = est, SE = se, Z = z, 'p-value' = pvalue, Beta = std.all) %>% 
  kable(digits = 3, format = "pandoc", caption = "Factor Loadings")
Factor Loadings
Latent Factor Indicator B SE Z p-value Beta
POI PA 2.074 0.095 21.839 0 0.924
POI SS 1.514 0.131 11.521 0 0.567
POI DSC 1.299 0.136 9.568 0 0.482
POI PC 2.154 0.138 15.601 0 0.702
POI MR 2.015 0.127 15.909 0 0.740
POI BD 2.109 0.129 16.302 0 0.729
VCI V 2.245 0.112 20.014 0 0.853
VCI S 2.523 0.139 18.152 0 0.831
VCI I 2.202 0.131 16.817 0 0.765
VCI C 1.558 0.091 17.200 0 0.608
VCI A 0.626 0.092 6.779 0 0.206
WMI C 0.977 0.104 9.358 0 0.381
WMI DS 2.228 0.145 15.366 0 0.767
WMI LNS 2.069 0.138 14.992 0 0.752
WMI A 1.735 0.154 11.299 0 0.570
POI PA 2.074 0.095 21.839 0 0.489
POI SS 1.514 0.131 11.521 0 0.424
POI DSC 1.299 0.136 9.568 0 0.351
POI PC 2.154 0.138 15.601 0 0.557
POI MR 2.015 0.127 15.909 0 0.599
POI BD 2.109 0.129 16.302 0 0.588
VCI V 2.245 0.112 20.014 0 0.822
VCI S 1.104 0.146 7.562 0 0.700
VCI I 2.202 0.131 16.817 0 0.724
VCI C 1.558 0.091 17.200 0 0.589
VCI A 0.626 0.092 6.779 0 0.202
WMI C 0.977 0.104 9.358 0 0.277
WMI DS 2.228 0.145 15.366 0 0.621
WMI LNS 2.069 0.138 14.992 0 0.602
WMI A 1.735 0.154 11.299 0 0.421
parameterEstimates(WIIIBWL.fit, stand = T) %>% 
  filter(op == "~1") %>% 
  select(Indicator = lhs, "Unstandardized Intercept" = est, SE = se, Z = z, 'p-value' = pvalue, "Standardized Intercept" = std.all) %>% 
  kable(digits = 3, format = "pandoc", caption = "Intercepts")
Intercepts
Indicator Unstandardized Intercept SE Z p-value Standardized Intercept
PA 10.090 0.120 84.122 0.000 4.496
SS 8.250 0.143 57.651 0.000 3.086
DSC 7.720 0.144 53.474 0.000 2.862
PC 10.478 0.159 66.003 0.000 3.414
MR 10.590 0.146 72.623 0.000 3.887
BD 10.682 0.150 71.053 0.000 3.695
V 14.523 0.140 103.423 0.000 5.519
S 14.220 0.162 87.511 0.000 4.684
I 11.750 0.154 76.229 0.000 4.080
C 12.528 0.136 92.222 0.000 4.886
A 9.677 0.160 60.656 0.000 3.178
DS 9.406 0.151 62.083 0.000 3.239
LNS 8.893 0.143 62.164 0.000 3.231
POI 0.000 0.000 NA NA 0.000
VCI 0.000 0.000 NA NA 0.000
WMI 0.000 0.000 NA NA 0.000
PA 10.090 0.120 84.122 0.000 3.495
SS 11.208 0.277 40.507 0.000 4.607
DSC 11.035 0.284 38.836 0.000 4.373
PC 10.478 0.159 66.003 0.000 3.979
MR 12.672 0.269 47.059 0.000 5.533
BD 10.682 0.150 71.053 0.000 4.370
V 14.523 0.140 103.423 0.000 6.023
S 10.816 0.364 29.728 0.000 7.762
I 15.568 0.385 40.465 0.000 5.792
C 12.528 0.136 92.222 0.000 5.367
A 9.677 0.160 60.656 0.000 3.544
DS 9.406 0.151 62.083 0.000 3.960
LNS 8.893 0.143 62.164 0.000 3.911
POI -0.989 0.116 -8.531 0.000 -1.452
VCI -2.442 0.167 -14.656 0.000 -2.765
WMI 0.113 0.105 1.077 0.281 0.171
print(ints <- list((SS = 3.086 - 4.607), (DSC = 2.862 - 4.373), (MR = 3.887 - 5.533), (S = 4.684 - 7.762), (I = 4.08 - 5.792)))
## [[1]]
## [1] -1.521
## 
## [[2]]
## [1] -1.511
## 
## [[3]]
## [1] -1.646
## 
## [[4]]
## [1] -3.078
## 
## [[5]]
## [1] -1.712
print(ints <- list((SS = 8.250 - 11.208), (DSC = 7.720 - 11.035), (MR = 10.59 - 12.672), (S = 14.220 - 10.816), (I = 11.750 - 15.568)))
## [[1]]
## [1] -2.958
## 
## [[2]]
## [1] -3.315
## 
## [[3]]
## [1] -2.082
## 
## [[4]]
## [1] 3.404
## 
## [[5]]
## [1] -3.818
#Parameters for POI

Pp = 6; Pl1 = c(0, 0, 0, 0, 0, 0); Pl2 = c(0, 0, 0, 0, 0, 0); Pi1 = c(0, 0, 0, 0, 0, 0); Pi2 = c(0, -1.521, -1.511, 0, -1.646, 0); Psd = c(2.26, 2.62, 2.68, 3.13, 2.71, 2.87); Pfm = 1.452; Pfsd = 1

#Parameters for VCI

Vp = 5; Vl1 = c(0, 0.700, 0, 0, 0, 0); Vl2 = c(0, 0.831, 0, 0, 0, 0); Vi1 = c(0, 0, 0, 0, 0); Vi2 = c(0, -3.078, -1.712, 0, 0); Vsd = c(2.66, 3.04, 2.72, 2.59, 3.14); Vfm = 2.765; Vfsd = 1

#Parameters for WMI

Wp = 4; Wl1 = c(0, 0, 0, 0); Wl2 = c(0, 0, 0, 0); Wi1 = c(0, 0, 0, 0); Wi2 = c(0, 0, 0, 0); Wsd = c(2.59, 2.94, 2.72, 3.14); Wfm = -0.171; Wfsd = 1
#Parameter inputs for POI and VCI

PES <- SDI2.UDI2(Pp, Pl1, Pl2, Pi1, Pi2, Psd, Pfm, Pfsd)
VES <- SDI2.UDI2(Vp, Vl1, Vl2, Vi1, Vi2, Vsd, Vfm, Vfsd)

#POI and VCI bias

PES$SDI2
##        SDI2
## [1,] 0.0000
## [2,] 0.5804
## [3,] 0.5637
## [4,] 0.0000
## [5,] 0.6073
## [6,] 0.0000
VES$SDI2
##        SDI2
## [1,] 0.0000
## [2,] 0.8835
## [3,] 0.6215
## [4,] 0.0000
## [5,] 0.0000
#Corrected mean differences: uncorrected observed mean differences + bias effect

print(corrds <- list((SS = 0.55 + PES$SDI2[2]), (DSC = 0.78 + PES$SDI2[3]), (MR = 0.04 + PES$SDI2[5]), (S = 2.58 + VES$SDI2[2]), (I = 0.52 + VES$SDI2[3])))
## [[1]]
## [1] 1.1304
## 
## [[2]]
## [1] 1.3437
## 
## [[3]]
## [1] 0.6473
## 
## [[4]]
## [1] 3.4635
## 
## [[5]]
## [1] 1.1415

Using the unstandardized values somewhat changes the result.

#Parameters for POI

Pp = 6; Pl1 = c(0, 0, 0, 0, 0, 0); Pl2 = c(0, 0, 0, 0, 0, 0); Pi1 = c(0, 0, 0, 0, 0, 0); Pi2 = c(0, -2.958, -3.315, 0, -2.082, 0); Psd = c(2.26, 2.62, 2.68, 3.13, 2.71, 2.87); Pfm = 1.452; Pfsd = 1

#Parameters for VCI

Vp = 5; Vl1 = c(0, 0.700, 0, 0, 0, 0); Vl2 = c(0, 0.831, 0, 0, 0, 0); Vi1 = c(0, 0, 0, 0, 0); Vi2 = c(0, 3.404, -3.818, 0, 0); Vsd = c(2.66, 3.04, 2.72, 2.59, 3.14); Vfm = 2.765; Vfsd = 1

#Parameters for WMI

Wp = 4; Wl1 = c(0, 0, 0, 0); Wl2 = c(0, 0, 0, 0); Wi1 = c(0, 0, 0, 0); Wi2 = c(0, 0, 0, 0); Wsd = c(2.59, 2.94, 2.72, 3.14); Wfm = -0.171; Wfsd = 1
#Parameter inputs for POI and VCI

PES <- SDI2.UDI2(Pp, Pl1, Pl2, Pi1, Pi2, Psd, Pfm, Pfsd)
VES <- SDI2.UDI2(Vp, Vl1, Vl2, Vi1, Vi2, Vsd, Vfm, Vfsd)

#POI and VCI bias

PES$SDI2
##        SDI2
## [1,] 0.0000
## [2,] 1.1288
## [3,] 1.2367
## [4,] 0.0000
## [5,] 0.7681
## [6,] 0.0000
VES$SDI2
##         SDI2
## [1,]  0.0000
## [2,] -1.2219
## [3,]  1.3861
## [4,]  0.0000
## [5,]  0.0000
#Corrected mean differences: uncorrected observed mean differences + bias effect

print(corrds <- list((SS = 0.55 + PES$SDI2[2]), (DSC = 0.78 + PES$SDI2[3]), (MR = 0.04 + PES$SDI2[5]), (S = 2.58 + VES$SDI2[2]), (I = 0.52 + VES$SDI2[3])))
## [[1]]
## [1] 1.6788
## 
## [[2]]
## [1] 2.0167
## 
## [[3]]
## [1] 0.8081
## 
## [[4]]
## [1] 1.3581
## 
## [[5]]
## [1] 1.9061

If unstandardized conclusions are correct then the bias is still largely in favor of the South African group, but the bias in similarities is reversed.

Conclusion

It is important to use consistent definitions of bias, especially when the model one wants to use is based on a specific definition. Importantly, had the bias gone in the direction Cockroft et al. (2015) expected, this would not have vindicated a role for culture or SES: the only determinate result in this instance is a negative one. Finding bias in the expected direction would have lended credence to their idea, but it would not have proved it.

Regarding the predictions at the top of this post, prediction 1 was not supported, as the bias went in the opposite of the direction implied by that prediction. Prediction 2 was likewise not supported since symbol search, digit-symbol coding, and matrix reasoning were biased; there is partial support since the bias on these was smaller than the similarities and information biases, but bias was larger than the remainder of the items, which included verbal and nonverbal subtests, basically without bias. Because most subtests were not biased, prediction 3 is not supported, and moreover, there is a qualitative lack of support for prediction 3 because the bias went in the opposite direction. Prediction 4 and its addendums were simply not supported, and in fact, the comprehension cross-loading did not need to be freed to reach acceptable fit. The probable explanation for their mentioning this cross-loading when it was supported was that they used an atypical cutoff for their PCA and the loading fell slightly below that level. Although it cannot be tested because the raw data are unavailable, prediction 5, on the basis of these results, the original authors would need to argue, is incorrect, since they suggested SES as an explanation for the group differences as they are and they went in the opposite of the expected direction. A more reasonable conclusion is that there are different explanations for the observed differences and biases.

Using the datasets analyzed by Dolan, Roorda & Wicherts (2004), I found that the bias was more (but not completely) consistently against the lower-scoring groups in those examples in which a partial MI model could be fit. I may supply the code for this result here in the future. Nonetheless, bias should still not be claimed to be present or to favor any particular group without a proper analysis; it’s liable, at least on occasion, to go in different directions in unexpected ways as in this analysis. Future analyses should assess the predictability of bias, meta-analytically assessing it and then, perhaps, surveying experts (psychologists, psychometricians, test developers, sociologists, etc.) to ascertain whether it can, in fact, be predicted, from context or criteria like the mean differences, SDs, and test content or type.

It is regrettable, but Frontiers has indicated that they do not want anything to do with this reanalysis of the Cockroft et al. (2015) results, so this will stay stagnant until a future editorial. Frontiers was unable to explain the initial reason for their disdain but after persistent contact, they claimed they wanted nothing to do with it for another reason (that it used novel data) which was not tenable, so they changed their reasoning back to simply failing to explain why they wanted nothing to do with it (despite claiming their review process is open and clear). This took more than a month and ten emails to find out.

General Intelligence

Confirmatory Models of g

Higher-order Model

Statistical Preference over The Measurement Model

The higher-order model is an empirically well-supported, parsimonious, and theoretically coherent explanation of the universal positive manifold among cognitive tests (Carroll, 1993; Jensen, 1998; Hood, 2010). It is generally theoretically preferred to the bifactor model and it has the added benefit beyond PCA and orthogonal models that all residual factors and components contribute positively to the matrix hierarchy (Jensen, 1987). When there are >3 factors, the model has fewer df, but generally equal or better fit, compared to the correlated group factors measurement model; with three, it cannot be distinguished in most case (and is liable to generate certain types of mostly inconsequential errors). I illustrate this empirical regularity using the four-factor South African structure below.

#South African PCA structure - higher-order

WIIISAHOF.model <- '
VCI =~ I + V + S + C + A
POI =~ MR + PC + BD + PA
WMI =~ A + DS + LNS
PSI =~ DSC + SS

g =~ POI + VCI + WMI + PSI'

WIIISAHOF.fit <- cfa(WIIISAHOF.model, sample.cov = WAISIIIB.cov, sample.nobs = 107, std.lv = T, orthogonal = T)

print(list(fitMeasures(WIIISA.fit, c("df", "bic")), fitMeasures(WIIISAHOF.fit, c("df", "bic"))))
## [[1]]
##       df      bic 
##   58.000 6121.117 
## 
## [[2]]
##      df     bic 
##   60.00 6116.72
#Compare to correlated group factor model

BICHOF <- c(fitMeasures(WIIISA.fit, "bic"), fitMeasures(WIIISAHOF.fit, "bic"))
BICweightsHOF <- akaike.weights(BICHOF)$weights
ModsHOF <- data.frame(Model = factor(rep(c("Measurement", "Higher-Order"))),
                   CRITHOF = factor(rep('BIC', each = 2)),
                   values = BICweightsHOF)

ggplot(ModsHOF, aes(Model, values, fill = Model)) + geom_bar(stat = "identity", position = "dodge", show.legend = F) + coord_cartesian(ylim = c(0,1)) + xlab('Schwarz Weights') + ylab('Normalized Probability') + theme_bw()

ModsHOF

Higher-order Measurement Invariance and Spearman’s Hypothesis Testing Procedures

In order to assess MI in a higher-order model, it’s required that MI with respect to a measurement model is tested first since invariance with respect to group factors cannot be simultaneously tested alongside invariance with respect to the factor above them, g. The bifactor model, due to being a model in which all factors - g and group factors - are orthogonal, has no such requirement. The initial table explaining MI is continued here in order to explain MI in the higher-order model.

Model Nesting Common Name(s) Statement
B1 A3/4/5 Second-order invariance \(\lambda_{r} = \lambda_{f}, \Psi_{r}^* = \Psi_{f}^*,\Phi_{r}^* \neq \Phi_{f}^*, \Theta_{r} = \Theta_{f}, \Gamma_{r} = \Gamma_{f}, \tau_{r} \neq \tau_{f}, \delta = 0, \tau_g = 0\)
B2 B1, B3-B5 Strong Spearman’s hypothesis \(\tau_{r} = \tau_{f}, \tau_g \neq 0\)
B3 B1 Weak Spearman’s hypothesis 1 \(\Psi_{r}^* = \Psi_{f}^*, \delta \neq 0\)
B4 B1 Weak Spearman’s hypothesis n \(\Psi_{r}^* = \Psi_{f}^*, \delta \neq 0\)
B5 B3-B4 Contra-Spearman’s hypothesis \(\tau_g = 0\)

The syntax here can be more confusing because the hypothesis is more variable than standard MI testing. In the first higher-order (or second-order/third-order, then higher-order models in rare cases) model, a g factor replaces the covariances among the group factors from the most invariant measurement model and that is all. In the strong model of Spearman’s hypothesis, all group factor means are constrained to zero and only g is allowed to vary. In the weak models of Spearman’s hypothesis, a single group factor is constrained at a time and g is allowed to vary, and then, another group factor is constrained until no more can be. The contra model is then fitted atop the contra model with the most constrained group factors. At this point, the fits of the unconstrained (if a measurement model could not employ a group factor constraint), strong, (most constrained weak), and contra models are compared. If the strong model fits best, group differences are attributable to g alone; if the weak model fits best, g and group factors explain group differences, and if the contra model fits best, g does not contribute to group differences. It can be difficult to differentiate these models thanks to power, insufficient variability, sampling variance, a narrow breadth of content, and so on. The Spearman’s hypothesis testing procedure is likewise for the bifactor model.

Bifactor Model

This model does not suffer from proportionality constraints, reduces tetrad violations, and allows factors to be assessed independently; unfortunately, it also models noise, and can be accepted instead of the correct model in simulations (there are many papers on each of these). There are theoretical reasons the model is unacceptable as well (see Hood, 2010). If may be valid, but that remains to be seen; behavior genetic testing would be useful to assess if it can be regarded as valid and explain empirical phenomena like the Jensen effect (see van der Maas et al., 2006; Kan, van der Maas & Kievit, 2016; Franic et al., 2013). Because of the reduced confounding among factors, it has been suggested as a more powerful option for testing Spearman’s hypothesis (Frisby & Beaujean, 2015). Given that bias is always with respect to a given factor structure, it may be useful to test MI in this model in complete in addition to the measurement model, but it is - to repeat - unknown whether it models away bias.

Interpretability Indices

There are more substantial ways to assess the presence of a general factor, g, which can be quite useful for higher-order models with two to three factors. For example, coefficient \(\omega_{hierarchical}\) which indexes the general factor saturation of a battery, or \(PUC\), which is the proportion of correlations in a given correlation matrix which inform directly on the general factor rather than the group factors. These are given by

\[\omega_h = \frac{\Sigma \lambda_g^2}{(\Sigma \lambda_g^2) + (\Sigma \lambda_{\eta_1}^2) + (\Sigma \lambda_{\eta_x}^2) + (\Sigma 1-h^2)}\]

\[PUC = \frac{\frac{k_{items}\times(k_{items}-1)}{2} - k_{factors} \times \frac{k_{items per factor}\times{(k_{items per factor}-1)}}2}{\frac{k_{items}\times(k_{items}-1)}{2}}\]

Metrics like the explained common variance (ECV) and H (Hancock & Mueller, 2001) are useful for evaluating the robustness of general and group factors (and \(\omega_h\) can be applied as \(\omega_{h subtest}\) as well). These are given as

\[ECV = \frac{\Sigma \lambda_g^2}{(\Sigma \lambda_g^2) + (\Sigma \lambda_{\eta_1}^2) + (\Sigma \lambda_{\eta_x}^2)}\]

\[H = 1/\left[1+\frac{1}{\sum_{i=1}^{k}\frac{\lambda_i^2}{1-\lambda_i^2}}\right]\]

There are many more examples of similar metrics. Their results are generally highly correlated. For more information on these, see Reise, Scheiners, Widman & Haviland (2012), Rodriguez, Reise & Haviland (2015, 2016), and Bonifay, Reise, Scheiners & Meijer (2015). To obtain the loadings required for this usually requires performing a Schmid-Leiman transformation (or using similar methods) or changing the model to a bifactor structure (Mansolf & Reise, 2012, 2017). These indices can be applied to the factors from the measurement model, but in doing so, the reliability will be inflated due to g variance being conflated with the group factor variance. This can lead to improper assessments of the dimensionality of a battery, whereby a researcher believes it to be multidimensional when it is largely unidimensional; sometimes this even results in seeing a group factor as reliable when it is plainly not. As dimensionality assessments, many of these metrics are capable of enhancing the case for a psychometric g - alongside, as well, exploratory analyses which, for example, uniformly showcase a dominant dimension in intelligence research (through many methods often derived similarly) - rather than a nearly statistically equivalent measurement model or similar. Even if a network model is accepted, the dimensionality as assessed by many of these metrics remains as it is (at least with respect to factors which maintain representation in networks, like g since it is the most general level of the hierarchy of relationships in a given matrix; group factors are more likely to be broken up and clustered differently, in part due to harder identification constraints on factor models).

Analysis of Spearman’s Hypothesis

All analyses in this section proceed from the British measurement model. The higher-order model, since it is based on three group factors, is simply modeled initially with the same freed parameters as the partial measurement models at their equivalent steps. Interpretability indices are computed from the bifactor model for ease and presented in a table below it.

Higher-order Invariance

It is not possible in this instance to differentiate the higher-order model from the measurement model with goodness of fit measures because, as described above, the df is equivalent with three or fewer group factors. Nonetheless, it is believed to be there due to results with more expansive batteries and testing, and it is generally supported by the interpretability indices and other facts and measures. The parallel analysis used earlier did support it.

# Higher-order model

WIIIUKRHOF.model <- '
POI =~ PA + SS + DSC + PC + MR + BD
VCI =~ V + S + I + C + A
WMI =~ C + DS + LNS + A

DSC ~~ SS

g =~ POI + VCI + WMI'

WIIIUKRHOF.fit <- cfa(WIIIUKRHOF.model, sample.cov = WAISIIIW.cov, sample.nobs = 349, std.lv = T, orthogonal = T)
summary(WIIIUKRHOF.fit, stand = T, fit = T)
## lavaan 0.6-5 ended normally after 48 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         32
##                                                       
##   Number of observations                           349
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                               192.576
##   Degrees of freedom                                59
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              2475.174
##   Degrees of freedom                                78
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.944
##   Tucker-Lewis Index (TLI)                       0.926
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -9904.529
##   Loglikelihood unrestricted model (H1)      -9808.241
##                                                       
##   Akaike (AIC)                               19873.058
##   Bayesian (BIC)                             19996.420
##   Sample-size adjusted Bayesian (BIC)        19894.905
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.081
##   90 Percent confidence interval - lower         0.068
##   90 Percent confidence interval - upper         0.093
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.062
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI =~                                                                
##     PA                0.800    0.505    1.586    0.113    2.098    0.930
##     SS                0.569    0.361    1.577    0.115    1.493    0.571
##     DSC               0.503    0.320    1.570    0.116    1.319    0.493
##     PC                0.801    0.506    1.582    0.114    2.101    0.672
##     MR                0.769    0.485    1.585    0.113    2.017    0.745
##     BD                0.787    0.497    1.584    0.113    2.062    0.720
##   VCI =~                                                                
##     V                 1.982    0.124   15.957    0.000    2.227    0.838
##     S                 2.243    0.142   15.785    0.000    2.519    0.830
##     I                 1.844    0.128   14.374    0.000    2.071    0.762
##     C                 1.542    0.115   13.426    0.000    1.732    0.670
##     A                 0.867    0.148    5.859    0.000    0.974    0.311
##   WMI =~                                                                
##     C                 0.766    0.101    7.616    0.000    0.933    0.361
##     DS                1.908    0.171   11.189    0.000    2.324    0.792
##     LNS               1.687    0.153   11.000    0.000    2.055    0.757
##     A                 1.248    0.154    8.092    0.000    1.520    0.485
##   g =~                                                                  
##     POI               2.424    1.787    1.356    0.175    0.924    0.924
##     VCI               0.511    0.096    5.301    0.000    0.455    0.455
##     WMI               0.695    0.139    5.016    0.000    0.571    0.571
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .SS ~~                                                                 
##    .DSC               3.422    0.340   10.059    0.000    3.422    0.684
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .PA                0.689    0.141    4.906    0.000    0.689    0.135
##    .SS                4.617    0.367   12.565    0.000    4.617    0.675
##    .DSC               5.422    0.424   12.782    0.000    5.422    0.757
##    .PC                5.355    0.444   12.062    0.000    5.355    0.548
##    .MR                3.254    0.286   11.371    0.000    3.254    0.444
##    .BD                3.960    0.339   11.667    0.000    3.960    0.482
##    .V                 2.097    0.235    8.913    0.000    2.097    0.297
##    .S                 2.869    0.312    9.187    0.000    2.869    0.311
##    .I                 3.088    0.288   10.725    0.000    3.088    0.419
##    .C                 1.977    0.201    9.822    0.000    1.977    0.296
##    .A                 5.804    0.491   11.823    0.000    5.804    0.590
##    .DS                3.218    0.442    7.278    0.000    3.218    0.373
##    .LNS               3.155    0.378    8.353    0.000    3.155    0.428
##    .POI               1.000                               0.145    0.145
##    .VCI               1.000                               0.793    0.793
##    .WMI               1.000                               0.674    0.674
##     g                 1.000                               1.000    1.000
#Plot the model - regular hierarchical and circular, in sky blue for fun

#semPaths(WIIIUKRHOF.fit, "model", "std", title = F, residuals = F, groups = "BRLAT", pastel = T, mar = c(2, 1, 3, 1))

semPaths(WIIIUKRHOF.fit, "std", title = F, residuals = F, mar = c(2, 1, 3, 1), posCol = c("skyblue4", "red"), sizeMan = 7, edge.label.cex = 0.8)

#semPaths(WIIIUKRHOF.fit, "model", "std", title = F, residuals = F, groups = "BRLAT", pastel = T, mar = c(2, 1, 3, 1), layout = "circle2", bifactor = "g")

semPaths(WIIIUKRHOF.fit, "std", title = F, residuals = F, mar = c(2, 1, 3, 1), layout = "circle2", bifactor = "g", posCol = c("skyblue4", "red"), sizeMan = 7, edge.label.cex = 1.2)

#These are not actually part of the higher-order invariance testing process, they were merely given illustratively.

#Configural model

WIIIBWCHOF.fit <- cfa(WIIIUKRHOF.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F)

parameterEstimates(WIIIBWCHOF.fit, stand = T)
fitMeasures(WIIIBWCHOF.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        249.398        118.000         90.000          0.952          0.070 
## rmsea.ci.lower rmsea.ci.upper 
##          0.058          0.082
#Metric model

WIIIBWMHOF.fit <- cfa(WIIIUKRHOF.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = "loadings", group.partial = "VCI=~S")

parameterEstimates(WIIIBWMHOF.fit, stand = T)
fitMeasures(WIIIBWMHOF.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        278.347        131.000         77.000          0.946          0.070 
## rmsea.ci.lower rmsea.ci.upper 
##          0.059          0.082
pchisq((fitMeasures(WIIIBWMHOF.fit, "chisq") - fitMeasures(WIIIBWCHOF.fit, "chisq")), (fitMeasures(WIIIBWMHOF.fit, "df") - fitMeasures(WIIIBWCHOF.fit, "df")), lower.tail = F)
## chisq 
## 0.007
fitMeasures(WIIIBWMHOF.fit, "cfi") - fitMeasures(WIIIBWCHOF.fit, "cfi")
##    cfi 
## -0.006
fitMeasures(WIIIBWMHOF.fit, "rmsea") - fitMeasures(WIIIBWCHOF.fit, "rmsea")
## rmsea 
##     0
#Scalar model

WIIIBWSHOF.fit <- cfa(WIIIUKRHOF.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 8.055404e-16) is close to zero. This may be a symptom that the
##     model is not identified.
parameterEstimates(WIIIBWSHOF.fit, stand = T)
fitMeasures(WIIIBWSHOF.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        308.749        135.000         73.000          0.937          0.075 
## rmsea.ci.lower rmsea.ci.upper 
##          0.064          0.086
pchisq((fitMeasures(WIIIBWSHOF.fit, "chisq") - fitMeasures(WIIIBWMHOF.fit, "chisq")), (fitMeasures(WIIIBWSHOF.fit, "df") - fitMeasures(WIIIBWMHOF.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWSHOF.fit, "cfi") - fitMeasures(WIIIBWMHOF.fit, "cfi")
##   cfi 
## -0.01
fitMeasures(WIIIBWSHOF.fit, "rmsea") - fitMeasures(WIIIBWMHOF.fit, "rmsea")
## rmsea 
## 0.005
#Strict model

WIIIBWRHOF.fit <- cfa(WIIIUKRHOF.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 6.114806e-15) is close to zero. This may be a symptom that the
##     model is not identified.
parameterEstimates(WIIIBWRHOF.fit, stand = T)
fitMeasures(WIIIBWRHOF.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        343.674        146.000         62.000          0.928          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.067          0.088
pchisq((fitMeasures(WIIIBWRHOF.fit, "chisq") - fitMeasures(WIIIBWSHOF.fit, "chisq")), (fitMeasures(WIIIBWRHOF.fit, "df") - fitMeasures(WIIIBWSHOF.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWRHOF.fit, "cfi") - fitMeasures(WIIIBWSHOF.fit, "cfi")
##    cfi 
## -0.009
fitMeasures(WIIIBWRHOF.fit, "rmsea") - fitMeasures(WIIIBWSHOF.fit, "rmsea")
## rmsea 
## 0.002
#Homogeneity of latent variances

WIIIBWLHOF.fit <- cfa(WIIIUKRHOF.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 2.206087e-14) is close to zero. This may be a symptom that the
##     model is not identified.
parameterEstimates(WIIIBWLHOF.fit, stand = T)
fitMeasures(WIIIBWLHOF.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        343.674        146.000         62.000          0.928          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.067          0.088
pchisq((fitMeasures(WIIIBWLHOF.fit, "chisq") - fitMeasures(WIIIBWRHOF.fit, "chisq")), (fitMeasures(WIIIBWLHOF.fit, "df") - fitMeasures(WIIIBWRHOF.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWLHOF.fit, "cfi") - fitMeasures(WIIIBWRHOF.fit, "cfi")
## cfi 
##   0
fitMeasures(WIIIBWLHOF.fit, "rmsea") - fitMeasures(WIIIBWRHOF.fit, "rmsea")
## rmsea 
##     0

Higher-order Spearman’s Hypothesis

#Strong, weak-POI, weak-vCI, weak-WMI

WIIIBWLHOFS.fit <- cfa(WIIIUKRHOFS.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"), control=list(rel.tol=1e-4), check.gradient = F)

WIIIBWLHOFWP.fit <- cfa(WIIIUKRHOFWP.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

WIIIBWLHOFWV.fit <- cfa(WIIIUKRHOFWV.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

WIIIBWLHOFWW.fit <- cfa(WIIIUKRHOFWW.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

print(list(fitMeasures(WIIIBWLHOFS.fit, FITM), fitMeasures(WIIIBWLHOFWP.fit, FITM), fitMeasures(WIIIBWLHOFWV.fit, FITM), fitMeasures(WIIIBWLHOFWW.fit, FITM)))
## [[1]]
##          chisq             df           npar            cfi          rmsea 
##        442.055        149.000         59.000          0.893          0.093 
## rmsea.ci.lower rmsea.ci.upper 
##          0.083          0.103 
## 
## [[2]]
##          chisq             df           npar            cfi          rmsea 
##        343.674        147.000         61.000          0.928          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.066          0.087 
## 
## [[3]]
##          chisq             df           npar            cfi          rmsea 
##        343.674        147.000         61.000          0.928          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.066          0.087 
## 
## [[4]]
##          chisq             df           npar            cfi          rmsea 
##        343.674        147.000         61.000          0.928          0.077 
## rmsea.ci.lower rmsea.ci.upper 
##          0.066          0.087

Constraining all of the group factors (strong Spearman’s hypothesis) is not viable, but none of the weak models can be distinguished. Therefore, the following models are weak models with even more group factors constrained.

#Weak-POI-VCI, weak-POI-WMI, weak-VCI-WMI

WIIIBWLHOFWPV.fit <- cfa(WIIIUKRHOFWPV.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"), control=list(rel.tol=1e-4), check.gradient = F)

WIIIBWLHOFWPW.fit <- cfa(WIIIUKRHOFWPW.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"), control=list(rel.tol=1e-4), check.gradient = F)
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
WIIIBWLHOFWVW.fit <- cfa(WIIIUKRHOFWVW.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"), control=list(rel.tol=1e-4), check.gradient = F)

print(list(fitMeasures(WIIIBWLHOFWPV.fit, FITM), fitMeasures(WIIIBWLHOFWPW.fit, FITM), fitMeasures(WIIIBWLHOFWVW.fit, FITM)))
## [[1]]
##          chisq             df           npar            cfi          rmsea 
##        398.754        148.000         60.000          0.909          0.086 
## rmsea.ci.lower rmsea.ci.upper 
##          0.076          0.096 
## 
## [[2]]
##          chisq             df           npar            cfi          rmsea 
##        384.575        148.000         60.000          0.914          0.084 
## rmsea.ci.lower rmsea.ci.upper 
##          0.074          0.094 
## 
## [[3]]
##          chisq             df           npar            cfi          rmsea 
##        441.682        148.000         60.000          0.893          0.093 
## rmsea.ci.lower rmsea.ci.upper 
##          0.083          0.103

None of these models is a very good description of the mean differences, but the best involves constraining POI and WMI. Because of that, I test contra models of each of the individual group factor constraints and that model.

#Contra POI, VCI, WMI, and POI-WMI

WIIIBWLHOFCP.fit <- cfa(WIIIUKRHOFCP.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"), control=list(rel.tol=1e-4), check.gradient = F)

WIIIBWLHOFCW.fit <- cfa(WIIIUKRHOFCW.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

WIIIBWLHOFCV.fit <- cfa(WIIIUKRHOFCV.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

WIIIBWLHOFCPW.fit <- cfa(WIIIUKRHOFCPW.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "DSC~1", "SS~1", "S~1", "PA~~PA", "S~~S"))

print(list(fitMeasures(WIIIBWLHOFCP.fit, FITM), fitMeasures(WIIIBWLHOFCW.fit, FITM), fitMeasures(WIIIBWLHOFCV.fit, FITM), fitMeasures(WIIIBWLHOFCPW.fit, FITM)))
## [[1]]
##          chisq             df           npar            cfi          rmsea 
##        415.577        148.000         60.000          0.903          0.089 
## rmsea.ci.lower rmsea.ci.upper 
##          0.079          0.099 
## 
## [[2]]
##          chisq             df           npar            cfi          rmsea 
##        344.855        148.000         60.000          0.928          0.076 
## rmsea.ci.lower rmsea.ci.upper 
##          0.066          0.087 
## 
## [[3]]
##          chisq             df           npar            cfi          rmsea 
##        550.365        148.000         60.000          0.854          0.109 
## rmsea.ci.lower rmsea.ci.upper 
##          0.100          0.119 
## 
## [[4]]
##          chisq             df           npar            cfi          rmsea 
##        434.634        149.000         59.000          0.896          0.092 
## rmsea.ci.lower rmsea.ci.upper 
##          0.082          0.102

Because of parsimony, the best-fitting model among the weak and contra models is the contra-WMI model, but there’s no power to detect differences as in Dolan (2000), Dolan & Hamaker (2001) and some of the information theoretic comparisons here, perhaps.

#Mean comparisons between the unrestricted and the contra-WMI model

parameterEstimates(WIIIBWLHOF.fit, stand = T) %>% 
  filter(op == "~1") %>% 
  select(Indicator = lhs, "Unstandardized Intercept" = est, SE = se, Z = z, 'p-value' = pvalue, "Standardized Intercept" = std.all) %>% 
  kable(digits = 3, format = "pandoc", caption = "Intercepts")
Intercepts
Indicator Unstandardized Intercept SE Z p-value Standardized Intercept
PA 10.090 0.120 84.365 0.000 4.509
SS 8.250 0.143 57.757 0.000 3.092
DSC 7.720 0.144 53.549 0.000 2.866
PC 10.478 0.158 66.186 0.000 3.422
MR 10.590 0.145 72.844 0.000 3.899
BD 10.682 0.150 71.259 0.000 3.705
V 14.521 0.141 102.934 0.000 5.493
S 14.220 0.163 87.161 0.000 4.666
I 11.750 0.155 76.009 0.000 4.069
C 12.530 0.137 91.321 0.000 4.839
A 9.680 0.160 60.422 0.000 3.165
DS 9.405 0.152 62.024 0.000 3.237
LNS 8.892 0.143 62.072 0.000 3.228
POI 0.000 0.000 NA NA 0.000
VCI 0.000 0.000 NA NA 0.000
WMI 0.000 0.000 NA NA 0.000
g 0.000 0.000 NA NA 0.000
PA 10.090 0.120 84.365 0.000 3.472
SS 11.206 0.276 40.527 0.000 4.587
DSC 11.034 0.284 38.855 0.000 4.360
PC 10.478 0.158 66.186 0.000 3.947
MR 12.667 0.269 47.081 0.000 5.480
BD 10.682 0.150 71.259 0.000 4.333
V 14.521 0.141 102.934 0.000 6.080
S 10.800 0.370 29.164 0.000 7.827
I 15.567 0.384 40.491 0.000 5.842
C 12.530 0.137 91.321 0.000 5.559
A 9.680 0.160 60.422 0.000 3.598
DS 9.405 0.152 62.024 0.000 3.962
LNS 8.892 0.143 62.072 0.000 3.915
POI -0.335 0.112 -2.983 0.003 -0.258
VCI -2.246 0.173 -12.988 0.000 -2.242
WMI 0.988 0.176 5.605 0.000 1.127
g -0.965 0.101 -9.591 0.000 -1.326
parameterEstimates(WIIIBWLHOFCW.fit, stand = T) %>% 
  filter(op == "~1") %>% 
  select(Indicator = lhs, "Unstandardized Intercept" = est, SE = se, Z = z, 'p-value' = pvalue, "Standardized Intercept" = std.all) %>% 
  kable(digits = 3, format = "pandoc", caption = "Intercepts")
Intercepts
Indicator Unstandardized Intercept SE Z p-value Standardized Intercept
WMI 0.000 0.000 NA NA 0.000
g 0.000 0.000 NA NA 0.000
PA 10.124 0.115 87.670 0 4.524
SS 8.275 0.141 58.673 0 3.101
DSC 7.741 0.143 54.191 0 2.874
PC 10.512 0.155 67.788 0 3.434
MR 10.623 0.142 74.690 0 3.911
BD 10.716 0.147 73.095 0 3.717
V 14.542 0.140 103.915 0 5.495
S 14.245 0.162 88.195 0 4.674
I 11.772 0.153 76.793 0 4.076
C 12.576 0.130 96.527 0 4.862
A 9.744 0.149 65.250 0 3.189
DS 9.493 0.128 74.101 0 3.264
LNS 8.974 0.122 73.695 0 3.256
POI 0.000 0.000 NA NA 0.000
VCI 0.000 0.000 NA NA 0.000
WMI 0.000 0.000 NA NA 0.000
g 0.000 0.000 NA NA 0.000
PA 10.124 0.115 87.670 0 3.483
SS 11.230 0.278 40.388 0 4.596
DSC 11.055 0.286 38.656 0 4.367
PC 10.512 0.155 67.788 0 3.959
MR 12.700 0.270 47.119 0 5.492
BD 10.716 0.147 73.095 0 4.345
V 14.542 0.140 103.915 0 6.087
S 10.805 0.373 28.949 0 7.830
I 15.572 0.386 40.356 0 5.846
C 12.576 0.130 96.527 0 5.584
A 9.744 0.149 65.250 0 3.619
DS 9.493 0.128 74.101 0 3.997
LNS 8.974 0.122 73.695 0 3.942
POI -1.910 0.423 -4.516 0 -1.477
VCI -2.835 0.213 -13.330 0 -2.831

In the initial higher-order model, the means for POI, VCI, WMI, and g are -0.258, -2.242, 1.127, and -1.328, compared to -1.477 and -2.831 for POI and VCI in the contra model. Though it goes without stating, the mean differences were redistributed to other factors in proportion to their loadings.

Bifactor Invariance

As mentioned above, Frisby & Beaujean (2015) argued that the bifactor model was more powerful for testing Spearman’s hypothesis because the factors are all orthogonal by design and thus unconfounded with one another and unaffected by proportionality constraints (which are a class of tetrad constraint violation). Just as well, bias is always with respect to a given model and the bifactor model has no differences from the measurement model in terms of MI testing procedures, unlike the higher-order model. For that reason, this section will assess invariance with respect to a bifactor model in full, and not only for illustrative purposes. The model will begin with the British PCA structure for the group factors.

#Bifactor model

WIIIUKBF.model <- '
POI =~ PA + SS + DSC + PC + MR + BD
VCI =~ V + S + I + C + A
WMI =~ C + DS + LNS + A

g =~ PA + SS + DSC + PC + MR + BD + V + S + I + C + A + DS + LNS'

WIIIUKBF.fit <- cfa(WIIIUKBF.model, sample.cov = WAISIIIW.cov, sample.nobs = 349, std.lv = T, orthogonal = T)
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated ov
## variances are negative
summary(WIIIUKBF.fit, stand = T, fit = T)
## lavaan 0.6-5 ended normally after 48 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         41
##                                                       
##   Number of observations                           349
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                               137.854
##   Degrees of freedom                                50
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              2475.174
##   Degrees of freedom                                78
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.963
##   Tucker-Lewis Index (TLI)                       0.943
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -9877.168
##   Loglikelihood unrestricted model (H1)      -9808.241
##                                                       
##   Akaike (AIC)                               19836.336
##   Bayesian (BIC)                             19994.394
##   Sample-size adjusted Bayesian (BIC)        19864.328
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.071
##   90 Percent confidence interval - lower         0.057
##   90 Percent confidence interval - upper         0.085
##   P-value RMSEA <= 0.05                          0.008
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.047
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI =~                                                                
##     PA                0.428    0.143    2.994    0.003    0.428    0.190
##     SS                1.477    0.245    6.025    0.000    1.477    0.565
##     DSC               2.865    0.358    8.001    0.000    2.865    1.071
##     PC                0.087    0.155    0.562    0.574    0.087    0.028
##     MR                0.117    0.139    0.846    0.398    0.117    0.043
##     BD                0.204    0.149    1.375    0.169    0.204    0.071
##   VCI =~                                                                
##     V                 2.012    0.120   16.778    0.000    2.012    0.757
##     S                 2.200    0.136   16.148    0.000    2.200    0.725
##     I                 1.771    0.127   13.903    0.000    1.771    0.652
##     C                 1.595    0.107   14.953    0.000    1.595    0.616
##     A                 0.662    0.149    4.434    0.000    0.662    0.211
##   WMI =~                                                                
##     C                 0.886    0.105    8.419    0.000    0.886    0.342
##     DS                2.430    0.184   13.200    0.000    2.430    0.828
##     LNS               1.472    0.150    9.805    0.000    1.472    0.542
##     A                 0.833    0.156    5.352    0.000    0.833    0.266
##   g =~                                                                  
##     PA                1.997    0.104   19.257    0.000    1.997    0.885
##     SS                1.290    0.169    7.611    0.000    1.290    0.493
##     DSC               0.900    0.210    4.285    0.000    0.900    0.336
##     PC                2.116    0.154   13.715    0.000    2.116    0.677
##     MR                2.081    0.128   16.296    0.000    2.081    0.769
##     BD                2.093    0.138   15.154    0.000    2.093    0.730
##     V                 0.967    0.145    6.659    0.000    0.967    0.364
##     S                 1.227    0.164    7.464    0.000    1.227    0.404
##     I                 1.064    0.147    7.214    0.000    1.064    0.392
##     C                 1.230    0.138    8.930    0.000    1.230    0.475
##     A                 1.678    0.164   10.238    0.000    1.678    0.535
##     DS                1.042    0.161    6.480    0.000    1.042    0.355
##     LNS               1.203    0.146    8.254    0.000    1.203    0.443
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI ~~                                                                
##     VCI               0.000                               0.000    0.000
##     WMI               0.000                               0.000    0.000
##     g                 0.000                               0.000    0.000
##   VCI ~~                                                                
##     WMI               0.000                               0.000    0.000
##     g                 0.000                               0.000    0.000
##   WMI ~~                                                                
##     g                 0.000                               0.000    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .PA                0.920    0.136    6.772    0.000    0.920    0.181
##    .SS                2.999    0.512    5.855    0.000    2.999    0.438
##    .DSC              -1.858    2.027   -0.916    0.359   -1.858   -0.259
##    .PC                5.283    0.447   11.816    0.000    5.283    0.541
##    .MR                2.977    0.278   10.722    0.000    2.977    0.407
##    .BD                3.790    0.335   11.331    0.000    3.790    0.461
##    .V                 2.075    0.241    8.625    0.000    2.075    0.294
##    .S                 2.870    0.310    9.250    0.000    2.870    0.311
##    .I                 3.109    0.285   10.889    0.000    3.109    0.421
##    .C                 1.857    0.205    9.036    0.000    1.857    0.277
##    .A                 5.887    0.470   12.516    0.000    5.887    0.599
##    .DS                1.627    0.700    2.325    0.020    1.627    0.189
##    .LNS               3.763    0.384    9.804    0.000    3.763    0.510
##     POI               1.000                               1.000    1.000
##     VCI               1.000                               1.000    1.000
##     WMI               1.000                               1.000    1.000
##     g                 1.000                               1.000    1.000

The PC, MR, and BD loadings on POI became insignificant and the variance for DSC became slightly negative. This can be alleviated by making PC, MR, and BD load solely on g and the DSC variance can be set to the mean of the remaining indicators for POI (\(\frac{0.92+2.999}{2}\) = 1.96) It may be more optimal to set it differently (i.e., 0.50 for no decrement in fit), but this is consistent, making it appropriate.

#Bifactor model

WIIIUKBFR.model <- '
POI =~ PA + SS + DSC 
VCI =~ V + S + I + C + A
WMI =~ C + DS + LNS + A

DSC ~~ 1.96*DSC

g =~ PA + SS + DSC + PC + MR + BD + V + S + I + C + A + DS + LNS'

WIIIUKBFR.fit <- cfa(WIIIUKBFR.model, sample.cov = WAISIIIW.cov, sample.nobs = 349, std.lv = T, orthogonal = T)
summary(WIIIUKBFR.fit, stand = T, fit = T)
## lavaan 0.6-5 ended normally after 33 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         37
##                                                       
##   Number of observations                           349
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                               152.249
##   Degrees of freedom                                54
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              2475.174
##   Degrees of freedom                                78
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.959
##   Tucker-Lewis Index (TLI)                       0.941
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -9884.365
##   Loglikelihood unrestricted model (H1)      -9808.241
##                                                       
##   Akaike (AIC)                               19842.731
##   Bayesian (BIC)                             19985.369
##   Sample-size adjusted Bayesian (BIC)        19867.992
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.072
##   90 Percent confidence interval - lower         0.059
##   90 Percent confidence interval - upper         0.086
##   P-value RMSEA <= 0.05                          0.004
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.049
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI =~                                                                
##     PA                0.312    0.087    3.587    0.000    0.312    0.138
##     SS                1.925    0.118   16.293    0.000    1.925    0.736
##     DSC               2.007    0.119   16.921    0.000    2.007    0.745
##   VCI =~                                                                
##     V                 2.023    0.120   16.855    0.000    2.023    0.762
##     S                 2.207    0.136   16.199    0.000    2.207    0.727
##     I                 1.783    0.127   13.986    0.000    1.783    0.656
##     C                 1.620    0.107   15.133    0.000    1.620    0.626
##     A                 0.676    0.149    4.539    0.000    0.676    0.216
##   WMI =~                                                                
##     C                 0.902    0.106    8.527    0.000    0.902    0.349
##     DS                2.419    0.183   13.211    0.000    2.419    0.824
##     LNS               1.465    0.149    9.806    0.000    1.465    0.539
##     A                 0.832    0.156    5.341    0.000    0.832    0.265
##   g =~                                                                  
##     PA                2.012    0.100   20.175    0.000    2.012    0.892
##     SS                1.334    0.143    9.312    0.000    1.334    0.510
##     DSC               1.123    0.151    7.426    0.000    1.123    0.417
##     PC                2.116    0.155   13.685    0.000    2.116    0.677
##     MR                2.093    0.128   16.388    0.000    2.093    0.774
##     BD                2.104    0.138   15.243    0.000    2.104    0.734
##     V                 0.940    0.146    6.444    0.000    0.940    0.354
##     S                 1.215    0.165    7.365    0.000    1.215    0.400
##     I                 1.046    0.148    7.058    0.000    1.046    0.385
##     C                 1.188    0.139    8.570    0.000    1.188    0.459
##     A                 1.679    0.164   10.231    0.000    1.679    0.536
##     DS                1.051    0.161    6.523    0.000    1.051    0.358
##     LNS               1.221    0.146    8.376    0.000    1.221    0.450
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   POI ~~                                                                
##     VCI               0.000                               0.000    0.000
##     WMI               0.000                               0.000    0.000
##     g                 0.000                               0.000    0.000
##   VCI ~~                                                                
##     WMI               0.000                               0.000    0.000
##     g                 0.000                               0.000    0.000
##   WMI ~~                                                                
##     g                 0.000                               0.000    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .DSC               1.960                               1.960    0.270
##    .PA                0.946    0.137    6.907    0.000    0.946    0.186
##    .SS                1.361    0.234    5.819    0.000    1.361    0.199
##    .V                 2.077    0.240    8.646    0.000    2.077    0.294
##    .S                 2.869    0.309    9.287    0.000    2.869    0.311
##    .I                 3.106    0.285   10.891    0.000    3.106    0.421
##    .C                 1.845    0.207    8.918    0.000    1.845    0.276
##    .A                 5.864    0.469   12.495    0.000    5.864    0.596
##    .DS                1.663    0.689    2.413    0.016    1.663    0.193
##    .LNS               3.739    0.380    9.837    0.000    3.739    0.507
##    .PC                5.290    0.450   11.761    0.000    5.290    0.542
##    .MR                2.942    0.279   10.547    0.000    2.942    0.402
##    .BD                3.788    0.339   11.168    0.000    3.788    0.461
##     POI               1.000                               1.000    1.000
##     VCI               1.000                               1.000    1.000
##     WMI               1.000                               1.000    1.000
##     g                 1.000                               1.000    1.000
#Plot the model - regular hierarchical and circular

#semPaths(WIIIUKBFR.fit, "model", "std", title = F, residuals = F, groups = "BRLAT", pastel = T, mar = c(2, 1, 3, 1))

semPaths(WIIIUKBFR.fit, "std", title = F, residuals = F, mar = c(2, 1, 3, 1), posCol = c("skyblue4", "red"), sizeMan = 7, edge.label.cex = 0.8, layout = "tree2", bifactor = "g")

#semPaths(WIIIUKBFR.fit, "model", "std", title = F, residuals = F, groups = "BRLAT", pastel = T, mar = c(2, 1, 3, 1), layout = "circle2", bifactor = "g")

semPaths(WIIIUKBFR.fit, "std", title = F, residuals = F, mar = c(2, 1, 3, 1), layout = "circle2", bifactor = "g", posCol = c("skyblue4", "red"), sizeMan = 7, edge.label.cex = 1.2)

#Configural model

WIIIBWBFC.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T)

parameterEstimates(WIIIBWBFC.fit, stand = T)
fitMeasures(WIIIBWBFC.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        207.847        108.000        100.000          0.964          0.064 
## rmsea.ci.lower rmsea.ci.upper 
##          0.051          0.077
#Metric model

WIIIBWBFM.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = "loadings")

parameterEstimates(WIIIBWBFM.fit, stand = T)
fitMeasures(WIIIBWBFM.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        284.902        129.000         79.000          0.943          0.073 
## rmsea.ci.lower rmsea.ci.upper 
##          0.061          0.084
pchisq((fitMeasures(WIIIBWBFM.fit, "chisq") - fitMeasures(WIIIBWBFC.fit, "chisq")), (fitMeasures(WIIIBWBFM.fit, "df") - fitMeasures(WIIIBWBFC.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWBFM.fit, "cfi") - fitMeasures(WIIIBWBFC.fit, "cfi")
##   cfi 
## -0.02
fitMeasures(WIIIBWBFM.fit, "rmsea") - fitMeasures(WIIIBWBFC.fit, "rmsea")
## rmsea 
## 0.009
#Partial metric model

WIIIBWBFMP.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = "loadings", group.partial = "VCI=~S")

parameterEstimates(WIIIBWBFMP.fit, stand = T)
fitMeasures(WIIIBWBFMP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        248.108        128.000         80.000          0.956          0.064 
## rmsea.ci.lower rmsea.ci.upper 
##          0.052          0.076
pchisq((fitMeasures(WIIIBWBFMP.fit, "chisq") - fitMeasures(WIIIBWBFC.fit, "chisq")), (fitMeasures(WIIIBWBFMP.fit, "df") - fitMeasures(WIIIBWBFC.fit, "df")), lower.tail = F)
## chisq 
## 0.005
fitMeasures(WIIIBWBFMP.fit, "cfi") - fitMeasures(WIIIBWBFC.fit, "cfi")
##    cfi 
## -0.007
fitMeasures(WIIIBWBFMP.fit, "rmsea") - fitMeasures(WIIIBWBFC.fit, "rmsea")
## rmsea 
##     0
#Scalar model

WIIIBWBFS.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts"), group.partial = "VCI=~S")

parameterEstimates(WIIIBWBFS.fit, stand = T)
fitMeasures(WIIIBWBFS.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        459.853        137.000         71.000          0.883          0.102 
## rmsea.ci.lower rmsea.ci.upper 
##          0.092          0.112
pchisq((fitMeasures(WIIIBWBFS.fit, "chisq") - fitMeasures(WIIIBWBFMP.fit, "chisq")), (fitMeasures(WIIIBWBFS.fit, "df") - fitMeasures(WIIIBWBFMP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWBFS.fit, "cfi") - fitMeasures(WIIIBWBFMP.fit, "cfi")
##    cfi 
## -0.074
fitMeasures(WIIIBWBFS.fit, "rmsea") - fitMeasures(WIIIBWBFMP.fit, "rmsea")
## rmsea 
## 0.038
#Partial scalar model

WIIIBWBFSP.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1"))

parameterEstimates(WIIIBWBFSP.fit, stand = T)
fitMeasures(WIIIBWBFSP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        266.707        133.000         75.000          0.951          0.066 
## rmsea.ci.lower rmsea.ci.upper 
##          0.055          0.078
pchisq((fitMeasures(WIIIBWBFSP.fit, "chisq") - fitMeasures(WIIIBWBFMP.fit, "chisq")), (fitMeasures(WIIIBWBFSP.fit, "df") - fitMeasures(WIIIBWBFMP.fit, "df")), lower.tail = F)
## chisq 
## 0.002
fitMeasures(WIIIBWBFSP.fit, "cfi") - fitMeasures(WIIIBWBFMP.fit, "cfi")
##    cfi 
## -0.005
fitMeasures(WIIIBWBFSP.fit, "rmsea") - fitMeasures(WIIIBWBFMP.fit, "rmsea")
## rmsea 
## 0.002
#Strict model

WIIIBWBFR.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1"))

parameterEstimates(WIIIBWBFR.fit, stand = T)
fitMeasures(WIIIBWBFR.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        469.775        145.000         63.000          0.882          0.099 
## rmsea.ci.lower rmsea.ci.upper 
##          0.089          0.109
pchisq((fitMeasures(WIIIBWBFR.fit, "chisq") - fitMeasures(WIIIBWBFSP.fit, "chisq")), (fitMeasures(WIIIBWBFR.fit, "df") - fitMeasures(WIIIBWBFSP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWBFR.fit, "cfi") - fitMeasures(WIIIBWBFSP.fit, "cfi")
##    cfi 
## -0.069
fitMeasures(WIIIBWBFR.fit, "rmsea") - fitMeasures(WIIIBWBFSP.fit, "rmsea")
## rmsea 
## 0.033
#Partial strict model

WIIIBWBFRP.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

parameterEstimates(WIIIBWBFRP.fit, stand = T)
fitMeasures(WIIIBWBFRP.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        289.779        141.000         67.000          0.946          0.068 
## rmsea.ci.lower rmsea.ci.upper 
##          0.057          0.079
pchisq((fitMeasures(WIIIBWBFRP.fit, "chisq") - fitMeasures(WIIIBWBFSP.fit, "chisq")), (fitMeasures(WIIIBWBFRP.fit, "df") - fitMeasures(WIIIBWBFSP.fit, "df")), lower.tail = F)
## chisq 
## 0.003
fitMeasures(WIIIBWBFRP.fit, "cfi") - fitMeasures(WIIIBWBFSP.fit, "cfi")
##    cfi 
## -0.005
fitMeasures(WIIIBWBFRP.fit, "rmsea") - fitMeasures(WIIIBWBFSP.fit, "rmsea")
## rmsea 
## 0.002

A partial model with \(\Delta\)CFI = 0.01 is possible without the freed PC residual variance, but I opted to ensure \(\Delta\)CFI was <0.01, even though it had no substantive effect.

#Homogeneous latent variances

WIIIBWBFL.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

parameterEstimates(WIIIBWBFL.fit, stand = T)
fitMeasures(WIIIBWBFL.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        289.779        141.000         67.000          0.946          0.068 
## rmsea.ci.lower rmsea.ci.upper 
##          0.057          0.079
pchisq((fitMeasures(WIIIBWBFL.fit, "chisq") - fitMeasures(WIIIBWBFRP.fit, "chisq")), (fitMeasures(WIIIBWBFL.fit, "df") - fitMeasures(WIIIBWBFRP.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWBFL.fit, "cfi") - fitMeasures(WIIIBWBFRP.fit, "cfi")
## cfi 
##   0
fitMeasures(WIIIBWBFL.fit, "rmsea") - fitMeasures(WIIIBWBFRP.fit, "rmsea")
## rmsea 
##     0
#Homogeneous latent means

WIIIBWBFMO.fit <- cfa(WIIIUKBFR.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals", "means"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

parameterEstimates(WIIIBWBFMO.fit, stand = T)
fitMeasures(WIIIBWBFMO.fit, FITM)
##          chisq             df           npar            cfi          rmsea 
##        683.991        145.000         63.000          0.804          0.128 
## rmsea.ci.lower rmsea.ci.upper 
##          0.118          0.137
pchisq((fitMeasures(WIIIBWBFMO.fit, "chisq") - fitMeasures(WIIIBWBFL.fit, "chisq")), (fitMeasures(WIIIBWBFMO.fit, "df") - fitMeasures(WIIIBWBFL.fit, "df")), lower.tail = F)
## chisq 
##     0
fitMeasures(WIIIBWBFMO.fit, "cfi") - fitMeasures(WIIIBWBFL.fit, "cfi")
##    cfi 
## -0.142
fitMeasures(WIIIBWBFMO.fit, "rmsea") - fitMeasures(WIIIBWBFL.fit, "rmsea")
## rmsea 
##  0.06

Bifactor Spearman’s Hypothesis

#Strong, weak-POI, weak-VCI, and weak-WMI

WIIIBWBFS.fit <- cfa(WIIIUKRBFS.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

WIIIBWBFWP.fit <- cfa(WIIIUKRBFWP.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

WIIIBWBFWV.fit <- cfa(WIIIUKRBFWV.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

WIIIBWBFWW.fit <- cfa(WIIIUKRBFWW.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

print(list(fitMeasures(WIIIBWBFS.fit, FITM), fitMeasures(WIIIBWBFWP.fit, FITM), fitMeasures(WIIIBWBFWV.fit, FITM), fitMeasures(WIIIBWBFWW.fit, FITM)))
## [[1]]
##          chisq             df           npar            cfi          rmsea 
##        608.677        144.000         64.000          0.831          0.119 
## rmsea.ci.lower rmsea.ci.upper 
##          0.109          0.129 
## 
## [[2]]
##          chisq             df           npar            cfi          rmsea 
##        390.926        142.000         66.000          0.909          0.088 
## rmsea.ci.lower rmsea.ci.upper 
##          0.077          0.098 
## 
## [[3]]
##          chisq             df           npar            cfi          rmsea 
##        457.965        142.000         66.000          0.885          0.099 
## rmsea.ci.lower rmsea.ci.upper 
##          0.089          0.109 
## 
## [[4]]
##          chisq             df           npar            cfi          rmsea 
##        345.416        142.000         66.000          0.926          0.079 
## rmsea.ci.lower rmsea.ci.upper 
##          0.069          0.090

No mean could be constrained in terms of both \(\Delta\)CFI and RMSEA, but WMI could be constrained in terms of RMSEA.

WIIIUKRBFCW.model <- '
POI =~ PA + SS + DSC 
VCI =~ V + S + I + C + A
WMI =~ C + DS + LNS + A

DSC ~~ 1.96*DSC

g =~ PA + SS + DSC + PC + MR + BD + V + S + I + C + A + DS + LNS
WMI ~ c(0,0)*1
g ~ c(0,0)*1'
#Contra-WMI

WIIIBWBFCW.fit <- cfa(WIIIUKRBFCW.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("VCI=~S", "I~1", "MR~1", "S~1", "DS~1", "PA~~PA", "S~~S", "SS~~SS", "PC~~PC"))

print(list(fitMeasures(WIIIBWBFWW.fit, FITM), fitMeasures(WIIIBWBFCW.fit, FITM)))
## [[1]]
##          chisq             df           npar            cfi          rmsea 
##        345.416        142.000         66.000          0.926          0.079 
## rmsea.ci.lower rmsea.ci.upper 
##          0.069          0.090 
## 
## [[2]]
##          chisq             df           npar            cfi          rmsea 
##        415.873        143.000         65.000          0.901          0.091 
## rmsea.ci.lower rmsea.ci.upper 
##          0.081          0.102

The contra model did not fit better in the bifactor case. This model only beats the higher-order Spearman’s hypothesis models when pitted against them on equal terms, which is to say, without the fit-damaging variance constraint on DSC and dropping indicators from POI. In that model, VCI was constrained and the weak model weas clearly better, but a weak model constraining WMI Was also viable and in that case, fit marginally better than the contra model. Moreoevr, only the similarities loading, block design and picture completion intercepts, and picture arrangement, similarities, and arithmetic residuals were biased, in consistent directions and magnitudes. The strong Spearman’s hypothesis model was not tenable in whatever case.

#Mean comparisons between the unrestricted and the weak-WMI model

parameterEstimates(WIIIBWBFL.fit, stand = T) %>% 
  filter(op == "~1") %>% 
  select(Indicator = lhs, "Unstandardized Intercept" = est, SE = se, Z = z, 'p-value' = pvalue, "Standardized Intercept" = std.all) %>% 
  kable(digits = 3, format = "pandoc", caption = "Intercepts")
Intercepts
Indicator Unstandardized Intercept SE Z p-value Standardized Intercept
PA 10.079 0.120 83.753 0 4.474
SS 8.249 0.139 59.454 0 3.170
DSC 7.721 0.145 53.331 0 2.839
V 14.540 0.138 105.658 0 5.639
S 14.220 0.159 89.170 0 4.773
I 11.750 0.153 76.719 0 4.107
C 12.494 0.138 90.649 0 4.825
A 9.681 0.160 60.683 0 3.185
DS 9.500 0.158 60.305 0 3.228
LNS 8.808 0.144 61.003 0 3.231
PC 10.492 0.162 64.873 0 3.328
MR 10.590 0.146 72.425 0 3.877
BD 10.719 0.150 71.676 0 3.732
POI 0.000 0.000 NA NA 0.000
VCI 0.000 0.000 NA NA 0.000
WMI 0.000 0.000 NA NA 0.000
g 0.000 0.000 NA NA 0.000
PA 10.079 0.120 83.753 0 3.490
SS 8.249 0.139 59.454 0 2.982
DSC 7.721 0.145 53.331 0 3.243
V 14.540 0.138 105.658 0 6.133
S 11.204 0.360 31.096 0 7.713
I 15.722 0.394 39.874 0 5.937
C 12.494 0.138 90.649 0 5.502
A 9.681 0.160 60.683 0 3.589
DS 7.072 0.578 12.239 0 3.168
LNS 8.808 0.144 61.003 0 3.772
PC 10.492 0.162 64.873 0 4.814
MR 12.883 0.274 47.026 0 5.698
BD 10.719 0.150 71.676 0 4.443
POI 1.577 0.167 9.439 0 1.817
VCI -2.320 0.185 -12.547 0 -2.521
WMI 1.451 0.232 6.245 0 2.151
g -1.069 0.117 -9.140 0 -1.598
parameterEstimates(WIIIBWBFWW.fit, stand = T) %>% 
  filter(op == "~1") %>% 
  select(Indicator = lhs, "Unstandardized Intercept" = est, SE = se, Z = z, 'p-value' = pvalue, "Standardized Intercept" = std.all) %>% 
  kable(digits = 3, format = "pandoc", caption = "Intercepts")
Intercepts
Indicator Unstandardized Intercept SE Z p-value Standardized Intercept
WMI 0.000 0.000 NA NA 0.000
PA 10.072 0.121 83.511 0 4.461
SS 8.247 0.139 59.522 0 3.172
DSC 7.725 0.145 53.352 0 2.839
V 14.513 0.141 103.225 0 5.508
S 14.220 0.160 89.126 0 4.771
I 11.750 0.154 76.502 0 4.095
C 12.664 0.130 97.661 0 5.098
A 9.888 0.151 65.321 0 3.343
DS 9.783 0.148 65.997 0 3.410
LNS 9.213 0.130 71.105 0 3.431
PC 10.414 0.163 63.698 0 3.260
MR 10.590 0.146 72.449 0 3.878
BD 10.667 0.151 70.646 0 3.672
POI 0.000 0.000 NA NA 0.000
VCI 0.000 0.000 NA NA 0.000
g 0.000 0.000 NA NA 0.000
WMI 0.000 0.000 NA NA 0.000
PA 10.072 0.121 83.511 0 3.482
SS 8.247 0.139 59.522 0 2.976
DSC 7.725 0.145 53.352 0 3.232
V 14.513 0.141 103.225 0 6.119
S 11.019 0.351 31.383 0 7.590
I 15.367 0.374 41.096 0 5.857
C 12.664 0.130 97.661 0 5.759
A 9.888 0.151 65.321 0 3.684
DS 9.579 0.241 39.716 0 4.054
LNS 9.213 0.130 71.105 0 3.893
PC 10.414 0.163 63.698 0 4.714
MR 12.618 0.263 47.962 0 5.572
BD 10.667 0.151 70.646 0 4.389
POI 1.495 0.161 9.292 0 1.714
VCI -2.201 0.173 -12.731 0 -2.463
g -0.943 0.112 -8.408 0 -1.407

The biased parameters were mostly but not entirely consistent between the bifactor and measurement models. The indicator with metric bias was the same in both models and the magnitude of the metric bias in favor of the British group was similar (MM: 0.131; BF: 0.204). The biased intercepts in the measurement model were in symbol search, digit-symbol coding, matrix reasoning, similarities, and information whereas in the bifactor case these were similarities, information, digit span, and matrix reasoning. As shown above, the bias in the measurement model was in favor of the South African sample with a magnitude between medium in large by Cohen’s effect size criteria. In this model, the intercept differences between the British and South African sample for the subtests in the same order were 2.940, 1.830, -0.060, and 1.821, which correspond to one large and two moderate effect sizes in favor of the South African sample, and one negligible effect in favor of the British sample (unstandardized effects are similar to the measurement model as well: information and matrix reasoning are biased against the British group whereas similarities and digit span are biased against the South African group, the latter only slightly in the weak model, and more in the latent variances model).

Model Interpretability Indices

These indices are taken from the British group in the bifactor homogeneous latent variances model - which fit better than the higher-order homogeneous latent variances model - with the freed loading’s result for the South African group given in quotations. For calculating PUC, cross-loaded items were counted as half items, adding up to a full item.

Factor H PUC \(\omega_t\) \(\omega_h\) ECV Total Variance
g 0.91 0.78 0.93 0.72 0.49 0.31
POI 0.70 - 0.90 0.39 0.13 0.08
VCI 0.82 (0.79) - 0.89 (0.86) 0.71 (0.66) 0.26 (0.23) 0.16 (0.14)
WMI 0.70 - 0.78 0.48 0.13 0.08

Unusually for this sort of assessment, it appears that the group factors were actually there, for the most part. This battery was far more multidimensional than is usually the case, which is a pleasant surprise. In fact, all of the group factors had acceptable (though in two cases only minimally) values of H, acceptable and nearly-acceptable values of \(\omega_h\), decent ECV (which was somewhat too low in the case of g), and surprisingly large total variance explained by group factors compared to what’s typical.

To-Do

  • Interpretability indices for group factors
    • Table of common guidelines for interpretation with sources
  • Proportion of differences due to g, original d, and bias-corrected d (for bifactor too), and h^2 in same table
    • Assessment of whether measurement model or bifactor model explains more
  • Network model
    • Mutualism model
  • Spearman’s hypothesis (in all applicable models, including the network/mutualism model with loading conversion and estimated variance sources)
  • Other studies (see Dolan, Roorda & Wicherts, 2004, among others)
  • Written out formulas for bifactor dimensionality/interpretability indices
    • Scripts for computing from lavaan object
  • Redo all Carroll models on RPubs, including those covered by Benson et al. (2018), with HOF, BF, CHC-HOF, CHC-BF, and network/mutualism models, with appropriate interpretability indices

References

Gunn, H. J., Grimm, K. J., & Edwards, M. C. (2019). Evaluation of Six Effect Size Measures of Measurement Non-Invariance for Continuous Outcomes. Structural Equation Modeling: A Multidisciplinary Journal, 0(0), 1-12. https://doi.org/10.1080/10705511.2019.1689507

Cockcroft, K., Alloway, T., Copello, E., & Milligan, R. (2015). A cross-cultural comparison between South African and British students on the Wechsler Adult Intelligence Scales Third Edition (WAIS-III). Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.00297

Claassen, N. C. W., Krynauw, A. H., Paterson, H., and Wa ga Mathe, M. (2001). A Standardisation of the WAIS-III for English-Speaking South Africans. Pretoria: Human Sciences Research Council.

Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525-543. https://doi.org/10.1007/BF02294825

Lubke, G. H., Dolan, C. V., Kelderman, H., & Mellenbergh, G. J. (2003). On the relationship between sources of within- and between-group differences and measurement invariance in the common factor model. Intelligence, 31(6), 543-566. https://doi.org/10.1016/S0160-2896(03)00051-5

Lubke, G. H., Dolan, C. V., Kelderman, H., & Mellenbergh, G. J. (2003). Weak measurement invariance with respect to unmeasured variables: An implication of strict factorial invariance. The British Journal of Mathematical and Statistical Psychology, 56(Pt 2), 231-248. https://doi.org/10.1348/000711003770480020

Meredith, W., & Teresi, J. A. (2006). An essay on measurement and factorial invariance. Medical Care, 44(11 Suppl 3), S69-77. https://doi.org/10.1097/01.mlr.0000245438.73837.89

van de Schoot, R., Lugtig, P., & Hox, J. (2012). A checklist for testing measurement invariance. European Journal of Developmental Psychology, 9(4), 486-492. https://doi.org/10.1080/17405629.2012.686740

Dolan, C. V. (2000). Investigating Spearman’s Hypothesis by Means of Multi-Group Confirmatory Factor Analysis. Multivariate Behavioral Research, 35(1), 21-50. https://doi.org/10.1207/S15327906MBR3501_2

Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13(2), 127-143. https://doi.org/10.1016/0883-0355(89)90002-5

Jensen, A. R. (1998). The g Factor: The Science of Mental Ability. Praeger Publishers/Greenwood Publishing Group.

Frisby, C. L., & Beaujean, A. A. (2015). Testing Spearman’s hypotheses using a bi-factor model with WAIS-IV/WMS-IV standardization data. Intelligence, 51, 79-97. https://doi.org/10.1016/j.intell.2015.04.007

Steenkamp, J. E. M., & Baumgartner, H. (1998). Assessing Measurement Invariance in Cross-National Consumer Research. Journal of Consumer Research, 25(1), 78-107. https://doi.org/10.1086/209528

Roznowski, M., & Reith, J. (1999). Examining the Measurement Quality of Tests Containing Differentially Functioning Items: Do Biased Items Result in Poor Measurement? Educational and Psychological Measurement, 59(2), 248-269. https://doi.org/10.1177/00131649921969839

Brody, N. (1992). Intelligence (2nd ed). Academic Press.

Asil, M., & Brown, G. T. L. (2016). Comparing OECD PISA Reading in English to Other Languages: Identifying Potential Sources of Non-Invariance. International Journal of Testing, 16(1), 71-93. https://doi.org/10.1080/15305058.2015.1064431

Millsap, R. E., & Kwok, O.-M. (2004). Evaluating the impact of partial factorial invariance on selection in two populations. Psychological Methods, 9(1), 93-115. https://doi.org/10.1037/1082-989X.9.1.93

Culpepper, S. A., Aguinis, H., Kern, J. L., & Millsap, R. (2019). High-Stakes Testing Case Study: A Latent Variable Approach for Assessing Measurement and Prediction Invariance. Psychometrika, 84(1), 285-309. https://doi.org/10.1007/s11336-018-9649-2

Wagenmakers, E.-J., & Farrell, S. (2004). AIC model selection using Akaike weights. Psychonomic Bulletin & Review, 11(1), 192-196. https://doi.org/10.3758/BF03206482

Dolan, C. V., Roorda, W., & Wicherts, J. M. (2004). Two failures of Spearman’s hypothesis: The GATB in Holland and the JAT in South Africa. Intelligence, 32(2), 155-173. https://doi.org/10.1016/j.intell.2003.09.001

Carroll, J. (1993). Human Cognitive Abilities: A Survey of Factor-Analytic Studies. Cambridge University Press.

Hood, S. B. (2010). Latent Variable Realism in Psychometrics. https://scholarworks.iu.edu/dspace/handle/2022/8120

Jensen, A. R. (1987). The “g” beyond factor analysis. In The influence of cognitive psychology on testing (pp. 87-142). Lawrence Erlbaum Associates, Inc.

Hancock, G. R., & Mueller, R. O. (Eds.). (2001). Rethinking construct reliability within latent variable systems. In Structural equation modeling: Present and future, a festschrift in honor of Karl Joereskog (pp. 195-216). Scientific Software International.

Reise, S. P., Scheines, R., Widaman, K. F., & Haviland, M. G. (2012). Multidimensionality and Structural Coefficient Bias in Structural Equation Modeling: A Bifactor Perspective. Educational and Psychological Measurement. https://doi.org/10.1177/0013164412449831

Rodriguez, A., Reise, S. P., & Haviland, M. G. (2015). Applying Bifactor Statistical Indices in the Evaluation of Psychological Measures. Journal of Personality Assessment. https://www.tandfonline.com/doi/abs/10.1080/00223891.2015.1089249

Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016). Evaluating bifactor models: Calculating and interpreting statistical indices. Psychological Methods, 21(2), 137-150. https://doi.org/10.1037/met0000045

Mansolf, M., & Reise, S. P. (2016). Exploratory Bifactor Analysis: The Schmid-Leiman Orthogonalization and Jennrich-Bentler Analytic Rotations. Multivariate Behavioral Research, 51(5), 698-717. https://doi.org/10.1080/00273171.2016.1215898

Mansolf, M., & Reise, S. P. (2017). When and why the second-order and bifactor models are distinguishable. Intelligence, 61, 120-129. https://doi.org/10.1016/j.intell.2017.01.012

Van Der Maas, H. L. J., Dolan, C. V., Grasman, R. P. P. P., Wicherts, J. M., Huizenga, H. M., & Raijmakers, M. E. J. (2006). A dynamical model of general intelligence: The positive manifold of intelligence by mutualism. Psychological Review, 113(4), 842-861. https://doi.org/10.1037/0033-295X.113.4.842

Kan, K.-J., van der Maas, H. L. J., & Kievit, R. A. (2016). Process Overlap Theory: Strengths, Limitations, and Challenges. Psychological Inquiry, 27(3), 220-228. https://doi.org/10.1080/1047840X.2016.1182000

Franic, S., Dolan, C. V., Borsboom, D., Hudziak, J. J., van Beijsterveldt, C. E. M., & Boomsma, D. I. (2013). Can genetics help psychometrics? Improving dimensionality assessment through genetic factor modeling. Psychological Methods, 18(3), 406-433. https://doi.org/10.1037/a0032755

Dolan, C. V., & Hamaker, E. L. (2001). Investigating Black-White differences in psychometric IQ: Multi-group confirmatory factor analyses of the WISC-R and K-ABC and a critique of the method of correlated vectors. In Advances in psychology research, Vol. 6. (pp. 31-59). Nova Science Publishers.

Benson, N. F., Beaujean, A. A., McGill, R. J., & Dombrowski, S. C. (2018). Revisiting Carroll’s survey of factor-analytic studies: Implications for the clinical assessment of intelligence. Psychological Assessment, 30(8), 1028-1038. https://doi.org/10.1037/pas0000556