Setup

Libraries

library(pacman)
p_load(psych, lavaan, ggplot2)

Data

#Correlation matrices

lowerSpain = '
1                               
0.57    1                           
0.283   0.449   1                       
0.148   0.31    0.497   1                   
0.381   0.457   0.557   0.558   1               
0.158   0.145   0.16    0.136   0.17    1           
-0.032  -0.078  0.067   0.159   0.191   0.157   1       
0.188   0.316   0.265   0.191   0.334   0.217   -0.008  1   
0.171   0.298   0.411   0.421   0.391   0.136   0.174   0.11    1'

lowerLatin = '
1                               
0.586   1                           
0.472   0.515   1                       
0.35    0.408   0.454   1                   
0.416   0.492   0.468   0.497   1               
0.378   0.343   0.332   0.305   0.345   1           
0.288   0.206   0.294   0.358   0.276   0.33    1       
0.257   0.256   0.306   0.161   0.254   0.264   0.231   1   
0.345   0.369   0.387   0.413   0.368   0.28    0.234   0.213   1'

#Variable names

Names = list("SPM", "PISA", "BISPN3", "BISRN3", "BISRN1", "BISMF", "BISRF", "BISPF", "BISPN2")

#Convert to variance-covariance matrices

SPA.cor = getCov(lowerSpain, names = Names)
LAT.cor = getCov(lowerLatin, names = Names)
SPASDs <- c(6.1, 3.8, 2.1, 6.6, 2.9, 4.3, 4.1, 1.6, 1.7)
LATSDs <- c(7.9, 3.8, 1.9, 6.6, 3.2, 4.9, 4.6, 1.7, 1.6)
SPA.cov = lavaan::cor2cov(R = SPA.cor, sds = SPASDs)
LAT.cov = lavaan::cor2cov(R = LAT.cor, sds = LATSDs)

#Means

SPAmeans = c(48.7, 7.6, 3.9, 18.6, 4.5, 13.4, 19.3, 3.9, 4.4)
LATmeans = c(44.9, 6.6, 3.6, 16.6, 4.1, 12.8, 19.2, 3.4, 3.9)

#Group inputs

SLATCovs <- list(SPA.cov, LAT.cov)
SLATMeans <- list(SPAmeans, LATmeans)
SLATNs <- list(145, 1393)

#Fit Measures

FITM <- c("chisq", "df", "nPar", "cfi", "rmsea", "rmsea.ci.lower", "rmsea.ci.upper", "aic", "bic")

Rationale

This is a preliminary analysis of the Study of Latin American Intelligence (SLATINT) data. This dataset contains cognitive test data from seven Latin countries: Argentina, Brazil, Chile, Colombia, Mexico, Peru, and Spain. The tests included Raven’s Standard Progressive Matrices, a short version of the 2003 PISA with a focus on mathematics, and some subtests of the Berlin Intelligence Structural model. This amounted to two figural reasoning tests (SPM and BIS-PF), a figural short-term memory test (BIS-MF), two simple numerical mental speed tests (BIS-PN2, BIS-RN1) and a simple figural mental speed test (BIS-RF), in addition to two numerical reasoning tests (BIS-PN3, BIS-RN3), a figural creativity test (BIS-CF2), and a verbal creativity test (BIS-CV1). Socioeconomic status was assessed with questions about “available resources in the home, age, sex, hometown, birth order, native language, number of dependents, early childhood environment, parents’ education, and job occupation of the main family provider” (Flores-Mendoza et al., 2018) and information about the school the tests took place in was also provided.

I have the full data, but right now I’m just going to preliminarily assess if the aggregate Latin American data can be compared with the Spanish data using the matrices provided in the associated book. I used McNeish & Wolf’s (2020) https://www.dynamicfit.app/cfa/ to compute the cutoff indices based on a model of the data for the Spanish sample with a correlated group factor model and the Spanish sample size.

Analysis

EFA

fa.parallel(SPA.cor, n.obs = 145); fa.parallel(LAT.cor, n.obs = 1393)
## Warning in fa.stats(r = r, f = f, phi = phi, n.obs = n.obs, np.obs = np.obs, :
## The estimated weights for the factor scores are probably incorrect. Try a
## different factor score estimation method.
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : An
## ultra-Heywood case was detected. Examine the results carefully

## Parallel analysis suggests that the number of factors =  2  and the number of components =  1

## Parallel analysis suggests that the number of factors =  3  and the number of components =  1
SFA <- fa(SPA.cor, n.obs = 145, nfactors = 2)
## Loading required namespace: GPArotation
print(SFA$loadings, cutoff = 0.2)
## 
## Loadings:
##        MR1    MR2   
## SPM            0.653
## PISA           0.845
## BISPN3  0.595       
## BISRN3  0.776       
## BISRN1  0.684       
## BISMF   0.207       
## BISRF   0.390 -0.286
## BISPF          0.266
## BISPN2  0.575       
## 
##                  MR1   MR2
## SS loadings    1.985 1.374
## Proportion Var 0.221 0.153
## Cumulative Var 0.221 0.373

Spanish Model

FA2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF

F1 ~~ F2'

GFA2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF

g =~ F1 + F2

F2 ~~ 0.5*F2'

SPA2 <- cfa(FA2, sample.cov = SPA.cov, sample.nobs = 145, std.lv = T)
LAT2 <- cfa(FA2, sample.cov = LAT.cov, sample.nobs = 1393, std.lv = T) #Latin American fit for comparison
fitMeasures(SPA2, FITM); fitMeasures(LAT2, FITM)
##          chisq             df           npar            cfi          rmsea 
##         31.752         19.000         17.000          0.949          0.068 
## rmsea.ci.lower rmsea.ci.upper            aic            bic 
##          0.019          0.108       6024.306       6074.911
##          chisq             df           npar            cfi          rmsea 
##        196.864         19.000         17.000          0.941          0.082 
## rmsea.ci.lower rmsea.ci.upper            aic            bic 
##          0.072          0.093      58732.400      58821.466

There were no dynamic cutoffs available. Perhaps the method is in its early stages and with more development, other fit indices might be able to work. Though perhaps not accurate, I will use the typical Hu & Bentler (1999) cutoffs and plot the model BICs.

Multigroup Model

CON2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T)
MET2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = "loadings")
SCA2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts"))
SFI2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts", "residuals"))
MEA2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals", "means"), control=list(rel.tol=1e-4)) #Standardizing latent variances does not affect fit
round(cbind(CONFIGURAL = fitMeasures(CON2, FITM),
            METRIC = fitMeasures(MET2, FITM),
            SCALAR = fitMeasures(SCA2, FITM),
            STRICT = fitMeasures(SFI2, FITM),
            MEANS = fitMeasures(MEA2, FITM)),3)
##                CONFIGURAL    METRIC    SCALAR    STRICT     MEANS
## chisq             228.616   264.146   283.215   303.779   330.437
## df                 38.000    44.000    50.000    58.000    60.000
## npar               50.000    44.000    38.000    30.000    28.000
## cfi                 0.941     0.932     0.928     0.925     0.917
## rmsea               0.081     0.081     0.078     0.074     0.077
## rmsea.ci.lower      0.071     0.071     0.069     0.066     0.069
## rmsea.ci.upper      0.091     0.090     0.087     0.083     0.085
## aic             64788.706 64812.237 64819.305 64823.869 64846.527
## bic             65055.618 65047.119 65022.158 64984.016 64995.998
CON2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     Could not compute standard errors! The information matrix could
##     not be inverted. This may be a symptom that the model is not
##     identified.
MET2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = "loadings")
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
SCA2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 1.909779e-14) is close to zero. This may be a symptom that the
##     model is not identified.
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
SFI2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts", "residuals"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 7.459434e-13) is close to zero. This may be a symptom that the
##     model is not identified.

## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING: some estimated lv variances are negative
MEA2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals", "means"), control=list(rel.tol=1e-4))
round(cbind(CONFIGURAL = fitMeasures(CON2, FITM),
            METRIC = fitMeasures(MET2, FITM),
            SCALAR = fitMeasures(SCA2, FITM),
            STRICT = fitMeasures(SFI2, FITM),
            MEANS = fitMeasures(MEA2, FITM)),3)
##                CONFIGURAL    METRIC    SCALAR    STRICT     MEANS
## chisq             228.616   267.216   286.266   305.579   331.492
## df                 36.000    45.000    50.000    58.000    60.000
## npar               52.000    43.000    38.000    30.000    28.000
## cfi                 0.941     0.932     0.927     0.924     0.917
## rmsea               0.083     0.080     0.078     0.075     0.077
## rmsea.ci.lower      0.073     0.071     0.070     0.066     0.069
## rmsea.ci.upper      0.094     0.090     0.087     0.083     0.085
## aic             64792.706 64813.306 64822.356 64825.670 64847.582
## bic             65070.294 65042.850 65025.209 64985.817 64997.053

The last model does not appear to suffer from multigroup sampling variance issues and the factor means are F1 = -0.119 (p = 0.011), F2 = 0.065 (p = 0.402), g = -0.370 (p = <0.001) with lower values favoring Spaniards.

ggplot(df, aes(x = Model, y = BIC, group = LM, shape = LM, color = LM)) + geom_line() + geom_point() + xlab('Level of Invariance') + ylab('Schwarz Information Criterion') + theme_bw() + theme(text = element_text(size = 12, family = "serif"), legend.position = "none")

Spearman’s Hypothesis?

GFA2S <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF

g =~ F1 + F2

F2 ~~ 0.5*F2

F1 ~ c(0, 0)*1
F2 ~ c(0, 0)*1'

GFA2W1 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF

g =~ F1 + F2

F2 ~~ 0.5*F2

F1 ~ c(0, 0)*1'

GFA2W2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF

g =~ F1 + F2

F2 ~~ 0.5*F2

F2 ~ c(0, 0)*1'

GFA2C1 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF

g =~ F1 + F2

F2 ~~ 0.5*F2

F1 ~ c(0, 0)*1
g ~ c(0, 0)*1'

GFA2C2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF

g =~ F1 + F2

F2 ~~ 0.5*F2

F2 ~ c(0, 0)*1
g ~ c(0, 0)*1'

SFIL <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"))
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
STRONG <- cfa(GFA2S, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
WEAK1 <- cfa(GFA2W1, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
WEAK2 <- cfa(GFA2W2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
CONTRA1 <- cfa(GFA2C1, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
CONTRA2 <- cfa(GFA2C2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"))
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
round(cbind(SFI = fitMeasures(SFIL, FITM),
            MEANS = fitMeasures(MEA2, FITM),
            STRONG = fitMeasures(STRONG, FITM),
            WEAK1 = fitMeasures(WEAK1, FITM),
            WEAK2 = fitMeasures(WEAK2, FITM),
            CONTRA1 = fitMeasures(CONTRA1, FITM),
            CONTRA2 = fitMeasures(CONTRA2, FITM)),3)
##                      SFI     MEANS    STRONG     WEAK1     WEAK2   CONTRA1
## chisq            304.534   331.492   311.208   306.446   306.574   331.490
## df                57.000    60.000    59.000    58.000    58.000    59.000
## npar              31.000    28.000    29.000    30.000    30.000    29.000
## cfi                0.924     0.917     0.923     0.924     0.924     0.916
## rmsea              0.075     0.077     0.075     0.075     0.075     0.077
## rmsea.ci.lower     0.067     0.069     0.067     0.067     0.067     0.069
## rmsea.ci.upper     0.084     0.085     0.083     0.083     0.083     0.086
## aic            64826.624 64847.582 64829.298 64826.536 64826.664 64849.581
## bic            64992.110 64997.053 64984.107 64986.683 64986.811 65004.389
##                  CONTRA2
## chisq            312.131
## df                59.000
## npar              29.000
## cfi                0.922
## rmsea              0.075
## rmsea.ci.lower     0.067
## rmsea.ci.upper     0.083
## aic            64830.221
## bic            64985.030

It seems as though the first weak model fits best. Note that the homogeneous means, strong, weak, and one of the contra models had trouble converging. This may be a demerit against them that will be investigated later.

Discussion

Preliminarily, it seems Latin American countries - at least at this level of socioeconomic status - can be compared to European countries, or at the very least, Spain, which is comparable to other western European countries. This page will be expanded with analyses of the raw data at the item and parcel level and with countries separated. This quick first-look should provide some hope that valid comparisons are possible.

References

Flores-Mendoza, C., Ardila, R., Rosas, R., Lucio, M. E., Gallegos, M., & Colareta, N. R. (2018). Intelligence Measurement and School Performance in Latin America: A Report of the Study of Latin American Intelligence Project. Springer International Publishing. //www.springer.com/us/book/9783319899749

McNeish, D., & Wolf, M. G. (2020). Dynamic Fit Index Cutoffs for Confirmatory Factor Analysis Models. PsyArXiv. https://doi.org/10.31234/osf.io/v8yru