library(pacman)
p_load(psych, lavaan, ggplot2)
#Correlation matrices
lowerSpain = '
1
0.57 1
0.283 0.449 1
0.148 0.31 0.497 1
0.381 0.457 0.557 0.558 1
0.158 0.145 0.16 0.136 0.17 1
-0.032 -0.078 0.067 0.159 0.191 0.157 1
0.188 0.316 0.265 0.191 0.334 0.217 -0.008 1
0.171 0.298 0.411 0.421 0.391 0.136 0.174 0.11 1'
lowerLatin = '
1
0.586 1
0.472 0.515 1
0.35 0.408 0.454 1
0.416 0.492 0.468 0.497 1
0.378 0.343 0.332 0.305 0.345 1
0.288 0.206 0.294 0.358 0.276 0.33 1
0.257 0.256 0.306 0.161 0.254 0.264 0.231 1
0.345 0.369 0.387 0.413 0.368 0.28 0.234 0.213 1'
#Variable names
Names = list("SPM", "PISA", "BISPN3", "BISRN3", "BISRN1", "BISMF", "BISRF", "BISPF", "BISPN2")
#Convert to variance-covariance matrices
SPA.cor = getCov(lowerSpain, names = Names)
LAT.cor = getCov(lowerLatin, names = Names)
SPASDs <- c(6.1, 3.8, 2.1, 6.6, 2.9, 4.3, 4.1, 1.6, 1.7)
LATSDs <- c(7.9, 3.8, 1.9, 6.6, 3.2, 4.9, 4.6, 1.7, 1.6)
SPA.cov = lavaan::cor2cov(R = SPA.cor, sds = SPASDs)
LAT.cov = lavaan::cor2cov(R = LAT.cor, sds = LATSDs)
#Means
SPAmeans = c(48.7, 7.6, 3.9, 18.6, 4.5, 13.4, 19.3, 3.9, 4.4)
LATmeans = c(44.9, 6.6, 3.6, 16.6, 4.1, 12.8, 19.2, 3.4, 3.9)
#Group inputs
SLATCovs <- list(SPA.cov, LAT.cov)
SLATMeans <- list(SPAmeans, LATmeans)
SLATNs <- list(145, 1393)
#Fit Measures
FITM <- c("chisq", "df", "nPar", "cfi", "rmsea", "rmsea.ci.lower", "rmsea.ci.upper", "aic", "bic")
This is a preliminary analysis of the Study of Latin American Intelligence (SLATINT) data. This dataset contains cognitive test data from seven Latin countries: Argentina, Brazil, Chile, Colombia, Mexico, Peru, and Spain. The tests included Raven’s Standard Progressive Matrices, a short version of the 2003 PISA with a focus on mathematics, and some subtests of the Berlin Intelligence Structural model. This amounted to two figural reasoning tests (SPM and BIS-PF), a figural short-term memory test (BIS-MF), two simple numerical mental speed tests (BIS-PN2, BIS-RN1) and a simple figural mental speed test (BIS-RF), in addition to two numerical reasoning tests (BIS-PN3, BIS-RN3), a figural creativity test (BIS-CF2), and a verbal creativity test (BIS-CV1). Socioeconomic status was assessed with questions about “available resources in the home, age, sex, hometown, birth order, native language, number of dependents, early childhood environment, parents’ education, and job occupation of the main family provider” (Flores-Mendoza et al., 2018) and information about the school the tests took place in was also provided.
I have the full data, but right now I’m just going to preliminarily assess if the aggregate Latin American data can be compared with the Spanish data using the matrices provided in the associated book. I used McNeish & Wolf’s (2020) https://www.dynamicfit.app/cfa/ to compute the cutoff indices based on a model of the data for the Spanish sample with a correlated group factor model and the Spanish sample size.
fa.parallel(SPA.cor, n.obs = 145); fa.parallel(LAT.cor, n.obs = 1393)
## Warning in fa.stats(r = r, f = f, phi = phi, n.obs = n.obs, np.obs = np.obs, :
## The estimated weights for the factor scores are probably incorrect. Try a
## different factor score estimation method.
## Warning in fac(r = r, nfactors = nfactors, n.obs = n.obs, rotate = rotate, : An
## ultra-Heywood case was detected. Examine the results carefully
## Parallel analysis suggests that the number of factors = 2 and the number of components = 1
## Parallel analysis suggests that the number of factors = 3 and the number of components = 1
SFA <- fa(SPA.cor, n.obs = 145, nfactors = 2)
## Loading required namespace: GPArotation
print(SFA$loadings, cutoff = 0.2)
##
## Loadings:
## MR1 MR2
## SPM 0.653
## PISA 0.845
## BISPN3 0.595
## BISRN3 0.776
## BISRN1 0.684
## BISMF 0.207
## BISRF 0.390 -0.286
## BISPF 0.266
## BISPN2 0.575
##
## MR1 MR2
## SS loadings 1.985 1.374
## Proportion Var 0.221 0.153
## Cumulative Var 0.221 0.373
FA2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF
F1 ~~ F2'
GFA2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF
g =~ F1 + F2
F2 ~~ 0.5*F2'
SPA2 <- cfa(FA2, sample.cov = SPA.cov, sample.nobs = 145, std.lv = T)
LAT2 <- cfa(FA2, sample.cov = LAT.cov, sample.nobs = 1393, std.lv = T) #Latin American fit for comparison
fitMeasures(SPA2, FITM); fitMeasures(LAT2, FITM)
## chisq df npar cfi rmsea
## 31.752 19.000 17.000 0.949 0.068
## rmsea.ci.lower rmsea.ci.upper aic bic
## 0.019 0.108 6024.306 6074.911
## chisq df npar cfi rmsea
## 196.864 19.000 17.000 0.941 0.082
## rmsea.ci.lower rmsea.ci.upper aic bic
## 0.072 0.093 58732.400 58821.466
There were no dynamic cutoffs available. Perhaps the method is in its early stages and with more development, other fit indices might be able to work. Though perhaps not accurate, I will use the typical Hu & Bentler (1999) cutoffs and plot the model BICs.
CON2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T)
MET2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = "loadings")
SCA2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts"))
SFI2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts", "residuals"))
MEA2 <- cfa(FA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals", "means"), control=list(rel.tol=1e-4)) #Standardizing latent variances does not affect fit
round(cbind(CONFIGURAL = fitMeasures(CON2, FITM),
METRIC = fitMeasures(MET2, FITM),
SCALAR = fitMeasures(SCA2, FITM),
STRICT = fitMeasures(SFI2, FITM),
MEANS = fitMeasures(MEA2, FITM)),3)
## CONFIGURAL METRIC SCALAR STRICT MEANS
## chisq 228.616 264.146 283.215 303.779 330.437
## df 38.000 44.000 50.000 58.000 60.000
## npar 50.000 44.000 38.000 30.000 28.000
## cfi 0.941 0.932 0.928 0.925 0.917
## rmsea 0.081 0.081 0.078 0.074 0.077
## rmsea.ci.lower 0.071 0.071 0.069 0.066 0.069
## rmsea.ci.upper 0.091 0.090 0.087 0.083 0.085
## aic 64788.706 64812.237 64819.305 64823.869 64846.527
## bic 65055.618 65047.119 65022.158 64984.016 64995.998
CON2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
## Could not compute standard errors! The information matrix could
## not be inverted. This may be a symptom that the model is not
## identified.
MET2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = "loadings")
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
SCA2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
## The variance-covariance matrix of the estimated parameters (vcov)
## does not appear to be positive definite! The smallest eigenvalue
## (= 1.909779e-14) is close to zero. This may be a symptom that the
## model is not identified.
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
SFI2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = F, group.equal = c("loadings", "intercepts", "residuals"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
## The variance-covariance matrix of the estimated parameters (vcov)
## does not appear to be positive definite! The smallest eigenvalue
## (= 7.459434e-13) is close to zero. This may be a symptom that the
## model is not identified.
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING: some estimated lv variances are negative
MEA2 <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals", "means"), control=list(rel.tol=1e-4))
round(cbind(CONFIGURAL = fitMeasures(CON2, FITM),
METRIC = fitMeasures(MET2, FITM),
SCALAR = fitMeasures(SCA2, FITM),
STRICT = fitMeasures(SFI2, FITM),
MEANS = fitMeasures(MEA2, FITM)),3)
## CONFIGURAL METRIC SCALAR STRICT MEANS
## chisq 228.616 267.216 286.266 305.579 331.492
## df 36.000 45.000 50.000 58.000 60.000
## npar 52.000 43.000 38.000 30.000 28.000
## cfi 0.941 0.932 0.927 0.924 0.917
## rmsea 0.083 0.080 0.078 0.075 0.077
## rmsea.ci.lower 0.073 0.071 0.070 0.066 0.069
## rmsea.ci.upper 0.094 0.090 0.087 0.083 0.085
## aic 64792.706 64813.306 64822.356 64825.670 64847.582
## bic 65070.294 65042.850 65025.209 64985.817 64997.053
The last model does not appear to suffer from multigroup sampling variance issues and the factor means are F1 = -0.119 (p = 0.011), F2 = 0.065 (p = 0.402), g = -0.370 (p = <0.001) with lower values favoring Spaniards.
ggplot(df, aes(x = Model, y = BIC, group = LM, shape = LM, color = LM)) + geom_line() + geom_point() + xlab('Level of Invariance') + ylab('Schwarz Information Criterion') + theme_bw() + theme(text = element_text(size = 12, family = "serif"), legend.position = "none")
GFA2S <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF
g =~ F1 + F2
F2 ~~ 0.5*F2
F1 ~ c(0, 0)*1
F2 ~ c(0, 0)*1'
GFA2W1 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF
g =~ F1 + F2
F2 ~~ 0.5*F2
F1 ~ c(0, 0)*1'
GFA2W2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF
g =~ F1 + F2
F2 ~~ 0.5*F2
F2 ~ c(0, 0)*1'
GFA2C1 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF
g =~ F1 + F2
F2 ~~ 0.5*F2
F1 ~ c(0, 0)*1
g ~ c(0, 0)*1'
GFA2C2 <- '
F1 =~ SPM + PISA + BISPF
F2 =~ BISPN3 + BISRN3 + BISRN1 + BISMF + BISRF
g =~ F1 + F2
F2 ~~ 0.5*F2
F2 ~ c(0, 0)*1
g ~ c(0, 0)*1'
SFIL <- cfa(GFA2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"))
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
STRONG <- cfa(GFA2S, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
WEAK1 <- cfa(GFA2W1, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
WEAK2 <- cfa(GFA2W2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
CONTRA1 <- cfa(GFA2C1, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4))
CONTRA2 <- cfa(GFA2C2, sample.cov = SLATCovs, sample.mean = SLATMeans, sample.nobs = SLATNs, std.lv = T, group.equal = c("loadings", "intercepts", "residuals"))
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
round(cbind(SFI = fitMeasures(SFIL, FITM),
MEANS = fitMeasures(MEA2, FITM),
STRONG = fitMeasures(STRONG, FITM),
WEAK1 = fitMeasures(WEAK1, FITM),
WEAK2 = fitMeasures(WEAK2, FITM),
CONTRA1 = fitMeasures(CONTRA1, FITM),
CONTRA2 = fitMeasures(CONTRA2, FITM)),3)
## SFI MEANS STRONG WEAK1 WEAK2 CONTRA1
## chisq 304.534 331.492 311.208 306.446 306.574 331.490
## df 57.000 60.000 59.000 58.000 58.000 59.000
## npar 31.000 28.000 29.000 30.000 30.000 29.000
## cfi 0.924 0.917 0.923 0.924 0.924 0.916
## rmsea 0.075 0.077 0.075 0.075 0.075 0.077
## rmsea.ci.lower 0.067 0.069 0.067 0.067 0.067 0.069
## rmsea.ci.upper 0.084 0.085 0.083 0.083 0.083 0.086
## aic 64826.624 64847.582 64829.298 64826.536 64826.664 64849.581
## bic 64992.110 64997.053 64984.107 64986.683 64986.811 65004.389
## CONTRA2
## chisq 312.131
## df 59.000
## npar 29.000
## cfi 0.922
## rmsea 0.075
## rmsea.ci.lower 0.067
## rmsea.ci.upper 0.083
## aic 64830.221
## bic 64985.030
It seems as though the first weak model fits best. Note that the homogeneous means, strong, weak, and one of the contra models had trouble converging. This may be a demerit against them that will be investigated later.
Preliminarily, it seems Latin American countries - at least at this level of socioeconomic status - can be compared to European countries, or at the very least, Spain, which is comparable to other western European countries. This page will be expanded with analyses of the raw data at the item and parcel level and with countries separated. This quick first-look should provide some hope that valid comparisons are possible.
Flores-Mendoza, C., Ardila, R., Rosas, R., Lucio, M. E., Gallegos, M., & Colareta, N. R. (2018). Intelligence Measurement and School Performance in Latin America: A Report of the Study of Latin American Intelligence Project. Springer International Publishing. //www.springer.com/us/book/9783319899749
McNeish, D., & Wolf, M. G. (2020). Dynamic Fit Index Cutoffs for Confirmatory Factor Analysis Models. PsyArXiv. https://doi.org/10.31234/osf.io/v8yru