Startup

Functions

#Congruence coefficient formula because Psych's fa.congruence function is broken

CONGO <- function(F1, F2) {
  PHI = sum(F1*F2) / sqrt(sum(F1^2)*sum(F2^2))
  return(PHI)}

Packages

library(pacman)
p_load(lavaan, semPlot, psych, qpcR, ggplot2)

Data Input

#Correlation matrices

lowerNAT = '
1                       
0.58    1                   
0.3 0.35    1               
0.32    0.28    0.38    1           
0.88    0.9 0.36    0.33    1       
0.37    0.38    0.83    0.83    0.42    1   
0.75    0.77    0.7 0.68    0.86    0.83    1'

lowerMIG = '
1                       
0.65    1                   
0.37    0.43    1               
0.31    0.32    0.57    1           
0.93    0.88    0.43    0.34    1       
0.38    0.42    0.89    0.88    0.44    1   
0.77    0.77    0.78    0.72    0.85    0.85    1'

#Variable names

Names = list("GWH", "VRZ", "OIO", "WHM", "VIX", "NIX", "CIX")

#Convert to variance-covariance matrices

RIASN.cor = getCov(lowerNAT, names = Names)
RIASM.cor = getCov(lowerMIG, names = Names)
NATSDs <- c(8.48, 9.61, 8.85, 8.75, 13.52, 12.46, 12.43)
MIGSDs <- c(11.28, 8.96, 10.66, 10.28, 15.39, 15.68, 14.92)
RIASN.cov = lavaan::cor2cov(R = RIASN.cor, sds = NATSDs)
RIASM.cov = lavaan::cor2cov(R = RIASM.cor, sds = MIGSDs)

#Means

NATmeans = c(53.11, 53.39, 52.21, 51.6, 105.85, 103.5, 105.34)
MIGmeans = c(44.29, 45.58, 47.99, 47.61, 91.86, 96.54, 93.44)

#Group inputs

GCovs <- list(RIASN.cov, RIASM.cov)
GMeans <- list(NATmeans, MIGmeans)
GNs <- list(316, 316)

#Fit measures

FITM <- c("chisq", "df", "nPar", "cfi", "rmsea", "rmsea.ci.lower", "rmsea.ci.upper", "aic", "bic")

LATS <- list(
  VERBAL = c("GWH", "VRZ"),
  NONVERBAL = c("OIO", "WHM"))

Analysis

Rationale

Gygi, Fux & Grob (2016) assessed whether the Reynolds Intellectual Assessment Scales (RIAS) were biased comparing migrants to natives in Germany. They found that measurement invariance (a lack of bias) was supported for a model with two correlated group factors but not for a model with a g factor alone. They were unable to perform well-identified assessments of higher-order and bifactor models, but nonetheless, the parameters can be generated for those models in order to assess g-related hypotheses.

Initial Fits

Four models are compared. The authors fit two of them - which I will check the fits of first in order to see how the matrix input results correspond to the results with the raw data - and I will fit the other two as well. The ones they fit were the measurement or correlated group factor model and the unified g model. In addition, I will fit the higher-order g and bifactor g models. When means of assessing more than metric measurement invariance in network models are developed, I will add a test of that model here.

#Measurement - In Paper

RIASM.model <- '
VERBAL =~ GWH + VRZ
NONVERBAL =~ OIO + WHM'

#Unified g - In Paper

RIASU.model <- '
g =~ GWH + VRZ + OIO + WHM

GWH ~~ VRZ
OIO ~~ WHM'

#Bifactor g - Novel

RIASB.model <- '
VERBAL =~ GWH + VRZ
NONVERBAL =~ OIO + WHM
g =~ GWH + VRZ + OIO + WHM'

#Higher-Order g - Novel

RIASH.model <- '
VERBAL =~ GWH + VRZ
NONVERBAL =~ OIO + WHM
g =~ VERBAL + NONVERBAL'

RIASM.fit <- cfa(RIASM.model, sample.cov = RIASN.cov, sample.nobs = 316, std.lv = T, orthogonal = F)
RIASU.fit <- cfa(RIASU.model, sample.cov = RIASN.cov, sample.nobs = 316, std.lv = T, orthogonal = F)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     Could not compute standard errors! The information matrix could
##     not be inverted. This may be a symptom that the model is not
##     identified.
RIASB.fit <- cfa(RIASB.model, sample.cov = RIASN.cov, sample.nobs = 316, std.lv = T, orthogonal = T)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     Could not compute standard errors! The information matrix could
##     not be inverted. This may be a symptom that the model is not
##     identified.
RIASH.fit <- cfa(RIASH.model, sample.cov = RIASN.cov, sample.nobs = 316, std.lv = T, orthogonal = T)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     Could not compute standard errors! The information matrix could
##     not be inverted. This may be a symptom that the model is not
##     identified.
round(cbind(MEASUREMENT = fitMeasures(RIASM.fit, FITM),
            UNIFIED = fitMeasures(RIASU.fit, FITM),
            BIFACTOR = fitMeasures(RIASB.fit, FITM),
            HIGHERORDER = fitMeasures(RIASH.fit, FITM)),3)
##                MEASUREMENT  UNIFIED BIFACTOR HIGHERORDER
## chisq                2.434    2.434       NA       2.434
## df                   1.000    0.000   -2.000       0.000
## npar                 9.000   10.000   12.000      10.000
## cfi                  0.994    0.990       NA       0.990
## rmsea                0.067    0.000       NA       0.000
## rmsea.ci.lower       0.000    0.000       NA       0.000
## rmsea.ci.upper       0.180    0.000       NA       0.000
## aic               8889.510 8891.510 8895.510    8891.510
## bic               8923.312 8929.068 8940.579    8929.068
#Model Plots

semPaths(RIASM.fit, "model", "std", title = F, residuals = F, groups = "LATS", pastel = T, mar = c(2, 1, 3, 1))

semPaths(RIASU.fit, "std", title = F, residuals = F, mar = c(2, 1, 3, 1), layout = "circle2", bifactor = "g", posCol = c("skyblue4", "red"), sizeMan = 7, edge.label.cex = 1.2)

semPaths(RIASB.fit, "model", "std", title = F, residuals = F, groups = "LATS", pastel = T, mar = c(2, 1, 3, 1), bifactor = "g", layout = "tree2", exoCov = F)
## Warning in if (w <= 0) w <- 1e-07: the condition has length > 1 and only the
## first element will be used

## Warning in if (w <= 0) w <- 1e-07: the condition has length > 1 and only the
## first element will be used

## Warning in if (w <= 0) w <- 1e-07: the condition has length > 1 and only the
## first element will be used

## Warning in if (w <= 0) w <- 1e-07: the condition has length > 1 and only the
## first element will be used

semPaths(RIASH.fit, "model", "std", title = F, residuals = F, groups = "LATS", pastel = T, mar = c(2, 1, 3, 1))

We can take a glance at what proportionality constraints do to the higher-order g model and the resulting loadings too.

#Congruence coefficient matrix, then correlation matrix, for vectors of g loadings from the unified, bifactor, higher-order, efa, and pca

congmat <- '
1                   
0.9953026   1               
0.9908895   0.9992733   1           
0.9980776   0.9993269   0.9972402   1       
0.9869383   0.9978053   0.9995052   0.9960175   1   
0.9895391   0.9778366   0.9706432   0.9839116   0.9661455   1'

cormat <- '
1                   
0.9960670   1               
0.9858495   0.9968207   1           
0.9998559   0.9958681   0.9855349   1       
0.9977094   0.9984901   0.9923225   0.9982435   1   
0.9200141   0.9011842   0.8768167   0.9259658   0.9235547   1'
cocnam = list("U", "B", "H", "EFA", "PCA", "d")

cong.cor = getCov(congmat, names = cocnam)
cor.cor = getCov(cormat, names = cocnam)
corcongmat <- lowerUpper(cong.cor, cor.cor, diff = F)
corcongmat
##             U         B         H       EFA       PCA         d
## U          NA 0.9960670 0.9858495 0.9998559 0.9977094 0.9200141
## B   0.9953026        NA 0.9968207 0.9958681 0.9984901 0.9011842
## H   0.9908895 0.9992733        NA 0.9855349 0.9923225 0.8768167
## EFA 0.9980776 0.9993269 0.9972402        NA 0.9982435 0.9259658
## PCA 0.9869383 0.9978053 0.9995052 0.9960175        NA 0.9235547
## d   0.9895391 0.9778366 0.9706432 0.9839116 0.9661455        NA
dl
ggplot(dl, aes(x = Indicator, y = Loading, fill = Model)) + geom_bar(stat = "identity", position = position_dodge(width = 0.8), width = 0.7, show.legend = T) + coord_cartesian(ylim = c(0, 1)) + xlab("Indicator") + ylab("g Loading") + scale_fill_hue(name = "Factor Model", breaks = c("U", "B", "H", "FA", "PCA"), labels = c("Unified", "Bifactor", "Higher-Order", "EFA", "PCA"), c = 40) + scale_x_discrete(labels = c("GWH" = "Guess What", "OIO" = "Odd Item Out", "VRZ" = "Verbal Reasoning", "WHM" = "What's Missing")) + theme_bw() + theme(legend.position = c(0.81, 0.81), legend.background = element_blank(), text = element_text(size = 12, family = "serif"))

MGCFA

#Measurement Model

RIASMC.fit <- cfa(RIASM.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F)
RIASMM.fit <- cfa(RIASM.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = "loadings")
RIASMS.fit <- cfa(RIASM.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"))
RIASMF.fit <- cfa(RIASM.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"))
RIASMFP.fit <- cfa(RIASM.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"))
RIASMV.fit <- cfa(RIASM.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"))
RIASMME.fit <- cfa(RIASM.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals", "means"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"))

round(cbind(CONFIGURAL = fitMeasures(RIASMC.fit, FITM),
            METRIC = fitMeasures(RIASMM.fit, FITM),
            SCALAR = fitMeasures(RIASMS.fit, FITM),
            STRICT = fitMeasures(RIASMF.fit, FITM),
            P_STRICT = fitMeasures(RIASMFP.fit, FITM),
            LVARS = fitMeasures(RIASMV.fit, FITM),
            MEANS = fitMeasures(RIASMME.fit, FITM)),3)
##                CONFIGURAL    METRIC    SCALAR    STRICT  P_STRICT     LVARS
## chisq               3.287     6.401     7.975    38.620     8.918     8.918
## df                  2.000     4.000     6.000    10.000     8.000     8.000
## npar               26.000    24.000    22.000    18.000    20.000    20.000
## cfi                 0.998     0.996     0.997     0.953     0.998     0.998
## rmsea               0.045     0.044     0.032     0.095     0.019     0.019
## rmsea.ci.lower      0.000     0.000     0.000     0.065     0.000     0.000
## rmsea.ci.upper      0.129     0.103     0.085     0.128     0.070     0.070
## aic             18019.453 18018.566 18016.140 18038.785 18013.083 18013.083
## bic             18135.124 18125.339 18114.016 18118.865 18102.061 18102.061
##                    MEANS
## chisq            142.285
## df                10.000
## npar              18.000
## cfi                0.782
## rmsea              0.205
## rmsea.ci.lower     0.176
## rmsea.ci.upper     0.235
## aic            18142.450
## bic            18222.530

The verbal, but not the nonverbal, residuals were biased (checked, but not included). There’s a partial model available, but involves releasing the GWH and VRZ residuals. Fortunately, this still means that OIO and WHM can be interpreted in common. The latent means are not equal.

#Unified Model

RIASUC.fit <- cfa(RIASU.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     Could not compute standard errors! The information matrix could
##     not be inverted. This may be a symptom that the model is not
##     identified.
RIASUM.fit <- cfa(RIASU.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = "loadings")
RIASUS.fit <- cfa(RIASU.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts"))
RIASUF.fit <- cfa(RIASU.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"))
RIASUFP.fit <- cfa(RIASU.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = F, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"))
RIASUV.fit <- cfa(RIASU.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts"))
RIASUME.fit <- cfa(RIASU.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = F, group.equal = c("loadings", "intercepts", "means"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 6.280855e-13) is close to zero. This may be a symptom that the
##     model is not identified.
round(cbind(CONFIGURAL = fitMeasures(RIASUC.fit, FITM),
            METRIC = fitMeasures(RIASUM.fit, FITM),
            SCALAR = fitMeasures(RIASUS.fit, FITM),
            STRICT = fitMeasures(RIASUF.fit, FITM),
            P_STRICT = fitMeasures(RIASUFP.fit, FITM),
            LVARS = fitMeasures(RIASUV.fit, FITM),
            MEANS = fitMeasures(RIASUME.fit, FITM)),3)
##                CONFIGURAL    METRIC    SCALAR    STRICT  P_STRICT     LVARS
## chisq               3.287     6.401     7.975    51.306    21.065     7.975
## df                  0.000     3.000     6.000    10.000     8.000     6.000
## npar               28.000    25.000    22.000    18.000    20.000    22.000
## cfi                 0.995     0.994     0.997     0.932     0.978     0.997
## rmsea               0.000     0.060     0.032     0.114     0.072     0.032
## rmsea.ci.lower      0.000     0.000     0.000     0.085     0.035     0.000
## rmsea.ci.upper      0.000     0.125     0.085     0.146     0.110     0.085
## aic             18023.453 18020.566 18016.140 18051.471 18025.231 18016.140
## bic             18148.021 18131.788 18114.016 18131.551 18114.208 18114.016
##                    MEANS
## chisq            141.198
## df                 7.000
## npar              21.000
## cfi                0.778
## rmsea              0.246
## rmsea.ci.lower     0.212
## rmsea.ci.upper     0.283
## aic            18147.363
## bic            18240.790

Even releasing the verbal residuals in this model does not make the partial model fit; all of them are biased with respect to a single factor. The latent variance are the same for this factor though. I do not know why the original authors found bias.

#Bifactor Model

RIASBC.fit <- cfa(RIASB.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, control=list(rel.tol=1e-4), check.gradient = F)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     Could not compute standard errors! The information matrix could
##     not be inverted. This may be a symptom that the model is not
##     identified.
RIASBM.fit <- cfa(RIASB.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = "loadings", control=list(rel.tol=1e-4), check.gradient = F)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 8.064675e-15) is close to zero. This may be a symptom that the
##     model is not identified.
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated ov
## variances are negative
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
RIASBS.fit <- cfa(RIASB.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts"), control=list(rel.tol=1e-4), check.gradient = F)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 3.859252e-14) is close to zero. This may be a symptom that the
##     model is not identified.
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated ov
## variances are negative
RIASBF.fit <- cfa(RIASB.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), control=list(rel.tol=1e-4), check.gradient = F)
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
RIASBFP.fit <- cfa(RIASB.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"), control=list(rel.tol=1e-4), check.gradient = F)
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated ov
## variances are negative
RIASBV.fit <- cfa(RIASB.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"), control=list(rel.tol=1e-4), check.gradient = F)
RIASBME.fit <- cfa(RIASB.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals", "means"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"), control=list(rel.tol=1e-4), check.gradient = F)

round(cbind(CONFIGURAL = fitMeasures(RIASBC.fit, FITM),
            METRIC = fitMeasures(RIASBM.fit, FITM),
            SCALAR = fitMeasures(RIASBS.fit, FITM),
            STRICT = fitMeasures(RIASBF.fit, FITM),
            P_STRICT = fitMeasures(RIASBFP.fit, FITM),
            LVARS = fitMeasures(RIASBV.fit, FITM),
            MEANS = fitMeasures(RIASBME.fit, FITM)),3)
##                CONFIGURAL    METRIC    SCALAR    STRICT  P_STRICT     LVARS
## chisq                  NA     6.409     7.536    34.245     6.514     6.498
## df                  -4.00     1.000     2.000     6.000     4.000     4.000
## npar                32.00    27.000    26.000    22.000    24.000    24.000
## cfi                    NA     0.991     0.991     0.953     0.996     0.996
## rmsea                  NA     0.131     0.094     0.122     0.045     0.044
## rmsea.ci.lower         NA     0.050     0.030     0.084     0.000     0.000
## rmsea.ci.upper         NA     0.235     0.169     0.163     0.104     0.104
## aic              18031.45 18024.574 18023.701 18042.410 18018.680 18018.663
## bic              18173.82 18144.694 18139.372 18140.286 18125.453 18125.437
##                    MEANS
## chisq            141.807
## df                 7.000
## npar              21.000
## cfi                0.777
## rmsea              0.247
## rmsea.ci.lower     0.212
## rmsea.ci.upper     0.283
## aic            18147.972
## bic            18241.398

This model really can’t be identified, so the result is kind of pointless. There are different negative residual variances in the migrant group by model and fixing that leads to other errors, so testing a model with “pure” g and group factors unfortunately cannot happen reasonably for this test yet.

#Higher-Order Model

RIASHC.fit <- cfa(RIASH.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T)
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     Could not compute standard errors! The information matrix could
##     not be inverted. This may be a symptom that the model is not
##     identified.
RIASHM.fit <- cfa(RIASH.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = "loadings")
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 9.420794e-15) is close to zero. This may be a symptom that the
##     model is not identified.
RIASHS.fit <- cfa(RIASH.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 2.491497e-14) is close to zero. This may be a symptom that the
##     model is not identified.
RIASHF.fit <- cfa(RIASH.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"))
## Warning in lav_object_post_check(object): lavaan WARNING: some estimated lv
## variances are negative
RIASHFP.fit <- cfa(RIASH.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = F, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"))
## Warning in lav_model_vcov(lavmodel = lavmodel2, lavsamplestats = lavsamplestats, : lavaan WARNING:
##     The variance-covariance matrix of the estimated parameters (vcov)
##     does not appear to be positive definite! The smallest eigenvalue
##     (= 5.423555e-13) is close to zero. This may be a symptom that the
##     model is not identified.
RIASHV.fit <- cfa(RIASH.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"))
RIASHME.fit <- cfa(RIASH.model, sample.cov = GCovs, sample.mean = GMeans, sample.nobs = GNs, std.lv = T, orthogonal = T, group.equal = c("loadings", "intercepts", "residuals", "means"), group.partial = c("GWH~~GWH", "VRZ~~VRZ"))

round(cbind(CONFIGURAL = fitMeasures(RIASHC.fit, FITM),
            METRIC = fitMeasures(RIASHM.fit, FITM),
            SCALAR = fitMeasures(RIASHS.fit, FITM),
            STRICT = fitMeasures(RIASHF.fit, FITM),
            P_STRICT = fitMeasures(RIASHFP.fit, FITM),
            LVARS = fitMeasures(RIASHV.fit, FITM),
            MEANS = fitMeasures(RIASHME.fit, FITM)),3)
##                CONFIGURAL    METRIC    SCALAR    STRICT  P_STRICT     LVARS
## chisq               3.287     6.401     7.975    38.620     8.918     8.918
## df                  0.000     3.000     4.000     8.000     6.000     6.000
## npar               28.000    25.000    24.000    20.000    22.000    22.000
## cfi                 0.995     0.994     0.993     0.949     0.995     0.995
## rmsea               0.000     0.060     0.056     0.110     0.039     0.039
## rmsea.ci.lower      0.000     0.000     0.000     0.077     0.000     0.000
## rmsea.ci.upper      0.000     0.125     0.113     0.146     0.089     0.089
## aic             18023.453 18020.566 18020.140 18042.785 18017.083 18017.083
## bic             18148.021 18131.788 18126.914 18131.763 18114.958 18114.958
##                    MEANS
## chisq            142.285
## df                 9.000
## npar              19.000
## cfi                0.780
## rmsea              0.216
## rmsea.ci.lower     0.186
## rmsea.ci.upper     0.249
## aic            18144.450
## bic            18228.979

These steps were actually unnecessary since invariance of higher-order models is tested in stages and the first stage (the measurement model) was invariant, which is then followed by testing only parts of the second stage being invariant. Unfortunately, with so few indicators and group factors, this model cannot be well-identified. The latent means in verbal and nonverbal are not equal, but the latent mean in nonverbal (net of g) is.

Overall MGCFA Results

Akaike and Schwarz weights are useful for model selection and visualizing the probability of one or another model. This is very true for measurement invariance model comparisons. I plot the normalized model probabilities based on these. Note that the unified model did not have invariant residual variances and that the model whose Schwarz weights are plotted is for the model above with invariant residuals, and for the three models where partial strict factorial invariance fit, that model.

BICMM <- c(fitMeasures(RIASMC.fit, "bic"), fitMeasures(RIASMM.fit, "bic"), fitMeasures(RIASMS.fit, "bic"), fitMeasures(RIASMV.fit, "bic"))
BICweightsMM <- akaike.weights(BICMM)$weights
ModsMM <- data.frame(Model = factor(rep(c("Configural", "Metric", "Scalar", "Strict + LV"))),
                   CRITMM = factor(rep('BIC', each = 4)),
                   values = BICweightsMM)

BICU <- c(fitMeasures(RIASUC.fit, "bic"), fitMeasures(RIASUM.fit, "bic"), fitMeasures(RIASUS.fit, "bic"), fitMeasures(RIASUV.fit, "bic"))
BICweightsU <- akaike.weights(BICU)$weights
ModsU <- data.frame(Model = factor(rep(c("Configural", "Metric", "Scalar", "Strict + LV"))),
                   CRITU = factor(rep('BIC', each = 4)),
                   values = BICweightsU)

BICB <- c(fitMeasures(RIASBC.fit, "bic"), fitMeasures(RIASBM.fit, "bic"), fitMeasures(RIASBS.fit, "bic"), fitMeasures(RIASBV.fit, "bic"))
BICweightsB <- akaike.weights(BICB)$weights
ModsB <- data.frame(Model = factor(rep(c("Configural", "Metric", "Scalar", "Strict + LV"))),
                   CRITB = factor(rep('BIC', each = 4)),
                   values = BICweightsB)

BICH <- c(fitMeasures(RIASHC.fit, "bic"), fitMeasures(RIASHM.fit, "bic"), fitMeasures(RIASHS.fit, "bic"), fitMeasures(RIASHV.fit, "bic"))
BICweightsH <- akaike.weights(BICH)$weights
ModsH <- data.frame(Model = factor(rep(c("Configural", "Metric", "Scalar", "Strict + LV"))),
                   CRITH = factor(rep('BIC', each = 4)),
                   values = BICweightsH)
df
ggplot(df, aes(x = Model, y = VALS, fill = LM)) + geom_bar(stat = "identity", position = position_dodge(width = 0.8), width = 0.7, show.legend = T) + coord_cartesian(ylim = c(0, 1)) + xlab('Level of Invariance') + ylab('Normalized Probability') + scale_fill_hue(name = "Factor Model", breaks = c("MM", "U", "B", "H"), labels = c("Measurement", "Unified", "Bifactor", "Higher-Order"), c = 40) + theme_bw() + theme(text = element_text(size = 12, family = "serif"))

ggplot(df, aes(x = Model, y = BIC, group = LM, shape = LM, color = LM)) + geom_line() + geom_point() + xlab('Level of Invariance') + ylab('Bayesian Information Criterion') + scale_colour_discrete(name = "Factor Model", breaks = c("MM", "U", "B", "H"), labels = c("Measurement", "Unified", "Bifactor", "Higher-Order"), c = 60) + scale_shape_discrete(name = "Factor Model", breaks = c("MM", "U", "B", "H"), labels = c("Measurement", "Unified", "Bifactor", "Higher-Order")) + theme_bw() + theme(text = element_text(size = 12, family = "serif"))

Spearman’s Hypothesis

We can also test Spearman’s hypothesis in these data, although it may be biased towards a model rejecting the contra hypothesis because there are too few indicators. The model cannot be tested with the bifactor model for identification reasons, again due to the number of indicators, but also the diversity of their content, since more group factors are needed; an alternative method would be to expand the number of indicators and check the different hypotheses with two group factors provided the loadings are homogeneous. In the strong model, the verbal latent variance had to be constrained, so I plot their probabilities with and without that model. An atheoretical contra model is also plotted, but since this is not theoretically admissible, it shouldn’t be considered and the second plot only includes the theoretically admissible models which did not induce errors.

round(cbind(LVARS = fitMeasures(RIASHV.fit, FITM),
            STRONG = fitMeasures(RIASHST.fit, FITM),
            WEAKV = fitMeasures(RIASHW1.fit, FITM),
            WEAKN = fitMeasures(RIASHW2.fit, FITM),
            CONTRA = fitMeasures(RIASHCO.fit, FITM),
            CONTRAV = fitMeasures(RIASHC1.fit, FITM),
            CONTRAN = fitMeasures(RIASHC2.fit, FITM)),3)
##                    LVARS    STRONG     WEAKV     WEAKN    CONTRA   CONTRAV
## chisq              8.918     9.726     8.918     8.918     8.918   141.331
## df                 6.000     9.000     7.000     7.000     7.000     8.000
## npar              22.000    19.000    21.000    21.000    21.000    20.000
## cfi                0.995     0.999     0.997     0.997     0.997     0.780
## rmsea              0.039     0.016     0.029     0.029     0.029     0.230
## rmsea.ci.lower     0.000     0.000     0.000     0.000     0.000     0.197
## rmsea.ci.upper     0.089     0.067     0.079     0.079     0.079     0.264
## aic            18017.083 18011.891 18015.083 18015.083 18015.083 18145.496
## bic            18114.958 18096.420 18108.510 18108.510 18108.510 18234.474
##                  CONTRAN
## chisq             45.793
## df                 8.000
## npar              20.000
## cfi                0.938
## rmsea              0.122
## rmsea.ci.lower     0.089
## rmsea.ci.upper     0.158
## aic            18049.958
## bic            18138.936
BICSP <- c(fitMeasures(RIASHV.fit, "bic"), fitMeasures(RIASHST.fit, "bic"), fitMeasures(RIASHW1.fit, "bic"), fitMeasures(RIASHW2.fit, "bic"), fitMeasures(RIASHCO.fit, "bic"), fitMeasures(RIASHC1.fit, "bic"), fitMeasures(RIASHC2.fit, "bic"))
BICweightsSP <- akaike.weights(BICSP)$weights
ModsSP <- data.frame(Model = factor(rep(c("Latent Variances", "Strong", "Weak Verbal", "Weak Nonverbal", "Contra", "Contra Verbal", "Contra Nonverbal"))),
                   CRITSP = factor(rep('BIC', each = 7)),
                   values = BICweightsSP)
ModsSP
ggplot(ModsSP, aes(Model, values, fill = Model)) + geom_bar(stat = "identity", position = "dodge", show.legend = F) + coord_cartesian(ylim = c(0,1)) + xlab('Spearman\'s Hypothesis Model') + ylab('Normalized Probability') + theme_bw() + theme(text = element_text(size = 12, family = "serif"))

BICSPNS <- c(fitMeasures(RIASHV.fit, "bic"), fitMeasures(RIASHW1.fit, "bic"), fitMeasures(RIASHW2.fit, "bic"), fitMeasures(RIASHC1.fit, "bic"), fitMeasures(RIASHC2.fit, "bic"))
BICweightsSPNS <- akaike.weights(BICSPNS)$weights
ModsSPNS <- data.frame(Model = factor(rep(c("Latent Variances", "Weak Verbal", "Weak Nonverbal", "Contra Verbal", "Contra Nonverbal"))),
                   CRITSPNS = factor(rep('BIC', each = 5)),
                   values = BICweightsSPNS)
ModsSPNS
ggplot(ModsSPNS, aes(Model, values, fill = Model)) + geom_bar(stat = "identity", position = "dodge", show.legend = F) + coord_cartesian(ylim = c(0,1)) + xlab('Spearman\'s Hypothesis Model') + ylab('Normalized Probability') + theme_bw() + theme(text = element_text(size = 12, family = "serif"))

Since the optimal model is unclear, it may be worthwhile to compare the proportion of the group differences which would be explained by g in the latent variances models for the higher-order and bifactor models. The formula to do this for the bifactor model is very simple (Brown, 2006, pp. 334-337)

\[\frac{\Sigma \lambda_g}{\Sigma (\lambda_g + \lambda_{non_g})}\]

whereas for the higher-order model, the formula is

\[\frac{\Sigma (\Gamma_g\times\lambda_{non-g})}{\Sigma ((\Gamma_g\times\lambda_{non-g}) + ((1-\Gamma_g^2)\times\lambda_{non-g}))}\]

because the g loading for each item is just

\[\lambda_g = \Gamma_g\times\lambda_{non-g}\]

and the residualized non-g loading is

\[\epsilon \lambda_{non-g} = (1-\Gamma_g^2)\times\lambda_{non-g}\]

where \(1-\Gamma_g^2\) is just the residual variance or uniqueness, \(\epsilon\). So, rewritten, the formula for the proportion of the differences due to g is

\[\frac{\Sigma \lambda_g}{\Sigma (\lambda_g + \epsilon \lambda_{non-g})}\]

So, using the values from the latent variance models and translating them to a table

Indicator Bifactor g Loading Bifactor non-g Loading Higher-Order g Loading Higher-Order non-g Loading Higher-Order g Item Loading Uniqueness Residualized non-g Loading Bifactor g Proportion Higher-Order g Proportion
GWH 0.705 0.405 0.783 0.821 0.643 0.387 0.318 0.635 0.669
VRZ 0.642 0.306 0.783 0.696 0.545 0.387 0.269 0.677 0.669
OIO 0.497 0.406 0.815 0.668 0.544 0.336 0.224 0.550 0.708
WHM 0.411 0.440 0.815 0.579 0.472 0.336 0.194 0.583 0.708

So the percentage of the observed group differences accounted for by g is 59.2% in the bifactor model and 68.7% in the higher-order model. Note that this does assume the item residuals are not contributing to the group difference, and as we have seen, for the verbal subtests, they are, to a degree which I might bother figuring out later, but which is of course less than the amount explain by the modeled factors. The factor means for the higher-order and bifactor models respectively were as follows where negative values are higher scores for natives, but note, that the bifactor means are off due to the identification issues and a slight negative variance, which, when corrected, still left the verbal mean unidentified

Factor Bifactor (d/g) Bifactor (Residuals Constrained) Higher-Order Higher-Order (Residuals Constrained)
Verbal -1.974 NA -0.399 -0.351
Nonverbal -0.396 -0.098 0.023 0.029
g -0.334 -0.799 -0.759 -0.783

The means of the various higher-order Spearman’s hypothesis models may also be interesting. The strong model had a mean in g of -1.014; the weak verbal model had means of -1.199 in g and 0.307 in nonverbal; the weak nonverbal model had means of -0.730 in g and -0.419 in verbal; the contra model had means of -1.028 in verbal and -0.517 in nonverbal; the contra verbal model had a mean of -0.080 in nonverbal, and; the contra nonverbal model had a mean of -0.796 in verbal. The mean differences were redistributed in the models which still fit, the strong, both weak, and the theoretically inadmissible contra models.

Discussion

Unlike the initial authors, I found that the unified g factor model was unbiased up to the scalar stage; maybe I made a mistake somewhere and someone can point it out. In a measurement model, the verbal residual variances were biased but the latent variances were not. In the unified g model, all of the residual variances were biased. In the higher-order g model, the results of course matched the measurement model, and in the bifactor g model, nothing can be stated firmly. Bias might contribute a bit to verbal, but not nonverbal scores, it seems. This instrument can probably be used validly for both of these groups. The observed group differences - at least at this level - seem to be a result of differences in g rather than specific skills but there are too few indicators to conclusively tell.

References

Gygi, J. T., Fux, E., Grob, A., & Arx, P. H. (2016). Measurement Invariance and Latent Mean Differences in the Reynolds Intellectual Assessment Scales (RIAS): Does the German Version of the RIAS Allow a Valid Assessment of Individuals with a Migration Background? PLOS ONE, 11(11), e0166533. https://doi.org/10.1371/journal.pone.0166533

Brown, T. A. (2006). Confirmatory Factor Analysis for Applied Research (First edition). The Guilford Press.