Setup

Functions

MOES <- function(T1, T2, S1, S2) {
  ES = (T1 - T2) / (S1^2 - S2^2)
  return(ES)}

Equations

\[\sigma_{jk}^2 = \lambda_j'\Phi_k\lambda_j+\theta_{jk}\]

\[\delta_{\sigma_{jk}^2} = \frac{\theta_{j1}-\theta_{j2}}{\sigma_{j1}^2-\sigma_{j2}^2}\]

Rationale

There are multiple effect sizes available for quantifying the effects of differential item functioning for items and metric and scalar variance for indicators in multi-group confirmatory factor models (MGCFA). As of today, there are no effect sizes for measuring the effect of residual variance on indicator variances in MGCFA. Part of the reason for that is difficulty in interpreting variance differences. Millsap & Olivera-Aguilar (2012) proposed a stand-in where practitioners would “evaluate what proportion of the group differences in variance [are] due to the difference in unique factor variances” (p. 385).

The formula is

\[\frac{\theta_{j1}-\theta_{j2}}{\sigma_{j1}^2-\sigma_{j2}^2}\]

and it assumes that “the signs of the numerator and denominator are consistent, and consistent with the difference:”

\[\lambda_j'(\Phi_1-\Phi_2)\lambda_j\]

And thus, in “the single factor case, [that equation] simplifies to”

\[\lambda_j^2(\phi_1-\phi_2)\]

Millsap & Olivera-Aguilar tested this with two items from a parent-child conflict scale administered to samples who speak either English or Spanish at home. They observed that 73% and 64% of the differences in the variances were attributable to “unique factor variance differences rather than to differences on the common factor”. Checking their result:

I4 <- MOES(.633, .269, .925, .597)
I6 <- MOES(.628, .366, 1.014, .785)

print(paste("Item 4 proportion = ", I4, "Item 6 =", I6))
## [1] "Item 4 proportion =  0.729143296689209 Item 6 = 0.635967094771234"

Demonstration

Simulation

  • Do later

Empirical

I recently reanalyzed the Reynolds Intellectual Assessment Scales (RIAS) and found two variant residuals; I analyzed the Vietnam Experience Study (VES) and found no variant residuals by conventional criteria, but because the results were near the common 0.01 \(\Delta\)CFI criteria, I freed three residuals to achieve practically no change in goodness-of-fit. In order to assess the effects of these residual differences, here I assess the proportion of the item variances which is due to them.

The variant parameters - GWH and VRZ - in the RIAS had native and immigrant SDs of 8.48, 9.61, 11.28, and 8.96; the GPTL, CD, and ACVL in the VES had SDs of 13.9, 13.8, and 13.7 in the white group and 19.9, 19.3, and 15.9 in the black group. The residual variances were 23.969 and 45.719 (standardized = 0.326 and 0.515) for natives and 55.711 and 19.786 (0.444 and 0.246) for immigrants; these were 63.568, 110.25, and 36.065 (0.332, 0.586, and 0.187) for the white group and 163.64, 212.788, and 71.117 (0.428, 0.672, and 0.333) for the black group. All residual values came from measurement models.

#Proportions

GWH <- MOES(23.969, 55.711, 8.48, 11.28)
VRZ <- MOES(42.719, 19.786, 9.61, 8.96)
GPTL <- MOES(63.568, 163.64, 13.9, 19.9)
CD <- MOES(110.25, 212.788, 13.8, 19.3)
ACVL <- MOES(36.065, 71.117, 13.7, 15.9)

print(paste("GWH proportion =", GWH, "VRZ = ", VRZ, "GPTL = ", GPTL, "CD = ", CD, "ACVL = ", ACVL))
## [1] "GWH proportion = 0.573705899363794 VRZ =  1.89992129572098 GPTL =  0.4934516765286 CD =  0.563240867893436 ACVL =  0.538267813267813"

It’s not exactly clear what a value >1 represents. It does not signal that the model would be better fitted without freeing that residual and it’s hard to imagine that, in this case, it means that 190% of the difference in indicator variances is attributable to residual differences, although I suppose it’s possible. What’s interesting here is that in the case of probable linguistic bias, it affected the two subtests in different directions. The residual bias in the VES was consistently in the direction of increasing the variance of the black group. With seemingly equal latent variances, I’m not sure why the residual doesn’t explain an even greater proportion of the observed differences. The cause of the observed differences may be something like guessing, but there’s no way to tell here; perhaps I’ll have the opportunity to assess the pseudo-guessing parameter for the relevant items at a later date. These values should be regarded as approximations.

Discussion

The issue of quantifying the effects of biases in the residual variances needs further work. When residual variance is detect in multi-group analyses, it can have very large effects; here, the effect sizes were similar to the ones found by Millsap & Olivera-Aguilar. As a final note, this sort of method can also be applied to other forms of bias and may, in that way, be an interesting addition to computing measurement variance effect sizes for other aspects of these models like their intercepts.

References

Millsap, R. E., & Olivera-Aguilar, M. (2012). Investigating measurement invariance using confirmatory factor analysis. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 380-392). New York, NY: Guildford Press