Setup

Rationale

RMSEA is commonly used to discern misspecification and worsened fit in structural equation model selection. In testing latent variable models, RMSEA is the most commonly-reported model fit index (Ropovik, 2016, Table 1). But while RMSEA has obvious utility for detecting misspecification and does not go overboard in calling minor parameter differences misspecifications like a \(\chi^2\) test might be alleged to, it is also weak when it comes in the form of \(\Delta RMSEA\), or \(RMSEA_B - RMSEA_A\).

RMSEA suffers from a number of defects. As described by Savalei, Brace & Fouladi (2021):

When comparing nested models in terms of approximate fit, it is common for researchers to simply report fit indices with each model and to eyeball tehir differences; we call this the “naked eye” approach. It has also become common to take the difference of fit indices, especially in the context of MI assessment (Cheung & Rensvold, 2002). For the RMSEA, the diffference index is defined as: \(\Delta RMSEA = RMSEA_B - RMSEA_A\), (2) where \(RMSEA_A\) and \(RMSEA_B\) are model RMSEAs for Models A and B, respectively. This index has several disadvantages. First, it is not on the same metric as the model RMSEA values (in fact, it is not even guaranteed to be nonnegative), and conventional RMSEA cutoffs do not apply. Limited research exists that has proposed cutoffs specifically for this index. Chen (2007) proposed that \(\Delta RMSEA\) < .01 or .015 (depending on sample size) before constraints in Model B are accepted. The small size of these differences reflects the second disadvantage of \(\Delta RMSEA\), which is that it will often lack sensitivity to detect misfit due to the added constraints. Particularly for models with large initial degrees of freedom, \(RMSEA_A\) and \(RMSEA_B\) will often be quite similar to each other, because any changes in the fit function will be “diluted” by dividing by the overall df in their computation. This situation is exacerbated by the fact that an analytic CI for \(\Delta RMSEA\) is not yet easily available.

Instead of [\(\Delta RMSEA\)], we propose to compute [the] RMSEA associated with the difference test to compare nested models. This computation is given by Browne & Du Toit, 1992: \(RMSEA_D = \sqrt{\frac{D-df_D}{df_D(N-1)}}\), (3) which is a direct generalization of [the equation for RMSEA]. When \(D < df_D\), \(RMSEA_D\) is set to zero….

\(RMSEA_D\) has the more conventional interpretation as the amount of additional misfit introduced by the constraints in Model B. It is on the same metric as the model RMSEA (e.g., it cannot be negative). Because this index is a transformation of the chi-square difference test in the same way the overall model RMSEA is a transformation of the model chi-square test, a simply analytic CI for \(RMSEA_D\) can be obtained using existing methodology.

And

When Model A is true but Model B is false, \(\Delta RMSEA\) is equal to \(RMSEA_B\) in the population, so it is not a test of the specific constraints distinguishing Model B from Model A, and the misfit due to the additional constraints may not be noticeable in the overall value of \(\Delta RMSEA\). On the other hand, \(RMSEA_D\) undoes this dilution by multiplying by a factor of \(\sqrt{df_B/df_D}\), resulting in a more sensitive test of the constraints.

Other approximate fit indices relying on values of \(\chi^2\) like CFI would benefit similarly. For the RMSEA function I wish to give below, which takes one model and compares it to another, we need some terms.

\[D = \chi^2_B - \chi^2_A\]

\[df_D = df_B - df_A\]

And for single-group models,

\[RMSEA_D = \sqrt{\frac{D - df_D}{df_D(N-1)}}\]

And for multiple-group models,

\[RMSEA_{D,MG} = \sqrt{\frac{D-df_D}{df_D(N-N_g)/N_g}}\]

Function and Analyses

First, the function.

library(pacman); p_load(lavaan)

dRMSEA <- function(ModA, ModB){
  require(lavaan)
  ChiA = fitMeasures(ModA, "chisq"); ChiB = fitMeasures(ModB, "chisq")
  dfA = fitMeasures(ModA, "df"); dfB = fitMeasures(ModB, "df")
  D = ChiB - ChiA
  dfD = dfB - dfA
  N = sum(as.numeric(ModA@Data@nobs))
  NG = length(ModA@Data@nobs)
  RMSEAD = ifelse(D < dfD, 0, ifelse(NG > 1, sqrt((D-dfD)/(dfD*(N-1))), sqrt((D-dfD)/(dfD*((N-NG)/NG)))))
  return(cat("RMSEA_D is", as.numeric(RMSEAD), "\n"))}

Next, a test on some data known to be violate measurement invariance because it was experimentally manipulated.

GavaLife.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5'

GavaLifeC <- sem(GavaLife.model, data, std.lv = T, group = "gavagai")
GavaLifeM <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = "loadings")
GavaLifeS <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts"))
GavaLifeR <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals"))
GavaLifeLV <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"))
GavaLifeLC <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"))
GavaLifeMN <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances", "means"))

round(cbind(Configural = fitMeasures(GavaLifeC, FITM),
            Metric = fitMeasures(GavaLifeM, FITM),
            Scalar = fitMeasures(GavaLifeS, FITM),
            Strict = fitMeasures(GavaLifeR, FITM),
            LV.Vars = fitMeasures(GavaLifeLV, FITM),
            LV.Covars = fitMeasures(GavaLifeLC, FITM),
            Means = fitMeasures(GavaLifeMN, FITM)), 3)

##                Configural    Metric    Scalar    Strict   LV.Vars LV.Covars
## chisq             272.483   302.135   359.628  1039.201  1063.359  1063.360
## df                 68.000    76.000    84.000    94.000    96.000    97.000
## npar               62.000    54.000    46.000    36.000    34.000    33.000
## cfi                 0.983     0.981     0.977     0.922     0.920     0.920
## rmsea               0.063     0.063     0.066     0.116     0.116     0.115
## rmsea.ci.lower      0.056     0.056     0.059     0.109     0.110     0.109
## rmsea.ci.upper      0.071     0.071     0.073     0.122     0.122     0.122
## aic             42887.227 42900.879 42942.372 43601.945 43622.103 43620.104
## bic             43216.647 43187.793 43186.780 43793.221 43802.753 43795.441
##                    Means
## chisq           1175.242
## df                99.000
## npar              31.000
## cfi                0.911
## rmsea              0.120
## rmsea.ci.lower     0.114
## rmsea.ci.upper     0.127
## aic            43727.986
## bic            43892.696

RMSEA clearly failed, except in the strict model. But how does \(RMSEA_D\) perform?

dRMSEA(GavaLifeC, GavaLifeM)

## RMSEA_D is 0.04249152

dRMSEA(GavaLifeM, GavaLifeS)

## RMSEA_D is 0.06424314

dRMSEA(GavaLifeS, GavaLifeR)

## RMSEA_D is 0.2113479

dRMSEA(GavaLifeR, GavaLifeLV)

## RMSEA_D is 0.08597146

dRMSEA(GavaLifeLV, GavaLifeLC)

## RMSEA_D is 0

dRMSEA(GavaLifeLC, GavaLifeMN)

## RMSEA_D is 0.1914462

\(RMSEA_D\) clearly picks up on minor misspecifications better than normal RMSEA. But how about for partially invariant models?

GavaLifeC <- sem(GavaLife.model, data, std.lv = T, group = "gavagai")
GavaLifeM <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = "loadings")
GavaLifeMP <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = "loadings", group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2"))
GavaLifeS <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2"))
GavaLifeSP <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1"))
GavaLifeR <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1"))
GavaLifeRP <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeLV <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeLC <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeMN <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances", "means"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))

GavaLifeMN1.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5

FWB ~ c(0,0)*1'

GavaLifeMN2.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5

MLQS ~ c(0,0)*1'

GavaLifeMNP1 <- sem(GavaLifeMN1.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeMNP2 <- sem(GavaLifeMN2.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))

round(cbind(Configural = fitMeasures(GavaLifeC, FITM),
            Metric = fitMeasures(GavaLifeM, FITM),
            P_Metric = fitMeasures(GavaLifeMP, FITM),
            Scalar = fitMeasures(GavaLifeS, FITM),
            P_Scalar = fitMeasures(GavaLifeSP, FITM),
            Strict = fitMeasures(GavaLifeR, FITM),
            P_Strict = fitMeasures(GavaLifeRP, FITM),
            LV.Vars = fitMeasures(GavaLifeLV, FITM),
            LV.Covars = fitMeasures(GavaLifeLC, FITM),
            Means = fitMeasures(GavaLifeMN, FITM),
            P_Means1 = fitMeasures(GavaLifeMNP1, FITM),
            P_Means2 = fitMeasures(GavaLifeMNP2, FITM)), 3)

##                Configural    Metric  P_Metric    Scalar  P_Scalar    Strict
## chisq             272.483   302.135   278.377   323.664   285.213   965.745
## df                 68.000    76.000    73.000    81.000    79.000    89.000
## npar               62.000    54.000    57.000    49.000    51.000    41.000
## cfi                 0.983     0.981     0.983     0.980     0.983     0.928
## rmsea               0.063     0.063     0.061     0.063     0.059     0.115
## rmsea.ci.lower      0.056     0.056     0.054     0.056     0.052     0.108
## rmsea.ci.upper      0.071     0.071     0.069     0.070     0.066     0.121
## aic             42887.227 42900.879 42883.121 42912.408 42877.957 43538.489
## bic             43216.647 43187.793 43185.975 43172.756 43148.931 43756.331
##                 P_Strict   LV.Vars LV.Covars     Means  P_Means1  P_Means2
## chisq            286.555   290.720   291.036   413.526   291.264   403.427
## df                84.000    86.000    87.000    89.000    88.000    88.000
## npar              46.000    44.000    43.000    41.000    42.000    42.000
## cfi                0.983     0.983     0.983     0.973     0.983     0.974
## rmsea              0.057     0.056     0.056     0.070     0.055     0.069
## rmsea.ci.lower     0.050     0.049     0.049     0.063     0.049     0.062
## rmsea.ci.upper     0.064     0.064     0.063     0.077     0.063     0.076
## aic            42869.299 42869.464 42867.780 42986.270 42866.008 42978.171
## bic            43113.707 43103.245 43096.249 43204.112 43089.163 43201.326

dRMSEA(GavaLifeC, GavaLifeM)

## RMSEA_D is 0.04249152

dRMSEA(GavaLifeC, GavaLifeMP)

## RMSEA_D is 0.01092204

dRMSEA(GavaLifeMP, GavaLifeS)

## RMSEA_D is 0.05576125

dRMSEA(GavaLifeMP, GavaLifeSP)

## RMSEA_D is 0.009640196

dRMSEA(GavaLifeSP, GavaLifeR)

## RMSEA_D is 0.2114993

dRMSEA(GavaLifeSP, GavaLifeRP)

## RMSEA_D is 0

dRMSEA(GavaLifeRP, GavaLifeLV)

## RMSEA_D is 0.02687176

dRMSEA(GavaLifeLV, GavaLifeLC)

## RMSEA_D is 0

dRMSEA(GavaLifeLC, GavaLifeMN)

## RMSEA_D is 0.2004746

dRMSEA(GavaLifeLC, GavaLifeMNP1)

## RMSEA_D is 0

dRMSEA(GavaLifeLC, GavaLifeMNP2)

## RMSEA_D is 0.2725989

So, good, good, good, good. If a proper version of RMSEA had been used, invariance violations in this data would have been detected, as opposed to the RMSEA used. Additionally, a violation with respect to the latent variances would have been noticed. But which latent variance was affected?

GavaLifeLVP1 <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "FWB ~~ FWB"))

GavaLifeLVP2 <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))

fitMeasures(GavaLifeLVP1, FITM); dRMSEA(GavaLifeRP, GavaLifeLVP1)

##          chisq             df           npar            cfi          rmsea 
##        290.714         85.000         45.000          0.983          0.057 
## rmsea.ci.lower rmsea.ci.upper            aic            bic 
##          0.050          0.064      42871.458      43110.552

## RMSEA_D is 0.0459034

fitMeasures(GavaLifeLVP2, FITM); dRMSEA(GavaLifeRP, GavaLifeLVP2)

##          chisq             df           npar            cfi          rmsea 
##        286.557         85.000         45.000          0.983          0.056 
## rmsea.ci.lower rmsea.ci.upper            aic            bic 
##          0.049          0.063      42867.301      43106.396

## RMSEA_D is 0

In this data, this was the theoretically-expected outcome: a difference in the latent variances such that a so-called “nonsense” factor had to have its variance released, but also importantly, removing the constraint on a non-nonsense factor worsened \(RMSEA_D\) whilst failing to even hint towards improving anything else. It likely could not be detected with a simple \(\chi^2\) test because it was too little of an effect, thanks to the generally-desirable fact of homogeneous and high loadings, perhaps combined with limited residual construct differentiation in the samples, but that remains to be assessed. Let’s finish the subsequent models.

GavaLifeLC <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))
GavaLifeMN <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances", "means"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))

GavaLifeMN1.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5

FWB ~ c(0,0)*1'

GavaLifeMN2.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5

MLQS ~ c(0,0)*1'

GavaLifeMNP1 <- sem(GavaLifeMN1.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))
GavaLifeMNP2 <- sem(GavaLifeMN2.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))

round(cbind(P_Strict = fitMeasures(GavaLifeRP, FITM),
            P_LV.V1 = fitMeasures(GavaLifeLVP1, FITM),
            P_LV.V2 = fitMeasures(GavaLifeLVP2, FITM),
            LV.Covars = fitMeasures(GavaLifeLC, FITM),
            Means = fitMeasures(GavaLifeMN, FITM),
            P_Means1 = fitMeasures(GavaLifeMNP1, FITM),
            P_Means2 = fitMeasures(GavaLifeMNP2, FITM)), 3)

##                 P_Strict   P_LV.V1   P_LV.V2 LV.Covars     Means  P_Means1
## chisq            286.555   290.714   286.557   288.129   409.937   288.361
## df                84.000    85.000    85.000    86.000    88.000    87.000
## npar              46.000    45.000    45.000    44.000    42.000    43.000
## cfi                0.983     0.983     0.983     0.983     0.973     0.983
## rmsea              0.057     0.057     0.056     0.056     0.070     0.056
## rmsea.ci.lower     0.050     0.050     0.049     0.049     0.063     0.049
## rmsea.ci.upper     0.064     0.064     0.063     0.063     0.077     0.063
## aic            42869.299 42871.458 42867.301 42866.873 42984.681 42865.105
## bic            43113.707 43110.552 43106.396 43100.655 43207.836 43093.573
##                 P_Means2
## chisq            399.986
## df                87.000
## npar              43.000
## cfi                0.974
## rmsea              0.069
## rmsea.ci.lower     0.062
## rmsea.ci.upper     0.076
## aic            42976.730
## bic            43205.199

dRMSEA(GavaLifeLVP2, GavaLifeLC)

## RMSEA_D is 0.01953851

dRMSEA(GavaLifeLC, GavaLifeMN)

## RMSEA_D is 0.1999069

dRMSEA(GavaLifeLC, GavaLifeMNP1)

## RMSEA_D is 0

dRMSEA(GavaLifeLC, GavaLifeMNP2)

## RMSEA_D is 0.2719452

It works, continuously, as a sensitive index of misfit, although it needs used better than the simple >0.015 = noninvariance that I’ve used here.

But what about CFI? CFI involves a restrictive null model and the formula for CFI is

\[\frac{(\chi^2-df)_{null}-(\chi^2-df)_{proposed}}{(\chi^2-df)_{null}}\]

But, as Savalei, Brace & Fouladi (2021) noted,

Unfortunately, a solution parallel to \(RMSEA_D\) to replace \(\Delta CFI\) does not help very much even when \(CFI_A\) and \(CFI_B\) are correctly computed so that the baseline model for each is nested within Models A and B. Suppose we make further modification so that the baseline model used for \(CFI_A\) and \(CFI_B\) is the same, and it is properly nested within Models A and B. More generally, when comparing a series of nested models, the researcher would pick one baseline model that is nested within all. In this case, \(\Delta CFI < 0\), and we can define \(CFI_D = 1-\frac{T_D - df_D}{T_N - df_N} = 1 + \Delta CFI\). This index is a trivial transformation of \(\Delta CFI\) and will not gain greater sensitivity (althoguh existing studies of \(\Delta CFI\) did not adjust the baseline model). However, it is now on the metric of the CFI, in that larger values indicate better fit, and it is bound to the [0,1] interval, so it does represent a small improvement over \(\Delta CFI\).

A better solution that is possible in some cases is to change the baseline model to result in a more sensitive relevant comparison. In the context where Model A is a measurement model (i.e., the structural part is saturated) and Model B imposes constraints on relationships among the latent variables, one can define a “structural null” model where the latent variables are unrealted to each other, and other comparative fit indices become possible. However, such a model is not easily generalizable to other nested model comparisons. [Given the original conceptualization of the CFI already involves a comparison of nested models (a proposed model and a very restrictive baseline model), a CFI-like index for a nested model comparison of Models A and B can be computed as \((T_A - df_A)/(T_B-df_B)\), which would be close to 1 when the additional constraints introduced by Model B do not introduce much more misfit relative to Model A. But this index is based on the same information as \(RMSEA_D\), capturing the change in fit on a percentage instead of a difference metric, and thus it departs in spirit somewhat from the overall model RMSEA and CFI, which are based on different information….] Overall, when comparing nested models, an index of comparative fit that compares each model to a baseline model may just not be the right tool in the evaluation of the specific constraints introduced by the more nested model. Because the existing approach of reporting \(\Delta CFI\) lacks sensitivity, whether researchers report it or not alongside \(RMSEA_D\) may not make a material difference.

References

Ropovik, I. (2015). A cautionary note on testing latent variable models. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.01715

Savalei, V., Brace, J. C., & Fouladi, R. T. (2021). We need to change how we compute RMSEA for nested model comparisons in structural equation modeling. OSF. https://osf.io/2ycd6/

Postscript

We can do this like Savalei, Brace & Fouladi (2021), using their code (from https://osf.io/5ambq/).

#RMSEA CI and noncentrality functions 

RMSEA.CI<-function(T,df,N,G){
  
#functions taken from lavaan (lav_fit_measures.R)
lower.lambda <- function(lambda) {
  (pchisq(T, df=df, ncp=lambda) - 0.95)
}
upper.lambda <- function(lambda) {
  (pchisq(T, df=df, ncp=lambda) - 0.05)
}

#RMSEA CI
lambda.l <- try(uniroot(f=lower.lambda, lower=0, upper=T)$root,silent=TRUE) 
if(inherits(lambda.l, "try-error")) { lambda.l <- NA; RMSEA.CI.l<-NA 
} else { if(lambda.l<0){
  RMSEA.CI.l=0
} else {
  RMSEA.CI.l<-sqrt(lambda.l*G/((N-1)*df))
}
}

N.RMSEA <- max(N, T*4) 
lambda.u <- try(uniroot(f=upper.lambda, lower=0,upper=N.RMSEA)$root,silent=TRUE)
if(inherits(lambda.u, "try-error")) { lambda.u <- NA; RMSEA.CI.u<-NA 
} else { if(lambda.u<0){
  RMSEA.CI.u=0
} else {
  RMSEA.CI.u<-sqrt(lambda.u*G/((N-1)*df))
}
}
RMSEA.CI<-c(RMSEA.CI.l,RMSEA.CI.u)
return(RMSEA.CI)
}

#p-values associated with the critical values of the RMSEA for most common cutoffs
pvals<-function(T,df,N,G){
  RMSEA0<-c(0,.01,.05,.06,.08,.1)
  eps0<-df*RMSEA0^2/G 
  nonc<-eps0*(N-G) 
  pvals<-pchisq(T,df=df,ncp=nonc)
  names(pvals)<-c("RMSEA>0","RMSEA>.01","RMSEA>.05","RMSEA>.06","RMSEA>.08","RMSEA>.10")
  return(pvals)
}

N<-1500
G<-2 

(RMSEA1<-fitmeasures(GavaLifeC)["rmsea"])

##     rmsea 
## 0.0633204

(df1<-fitmeasures(GavaLifeC)["df"])

## df 
## 68

(T1<-fitmeasures(GavaLifeC)["chisq"])

##    chisq 
## 272.4831

(RMSEA2<-fitmeasures(GavaLifeM)["rmsea"])

##      rmsea 
## 0.06298636

(df2<-fitmeasures(GavaLifeM)["df"])

## df 
## 76

(T2<-fitmeasures(GavaLifeM)["chisq"])

##   chisq 
## 302.135

(RMSEA3<-fitmeasures(GavaLifeMP)["rmsea"])

##     rmsea 
## 0.0612469

(df3<-fitmeasures(GavaLifeMP)["df"])

## df 
## 73

(T3<-fitmeasures(GavaLifeMP)["chisq"])

##    chisq 
## 278.3772

(RMSEA4<-fitmeasures(GavaLifeS)["rmsea"])

##      rmsea 
## 0.06320183

(df4<-fitmeasures(GavaLifeS)["df"])

## df 
## 81

(T4<-fitmeasures(GavaLifeS)["chisq"])

##    chisq 
## 323.6642

(RMSEA5<-fitmeasures(GavaLifeSP)["rmsea"])

##      rmsea 
## 0.05899482

(df5<-fitmeasures(GavaLifeSP)["df"])

## df 
## 79

(T5<-fitmeasures(GavaLifeSP)["chisq"])

##    chisq 
## 285.2131

(RMSEA6<-fitmeasures(GavaLifeR)["rmsea"])

##    rmsea 
## 0.114607

(df6<-fitmeasures(GavaLifeR)["df"])

## df 
## 89

(T6<-fitmeasures(GavaLifeR)["chisq"])

##    chisq 
## 965.7449

(RMSEA7<-fitmeasures(GavaLifeRP)["rmsea"])

##      rmsea 
## 0.05670236

(df7<-fitmeasures(GavaLifeRP)["df"])

## df 
## 84

(T7<-fitmeasures(GavaLifeRP)["chisq"])

##    chisq 
## 286.5549

(RMSEA8<-fitmeasures(GavaLifeLV)["rmsea"])

##      rmsea 
## 0.05633782

(df8<-fitmeasures(GavaLifeLV)["df"])

## df 
## 86

(T8<-fitmeasures(GavaLifeLV)["chisq"])

##    chisq 
## 290.7198

(RMSEA9<-fitmeasures(GavaLifeLVP1)["rmsea"])

##      rmsea 
## 0.05680562

(df9<-fitmeasures(GavaLifeLVP1)["df"])

## df 
## 85

(T9<-fitmeasures(GavaLifeLVP1)["chisq"])

##    chisq 
## 290.7135

(RMSEA10<-fitmeasures(GavaLifeLVP2)["rmsea"])

##      rmsea 
## 0.05622875

(df10<-fitmeasures(GavaLifeLVP2)["df"])

## df 
## 85

(T10<-fitmeasures(GavaLifeLVP2)["chisq"])

##    chisq 
## 286.5566

(RMSEA11<-fitmeasures(GavaLifeLC)["rmsea"])

##      rmsea 
## 0.05598019

(df11<-fitmeasures(GavaLifeLC)["df"])

## df 
## 86

(T11<-fitmeasures(GavaLifeLC)["chisq"])

##    chisq 
## 288.1289

(RMSEA12<-fitmeasures(GavaLifeMN)["rmsea"])

##      rmsea 
## 0.06984151

(df12<-fitmeasures(GavaLifeMN)["df"])

## df 
## 88

(T12<-fitmeasures(GavaLifeMN)["chisq"])

##    chisq 
## 409.9372

(RMSEA13<-fitmeasures(GavaLifeMNP1)["rmsea"])

##     rmsea 
## 0.0555517

(df13<-fitmeasures(GavaLifeMNP1)["df"])

## df 
## 87

(T13<-fitmeasures(GavaLifeMNP1)["chisq"])

##    chisq 
## 288.3609

(RMSEA14<-fitmeasures(GavaLifeMNP2)["rmsea"])

##      rmsea 
## 0.06925838

(df14<-fitmeasures(GavaLifeMNP2)["df"])

## df 
## 87

(T14<-fitmeasures(GavaLifeMNP2)["chisq"])

##    chisq 
## 399.9862

ModelRMSEAs<-c(RMSEA1,RMSEA2,RMSEA3,RMSEA4,RMSEA5,RMSEA6,RMSEA7,RMSEA8,RMSEA9,RMSEA10,RMSEA11,RMSEA12,RMSEA13,RMSEA14)
RMSEA.CI(T=T1,df=df1,N=N,G=2)

##         df         df 
## 0.05557577 0.07130340

deltaRMSEA<-(c(ModelRMSEAs,0)-c(0,ModelRMSEAs))[2:14]

tests<-lavTestLRT(GavaLifeC,GavaLifeM,GavaLifeMP,GavaLifeS,GavaLifeSP,GavaLifeR,GavaLifeRP,GavaLifeLV,GavaLifeLVP1,GavaLifeLVP2,GavaLifeLC,GavaLifeMN,GavaLifeMNP1,GavaLifeMNP2)

## Warning in lavTestLRT(GavaLifeC, GavaLifeM, GavaLifeMP, GavaLifeS, GavaLifeSP, : lavaan WARNING: some restricted models fit better than less 
##   restricted models; either these models are not nested, or
##   the less restricted model failed to reach a global optimum.

## Warning in lavTestLRT(GavaLifeC, GavaLifeM, GavaLifeMP, GavaLifeS, GavaLifeSP, :
## lavaan WARNING: some models have the same degrees of freedom

Td=tests[2:14,5]
Td

##  [1]   5.8940852  23.7578201 -16.9219774  38.4510853 -37.1092195   4.1585766
##  [7]  -4.1568665   4.1631199  -2.5908718   0.2320148 111.6252959   9.9510453
## [13] 555.8076073

dfd=tests[2:14,6]
dfd

##  [1] 5 3 3 2 3 1 0 1 0 1 0 1 1

lambda<-Td-dfd
ld<-lambda/dfd

RMSEAD<- ifelse(ld > 0, sqrt((ld)*G/(N-G)), 0)

## Warning in sqrt((ld) * G/(N - G)): NaNs produced

RMSEAD

##  [1] 0.01545125 0.09611459 0.00000000 0.15599092 0.00000000 0.06493888
##  [7] 0.00000000 0.06498557 0.00000000 0.00000000        Inf 0.10931908
## [13] 0.86065738

cat("Configural \n"); RMSEA.CI(T=Td[1],df=dfd[1],N=N,G=2)

## Configural

## [1]         NA 0.05488264

cat("Metric \n"); RMSEA.CI(T=Td[2],df=dfd[2],N=N,G=2)

## Metric

## [1] 0.06255765 0.13367440

cat("Partial Metric \n"); RMSEA.CI(T=Td[3],df=dfd[3],N=N,G=2)

## Partial Metric

## [1] NA NA

cat("Scalar \n"); RMSEA.CI(T=Td[4],df=dfd[4],N=N,G=2)

## Scalar

## [1] 0.1152169 0.2007811

cat("Partial Scalar \n"); RMSEA.CI(T=Td[5],df=dfd[5],N=N,G=2)

## Partial Scalar

## [1] NA NA

cat("Strict \n"); RMSEA.CI(T=Td[6],df=dfd[6],N=N,G=2)

## Strict

## [1] 0.01056336 0.13456948

cat("Partial Strict \n"); RMSEA.CI(T=Td[7],df=dfd[7],N=N,G=2)

## Partial Strict

## [1] NA NA

cat("Latent Variances \n"); RMSEA.CI(T=Td[8],df=dfd[8],N=N,G=2)

## Latent Variances

## [1] 0.01064001 0.13461046

cat("FWB Latent Variance \n"); RMSEA.CI(T=Td[9],df=dfd[9],N=N,G=2)

## FWB Latent Variance

## [1] NA NA

cat("MLQS Latent Variance \n"); RMSEA.CI(T=Td[10],df=dfd[10],N=N,G=2)

## MLQS Latent Variance

## [1]         NA 0.07589186

cat("Latent Covariances \n"); RMSEA.CI(T=Td[11],df=dfd[11],N=N,G=2)

## Latent Covariances

## [1] Inf Inf

cat("Means \n"); RMSEA.CI(T=Td[12],df=dfd[12],N=N,G=2)

## Means

## [1] 0.05514353 0.17530719

cat("MLSQ Mean \n"); RMSEA.CI(T=Td[13],df=dfd[13],N=N,G=2)

## MLSQ Mean

## [1] 0.8010637 0.9212269

cat("FWB Mean \n"); RMSEA.CI(T=Td[14],df=dfd[14],N=N,G=2)

## FWB Mean

## [1] NA NA

round(pvals(T=Td[1],df=dfd[1],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##     0.683     0.640     0.084     0.027     0.001     0.000

round(pvals(T=Td[2],df=dfd[2],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##     1.000     1.000     0.987     0.961     0.802     0.470

round(pvals(T=Td[3],df=dfd[3],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##         0         0         0         0         0         0

round(pvals(T=Td[4],df=dfd[4],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##     1.000     1.000     1.000     1.000     0.999     0.987

round(pvals(T=Td[5],df=dfd[5],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##         0         0         0         0         0         0

round(pvals(T=Td[6],df=dfd[6],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##     0.959     0.951     0.749     0.654     0.440     0.243

round(pvals(T=Td[7],df=dfd[7],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##         0         0         0         0         0         0

round(pvals(T=Td[8],df=dfd[8],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##     0.959     0.951     0.749     0.655     0.441     0.243

round(pvals(T=Td[9],df=dfd[9],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##         0         0         0         0         0         0

round(pvals(T=Td[10],df=dfd[10],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##     0.370     0.357     0.155     0.106     0.040     0.011

round(pvals(T=Td[11],df=dfd[11],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##         1         1         1         1         1         1

round(pvals(T=Td[12],df=dfd[12],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##     0.998     0.998     0.963     0.935     0.833     0.662

round(pvals(T=Td[13],df=dfd[13],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##         1         1         1         1         1         1

round(pvals(T=Td[14],df=dfd[14],N=N,G=2),3)

##   RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10 
##        NA        NA        NA        NA        NA        NA

95% CIs correctly identify the same fit issues as before: metric, scalar, strict, latent variance, and mean invariance violations, and they are on the same parameters as before, such that the violations come from the factor that is known to have been experimentally manipulated into being wrong.

sessionInfo()

## R version 4.1.2 (2021-11-01)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] lavaan_0.6-9          pacman_0.5.1          kirkegaard_2022-04-12
##  [4] rlang_0.4.12          metafor_3.0-2         Matrix_1.3-4         
##  [7] psych_2.1.9           magrittr_2.0.1        assertthat_0.2.1     
## [10] weights_1.0.4         Hmisc_4.6-0           Formula_1.2-4        
## [13] survival_3.2-13       lattice_0.20-45       forcats_0.5.1        
## [16] stringr_1.4.0         dplyr_1.0.7           purrr_0.3.4          
## [19] readr_2.0.2           tidyr_1.1.4           tibble_3.1.5         
## [22] ggplot2_3.3.5         tidyverse_1.3.1      
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-153        fs_1.5.2            lubridate_1.8.0    
##  [4] RColorBrewer_1.1-2  httr_1.4.2          tools_4.1.2        
##  [7] backports_1.3.0     bslib_0.3.1         utf8_1.2.2         
## [10] R6_2.5.1            rpart_4.1-15        DBI_1.1.1          
## [13] colorspace_2.0-2    nnet_7.3-16         withr_2.4.3        
## [16] mnormt_2.0.2        tidyselect_1.1.1    gridExtra_2.3      
## [19] compiler_4.1.2      cli_3.1.0           rvest_1.0.2        
## [22] htmlTable_2.3.0     mice_3.14.0         xml2_1.3.3         
## [25] sass_0.4.0          scales_1.1.1        checkmate_2.0.0    
## [28] pbivnorm_0.6.0      digest_0.6.28       minqa_1.2.4        
## [31] foreign_0.8-81      rmarkdown_2.11      base64enc_0.1-3    
## [34] jpeg_0.1-9          pkgconfig_2.0.3     htmltools_0.5.2    
## [37] lme4_1.1-27.1       dbplyr_2.1.1        fastmap_1.1.0      
## [40] htmlwidgets_1.5.4   readxl_1.3.1        rstudioapi_0.13    
## [43] jquerylib_0.1.4     generics_0.1.1      jsonlite_1.7.2     
## [46] gtools_3.9.2        Rcpp_1.0.7          munsell_0.5.0      
## [49] fansi_0.5.0         lifecycle_1.0.1     stringi_1.7.5      
## [52] yaml_2.2.1          mathjaxr_1.4-0      MASS_7.3-54        
## [55] grid_4.1.2          parallel_4.1.2      gdata_2.18.0       
## [58] crayon_1.4.2        haven_2.4.3         splines_4.1.2      
## [61] hms_1.1.1           tmvnsim_1.0-2       knitr_1.36         
## [64] pillar_1.6.4        boot_1.3-28         stats4_4.1.2       
## [67] reprex_2.0.1        glue_1.4.2          evaluate_0.14      
## [70] latticeExtra_0.6-29 data.table_1.14.2   modelr_0.1.8       
## [73] nloptr_1.2.2.2      png_0.1-7           vctrs_0.3.8        
## [76] tzdb_0.2.0          cellranger_1.1.0    gtable_0.3.0       
## [79] xfun_0.27           broom_0.7.10        cluster_2.1.2      
## [82] ellipsis_0.3.2

Difference-\(RMSEA\) versus \(RMSEA_D\)

Setup

Rationale

Function and Analyses

References

Postscript