RMSEA is commonly used to discern misspecification and worsened fit in structural equation model selection. In testing latent variable models, RMSEA is the most commonly-reported model fit index (Ropovik, 2016, Table 1). But while RMSEA has obvious utility for detecting misspecification and does not go overboard in calling minor parameter differences misspecifications like a \(\chi^2\) test might be alleged to, it is also weak when it comes in the form of \(\Delta RMSEA\), or \(RMSEA_B - RMSEA_A\).
RMSEA suffers from a number of defects. As described by Savalei, Brace & Fouladi (2021):
When comparing nested models in terms of approximate fit, it is common for researchers to simply report fit indices with each model and to eyeball tehir differences; we call this the “naked eye” approach. It has also become common to take the difference of fit indices, especially in the context of MI assessment (Cheung & Rensvold, 2002). For the RMSEA, the diffference index is defined as: \(\Delta RMSEA = RMSEA_B - RMSEA_A\), (2) where \(RMSEA_A\) and \(RMSEA_B\) are model RMSEAs for Models A and B, respectively. This index has several disadvantages. First, it is not on the same metric as the model RMSEA values (in fact, it is not even guaranteed to be nonnegative), and conventional RMSEA cutoffs do not apply. Limited research exists that has proposed cutoffs specifically for this index. Chen (2007) proposed that \(\Delta RMSEA\) < .01 or .015 (depending on sample size) before constraints in Model B are accepted. The small size of these differences reflects the second disadvantage of \(\Delta RMSEA\), which is that it will often lack sensitivity to detect misfit due to the added constraints. Particularly for models with large initial degrees of freedom, \(RMSEA_A\) and \(RMSEA_B\) will often be quite similar to each other, because any changes in the fit function will be “diluted” by dividing by the overall df in their computation. This situation is exacerbated by the fact that an analytic CI for \(\Delta RMSEA\) is not yet easily available.
Instead of [\(\Delta RMSEA\)], we propose to compute [the] RMSEA associated with the difference test to compare nested models. This computation is given by Browne & Du Toit, 1992: \(RMSEA_D = \sqrt{\frac{D-df_D}{df_D(N-1)}}\), (3) which is a direct generalization of [the equation for RMSEA]. When \(D < df_D\), \(RMSEA_D\) is set to zero….
\(RMSEA_D\) has the more conventional interpretation as the amount of additional misfit introduced by the constraints in Model B. It is on the same metric as the model RMSEA (e.g., it cannot be negative). Because this index is a transformation of the chi-square difference test in the same way the overall model RMSEA is a transformation of the model chi-square test, a simply analytic CI for \(RMSEA_D\) can be obtained using existing methodology.
And
When Model A is true but Model B is false, \(\Delta RMSEA\) is equal to \(RMSEA_B\) in the population, so it is not a test of the specific constraints distinguishing Model B from Model A, and the misfit due to the additional constraints may not be noticeable in the overall value of \(\Delta RMSEA\). On the other hand, \(RMSEA_D\) undoes this dilution by multiplying by a factor of \(\sqrt{df_B/df_D}\), resulting in a more sensitive test of the constraints.
Other approximate fit indices relying on values of \(\chi^2\) like CFI would benefit similarly. For the RMSEA function I wish to give below, which takes one model and compares it to another, we need some terms.
\[D = \chi^2_B - \chi^2_A\]
\[df_D = df_B - df_A\]
And for single-group models,
\[RMSEA_D = \sqrt{\frac{D - df_D}{df_D(N-1)}}\]
And for multiple-group models,
\[RMSEA_{D,MG} = \sqrt{\frac{D-df_D}{df_D(N-N_g)/N_g}}\]
First, the function.
library(pacman); p_load(lavaan)
dRMSEA <- function(ModA, ModB){
require(lavaan)
ChiA = fitMeasures(ModA, "chisq"); ChiB = fitMeasures(ModB, "chisq")
dfA = fitMeasures(ModA, "df"); dfB = fitMeasures(ModB, "df")
D = ChiB - ChiA
dfD = dfB - dfA
N = sum(as.numeric(ModA@Data@nobs))
NG = length(ModA@Data@nobs)
RMSEAD = ifelse(D < dfD, 0, ifelse(NG > 1, sqrt((D-dfD)/(dfD*(N-1))), sqrt((D-dfD)/(dfD*((N-NG)/NG)))))
return(cat("RMSEA_D is", as.numeric(RMSEAD), "\n"))}
Next, a test on some data known to be violate measurement invariance because it was experimentally manipulated.
GavaLife.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5'
GavaLifeC <- sem(GavaLife.model, data, std.lv = T, group = "gavagai")
GavaLifeM <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = "loadings")
GavaLifeS <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts"))
GavaLifeR <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals"))
GavaLifeLV <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"))
GavaLifeLC <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"))
GavaLifeMN <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances", "means"))
round(cbind(Configural = fitMeasures(GavaLifeC, FITM),
Metric = fitMeasures(GavaLifeM, FITM),
Scalar = fitMeasures(GavaLifeS, FITM),
Strict = fitMeasures(GavaLifeR, FITM),
LV.Vars = fitMeasures(GavaLifeLV, FITM),
LV.Covars = fitMeasures(GavaLifeLC, FITM),
Means = fitMeasures(GavaLifeMN, FITM)), 3)
## Configural Metric Scalar Strict LV.Vars LV.Covars
## chisq 272.483 302.135 359.628 1039.201 1063.359 1063.360
## df 68.000 76.000 84.000 94.000 96.000 97.000
## npar 62.000 54.000 46.000 36.000 34.000 33.000
## cfi 0.983 0.981 0.977 0.922 0.920 0.920
## rmsea 0.063 0.063 0.066 0.116 0.116 0.115
## rmsea.ci.lower 0.056 0.056 0.059 0.109 0.110 0.109
## rmsea.ci.upper 0.071 0.071 0.073 0.122 0.122 0.122
## aic 42887.227 42900.879 42942.372 43601.945 43622.103 43620.104
## bic 43216.647 43187.793 43186.780 43793.221 43802.753 43795.441
## Means
## chisq 1175.242
## df 99.000
## npar 31.000
## cfi 0.911
## rmsea 0.120
## rmsea.ci.lower 0.114
## rmsea.ci.upper 0.127
## aic 43727.986
## bic 43892.696
RMSEA clearly failed, except in the strict model. But how does \(RMSEA_D\) perform?
dRMSEA(GavaLifeC, GavaLifeM)
## RMSEA_D is 0.04249152
dRMSEA(GavaLifeM, GavaLifeS)
## RMSEA_D is 0.06424314
dRMSEA(GavaLifeS, GavaLifeR)
## RMSEA_D is 0.2113479
dRMSEA(GavaLifeR, GavaLifeLV)
## RMSEA_D is 0.08597146
dRMSEA(GavaLifeLV, GavaLifeLC)
## RMSEA_D is 0
dRMSEA(GavaLifeLC, GavaLifeMN)
## RMSEA_D is 0.1914462
\(RMSEA_D\) clearly picks up on minor misspecifications better than normal RMSEA. But how about for partially invariant models?
GavaLifeC <- sem(GavaLife.model, data, std.lv = T, group = "gavagai")
GavaLifeM <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = "loadings")
GavaLifeMP <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = "loadings", group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2"))
GavaLifeS <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2"))
GavaLifeSP <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1"))
GavaLifeR <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1"))
GavaLifeRP <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeLV <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeLC <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeMN <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances", "means"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeMN1.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5
FWB ~ c(0,0)*1'
GavaLifeMN2.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5
MLQS ~ c(0,0)*1'
GavaLifeMNP1 <- sem(GavaLifeMN1.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
GavaLifeMNP2 <- sem(GavaLifeMN2.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4"))
round(cbind(Configural = fitMeasures(GavaLifeC, FITM),
Metric = fitMeasures(GavaLifeM, FITM),
P_Metric = fitMeasures(GavaLifeMP, FITM),
Scalar = fitMeasures(GavaLifeS, FITM),
P_Scalar = fitMeasures(GavaLifeSP, FITM),
Strict = fitMeasures(GavaLifeR, FITM),
P_Strict = fitMeasures(GavaLifeRP, FITM),
LV.Vars = fitMeasures(GavaLifeLV, FITM),
LV.Covars = fitMeasures(GavaLifeLC, FITM),
Means = fitMeasures(GavaLifeMN, FITM),
P_Means1 = fitMeasures(GavaLifeMNP1, FITM),
P_Means2 = fitMeasures(GavaLifeMNP2, FITM)), 3)
## Configural Metric P_Metric Scalar P_Scalar Strict
## chisq 272.483 302.135 278.377 323.664 285.213 965.745
## df 68.000 76.000 73.000 81.000 79.000 89.000
## npar 62.000 54.000 57.000 49.000 51.000 41.000
## cfi 0.983 0.981 0.983 0.980 0.983 0.928
## rmsea 0.063 0.063 0.061 0.063 0.059 0.115
## rmsea.ci.lower 0.056 0.056 0.054 0.056 0.052 0.108
## rmsea.ci.upper 0.071 0.071 0.069 0.070 0.066 0.121
## aic 42887.227 42900.879 42883.121 42912.408 42877.957 43538.489
## bic 43216.647 43187.793 43185.975 43172.756 43148.931 43756.331
## P_Strict LV.Vars LV.Covars Means P_Means1 P_Means2
## chisq 286.555 290.720 291.036 413.526 291.264 403.427
## df 84.000 86.000 87.000 89.000 88.000 88.000
## npar 46.000 44.000 43.000 41.000 42.000 42.000
## cfi 0.983 0.983 0.983 0.973 0.983 0.974
## rmsea 0.057 0.056 0.056 0.070 0.055 0.069
## rmsea.ci.lower 0.050 0.049 0.049 0.063 0.049 0.062
## rmsea.ci.upper 0.064 0.064 0.063 0.077 0.063 0.076
## aic 42869.299 42869.464 42867.780 42986.270 42866.008 42978.171
## bic 43113.707 43103.245 43096.249 43204.112 43089.163 43201.326
dRMSEA(GavaLifeC, GavaLifeM)
## RMSEA_D is 0.04249152
dRMSEA(GavaLifeC, GavaLifeMP)
## RMSEA_D is 0.01092204
dRMSEA(GavaLifeMP, GavaLifeS)
## RMSEA_D is 0.05576125
dRMSEA(GavaLifeMP, GavaLifeSP)
## RMSEA_D is 0.009640196
dRMSEA(GavaLifeSP, GavaLifeR)
## RMSEA_D is 0.2114993
dRMSEA(GavaLifeSP, GavaLifeRP)
## RMSEA_D is 0
dRMSEA(GavaLifeRP, GavaLifeLV)
## RMSEA_D is 0.02687176
dRMSEA(GavaLifeLV, GavaLifeLC)
## RMSEA_D is 0
dRMSEA(GavaLifeLC, GavaLifeMN)
## RMSEA_D is 0.2004746
dRMSEA(GavaLifeLC, GavaLifeMNP1)
## RMSEA_D is 0
dRMSEA(GavaLifeLC, GavaLifeMNP2)
## RMSEA_D is 0.2725989
So, good, good, good, good. If a proper version of RMSEA had been used, invariance violations in this data would have been detected, as opposed to the RMSEA used. Additionally, a violation with respect to the latent variances would have been noticed. But which latent variance was affected?
GavaLifeLVP1 <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "FWB ~~ FWB"))
GavaLifeLVP2 <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))
fitMeasures(GavaLifeLVP1, FITM); dRMSEA(GavaLifeRP, GavaLifeLVP1)
## chisq df npar cfi rmsea
## 290.714 85.000 45.000 0.983 0.057
## rmsea.ci.lower rmsea.ci.upper aic bic
## 0.050 0.064 42871.458 43110.552
## RMSEA_D is 0.0459034
fitMeasures(GavaLifeLVP2, FITM); dRMSEA(GavaLifeRP, GavaLifeLVP2)
## chisq df npar cfi rmsea
## 286.557 85.000 45.000 0.983 0.056
## rmsea.ci.lower rmsea.ci.upper aic bic
## 0.049 0.063 42867.301 43106.396
## RMSEA_D is 0
In this data, this was the theoretically-expected outcome: a difference in the latent variances such that a so-called “nonsense” factor had to have its variance released, but also importantly, removing the constraint on a non-nonsense factor worsened \(RMSEA_D\) whilst failing to even hint towards improving anything else. It likely could not be detected with a simple \(\chi^2\) test because it was too little of an effect, thanks to the generally-desirable fact of homogeneous and high loadings, perhaps combined with limited residual construct differentiation in the samples, but that remains to be assessed. Let’s finish the subsequent models.
GavaLifeLC <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))
GavaLifeMN <- sem(GavaLife.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances", "means"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))
GavaLifeMN1.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5
FWB ~ c(0,0)*1'
GavaLifeMN2.model <- '
FWB =~ fwb1 + fwb2 + fwb3 + fwb4 + fwb5
MLQS =~ mlqs1 + mlqs2 + mlqs3 + mlqs4 + mlqs5
MLQS ~ c(0,0)*1'
GavaLifeMNP1 <- sem(GavaLifeMN1.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))
GavaLifeMNP2 <- sem(GavaLifeMN2.model, data, std.lv = F, group = "gavagai", group.equal = c("loadings", "intercepts", "residuals", "lv.variances", "lv.covariances"), group.partial = c("MLQS =~ mlqs4", "MLQS =~ mlqs5", "MLQS =~ mlqs2", "mlqs5 ~ 1", "mlqs1 ~ 1", "mlqs3 ~~ mlqs3", "mlqs1 ~~ mlqs1", "mlqs2 ~~ mlqs2", "mlqs5 ~~ mlqs5", "mlqs4 ~~ mlqs4", "MLQS ~~ MLQS"))
round(cbind(P_Strict = fitMeasures(GavaLifeRP, FITM),
P_LV.V1 = fitMeasures(GavaLifeLVP1, FITM),
P_LV.V2 = fitMeasures(GavaLifeLVP2, FITM),
LV.Covars = fitMeasures(GavaLifeLC, FITM),
Means = fitMeasures(GavaLifeMN, FITM),
P_Means1 = fitMeasures(GavaLifeMNP1, FITM),
P_Means2 = fitMeasures(GavaLifeMNP2, FITM)), 3)
## P_Strict P_LV.V1 P_LV.V2 LV.Covars Means P_Means1
## chisq 286.555 290.714 286.557 288.129 409.937 288.361
## df 84.000 85.000 85.000 86.000 88.000 87.000
## npar 46.000 45.000 45.000 44.000 42.000 43.000
## cfi 0.983 0.983 0.983 0.983 0.973 0.983
## rmsea 0.057 0.057 0.056 0.056 0.070 0.056
## rmsea.ci.lower 0.050 0.050 0.049 0.049 0.063 0.049
## rmsea.ci.upper 0.064 0.064 0.063 0.063 0.077 0.063
## aic 42869.299 42871.458 42867.301 42866.873 42984.681 42865.105
## bic 43113.707 43110.552 43106.396 43100.655 43207.836 43093.573
## P_Means2
## chisq 399.986
## df 87.000
## npar 43.000
## cfi 0.974
## rmsea 0.069
## rmsea.ci.lower 0.062
## rmsea.ci.upper 0.076
## aic 42976.730
## bic 43205.199
dRMSEA(GavaLifeLVP2, GavaLifeLC)
## RMSEA_D is 0.01953851
dRMSEA(GavaLifeLC, GavaLifeMN)
## RMSEA_D is 0.1999069
dRMSEA(GavaLifeLC, GavaLifeMNP1)
## RMSEA_D is 0
dRMSEA(GavaLifeLC, GavaLifeMNP2)
## RMSEA_D is 0.2719452
It works, continuously, as a sensitive index of misfit, although it needs used better than the simple >0.015 = noninvariance that I’ve used here.
But what about CFI? CFI involves a restrictive null model and the formula for CFI is
\[\frac{(\chi^2-df)_{null}-(\chi^2-df)_{proposed}}{(\chi^2-df)_{null}}\]
But, as Savalei, Brace & Fouladi (2021) noted,
Unfortunately, a solution parallel to \(RMSEA_D\) to replace \(\Delta CFI\) does not help very much even when \(CFI_A\) and \(CFI_B\) are correctly computed so that the baseline model for each is nested within Models A and B. Suppose we make further modification so that the baseline model used for \(CFI_A\) and \(CFI_B\) is the same, and it is properly nested within Models A and B. More generally, when comparing a series of nested models, the researcher would pick one baseline model that is nested within all. In this case, \(\Delta CFI < 0\), and we can define \(CFI_D = 1-\frac{T_D - df_D}{T_N - df_N} = 1 + \Delta CFI\). This index is a trivial transformation of \(\Delta CFI\) and will not gain greater sensitivity (althoguh existing studies of \(\Delta CFI\) did not adjust the baseline model). However, it is now on the metric of the CFI, in that larger values indicate better fit, and it is bound to the [0,1] interval, so it does represent a small improvement over \(\Delta CFI\).
A better solution that is possible in some cases is to change the baseline model to result in a more sensitive relevant comparison. In the context where Model A is a measurement model (i.e., the structural part is saturated) and Model B imposes constraints on relationships among the latent variables, one can define a “structural null” model where the latent variables are unrealted to each other, and other comparative fit indices become possible. However, such a model is not easily generalizable to other nested model comparisons. [Given the original conceptualization of the CFI already involves a comparison of nested models (a proposed model and a very restrictive baseline model), a CFI-like index for a nested model comparison of Models A and B can be computed as \((T_A - df_A)/(T_B-df_B)\), which would be close to 1 when the additional constraints introduced by Model B do not introduce much more misfit relative to Model A. But this index is based on the same information as \(RMSEA_D\), capturing the change in fit on a percentage instead of a difference metric, and thus it departs in spirit somewhat from the overall model RMSEA and CFI, which are based on different information….] Overall, when comparing nested models, an index of comparative fit that compares each model to a baseline model may just not be the right tool in the evaluation of the specific constraints introduced by the more nested model. Because the existing approach of reporting \(\Delta CFI\) lacks sensitivity, whether researchers report it or not alongside \(RMSEA_D\) may not make a material difference.
Ropovik, I. (2015). A cautionary note on testing latent variable models. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.01715
Savalei, V., Brace, J. C., & Fouladi, R. T. (2021). We need to change how we compute RMSEA for nested model comparisons in structural equation modeling. OSF. https://osf.io/2ycd6/
We can do this like Savalei, Brace & Fouladi (2021), using their code (from https://osf.io/5ambq/).
#RMSEA CI and noncentrality functions
RMSEA.CI<-function(T,df,N,G){
#functions taken from lavaan (lav_fit_measures.R)
lower.lambda <- function(lambda) {
(pchisq(T, df=df, ncp=lambda) - 0.95)
}
upper.lambda <- function(lambda) {
(pchisq(T, df=df, ncp=lambda) - 0.05)
}
#RMSEA CI
lambda.l <- try(uniroot(f=lower.lambda, lower=0, upper=T)$root,silent=TRUE)
if(inherits(lambda.l, "try-error")) { lambda.l <- NA; RMSEA.CI.l<-NA
} else { if(lambda.l<0){
RMSEA.CI.l=0
} else {
RMSEA.CI.l<-sqrt(lambda.l*G/((N-1)*df))
}
}
N.RMSEA <- max(N, T*4)
lambda.u <- try(uniroot(f=upper.lambda, lower=0,upper=N.RMSEA)$root,silent=TRUE)
if(inherits(lambda.u, "try-error")) { lambda.u <- NA; RMSEA.CI.u<-NA
} else { if(lambda.u<0){
RMSEA.CI.u=0
} else {
RMSEA.CI.u<-sqrt(lambda.u*G/((N-1)*df))
}
}
RMSEA.CI<-c(RMSEA.CI.l,RMSEA.CI.u)
return(RMSEA.CI)
}
#p-values associated with the critical values of the RMSEA for most common cutoffs
pvals<-function(T,df,N,G){
RMSEA0<-c(0,.01,.05,.06,.08,.1)
eps0<-df*RMSEA0^2/G
nonc<-eps0*(N-G)
pvals<-pchisq(T,df=df,ncp=nonc)
names(pvals)<-c("RMSEA>0","RMSEA>.01","RMSEA>.05","RMSEA>.06","RMSEA>.08","RMSEA>.10")
return(pvals)
}
N<-1500
G<-2
(RMSEA1<-fitmeasures(GavaLifeC)["rmsea"])
## rmsea
## 0.0633204
(df1<-fitmeasures(GavaLifeC)["df"])
## df
## 68
(T1<-fitmeasures(GavaLifeC)["chisq"])
## chisq
## 272.4831
(RMSEA2<-fitmeasures(GavaLifeM)["rmsea"])
## rmsea
## 0.06298636
(df2<-fitmeasures(GavaLifeM)["df"])
## df
## 76
(T2<-fitmeasures(GavaLifeM)["chisq"])
## chisq
## 302.135
(RMSEA3<-fitmeasures(GavaLifeMP)["rmsea"])
## rmsea
## 0.0612469
(df3<-fitmeasures(GavaLifeMP)["df"])
## df
## 73
(T3<-fitmeasures(GavaLifeMP)["chisq"])
## chisq
## 278.3772
(RMSEA4<-fitmeasures(GavaLifeS)["rmsea"])
## rmsea
## 0.06320183
(df4<-fitmeasures(GavaLifeS)["df"])
## df
## 81
(T4<-fitmeasures(GavaLifeS)["chisq"])
## chisq
## 323.6642
(RMSEA5<-fitmeasures(GavaLifeSP)["rmsea"])
## rmsea
## 0.05899482
(df5<-fitmeasures(GavaLifeSP)["df"])
## df
## 79
(T5<-fitmeasures(GavaLifeSP)["chisq"])
## chisq
## 285.2131
(RMSEA6<-fitmeasures(GavaLifeR)["rmsea"])
## rmsea
## 0.114607
(df6<-fitmeasures(GavaLifeR)["df"])
## df
## 89
(T6<-fitmeasures(GavaLifeR)["chisq"])
## chisq
## 965.7449
(RMSEA7<-fitmeasures(GavaLifeRP)["rmsea"])
## rmsea
## 0.05670236
(df7<-fitmeasures(GavaLifeRP)["df"])
## df
## 84
(T7<-fitmeasures(GavaLifeRP)["chisq"])
## chisq
## 286.5549
(RMSEA8<-fitmeasures(GavaLifeLV)["rmsea"])
## rmsea
## 0.05633782
(df8<-fitmeasures(GavaLifeLV)["df"])
## df
## 86
(T8<-fitmeasures(GavaLifeLV)["chisq"])
## chisq
## 290.7198
(RMSEA9<-fitmeasures(GavaLifeLVP1)["rmsea"])
## rmsea
## 0.05680562
(df9<-fitmeasures(GavaLifeLVP1)["df"])
## df
## 85
(T9<-fitmeasures(GavaLifeLVP1)["chisq"])
## chisq
## 290.7135
(RMSEA10<-fitmeasures(GavaLifeLVP2)["rmsea"])
## rmsea
## 0.05622875
(df10<-fitmeasures(GavaLifeLVP2)["df"])
## df
## 85
(T10<-fitmeasures(GavaLifeLVP2)["chisq"])
## chisq
## 286.5566
(RMSEA11<-fitmeasures(GavaLifeLC)["rmsea"])
## rmsea
## 0.05598019
(df11<-fitmeasures(GavaLifeLC)["df"])
## df
## 86
(T11<-fitmeasures(GavaLifeLC)["chisq"])
## chisq
## 288.1289
(RMSEA12<-fitmeasures(GavaLifeMN)["rmsea"])
## rmsea
## 0.06984151
(df12<-fitmeasures(GavaLifeMN)["df"])
## df
## 88
(T12<-fitmeasures(GavaLifeMN)["chisq"])
## chisq
## 409.9372
(RMSEA13<-fitmeasures(GavaLifeMNP1)["rmsea"])
## rmsea
## 0.0555517
(df13<-fitmeasures(GavaLifeMNP1)["df"])
## df
## 87
(T13<-fitmeasures(GavaLifeMNP1)["chisq"])
## chisq
## 288.3609
(RMSEA14<-fitmeasures(GavaLifeMNP2)["rmsea"])
## rmsea
## 0.06925838
(df14<-fitmeasures(GavaLifeMNP2)["df"])
## df
## 87
(T14<-fitmeasures(GavaLifeMNP2)["chisq"])
## chisq
## 399.9862
ModelRMSEAs<-c(RMSEA1,RMSEA2,RMSEA3,RMSEA4,RMSEA5,RMSEA6,RMSEA7,RMSEA8,RMSEA9,RMSEA10,RMSEA11,RMSEA12,RMSEA13,RMSEA14)
RMSEA.CI(T=T1,df=df1,N=N,G=2)
## df df
## 0.05557577 0.07130340
deltaRMSEA<-(c(ModelRMSEAs,0)-c(0,ModelRMSEAs))[2:14]
tests<-lavTestLRT(GavaLifeC,GavaLifeM,GavaLifeMP,GavaLifeS,GavaLifeSP,GavaLifeR,GavaLifeRP,GavaLifeLV,GavaLifeLVP1,GavaLifeLVP2,GavaLifeLC,GavaLifeMN,GavaLifeMNP1,GavaLifeMNP2)
## Warning in lavTestLRT(GavaLifeC, GavaLifeM, GavaLifeMP, GavaLifeS, GavaLifeSP, : lavaan WARNING: some restricted models fit better than less
## restricted models; either these models are not nested, or
## the less restricted model failed to reach a global optimum.
## Warning in lavTestLRT(GavaLifeC, GavaLifeM, GavaLifeMP, GavaLifeS, GavaLifeSP, :
## lavaan WARNING: some models have the same degrees of freedom
Td=tests[2:14,5]
Td
## [1] 5.8940852 23.7578201 -16.9219774 38.4510853 -37.1092195 4.1585766
## [7] -4.1568665 4.1631199 -2.5908718 0.2320148 111.6252959 9.9510453
## [13] 555.8076073
dfd=tests[2:14,6]
dfd
## [1] 5 3 3 2 3 1 0 1 0 1 0 1 1
lambda<-Td-dfd
ld<-lambda/dfd
RMSEAD<- ifelse(ld > 0, sqrt((ld)*G/(N-G)), 0)
## Warning in sqrt((ld) * G/(N - G)): NaNs produced
RMSEAD
## [1] 0.01545125 0.09611459 0.00000000 0.15599092 0.00000000 0.06493888
## [7] 0.00000000 0.06498557 0.00000000 0.00000000 Inf 0.10931908
## [13] 0.86065738
cat("Configural \n"); RMSEA.CI(T=Td[1],df=dfd[1],N=N,G=2)
## Configural
## [1] NA 0.05488264
cat("Metric \n"); RMSEA.CI(T=Td[2],df=dfd[2],N=N,G=2)
## Metric
## [1] 0.06255765 0.13367440
cat("Partial Metric \n"); RMSEA.CI(T=Td[3],df=dfd[3],N=N,G=2)
## Partial Metric
## [1] NA NA
cat("Scalar \n"); RMSEA.CI(T=Td[4],df=dfd[4],N=N,G=2)
## Scalar
## [1] 0.1152169 0.2007811
cat("Partial Scalar \n"); RMSEA.CI(T=Td[5],df=dfd[5],N=N,G=2)
## Partial Scalar
## [1] NA NA
cat("Strict \n"); RMSEA.CI(T=Td[6],df=dfd[6],N=N,G=2)
## Strict
## [1] 0.01056336 0.13456948
cat("Partial Strict \n"); RMSEA.CI(T=Td[7],df=dfd[7],N=N,G=2)
## Partial Strict
## [1] NA NA
cat("Latent Variances \n"); RMSEA.CI(T=Td[8],df=dfd[8],N=N,G=2)
## Latent Variances
## [1] 0.01064001 0.13461046
cat("FWB Latent Variance \n"); RMSEA.CI(T=Td[9],df=dfd[9],N=N,G=2)
## FWB Latent Variance
## [1] NA NA
cat("MLQS Latent Variance \n"); RMSEA.CI(T=Td[10],df=dfd[10],N=N,G=2)
## MLQS Latent Variance
## [1] NA 0.07589186
cat("Latent Covariances \n"); RMSEA.CI(T=Td[11],df=dfd[11],N=N,G=2)
## Latent Covariances
## [1] Inf Inf
cat("Means \n"); RMSEA.CI(T=Td[12],df=dfd[12],N=N,G=2)
## Means
## [1] 0.05514353 0.17530719
cat("MLSQ Mean \n"); RMSEA.CI(T=Td[13],df=dfd[13],N=N,G=2)
## MLSQ Mean
## [1] 0.8010637 0.9212269
cat("FWB Mean \n"); RMSEA.CI(T=Td[14],df=dfd[14],N=N,G=2)
## FWB Mean
## [1] NA NA
round(pvals(T=Td[1],df=dfd[1],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0.683 0.640 0.084 0.027 0.001 0.000
round(pvals(T=Td[2],df=dfd[2],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 1.000 1.000 0.987 0.961 0.802 0.470
round(pvals(T=Td[3],df=dfd[3],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0 0 0 0 0 0
round(pvals(T=Td[4],df=dfd[4],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 1.000 1.000 1.000 1.000 0.999 0.987
round(pvals(T=Td[5],df=dfd[5],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0 0 0 0 0 0
round(pvals(T=Td[6],df=dfd[6],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0.959 0.951 0.749 0.654 0.440 0.243
round(pvals(T=Td[7],df=dfd[7],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0 0 0 0 0 0
round(pvals(T=Td[8],df=dfd[8],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0.959 0.951 0.749 0.655 0.441 0.243
round(pvals(T=Td[9],df=dfd[9],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0 0 0 0 0 0
round(pvals(T=Td[10],df=dfd[10],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0.370 0.357 0.155 0.106 0.040 0.011
round(pvals(T=Td[11],df=dfd[11],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 1 1 1 1 1 1
round(pvals(T=Td[12],df=dfd[12],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 0.998 0.998 0.963 0.935 0.833 0.662
round(pvals(T=Td[13],df=dfd[13],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## 1 1 1 1 1 1
round(pvals(T=Td[14],df=dfd[14],N=N,G=2),3)
## RMSEA>0 RMSEA>.01 RMSEA>.05 RMSEA>.06 RMSEA>.08 RMSEA>.10
## NA NA NA NA NA NA
95% CIs correctly identify the same fit issues as before: metric, scalar, strict, latent variance, and mean invariance violations, and they are on the same parameters as before, such that the violations come from the factor that is known to have been experimentally manipulated into being wrong.
sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] lavaan_0.6-9 pacman_0.5.1 kirkegaard_2022-04-12
## [4] rlang_0.4.12 metafor_3.0-2 Matrix_1.3-4
## [7] psych_2.1.9 magrittr_2.0.1 assertthat_0.2.1
## [10] weights_1.0.4 Hmisc_4.6-0 Formula_1.2-4
## [13] survival_3.2-13 lattice_0.20-45 forcats_0.5.1
## [16] stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4
## [19] readr_2.0.2 tidyr_1.1.4 tibble_3.1.5
## [22] ggplot2_3.3.5 tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-153 fs_1.5.2 lubridate_1.8.0
## [4] RColorBrewer_1.1-2 httr_1.4.2 tools_4.1.2
## [7] backports_1.3.0 bslib_0.3.1 utf8_1.2.2
## [10] R6_2.5.1 rpart_4.1-15 DBI_1.1.1
## [13] colorspace_2.0-2 nnet_7.3-16 withr_2.4.3
## [16] mnormt_2.0.2 tidyselect_1.1.1 gridExtra_2.3
## [19] compiler_4.1.2 cli_3.1.0 rvest_1.0.2
## [22] htmlTable_2.3.0 mice_3.14.0 xml2_1.3.3
## [25] sass_0.4.0 scales_1.1.1 checkmate_2.0.0
## [28] pbivnorm_0.6.0 digest_0.6.28 minqa_1.2.4
## [31] foreign_0.8-81 rmarkdown_2.11 base64enc_0.1-3
## [34] jpeg_0.1-9 pkgconfig_2.0.3 htmltools_0.5.2
## [37] lme4_1.1-27.1 dbplyr_2.1.1 fastmap_1.1.0
## [40] htmlwidgets_1.5.4 readxl_1.3.1 rstudioapi_0.13
## [43] jquerylib_0.1.4 generics_0.1.1 jsonlite_1.7.2
## [46] gtools_3.9.2 Rcpp_1.0.7 munsell_0.5.0
## [49] fansi_0.5.0 lifecycle_1.0.1 stringi_1.7.5
## [52] yaml_2.2.1 mathjaxr_1.4-0 MASS_7.3-54
## [55] grid_4.1.2 parallel_4.1.2 gdata_2.18.0
## [58] crayon_1.4.2 haven_2.4.3 splines_4.1.2
## [61] hms_1.1.1 tmvnsim_1.0-2 knitr_1.36
## [64] pillar_1.6.4 boot_1.3-28 stats4_4.1.2
## [67] reprex_2.0.1 glue_1.4.2 evaluate_0.14
## [70] latticeExtra_0.6-29 data.table_1.14.2 modelr_0.1.8
## [73] nloptr_1.2.2.2 png_0.1-7 vctrs_0.3.8
## [76] tzdb_0.2.0 cellranger_1.1.0 gtable_0.3.0
## [79] xfun_0.27 broom_0.7.10 cluster_2.1.2
## [82] ellipsis_0.3.2