Dynamic fit index (DFI) cutoffs are a recently proposed method to derive benchmarks for fit indices like RMSEA, CFI, and SRMR that are optimally sensitive to misspecifications for the user’s specific model. The original intent of DFI was to extend the idea of Hu & Bentler (1999) – whose cutoffs are known to be sensitive to model characteristics like number of items or factor loading magnitude – so that cutoffs with desirable properties could be derived for any CFA, not just those with properties similar to those studied by Hu & Bentler (1999).
DFI has previously been devised using simulations that assumed
multivariate normality. This has been done for one-factor models (McNeish & Wolf, 2022) and multifactor models
(McNeish & Wolf, 2021), which can be implemented
in the dynamic package with the cfaOne and
cfaHB functions, respectively. A vignette for implementing
the multivariate normal version of DFI can be found here
and a conceptual overview and common questions can be found here.
However, many researchers using CFA solicit Likert-type responses (e.g., about 80% of studies based on a review by Flake et al., 2017), often on 5- or 7-point scales. Likert-type data are often treated as continuous in factor analyses (e.g., Jackson et al., 2009); however, fit indices are less sensitive to misfit with Likert responses compared to multivariate normal responses.
A disconnect therefore emerges where most researchers have Likert responses but traditional fit index cutoffs are based on continuous responses. Applying cutoffs intended for continuous data to Likert-type data results in overly optimistic conclusions about fit and validity.
The dynamic package now has additional functions that
are explicitly designed to accommodate Likert-type responses. This wasy,
researchers can derive cutoffs that are optimally sensitive to their
specific model characteristics and their specific data
characteristics (e.g., number of Likert scale points, response
distributions across scale points). More importantly, this allows
researchers to better tailor fit index cutoffs to the data and model
under consideration, hopefully leading to more accurate conclusions
about model fit.
There are two functions for treating Likert-type responses as
continuous in the dynamic package: likertOne
for one-factor models and likertHB for multi-factor models.
The method by which these functions test hypothetical misspecifications
and determine dynamic cutoffs is similar to other DFI functions (see here for an
overview of this process). The main difference is how the functions
generate data within the DFI simulations.
Likert-type responses can be treated in different ways, so there are
some similarities to other DFI functions. For instance, Likert-type
responses are technically discrete, so there are many similarities to
the categorical functions catOne and catHB
(described here). We initially thought that Likert-type
responses would be a special case of non-normal continuous data covered
by the nnorOne and nnorHB functions (described
here.
However, deeper investigation showed that the nnor
functions worked well for non-normal continuous data but were sometimes
susceptible to poor performance when applied specifically to Likert-type
items.
Therefore, we do not not generally recommend the
nnor functions for Likert-type data and instead suggest the
dedicated likert functions described in this
vignette.
Likert-type DFI starts by generating multivariate normal data, which will eventually be discretized into Likert-type responses. Whereas categorical models have threshold parameters to guide this discretization, models that treat Likert-type responses as continuous have no such parameters.
Therefore, DFI creates pseudo-thresholds from the user’s original data. This is done by first calculating the number of response options for each scale-point and proportion of responses within each scale-point. These proportions are then converted to pseudo-thresholds through the inverse of the standard normal cumulative distribution function.This process is handled internally by the software.
The result is simulated data that match the number of scale-points and scale-point distribution of the user’s data. Different number of scale-points per item are permitted. Additionally, because the user’s data is used to create the pseudo-thresholds, the simulated data will contain the same proportion and pattern of missing data as in the user’s original data.
To be clear, unlike the cfa and cat
functions in dynamic, the likert
functions require the original dataset to be included. This is
necessary because the model output alone is not sufficient to generate
Likert-type data that have similar properties as the original data. In
categorical models, this information can be taken from the thresholds
estimates. But in models that treat Likert-type responses as continuous,
the only source of this information is from the data itself.
As another important note, the likert functions work
best when responses are coded with values between 1 and 9. Data coded as
‘0’ or with double digits sometimes adversely affect some of the
functions called by the likert functions and the generated
data do not always closely reproduce the original data.
likert vs. nnor Data GenerationThis section provides an example to show the difference in how data
are generated with different DFI functions and why the
likert functions are better suited for Likert-type
responses.
We use data from Hussey & Hughes (2019), which is openly available from the Open Science Framework.
The data feature 8 extraversion items from 6649 people, each measured on a 6-point Likert scale. We use 5 of the items to create a unidimensional scale. The histograms for responses to these 5 items are shown below,
Using the nnorOne function in attempt to simulate data
with similar properties results in this (original in teal, simulated in
pink):
This data generation mechanism in the nnor function has
some trouble mimicking features of discrete Likert-type scale points,
especially reproducing the proportions at the extreme upper and and
lower end of the scale. This results in generated data that do not look
too similarly to the original Likert-type responses.
However, the likertOne function is explicitly designed
to generate discrete Likert-type data, so the data it generates
completely overlap the original data and exactly recreate the original
data’s properties.
The likert functions create data that more closely
resemble the original data, so the DFI cutoffs produced by the
likert functions will be more accurate and lead to better
decisions about fit because they are more closely aligned to the model
and data being evaluated.
lavaan object (manual=FALSE)For users who work within R, dynamic can directly
interface with a lavaan object to extract the pertinent
information required to derive cutoffs via custom simulations. This is
true for both the likertOne and likertHB
functions. The likertHB function is used as an example in
this section.
Using the same data as the previous section, we then fit a one-factor
model in lavaan to the five items using the
cfa function using the WLSMV estimator used by
Hussey & Hughes (2019), which treats the Likert-type responses as
continuous as long as the code does not included ordered=T.
The fit of this model is shown below.
lavmod<-"E=~bfi_e1+bfi_e4+bfi_e5+bfi_e6+bfi_e8"
fit <- lavaan::cfa(model=lavmod, data=dat, estimator="WLSMV")
lavaan::fitmeasures(fit, c("srmr", "rmsea.scaled","cfi.scaled"))
#> srmr rmsea.scaled cfi.scaled
#> 0.026 0.079 0.975Model fit looks decent compared to Hu and Bentler’s benchmarks, although RMSEA would be a little high. Nonetheless, Hu and Bentler (1999) considered models with 15 items, 3 factors, continuous responses, and maximum likelihood estimation. Therefore, it is unclear if these traditional benchmarks are sensitive to misspecification with these particular model characteristics (5 items instead of 15, 1 factors instead of 3), these particular data characteristics (Likert-type responses with somewhat negatively skewed response distributions), and this estimator (WLSMV instead of ML).
Therefore, we derive DFI cutoffs for the specific model and data
characteristics using likertOne. When fitting models in
lavaan, DFI cutoffs can be derived simply by using the
lavaan object and the dataset as the only required
arguments in a dynamic function (there are options that can
be added as well, such as requesting plots and specifying the
estimator):
library(devtools)
devtools::install_github("melissagwolf/dynamic")
dynamic::likertOne(fit, data=dat,plot=T, estimator="WLSMV")
#> Your DFI cutoffs:
#> SRMR RMSEA CFI
#> Level-0 0.01 0.022 0.998
#> Specificity 95% 95% 95%
#>
#> Level-1 0.03 0.09 0.968
#> Sensitivity 95% 95% 95%
#>
#> Level-2 0.041 0.126 0.939
#> Sensitivity 95% 95% 95%
#>
#> Empirical fit indices:
#> Chi-Square df p-value SRMR RMSEA CFI
#> 213.491 5 0 0.026 0.079 0.975
#>
#> Notes:
#> -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#> -Cutoffs with 95% sensitivity are reported when possible
#> -If sensitivity is <50%, cutoffs will be supressed
#>
#> The distributions for each level are in the Plots tab
#> [[1]]#>
#> [[2]]
The “Level-0” row corresponds to anticipated fit index values if the fitted model were indeed the underlying population model. In this example, we see that if the model were actually correct, 95% of the time, we would see SRMR values below 0.011, RMSEA values below 0.025, and CFI values above 0.997.
“Level-1” corresponds to anticipated fit index values if the model were misspecified. In this case, the misspecification being tested is an omitted 0.30 residual correlation between one pair of items. The “Sensitivity” row is the percentage of replications with this magnitude misspecification that would be rejected at the printed cutoffs (e.g., the proportion of the red distribution that is to the right of the RMSEA cutoff in the plots). The default is 95%, but smaller number may be reported when there is uncertainty due to sampling variability (e.g., small samples or low factor loadings). Sensitivity is always contingent on rejecting no more than 5% of correct models represented by the blue distribution in the plot.
The plot=T option in the code creates plots to help
understand how these cutoffs were derived. The Level-1 plot shows the
simulated distribution for the Level-1 misspecification. The blue
distribution represents fit index values if the model were
hypothetically true. The red distribution represents the distribution of
fit index values if the model were hypothetically misspecified via an
omitted 0.30 residual correlation. The cutoff comes from the fit index
value that is sensitive to 95% of misspecified models.
With the model and data characteristics in this example, an SRMR cutoff of .030 would reject 95% of models with this misspecification, an RMSEA cutoff of 0.09 would reject 95% of models with this misspecification, and a CFI cutoff of 0.968 would reject 95% of models with this misspecification. Because all of the fit indices for the model meet the DFI cutoffs, the conclusion would be that the cumulative misspecification in this model is less than one omitted 0.30 residual correlation.
The Level-2 DFI cutoffs repeat this process but use a larger misspecification (two omitted 0.30 residual correlations for this model). This provides a sensitivity analyses and DFI cutoffs that correspond to different magnitudes of hypothetical misfit.
In this case, we case see that the RMSEA value being above the traditional 0.06 threshold does not seem problematic because the equivalent cutoff for these model and data characteristics is 0.09. This model only has 5 df, so the DFI cutoffs can encode small-df issues with RMSEA discussed by Kenny et al. (2015). That is, RMSEA becomes inflated when there are few degrees of freedom, so the DFI cutoff acknowledges this and increases to reflect behavior of this fit index under these conditions. Similarly, the DFI cutoff for CFI is stricter than the traditional 0.95 cutoff because the the WLSMV estimator tends to restrict the value of CFI relative to maximum likelihood (e.g., Xia & Yang, 2019).
dynamic also allows manual model entry for users who do
not use lavaan as their primary latent variable modeling
software or for users who may be assessing results reported by published
research without having to completely rerun the analysis.
For instance, if we fit the same model in Mplus and asked for the standardized estimates, the output would be
These standardized estimates can be transferred to
lavaan manually placing the standardize estimate before the
item names,
manmod<-"E=~.778*bfi_e1 + .656*bfi_e4 + .728*bfi_e5 + .550*bfi_e6 + .792*bfi_e8"Where E is an arbitrary label for the latent factor
E. The values before each item are the standardized
loadings from the Mplus output.
The code to run DFI is then similar to the previous example but with
two small changes. First, there is an extra option manual=T
to tell the software that the model was manually entered and that the
code should parse a manually added model statement rather than
extracting information from a lavaan object. Second, users need to enter
a sample size with the n= option. Otherwise, the code is
the same
dynamic::likertOne(manmod, data=dat,plot=T, manual=T, n= 6649, estimator="WLSMV")
#> Your DFI cutoffs:
#> SRMR RMSEA CFI
#> Level-0 0.01 0.021 0.998
#> Specificity 95% 95% 95%
#>
#> Level-1 0.03 0.091 0.966
#> Sensitivity 95% 95% 95%
#>
#> Level-2 0.041 0.126 0.939
#> Sensitivity 95% 95% 95%
#>
#> Notes:
#> -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#> -Cutoffs with 95% sensitivity are reported when possible
#> -If sensitivity is <50%, cutoffs will be supressed
#>
#> The distributions for each level are in the Plots tab
#> [[1]]#>
#> [[2]]
The cutoffs from this model are within .001 of the previous section and the small difference is attributable to rounding the standardized estimates in the manually entered model statement.
To cite the ideas behind dynamic fit index cutoffs:
McNeish, D. & Wolf, M. G. (2021). Dynamic Fit Index Cutoffs for Confirmatory Factor Analysis Models. Psychological Methods.
McNeish, D. & Wolf, M. G. (2022). Dynamic fit cutoffs for one-factor models. Behavior Research Methods.
To cite the dynamic fit index R package discussed in this vignette: