Dynamic fit index (DFI) cutoffs are a recently proposed method to derive benchmarks for fit indices like RMSEA, CFI, and SRMR that are optimally sensitive to misspecifications for the user’s specific model. The original intent of DFI was to extend the idea of Hu & Bentler (1999) – whose cutoffs are known to be sensitive to model characteristics like number of items or factor loading magnitude – so that cutoffs with desirable properties could be derived for any CFA, not just those with properties similar to those studied by Hu & Bentler (1999).
DFI has previously been devised using simulations that assumed
multivariate normality. This has been done for one-factor models (McNeish & Wolf, 2022) and multifactor models
(McNeish & Wolf, 2021), which can be implemented
in the dynamic package with the cfaOne and
cfaHB functions, respectively. A vignette for implementing
the multivariate normal version of DFI can be foundhere
and a conceptual overview and common questions can be found here.
However, many researchers using CFA solicit item responses in binary (e.g., Yes/No, Right/Wrong) or ordinal (e.g., Likert scales) formats (e.g., Flake et al., 2017). Data in these formats cannot always be appropriately modeled assuming normality, especially if there are fewer than 5 categories or when categories are asymmetrically endorsed (e.g., Rhemtulla et al., 2012).
The dynamic package now has additional functions that
are explicitly designed to accommodate nuances of categorical data such
that researchers can derive cutoffs that are optimally sensitive to
their specific model characteristics and their specific data
characteristics (e.g., number of categories, response
distributions).
Categorical factor analyses that produce similar fit indices like RMSEA, CFI, and SRMR rely on limited information estimators like diagonally weighted least squares or unweighted least squares. In factor analysis, these estimators assume that categorical data are a coarse manifestation of an underlying normal distribution. That is, the observed information from an item response is discrete (i.e., yes/no) but the process that resulted in that discrete response is normally distributed.
Consider the figure below as an example of this assumption with a 5-point Likert response. The observed information can take on integer values between 1 and 5 (shown in blue). However, the underlying assumption is that these discrete responses are a coarse version of an underlying normal process. In this example, any value on this underlying normal distribution below -1 gets binned into a Likert response of “1”. Any value between -1 and -0.25 is binned into a Likert response of “2” and so on. The data only contain the discrete values in blue, but the true construct of interest is assumed to be continuous (even if it is not necessarily being captured at that level of granularity).
In this way, categorical factor analysis is just multivariate normal factor analysis on the underlying normal process of the item responses rather than on the observed item responses themselves. This is alternatively conceptualized as fitting a factor model to the polychoric correlation matrix (the correlations of the underlying normal distributions) rather than fitting a factor model to the Pearson correlation matrix (the correlations of the observed variables).
The path diagram of a categorical factor analysis envisioned this way in shown below for a hypothetical one-factor model with four items. The wavy lines link the observed responses to an underlying normal process (which can be conceptualized as a latent variable because the underlying process is unknown). Then, the factor model is fit to these underlying normal processes rather than to the observed data.
The basic algorithm for how DFI creates hypothetical misspecifications to include in its simulation is unchanged depending on whether cutoffs are desired for multivariate normal or categorical data. For one-factor models, hypothetical misspecifications being tested are based on residual correlations between items. For multi-factor models, hypothetical misspecifications are based on cross-loadings to mimic the approach from Hu and Bentler (1999).
The differences between the multivariate normal functions in
dynamic (cfaOne and cfaHB) and
their categorical counterparts (catOne and
catHB) lie in how data are simulated. Because categorical
factor analysis assumes an underlying normal distribution, data are
first generated from multivariate normality. However, an additional step
is then taken to discretize the data by binning ranges of values
together. The mirrors the assumptions of limited information estimators
and polychoric correlations.
The number of categories in catOne and
catHB does not have to be constant across items and the
functions are able to accommodate categorical items that have different
number of response options. These functions can also accommodate a
mixture of continuous and categorical items because the continuous items
are just a special case where there are 0 thresholds.
To ensure that the number of categories and the proportion of
responses within each mirror the user’s data, this discretization is
based on the estimated thresholds from the user’s model. Doing so also
means that the user does not need to provide additional information
relative to the multivariate normality DFI functions – only a
lavaan object or a manual model statement (including
thresholds) is required and the software will generate data that mirror
the original data.
Do note that catOne and catHB are designed
for data that are explicitly treated as categorical in the model (e.g.,
ordered = T in lavaan or with a
CATEGORICAL ARE statement in Mplus). If
discrete data like Likert responses are treated as continuous (possibly
with a robust estimator like MLR), that cannot be
accommodated by catOne or catHB but instead is
appropriate for the nnorOne or nnorHB
functions in dynamic (which accommodate non-normal
continuous data and missing data; these functions are still in beta
version, however.).
lavaan object (manual=FALSE)For users who work within R, dynamic can directly
interface with a lavaan object to extract the pertinent
information required to derive cutoffs via custom simulations. This is
true for both the catOne and catHB functions.
The catHB function is used as an example in this
section.
We use the HolzingerSwineford1939 data that is built
into the lavaan package for this example. The item
responses to this data are continuous, but we round responses to the
first six items to the nearest integer to make them categorical for the
purposes of demonstration.
library(lavaan)
dat<-lavaan::HolzingerSwineford1939
dat1<-round(dat[,7:12])The response distributions of these items are shown below.
To these data, we then fit a two-factor categorical factor analysis
in lavaan to the first six items using the cfa
function with ordered =T and the default WLSMV
estimator (dynamic can accommodate any estimator in
lavaan and is not restricted to the default). With
categorical data, it is important to make sure that the correct fit
indices are reported because the plain rmsea or
cfi values are uncorrected (for estimators that end in
“MV”, the .scaled values are correct; for estimators that
end in “M”, the .robust values are correct)
lavmod <- "visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6"
fit <- lavaan::cfa(model=lavmod, data=dat1, ordered=T, estimator="WLSMV")
lavaan::fitmeasures(fit, c("srmr", "rmsea.scaled","cfi.scaled"))
#> srmr rmsea.scaled cfi.scaled
#> 0.040 0.070 0.992The fit of the model looks quite good compared to Hu and Bentler’s benchmarks, but Hu and Bentler (1999) considered models with 15 items, 3 factors, continuous responses, and maximum likelihood estimation. Therefore, it is unclear if these traditional benchmarks are sensitive to misspecification with these particular model characteristics (6 items instead of 15, 2 factors instead of 2), these particular data characteristics (number of categories, response distributions), and this estimator (WLSMV instead of ML).
Therefore, we use derive DFI cutoffs for the specific model
characteristics, data characteristics, and estimator using
catHB. When fitting models in lavaan, DFI
cutoffs can be derived simply by using the lavaan object as
the only required argument in a dynamic function (there are
options that can be added as well):
library(devtools)
devtools::install_github("melissagwolf/dynamic")
dynamic::catHB(fit, plot=T)
#> Your DFI cutoffs:
#> SRMR RMSEA CFI Magnitude
#> Level-0 0.037 0.057 0.995 NONE
#> Specificity 95% 95% 95%
#>
#> Level-1 0.037 0.063 0.995 0.407
#> Sensitivity 91% 95% 95%
#>
#> Empirical fit indices:
#> Chi-Square df p-value SRMR RMSEA CFI
#> 11.546 8 0.173 0.04 0.07 0.992
#>
#> Notes:
#> -Number of levels is based on the number of factors in the model
#> -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#> -Cutoffs with 95% sensitivity are reported when possible
#> -If sensitivity is <50%, cutoffs will be supressed
#>
#> The distributions for each level are in the Plots tab
#> [[1]]The catHB defaults are to use the WLSMV
estimator and 250 replications in the DFI simulations. The
plot option is not required, but will print fit index
distributions that are useful for understanding behavior of fit indices
with these specific model and data characteristics.The number of levels
is dependent on the size of the model; this model only has two factors,
so there are only two rows to the output – Level-0 and Level-1.
The “Level-0” row corresponds to anticipated fit index values if the fitted model were indeed the underlying population model. In this example, we see that if the model were actually correct, 95% of the time, we would see SRMR values below 0.037, RMSEA values below 0.057, and CFI values above 0.995.
“Level-1” corresponds to anticipated fit index values if the model were misspecified. In this case, the misspecification being tested is an omitted standardized cross-loading equal to 0.407 (the value in the Magnitude column). The “Sensitivity” row is the percentage of replications with this magnitude misspecification that would be rejected at the printed cutoffs (i.e., the proportion of the red distribution that is to the right of the cutoff in the plots). The default is 95%, but smaller number may be reported when there is uncertain due to sampling variability (e.g., small samples or low factor loadings). Sensitivity is always contingent on rejecting no more than 5% of correct models represented by the blue distribution in the plot.
With the model and data characteristics in this example, an SRMR cutoff of .037 would reject 91% of models with this misspecification, an RMSEA cutoff of 0.063 would reject 95% of models with this misspecification, and a CFI cutoff of 0.995 would reject 95% of models with this misspecification. The Sensitivity for SRMR is slightly lower because the distributions assuming the model was correct (in blue) and assuming the model was misspecified (in red) overlapped a little, so there the cutoff had less precision to distinguish between correct from misspecified models.
Regarding interpretation, the conclusion would be that this model is more misspecified than the models Hu and Bentler (1999) considered to be meaningfully misspecified because SRMR and RMSEA are above the Level-1 cutoff and CFI is below the Level-1 cutoff. Although the model surpasses the traditional cutoffs (SRMR < .08, CFI > .95), those cutoffs are not necessarily applicable here because (a) the model characteristics are different, (b) the data characteristics are different, and (c) the estimator is different. In these specific conditions, the values printed in the Level-1 row represent the cutoffs Hu and Bentler (1999) would have found if they had based their simulation on this specific analysis. Do note that these data have a fairly low sample size for categorical factor analysis (N=301).
For users whose models have more than 2 factors, additional levels of misspecification will be included in the output to provide a sensitivity analysis for larger magnitudes of misspecification (e.g., with larger models, the tolerance for how misspecified is “too misspecified” may change).
For user’s not using the WLSMV estimator, cfaHB can use
include and estimator = option. Note that
catOne and catHB cannot automatically detect
the estimator from a lavaan object, so it is important that
the estimator in the fitted model match the estimator in the
dynamic function.
dynamic also allows manual model entry for users who do
not use lavaan as their primary latent variable modeling
software or for users who may be assessing results reported by published
research without having to completely rerun the analysis (or without
having access to the original data).
With categorical data, this involves writing out the factor structure
with standardized loadings and the estimated thresholds. The model is
written out in lavaan syntax where =~ is used
to define a latent variable (followed by factor loadings) and
|t__ is used for thresholds. This is vague without context,
so we provide an example using data from Hussey & Hughes (2019), which is openly
available from the Open
Science Framework.
The data feature 8 extraversion items from 6649 people, each measured on a 6-point Likert scale. The items are intended to be unidimensional and all items are intended to capture the same latent construct. Each item has 5 thresholds, which discretize the underlying normal process behind each item into 6 discrete responses.
In lavaan syntax, this model would be written as,
manmod <- "
E =~ .76*bfi_e1 + .73*bfi_e2 + .59*bfi_e3 + .71*bfi_e4 + .84*bfi_e5 + .58*bfi_e6 + .71*bfi_e7 + .80*bfi_e8
bfi_e1 |-1.69*t1
bfi_e1 |-1.06*t2
bfi_e1 |-0.53* t3
bfi_e1 |0.06* t4
bfi_e1 |0.75* t5
bfi_e2 |-1.16*t1
bfi_e2 |-0.42*t2
bfi_e2 |0.28*t3
bfi_e2 |0.71*t4
bfi_e2 |1.34*t5
bfi_e3 |-1.99* t1
bfi_e3 |-1.27* t2
bfi_e3 | 0.61*t3
bfi_e3 | 0.09*t4
bfi_e3 | 0.97*t5
bfi_e4 |-2.05* t1
bfi_e4 | -1.36*t2
bfi_e4 | -0.74*t3
bfi_e4 | -.004*t4
bfi_e4 | 0.81*t5
bfi_e5 | -1.12*t1
bfi_e5 | -0.42*t2
bfi_e5 | 0.16*t3
bfi_e5 | 0.60*t4
bfi_e5 | 1.21*t5
bfi_e6 | -1.76*t1
bfi_e6 | -1.18*t2
bfi_e6 | -0.68*t3
bfi_e6 | -0.03*t4
bfi_e6 | 0.76*t5
bfi_e7 | -1.04*t1
bfi_e7 | -0.31*t2
bfi_e7 | 0.46*t3
bfi_e7 | 0.78*t4
bfi_e7 | 1.38*t5
bfi_e8 | -1.74*t1
bfi_e8 | -1.12*t2
bfi_e8 | -0.64*t3
bfi_e8 | -.002*t4
bfi_e8 | 0.77*t5"The first line defines a latent factor E, which loads on
8 items, bfi_e1 - bfi_e8. The values before
each items are the standard loadings from the model. In the same set of
quotation marks, the thresholds also need to be provided. There are 5
thresholds per item, so each item is repeated 5 times followed by
|, the threshold estimate, the letter t, and
the threshold number. We realize that writing out the thresholds can be
tedious, but this information is required to simulate data that properly
recpature characteristics of the data (we are exploring options to make
this more user-friendly for those who do not primarily work in
lavaan).
This model has one factor, so DFI cutoffs treating the data as
categorical can be derived with the catOne function. When
the model is manually entered rather than coming from a
lavaan object, additional options are required so that the
function has all the information necessary.
First, manual = T is needed so that the
dynamic knows it should read through quoted text rather
than a saved lavaan object. Second, n = is
needed to specify the sample size (which otherwise would be read from a
lavaan object. The code for this model (using the
ULSMV estimator instead of the default WLSMV
estimator, just to show what that the estimator= option
looks like) would be
dynamic::catOne(manmod,n=6649,manual=T, estimator="ULSMV", plot=T)
#> Your DFI cutoffs:
#> SRMR RMSEA CFI
#> Level-0 0.012 0.017 0.999
#> Specificity 95% 95% 95%
#>
#> Level-1 0.026 0.055 0.99
#> Sensitivity 95% 95% 95%
#>
#> Level-2 0.033 0.073 0.984
#> Sensitivity 95% 95% 95%
#>
#> Level-3 0.043 0.095 0.973
#> Sensitivity 95% 95% 95%
#>
#> Notes:
#> -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#> -Cutoffs with 95% sensitivity are reported when possible
#> -If sensitivity is <50%, cutoffs will be supressed
#>
#> The distributions for each level are in the Plots tab
#> [[1]]#>
#> [[2]]
#>
#> [[3]]
When there are 6 or more items,
cfaOne will consider three
levels of misspecification. As in catHB, the Level-0 row
corresponds to the anticipated fit index values if the fitted model were
the exact underlying population model. The Level-1 row corresponds to
the anticipated fit index values if the fitted model omitted 0.30
residual correlations between approximately 1/3 of item pairs. The
Level-2 row corresponds to the anticipated fit index values if the
fitted model omitted 0.30 residual correlations between approximately
1/3 of item pairs. The Level-3 row corresponds to the anticipated fit
index values if the fitted model omitted 0.30 residual correlations
between all item pairs.
When manual=T, the model fit indices are not reported at
the bottom of the output because dyanmic does not have
access to the original data. Nonetheless, the process would remain the
same such that the fit indices output by the user’s model would be
compared to these cutoffs to evaluate the fit of the model under these
specific data and model characteristics.
If the model’s fit indices are below the Level-0 cutoffs, that would indicate the the model is within sampling error or fitting exactly. If the model’s fit indices are above the Level-0 cutoffs below the Level-1 cutoffs, that would indicate that the model does not fit exactly but that the misspecification in the model is our less than 1/3 of the items having an omitted 0.30 residual correlation (note that this does not necessarily mean that the model omitted this path, just that the overall misspecification is on par with this misspecification) and so on for Level-2 and Level-3.
To cite the ideas behind dynamic fit index cutoffs:
McNeish, D. & Wolf, M. G. (2021). Dynamic Fit Index Cutoffs for Confirmatory Factor Analysis Models. Psychological Methods.
McNeish, D. & Wolf, M. G. (2022). Dynamic fit cutoffs for one-factor models. Behavior Research Methods.
To cite information about dynamic fit index cutoffs with categorical data specifucally
McNeish, D. (2023). Dynamic Fit Index Cutoffs for Factor Analysis with Likert, Ordinal, or Binary Responses.
To cite the dynamic fit index R package disuccsed in this vignette: