Dynamic Fit Index (DFI) Cutoffs for CFA Models with Categorical Variables

Daniel McNeish & Melissa G Wolf

2023-04-06

Introduction

Dynamic fit index (DFI) cutoffs are a recently proposed method to derive benchmarks for fit indices like RMSEA, CFI, and SRMR that are optimally sensitive to misspecifications for the user’s specific model. The original intent of DFI was to extend the idea of Hu & Bentler (1999) – whose cutoffs are known to be sensitive to model characteristics like number of items or factor loading magnitude – so that cutoffs with desirable properties could be derived for any CFA, not just those with properties similar to those studied by Hu & Bentler (1999).

DFI has previously been devised using simulations that assumed multivariate normality. This has been done for one-factor models (McNeish & Wolf, 2022) and multifactor models (McNeish & Wolf, 2021), which can be implemented in the dynamic package with the cfaOne and cfaHB functions, respectively. A vignette for implementing the multivariate normal version of DFI can be foundhere and a conceptual overview and common questions can be found here.

However, many researchers using CFA solicit item responses in binary (e.g., Yes/No, Right/Wrong) or ordinal (e.g., Likert scales) formats (e.g., Flake et al., 2017). Data in these formats cannot always be appropriately modeled assuming normality, especially if there are fewer than 5 categories or when categories are asymmetrically endorsed (e.g., Rhemtulla et al., 2012).

The dynamic package now has additional functions that are explicitly designed to accommodate nuances of categorical data such that researchers can derive cutoffs that are optimally sensitive to their specific model characteristics and their specific data characteristics (e.g., number of categories, response distributions).

Overview of Categorical Factor Analysis

Categorical factor analyses that produce similar fit indices like RMSEA, CFI, and SRMR rely on limited information estimators like diagonally weighted least squares or unweighted least squares. In factor analysis, these estimators assume that categorical data are a coarse manifestation of an underlying normal distribution. That is, the observed information from an item response is discrete (i.e., yes/no) but the process that resulted in that discrete response is normally distributed.

Consider the figure below as an example of this assumption with a 5-point Likert response. The observed information can take on integer values between 1 and 5 (shown in blue). However, the underlying assumption is that these discrete responses are a coarse version of an underlying normal process. In this example, any value on this underlying normal distribution below -1 gets binned into a Likert response of “1”. Any value between -1 and -0.25 is binned into a Likert response of “2” and so on. The data only contain the discrete values in blue, but the true construct of interest is assumed to be continuous (even if it is not necessarily being captured at that level of granularity).

In this way, categorical factor analysis is just multivariate normal factor analysis on the underlying normal process of the item responses rather than on the observed item responses themselves. This is alternatively conceptualized as fitting a factor model to the polychoric correlation matrix (the correlations of the underlying normal distributions) rather than fitting a factor model to the Pearson correlation matrix (the correlations of the observed variables).

The path diagram of a categorical factor analysis envisioned this way in shown below for a hypothetical one-factor model with four items. The wavy lines link the observed responses to an underlying normal process (which can be conceptualized as a latent variable because the underlying process is unknown). Then, the factor model is fit to these underlying normal processes rather than to the observed data.

DFI with Categorical Data

The basic algorithm for how DFI creates hypothetical misspecifications to include in its simulation is unchanged depending on whether cutoffs are desired for multivariate normal or categorical data. For one-factor models, hypothetical misspecifications being tested are based on residual correlations between items. For multi-factor models, hypothetical misspecifications are based on cross-loadings to mimic the approach from Hu and Bentler (1999).

The differences between the multivariate normal functions in dynamic (cfaOne and cfaHB) and their categorical counterparts (catOne and catHB) lie in how data are simulated. Because categorical factor analysis assumes an underlying normal distribution, data are first generated from multivariate normality. However, an additional step is then taken to discretize the data by binning ranges of values together. The mirrors the assumptions of limited information estimators and polychoric correlations.

The number of categories in catOne and catHB does not have to be constant across items and the functions are able to accommodate categorical items that have different number of response options. These functions can also accommodate a mixture of continuous and categorical items because the continuous items are just a special case where there are 0 thresholds.

To ensure that the number of categories and the proportion of responses within each mirror the user’s data, this discretization is based on the estimated thresholds from the user’s model. Doing so also means that the user does not need to provide additional information relative to the multivariate normality DFI functions – only a lavaan object or a manual model statement (including thresholds) is required and the software will generate data that mirror the original data.

Do note that catOne and catHB are designed for data that are explicitly treated as categorical in the model (e.g., ordered = T in lavaan or with a CATEGORICAL ARE statement in Mplus). If discrete data like Likert responses are treated as continuous (possibly with a robust estimator like MLR), that cannot be accommodated by catOne or catHB but instead is appropriate for the nnorOne or nnorHB functions in dynamic (which accommodate non-normal continuous data and missing data; these functions are still in beta version, however.).

Example with lavaan object (manual=FALSE)

For users who work within R, dynamic can directly interface with a lavaan object to extract the pertinent information required to derive cutoffs via custom simulations. This is true for both the catOne and catHB functions. The catHB function is used as an example in this section.

We use the HolzingerSwineford1939 data that is built into the lavaan package for this example. The item responses to this data are continuous, but we round responses to the first six items to the nearest integer to make them categorical for the purposes of demonstration.

library(lavaan)
dat<-lavaan::HolzingerSwineford1939
dat1<-round(dat[,7:12])

The response distributions of these items are shown below.

To these data, we then fit a two-factor categorical factor analysis in lavaan to the first six items using the cfa function with ordered =T and the default WLSMV estimator (dynamic can accommodate any estimator in lavaan and is not restricted to the default). With categorical data, it is important to make sure that the correct fit indices are reported because the plain rmsea or cfi values are uncorrected (for estimators that end in “MV”, the .scaled values are correct; for estimators that end in “M”, the .robust values are correct)

lavmod <- "visual  =~ x1 + x2 + x3
           textual =~ x4 + x5 + x6"
fit <- lavaan::cfa(model=lavmod, data=dat1, ordered=T, estimator="WLSMV")
lavaan::fitmeasures(fit, c("srmr", "rmsea.scaled","cfi.scaled"))
#>         srmr rmsea.scaled   cfi.scaled 
#>        0.040        0.070        0.992

The fit of the model looks quite good compared to Hu and Bentler’s benchmarks, but Hu and Bentler (1999) considered models with 15 items, 3 factors, continuous responses, and maximum likelihood estimation. Therefore, it is unclear if these traditional benchmarks are sensitive to misspecification with these particular model characteristics (6 items instead of 15, 2 factors instead of 2), these particular data characteristics (number of categories, response distributions), and this estimator (WLSMV instead of ML).

Therefore, we use derive DFI cutoffs for the specific model characteristics, data characteristics, and estimator using catHB. When fitting models in lavaan, DFI cutoffs can be derived simply by using the lavaan object as the only required argument in a dynamic function (there are options that can be added as well):

library(devtools)
devtools::install_github("melissagwolf/dynamic")
dynamic::catHB(fit, plot=T)
#> Your DFI cutoffs: 
#>             SRMR  RMSEA CFI   Magnitude
#> Level-0     0.037 0.057 0.995 NONE     
#> Specificity 95%   95%   95%            
#>                                        
#> Level-1     0.037 0.063 0.995 0.407    
#> Sensitivity 91%   95%   95%            
#> 
#> Empirical fit indices: 
#>  Chi-Square  df p-value   SRMR   RMSEA    CFI
#>      11.546   8   0.173   0.04    0.07  0.992
#> 
#>  Notes:
#>   -Number of levels is based on the number of factors in the model
#>   -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#>   -Cutoffs with 95% sensitivity are reported when possible
#>   -If sensitivity is <50%, cutoffs will be supressed 
#> 
#>  The distributions for each level are in the Plots tab 
#> [[1]]

The catHB defaults are to use the WLSMV estimator and 250 replications in the DFI simulations. The plot option is not required, but will print fit index distributions that are useful for understanding behavior of fit indices with these specific model and data characteristics.The number of levels is dependent on the size of the model; this model only has two factors, so there are only two rows to the output – Level-0 and Level-1.

The “Level-0” row corresponds to anticipated fit index values if the fitted model were indeed the underlying population model. In this example, we see that if the model were actually correct, 95% of the time, we would see SRMR values below 0.037, RMSEA values below 0.057, and CFI values above 0.995.

“Level-1” corresponds to anticipated fit index values if the model were misspecified. In this case, the misspecification being tested is an omitted standardized cross-loading equal to 0.407 (the value in the Magnitude column). The “Sensitivity” row is the percentage of replications with this magnitude misspecification that would be rejected at the printed cutoffs (i.e., the proportion of the red distribution that is to the right of the cutoff in the plots). The default is 95%, but smaller number may be reported when there is uncertain due to sampling variability (e.g., small samples or low factor loadings). Sensitivity is always contingent on rejecting no more than 5% of correct models represented by the blue distribution in the plot.

With the model and data characteristics in this example, an SRMR cutoff of .037 would reject 91% of models with this misspecification, an RMSEA cutoff of 0.063 would reject 95% of models with this misspecification, and a CFI cutoff of 0.995 would reject 95% of models with this misspecification. The Sensitivity for SRMR is slightly lower because the distributions assuming the model was correct (in blue) and assuming the model was misspecified (in red) overlapped a little, so there the cutoff had less precision to distinguish between correct from misspecified models.

Regarding interpretation, the conclusion would be that this model is more misspecified than the models Hu and Bentler (1999) considered to be meaningfully misspecified because SRMR and RMSEA are above the Level-1 cutoff and CFI is below the Level-1 cutoff. Although the model surpasses the traditional cutoffs (SRMR < .08, CFI > .95), those cutoffs are not necessarily applicable here because (a) the model characteristics are different, (b) the data characteristics are different, and (c) the estimator is different. In these specific conditions, the values printed in the Level-1 row represent the cutoffs Hu and Bentler (1999) would have found if they had based their simulation on this specific analysis. Do note that these data have a fairly low sample size for categorical factor analysis (N=301).

For users whose models have more than 2 factors, additional levels of misspecification will be included in the output to provide a sensitivity analysis for larger magnitudes of misspecification (e.g., with larger models, the tolerance for how misspecified is “too misspecified” may change).

For user’s not using the WLSMV estimator, cfaHB can use include and estimator = option. Note that catOne and catHB cannot automatically detect the estimator from a lavaan object, so it is important that the estimator in the fitted model match the estimator in the dynamic function.

Example with Manual Model Entry (manual =TRUE)

dynamic also allows manual model entry for users who do not use lavaan as their primary latent variable modeling software or for users who may be assessing results reported by published research without having to completely rerun the analysis (or without having access to the original data).

With categorical data, this involves writing out the factor structure with standardized loadings and the estimated thresholds. The model is written out in lavaan syntax where =~ is used to define a latent variable (followed by factor loadings) and |t__ is used for thresholds. This is vague without context, so we provide an example using data from Hussey & Hughes (2019), which is openly available from the Open Science Framework.

The data feature 8 extraversion items from 6649 people, each measured on a 6-point Likert scale. The items are intended to be unidimensional and all items are intended to capture the same latent construct. Each item has 5 thresholds, which discretize the underlying normal process behind each item into 6 discrete responses.

In lavaan syntax, this model would be written as,

manmod <- "
E =~ .76*bfi_e1 + .73*bfi_e2 + .59*bfi_e3 + .71*bfi_e4 + .84*bfi_e5 + .58*bfi_e6 + .71*bfi_e7 + .80*bfi_e8

bfi_e1 |-1.69*t1
bfi_e1 |-1.06*t2
bfi_e1 |-0.53* t3
bfi_e1 |0.06* t4
bfi_e1 |0.75* t5

bfi_e2 |-1.16*t1
bfi_e2 |-0.42*t2
bfi_e2 |0.28*t3
bfi_e2 |0.71*t4
bfi_e2 |1.34*t5

bfi_e3 |-1.99* t1
bfi_e3 |-1.27* t2
bfi_e3 | 0.61*t3
bfi_e3 | 0.09*t4
bfi_e3 | 0.97*t5

bfi_e4 |-2.05* t1
bfi_e4 | -1.36*t2
bfi_e4 | -0.74*t3
bfi_e4 | -.004*t4
bfi_e4 | 0.81*t5

bfi_e5 | -1.12*t1
bfi_e5 | -0.42*t2
bfi_e5 | 0.16*t3
bfi_e5 | 0.60*t4
bfi_e5 | 1.21*t5

bfi_e6 | -1.76*t1
bfi_e6 | -1.18*t2
bfi_e6 | -0.68*t3
bfi_e6 | -0.03*t4
bfi_e6 | 0.76*t5

bfi_e7 | -1.04*t1
bfi_e7 | -0.31*t2
bfi_e7 | 0.46*t3
bfi_e7 | 0.78*t4
bfi_e7 | 1.38*t5

bfi_e8 | -1.74*t1
bfi_e8 | -1.12*t2
bfi_e8 | -0.64*t3
bfi_e8 | -.002*t4
bfi_e8 | 0.77*t5"

The first line defines a latent factor E, which loads on 8 items, bfi_e1 - bfi_e8. The values before each items are the standard loadings from the model. In the same set of quotation marks, the thresholds also need to be provided. There are 5 thresholds per item, so each item is repeated 5 times followed by |, the threshold estimate, the letter t, and the threshold number. We realize that writing out the thresholds can be tedious, but this information is required to simulate data that properly recpature characteristics of the data (we are exploring options to make this more user-friendly for those who do not primarily work in lavaan).

This model has one factor, so DFI cutoffs treating the data as categorical can be derived with the catOne function. When the model is manually entered rather than coming from a lavaan object, additional options are required so that the function has all the information necessary.

First, manual = T is needed so that the dynamic knows it should read through quoted text rather than a saved lavaan object. Second, n = is needed to specify the sample size (which otherwise would be read from a lavaan object. The code for this model (using the ULSMV estimator instead of the default WLSMV estimator, just to show what that the estimator= option looks like) would be

dynamic::catOne(manmod,n=6649,manual=T, estimator="ULSMV", plot=T)
#> Your DFI cutoffs: 
#>             SRMR  RMSEA CFI  
#> Level-0     0.012 0.017 0.999
#> Specificity 95%   95%   95%  
#>                              
#> Level-1     0.026 0.055 0.99 
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-2     0.033 0.073 0.984
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-3     0.043 0.095 0.973
#> Sensitivity 95%   95%   95%  
#> 
#>  Notes:
#>   -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#>   -Cutoffs with 95% sensitivity are reported when possible
#>   -If sensitivity is <50%, cutoffs will be supressed 
#> 
#>  The distributions for each level are in the Plots tab 
#> [[1]]

#> 
#> [[2]]

#> 
#> [[3]]

When there are 6 or more items, cfaOne will consider three levels of misspecification. As in catHB, the Level-0 row corresponds to the anticipated fit index values if the fitted model were the exact underlying population model. The Level-1 row corresponds to the anticipated fit index values if the fitted model omitted 0.30 residual correlations between approximately 1/3 of item pairs. The Level-2 row corresponds to the anticipated fit index values if the fitted model omitted 0.30 residual correlations between approximately 1/3 of item pairs. The Level-3 row corresponds to the anticipated fit index values if the fitted model omitted 0.30 residual correlations between all item pairs.

When manual=T, the model fit indices are not reported at the bottom of the output because dyanmic does not have access to the original data. Nonetheless, the process would remain the same such that the fit indices output by the user’s model would be compared to these cutoffs to evaluate the fit of the model under these specific data and model characteristics.

If the model’s fit indices are below the Level-0 cutoffs, that would indicate the the model is within sampling error or fitting exactly. If the model’s fit indices are above the Level-0 cutoffs below the Level-1 cutoffs, that would indicate that the model does not fit exactly but that the misspecification in the model is our less than 1/3 of the items having an omitted 0.30 residual correlation (note that this does not necessarily mean that the model omitted this path, just that the overall misspecification is on par with this misspecification) and so on for Level-2 and Level-3.

Citation Recommendations

To cite the ideas behind dynamic fit index cutoffs:

To cite information about dynamic fit index cutoffs with categorical data specifucally

To cite the dynamic fit index R package disuccsed in this vignette:

This package relies on the following packages: