Dynamic Fit Index (DFI) Cutoffs for CFA Models that Treat Likert-type Responses as Continuous

Daniel McNeish

2023-07-13

Introduction

Dynamic fit index (DFI) cutoffs are a recently proposed method to derive benchmarks for fit indices like RMSEA, CFI, and SRMR that are optimally sensitive to misspecifications for the user’s specific model. The original intent of DFI was to extend the idea of Hu & Bentler (1999) – whose cutoffs are known to be sensitive to model characteristics like number of items or factor loading magnitude – so that cutoffs with desirable properties could be derived for any CFA, not just those with properties similar to those studied by Hu & Bentler (1999).

DFI has previously been devised using simulations that assumed multivariate normality. This has been done for one-factor models (McNeish & Wolf, 2022) and multifactor models (McNeish & Wolf, 2021), which can be implemented in the dynamic package with the cfaOne and cfaHB functions, respectively. A vignette for implementing the multivariate normal version of DFI can be found here and a conceptual overview and common questions can be found here.

However, many researchers using CFA solicit Likert-type responses (e.g., about 80% of studies based on a review by Flake et al., 2017), often on 5- or 7-point scales. Likert-type data are often treated as continuous in factor analyses (e.g., Jackson et al., 2009); however, fit indices are less sensitive to misfit with Likert responses compared to multivariate normal responses.

A disconnect therefore emerges where most researchers have Likert responses but traditional fit index cutoffs are based on continuous responses. Applying cutoffs intended for continuous data to Likert-type data results in overly optimistic conclusions about fit and validity.

The dynamic package now has additional functions that are explicitly designed to accommodate Likert-type responses. This wasy, researchers can derive cutoffs that are optimally sensitive to their specific model characteristics and their specific data characteristics (e.g., number of Likert scale points, response distributions across scale points). More importantly, this allows researchers to better tailor fit index cutoffs to the data and model under consideration, hopefully leading to more accurate conclusions about model fit.

DFI for Likert-type Responses Treated as Continuous

There are two functions for treating Likert-type responses as continuous in the dynamic package: likertOne for one-factor models and likertHB for multi-factor models. The method by which these functions test hypothetical misspecifications and determine dynamic cutoffs is similar to other DFI functions (see here for an overview of this process). The main difference is how the functions generate data within the DFI simulations.

Likert-type responses can be treated in different ways, so there are some similarities to other DFI functions. For instance, Likert-type responses are technically discrete, so there are many similarities to the categorical functions catOne and catHB (described here). We initially thought that Likert-type responses would be a special case of non-normal continuous data covered by the nnorOne and nnorHB functions (described here. However, deeper investigation showed that the nnor functions worked well for non-normal continuous data but were sometimes susceptible to poor performance when applied specifically to Likert-type items.

Therefore, we do not not generally recommend the nnor functions for Likert-type data and instead suggest the dedicated likert functions described in this vignette.

Likert-type DFI starts by generating multivariate normal data, which will eventually be discretized into Likert-type responses. Whereas categorical models have threshold parameters to guide this discretization, models that treat Likert-type responses as continuous have no such parameters.

Therefore, DFI creates pseudo-thresholds from the user’s original data. This is done by first calculating the number of response options for each scale-point and proportion of responses within each scale-point. These proportions are then converted to pseudo-thresholds through the inverse of the standard normal cumulative distribution function.This process is handled internally by the software.

The result is simulated data that match the number of scale-points and scale-point distribution of the user’s data. Different number of scale-points per item are permitted. Additionally, because the user’s data is used to create the pseudo-thresholds, the simulated data will contain the same proportion and pattern of missing data as in the user’s original data.

To be clear, unlike the cfa and cat functions in dynamic, the likert functions require the original dataset to be included. This is necessary because the model output alone is not sufficient to generate Likert-type data that have similar properties as the original data. In categorical models, this information can be taken from the thresholds estimates. But in models that treat Likert-type responses as continuous, the only source of this information is from the data itself.

As another important note, the likert functions work best when responses are coded with values between 1 and 9. Data coded as ‘0’ or with double digits sometimes adversely affect some of the functions called by the likert functions and the generated data do not always closely reproduce the original data.

likert vs. nnor Data Generation

This section provides an example to show the difference in how data are generated with different DFI functions and why the likert functions are better suited for Likert-type responses.

We use data from Hussey & Hughes (2019), which is openly available from the Open Science Framework.

The data feature 8 extraversion items from 6649 people, each measured on a 6-point Likert scale. We use 5 of the items to create a unidimensional scale. The histograms for responses to these 5 items are shown below,

Using the nnorOne function in attempt to simulate data with similar properties results in this (original in teal, simulated in pink):

This data generation mechanism in the nnor function has some trouble mimicking features of discrete Likert-type scale points, especially reproducing the proportions at the extreme upper and and lower end of the scale. This results in generated data that do not look too similarly to the original Likert-type responses.

However, the likertOne function is explicitly designed to generate discrete Likert-type data, so the data it generates completely overlap the original data and exactly recreate the original data’s properties.

The likert functions create data that more closely resemble the original data, so the DFI cutoffs produced by the likert functions will be more accurate and lead to better decisions about fit because they are more closely aligned to the model and data being evaluated.

Example with lavaan object (manual=FALSE)

For users who work within R, dynamic can directly interface with a lavaan object to extract the pertinent information required to derive cutoffs via custom simulations. This is true for both the likertOne and likertHB functions. The likertHB function is used as an example in this section.

Using the same data as the previous section, we then fit a one-factor model in lavaan to the five items using the cfa function using the WLSMV estimator used by Hussey & Hughes (2019), which treats the Likert-type responses as continuous as long as the code does not included ordered=T. The fit of this model is shown below.

lavmod<-"E=~bfi_e1+bfi_e4+bfi_e5+bfi_e6+bfi_e8"
fit <- lavaan::cfa(model=lavmod, data=dat, estimator="WLSMV")
lavaan::fitmeasures(fit, c("srmr", "rmsea.scaled","cfi.scaled"))
#>         srmr rmsea.scaled   cfi.scaled 
#>        0.026        0.079        0.975

Model fit looks decent compared to Hu and Bentler’s benchmarks, although RMSEA would be a little high. Nonetheless, Hu and Bentler (1999) considered models with 15 items, 3 factors, continuous responses, and maximum likelihood estimation. Therefore, it is unclear if these traditional benchmarks are sensitive to misspecification with these particular model characteristics (5 items instead of 15, 1 factors instead of 3), these particular data characteristics (Likert-type responses with somewhat negatively skewed response distributions), and this estimator (WLSMV instead of ML).

Therefore, we derive DFI cutoffs for the specific model and data characteristics using likertOne. When fitting models in lavaan, DFI cutoffs can be derived simply by using the lavaan object and the dataset as the only required arguments in a dynamic function (there are options that can be added as well, such as requesting plots and specifying the estimator):

library(devtools)
devtools::install_github("melissagwolf/dynamic")
dynamic::likertOne(fit, data=dat,plot=T, estimator="WLSMV")
#> Your DFI cutoffs: 
#>             SRMR  RMSEA CFI  
#> Level-0     0.01  0.022 0.998
#> Specificity 95%   95%   95%  
#>                              
#> Level-1     0.03  0.09  0.968
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-2     0.041 0.126 0.939
#> Sensitivity 95%   95%   95%  
#> 
#> Empirical fit indices: 
#>  Chi-Square  df p-value   SRMR   RMSEA    CFI
#>     213.491   5       0  0.026   0.079  0.975
#> 
#>  Notes:
#>   -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#>   -Cutoffs with 95% sensitivity are reported when possible
#>   -If sensitivity is <50%, cutoffs will be supressed 
#> 
#>  The distributions for each level are in the Plots tab 
#> [[1]]

#> 
#> [[2]]

The “Level-0” row corresponds to anticipated fit index values if the fitted model were indeed the underlying population model. In this example, we see that if the model were actually correct, 95% of the time, we would see SRMR values below 0.011, RMSEA values below 0.025, and CFI values above 0.997.

“Level-1” corresponds to anticipated fit index values if the model were misspecified. In this case, the misspecification being tested is an omitted 0.30 residual correlation between one pair of items. The “Sensitivity” row is the percentage of replications with this magnitude misspecification that would be rejected at the printed cutoffs (e.g., the proportion of the red distribution that is to the right of the RMSEA cutoff in the plots). The default is 95%, but smaller number may be reported when there is uncertainty due to sampling variability (e.g., small samples or low factor loadings). Sensitivity is always contingent on rejecting no more than 5% of correct models represented by the blue distribution in the plot.

The plot=T option in the code creates plots to help understand how these cutoffs were derived. The Level-1 plot shows the simulated distribution for the Level-1 misspecification. The blue distribution represents fit index values if the model were hypothetically true. The red distribution represents the distribution of fit index values if the model were hypothetically misspecified via an omitted 0.30 residual correlation. The cutoff comes from the fit index value that is sensitive to 95% of misspecified models.

With the model and data characteristics in this example, an SRMR cutoff of .030 would reject 95% of models with this misspecification, an RMSEA cutoff of 0.09 would reject 95% of models with this misspecification, and a CFI cutoff of 0.968 would reject 95% of models with this misspecification. Because all of the fit indices for the model meet the DFI cutoffs, the conclusion would be that the cumulative misspecification in this model is less than one omitted 0.30 residual correlation.

The Level-2 DFI cutoffs repeat this process but use a larger misspecification (two omitted 0.30 residual correlations for this model). This provides a sensitivity analyses and DFI cutoffs that correspond to different magnitudes of hypothetical misfit.

In this case, we case see that the RMSEA value being above the traditional 0.06 threshold does not seem problematic because the equivalent cutoff for these model and data characteristics is 0.09. This model only has 5 df, so the DFI cutoffs can encode small-df issues with RMSEA discussed by Kenny et al. (2015). That is, RMSEA becomes inflated when there are few degrees of freedom, so the DFI cutoff acknowledges this and increases to reflect behavior of this fit index under these conditions. Similarly, the DFI cutoff for CFI is stricter than the traditional 0.95 cutoff because the the WLSMV estimator tends to restrict the value of CFI relative to maximum likelihood (e.g., Xia & Yang, 2019).

Example with Manual Model Entry (manual =TRUE)

dynamic also allows manual model entry for users who do not use lavaan as their primary latent variable modeling software or for users who may be assessing results reported by published research without having to completely rerun the analysis.

For instance, if we fit the same model in Mplus and asked for the standardized estimates, the output would be

These standardized estimates can be transferred to lavaan manually placing the standardize estimate before the item names,

manmod<-"E=~.778*bfi_e1 + .656*bfi_e4 + .728*bfi_e5 + .550*bfi_e6 + .792*bfi_e8"

Where E is an arbitrary label for the latent factor E. The values before each item are the standardized loadings from the Mplus output.

The code to run DFI is then similar to the previous example but with two small changes. First, there is an extra option manual=T to tell the software that the model was manually entered and that the code should parse a manually added model statement rather than extracting information from a lavaan object. Second, users need to enter a sample size with the n= option. Otherwise, the code is the same

dynamic::likertOne(manmod, data=dat,plot=T, manual=T, n= 6649, estimator="WLSMV")
#> Your DFI cutoffs: 
#>             SRMR  RMSEA CFI  
#> Level-0     0.01  0.021 0.998
#> Specificity 95%   95%   95%  
#>                              
#> Level-1     0.03  0.091 0.966
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-2     0.041 0.126 0.939
#> Sensitivity 95%   95%   95%  
#> 
#>  Notes:
#>   -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#>   -Cutoffs with 95% sensitivity are reported when possible
#>   -If sensitivity is <50%, cutoffs will be supressed 
#> 
#>  The distributions for each level are in the Plots tab 
#> [[1]]

#> 
#> [[2]]

The cutoffs from this model are within .001 of the previous section and the small difference is attributable to rounding the standardized estimates in the manually entered model statement.

Citation Recommendations

To cite the ideas behind dynamic fit index cutoffs:

To cite the dynamic fit index R package discussed in this vignette:

This package relies on the following packages: