Dynamic Fit Index (DFI) Cutoffs for CFA Models with Non-Normal Variables or Ordinal Variable Treated as Continuous

Daniel McNeish

2023-04-24

Introduction

Dynamic fit index (DFI) cutoffs are a recently proposed method to derive benchmarks for fit indices like RMSEA, CFI, and SRMR that are optimally sensitive to misspecifications for the user’s specific model. The original intent of DFI was to extend the idea of Hu & Bentler (1999) – whose cutoffs are known to be sensitive to model characteristics like number of items or factor loading magnitude – so that cutoffs with desirable properties could be derived for any CFA, not just those with properties similar to those studied by Hu & Bentler (1999).

The original conception of DFI uses a simulation method that assumes that data are multivariate normal and that there is no missing data. The method has been devised for one-factor models (McNeish & Wolf, 2022) and multifactor models (McNeish & Wolf, 2021). Functions to implement these methods can be found in the dynamic package with the cfaOne and cfaHB functions, respectively. A vignette for implementing the multivariate normal version of DFI can be found here and a conceptual overview and common questions can be found here.

In many applications of CFA, however, researchers do not have truly continuous data and instead solicit item responses on ordinal formats like Likert scales (e.g., Flake et al., 2017). Even though these data are technically discrete, they are routinely modeled as if they are continuous (e.g., Jackson et al., 2009). Therefore, assuming multivariate normality in DFI simulations may result in inaccurate conclusions if the true characteristics of the data deviate from normality and to the extent that missing data are present.

The dynamic package now has additional functions that are designed to provide customized fit index cutoffs that not only match the user’s model characteristics, but that also match the response distribution and missing data pattern for each item. This way, researchers can derive cutoffs that are optimally sensitive to their specific model characteristics and their specific data characteristics so that DFI cutoffs are maximally tailored to all aspects of the user’s analysis.

DFI with Non-Normal Continuous Data

The basic algorithm for how DFI searches for hypothetical misspecifications to include in its simulation is unchanged whether cutoffs are desired for multivariate normal or non-normal continuous data. For one-factor models, hypothetical misspecifications are based on residual correlations between items. For multi-factor models, hypothetical misspecifications are based on cross-loadings to mimic the approach from Hu and Bentler (1999).

The main difference between the multivariate normal functions in dynamic (cfaOne and cfaHB) and their non-normal counterparts (nnorOne and nnorHB) lies in how data are generated in the DFI algorithm.

In the multivariate normal functions, data are simulated from a multivariate normal distribution with a covariance matrix equal to a model-implied matrix (which is either consistent with the user’s model or perturbed so that it is misspecified to a particular degree). Because data are simulated, it can be difficult to accommodate specific patterns of missingness, so complete data are assumed.

However, generating non-normal multivariate data in this manner is notoriously difficult (e.g., Astivia & Zumbo, 2015, Qu et al., 2020). It is also difficult to incorporate specific missing data patterns in such a simulation without knowing the missing data mechanism and what process leads to missing responses.

To better recapture the response distributions and missing data patterns in the empirical data, nnorOne and nnorHB replace data generation by drawing from a particular parametric distribution with a Bollen-Stine bootstrap (Bollen & Stine, 1992).

What is a Bollen-Stine Bootstrap?

The typical approach to bootstrapping is to resample the data with replacement and reanalyze each bootstrapped sample. This will produce one estimate per bootstrapped sample, which creates a sampling distribution for the quantity of interest.

Though effective for model parameters like regression coefficients or their standard errors, this standard approach to bootstrapping does not work well for fit indices because it is uncertain if the model actually fits the data. That is, the interpretation of bootstrapped distributions of fit indices are unclear when the magnitude of misspecification is unknown.

Bollen and Stine (1992) proposed a different approach to bootstrapping that is applicable specifically to model fit measures. Rather than bootstrap from the original data directly, the Bollen-Stine bootstrap first transforms the original data prior to bootstrapping. This transformation is done to ensure that the data are necessarily consistent with a particular model-implied covariance structure (i.e., it is known whether the fitted model is correct or incorrect). The transformed data are then bootstrapped instead of the original data. An extension for missing data was first proposed by Enders (2002), which was later refined by Savalei and Yuan (2009).

Bollen-Stine Bootstrapping Example

To demonstrate how this works, consider an example using data from Hussey & Hughes (2019), which is openly available from the Open Science Framework page associated with their paper. We will use data from the 9-item Agreeableness scale items, which were solicited from 6649 people and are on a 6-point Likert scale.

After downloading the “trimmed_data” file from the OSF page associated with their paper, the following code can be used to select the specific items of interest (code adapted from Hussey & Hughes’ supplement material)

library(dplyr)

trimmed_first_timepoint_per_user_per_scale <- trimmed_data %>%
  arrange(user_id, individual_differences_measure, datetime_ymdhms) %>%
  mutate(combined_var = paste0(user_id, individual_differences_measure)) %>%
  filter(combined_var != lag(combined_var)) %>%
  dplyr::select(-combined_var)

bfi_a_data <- trimmed_first_timepoint_per_user_per_scale %>%
dplyr::select(bfi_a1, bfi_a2, bfi_a3, bfi_a4, bfi_a5, bfi_a6, bfi_a7, bfi_a8, bfi_a9) 

If we fit a one-factor model to these data using the MLR estimator to account for possibly mild to moderate violations of normality, we obtain the following results

library(lavaan)

bfi_a_model <- "A =~ bfi_a1 + bfi_a2 + bfi_a3 + bfi_a4 + bfi_a5 + bfi_a6 + bfi_a7 + bfi_a8 + bfi_a9" 

fit<-lavaan::cfa(bfi_a_model, data=bfi_a_data, estimator="MLR")

fitmeasures(fit,c("srmr","rmsea.robust","cfi.robust","df","chisq"))
#>         srmr rmsea.robust   cfi.robust           df        chisq 
#>        0.054        0.093        0.881       27.000     1588.066

If we perform a Bollen-Stine bootstrap using the semTools package, we can transform the data so that they are necessarily consistent with the model-implied mean and covariance of the bfi_a_model fit above:

library(lavaan)
library(semTools)

fit<-lavaan::cfa(bfi_a_model, data=bfi_a_data, estimator="MLR",meanstructure=T)

library(semTools)             
#transform data according to these model-implied structures 
Data_t<-as.data.frame(semTools::bsBootMiss(fit,transDataOnly  = TRUE))
      
#fit same model to the Bollen-Stine bootstrapped data       
tran_fit<-lavaan::cfa(bfi_a_model, data=Data_t, estimator="MLR")
fitmeasures(tran_fit,c("srmr","rmsea.robust","cfi.robust","df","chisq", "pvalue"))
#>         srmr rmsea.robust   cfi.robust           df        chisq       pvalue 
#>            0            0            1           27            0            1
## fitting the model to transformed data now fits perfectly

After this transformation, we can see that the model fits perfectly.

Using this type of bootstrapping, we can transform the original data to be consistent with any model that we wish. The result is similar to generated data from a probability distribution (like multivariate normal), except that the distribution of the generated data match the distribution of the original data (and its missing data patterns).

To demonstrate, compare the distribution of the original data (in teal) to the distribution of the transformed data (in red):

library(tidyr)
library(ggplot2)

aa<-gather(Data_t[,1:9])
bb<-gather(bfi_a_data)

aa$model<-c("Original Data")
bb$model<-c("Bootstrapped Data")

cc<-rbind(aa,bb)
cc$value<-as.numeric(cc$value, na.rm=T)
             
ggplot()+  
geom_histogram(data=cc,aes(x=value, y=..density.., fill=model),bins=8, alpha=.3,position="identity") +
facet_wrap(~key)+ 
scale_colour_manual(values=c("#E9798C","#66C2F5")) +
theme(axis.title.y = element_blank(), 
axis.title.x=element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
panel.background = element_blank(),
axis.line = element_line(color="black"),
legend.title=element_blank())

The distributions almost precisely overlap, indicating that the response distribution of each item is preserved. However, the red distribution is exactly consistent with a one-factor model whereas the teal distribution has an unknown underlying model. This demonstrates the power of the Bollen-Stine bootstrap (particularly for methdological studies) – it can retain the observed variables’ original distributions while manipulating the model underlying the data.

Bollen-Stine Bootstrapping in DFI functions

In DFI, the non-normality functions start with the user’s original data. NOTE the nnorOne and nnorHB functions require the user to include a dataset in the function – this deviates from other DFI functions where no data are required because data are generated from a probability distribution. Because the non-normality functions use bootstrapping to match whatever distributions are present in the original data, the original data must be provided. We chose this route because it seemed more user-friendly than requiring that user’s specify skewness and kurtosis of each variable.

The user’s data are transformed according to a model-implied covariance matrix. This covariance matrix may be consistent with the user’s fitted model (i.e., a Level-0 misspecification) such that it is known that there are no misspecifications or it may be consistent with a hypothetically misspecified model (i.e., a Level-1 or higher misspecification). A Bollen-Stine bootstrap then samples from the transformed data and the user’s original model is fit to all data.

The most important takeaway point for Bollen-Stine bootstraping within DFI is that – for Level-1 misspecification and higher – the data are being transformed so that they are inconsistent with the user’s original fitted model. This is different from typical applications of Bollen-Stine bootstrapping where the interest is in the null sampling distribution when the model fit the data perfectly. However, transforming the data based on an incorrect model is consistent with the idea of DFI. In other words, the use of the Bollen-Stine bootstrap is being modified to essentially simulate data with specific, idiosyncratic properties.

The DFI process is otherwise unchanged – the only difference is that data are generated from a Bollen-Stine bootstrap to create hypothetical data that more closely match the characteristics of the user’s data as opposed to simulating data from a probability distribution. Doing so requires that the user must also provide a dataset, however.

Example with lavaan object (manual=FALSE)

For users who work within R, dynamic can directly interface with a lavaan object to extract pertinent information required to derive customized cutoffs. This is true for both the nnorOne and nnorHB functions. The nnorOne function is used as an example in this section.

We use the same data from 9-item Agreeableness scale from Hussey & Hughes (2019) that was used in the previous example. Earlier, we saw that the fit indices for this model using the MLR estimator were srmr = .054, rsmea = .093, and cfi = .881. The fit indices and chi-square statistic indicated that the model does not fit exactly; however, the approximate fit indices (SRMR, RMSEA, CFI) are a little ambiguous so it is a little unclear how to interpret them. This is especially true because traditional cutoffs were not meant to generalize to one-factor models, non-continuous data, or the MLR estimator that are all present here.

Because the item responses were treated as continuous, we could use traditional DFI to derive custom cutoffs to help us contextualize these indices:

#if first time using dynamic, install it with following code:
#devtools::install_github("melissagwolf/dynamic")

library(dynamic)
library(lavaan)

#model statement
bfi_a_model <- "A =~ bfi_a1 + bfi_a2 + bfi_a3 + bfi_a4 + bfi_a5 + bfi_a6 + bfi_a7 + bfi_a8 + bfi_a9" 

#fit one-factor model in lavaan with MLR estimator
fit<-lavaan::cfa(bfi_a_model, data=bfi_a_data, estimator="MLR")

#traditional DFI function for one-factor model with MLR estimator
dynamic::cfaOne(fit, estimator="MLR")
#> Your DFI cutoffs: 
#>                SRMR RMSEA  CFI
#> Level 1: 95/5  .028  .047 .968
#> Level 1: 90/10   --    --   --
#> Level 2: 95/5  .038  .069 .935
#> Level 2: 90/10   --    --   --
#> Level 3: 95/5  .056  .104 .871
#> Level 3: 90/10   --    --   --
#> 
#> Empirical fit indices: 
#>  Chi-Square  df p-value   SRMR   RMSEA    CFI
#>    1588.066  27       0  0.054   0.093  0.881

To briefly explain what the different levels mean, Level-1 is designed to be roughly consistent with traditional cutoffs and roughly represents “close” fit. Level-2 and Level-3 roughly represent “fair” and “mediocre” fit, respectively. If the cutoffs from the model exceed the Level-3 cutoff, that suggests that the model fits poorly.

Assuming normality shows that the model’s fit indices do not meet the Level-1 or Level-2 DFI cutoffs but that they do meet the Level-3 DFI cutoffs. This suggests that there is evidence of meaningful misspecification but that the model may not be entirely worthless (Level-3 in DFI roughly corresponds to “mediocre” fit discussed in the existing SEM literature).

However, this approach assumes multivariate normality. We know that this may not be reasonable to assume given that (a) the data are Likert and may be considered ordinal, and (b) the plots above show that most of the response distributions are not bell-shaped and are somewhat skewed. So, it is unclear how fit indices may be affected if the item response distributions are not normal or evenly vaguely bell-shaped.

The nnorOne function was designed for this exact scenario to tailor fit indices to the model and data characteristics. That is, we can derive DFI cutoffs that will take the shape of the response distributions into account to produce more accurate cutoffs for this exact situation.

library(dynamic)
library(lavaan)

#model statement
bfi_a_model <- "A =~ bfi_a1 + bfi_a2 + bfi_a3 + bfi_a4 + bfi_a5 + bfi_a6 + bfi_a7 + bfi_a8 + bfi_a9" 

#fit one-factor model in lavaan with MLR estimator
fit<-lavaan::cfa(bfi_a_model, data=bfi_a_data, estimator="MLR")

#traditional DFI function for one-factor model with MLR estimator
nnor<-dynamic::nnorOne(fit, data=bfi_a_data, estimator="MLR", plot=T)

#print DFI cutoffs
nnor$cutoffs
#>             SRMR  RMSEA CFI  
#> Level-0     0.009 0.01  0.999
#> Specificity 95%   95%   95%  
#>                              
#> Level-1     0.021 0.035 0.981
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-2     0.028 0.05  0.964
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-3     0.04  0.073 0.929
#> Sensitivity 95%   95%   95%

The nnorOne defaults are to use the MLR estimator and 500 replications in the DFI simulations. the estimator= option can be added to derive cutoffs using any estimator supported by lavaan. The plot option is not required, but will print fit index distributions that are useful for understanding behavior of fit indices with these specific model and data characteristics. The plot option also will create a plot called dist_plot that will compare the bootstrapped data to the observed data (to view this plot, save the results of dynamic to an object, then print the object name followed by $dist_plot).

In nnorOne, there will usually be four levels of misspecification (Levels 0 through 3) tested when there are 6 or more items that do not have residual correlations in the fitted model (fewer levels will be tested when there are fewer than 6 items without residual correlations). The “Level-0” row corresponds to anticipated fit index values if the fitted model were indeed the underlying population model with the observed response distributions. For multi-factor model using the nnorHB function, the number of levels will depend on the size of the model.

When accounting for specific response patterns in these data, we see that the DFI cutoffs are stricter. Ultimately, these fit indices change so much that the conclusion of this fit assessment changes – whereas DFI cutoffs assuming normality suggested fit consistent with a Level-3 misspecification, DFI cutoffs based on non-normality suggest that the model fits much worse and that the model has poor fit.

This highlights the importance of accounting for the data characteristics in addition to model characteristics – assuming normality when it is not appropriate can change DFI cutoffs and interpretations of model fit. The nnorOne and nnorHB functions can be used to incorporate information about the distributions of the observed items to further customize fit index cutoffs and obtain the most accurate conclusions possible.

Example with Manual Model Entry (manual=TRUE)

Although nnorOne and nnorHB require the user to include their dataset, it is still possible to enter a model manually for users who do not use lavaan as their primary latent variable modeling software. Using the same example above, the standardized estimates (which could be obtained from any SEM software but are reproduced here in lavaan) are:


#model statement
bfi_a_model <- "A =~ bfi_a1 + bfi_a2 + bfi_a3 + bfi_a4 + bfi_a5 + bfi_a6 + bfi_a7 + bfi_a8 + bfi_a9" 

#fit one-factor model in lavaan with MLR estimator
fit<-cfa(bfi_a_model, data=bfi_a_data, estimator="MLR")

#standardized estimates
lavaan::standardizedsolution(fit)
#>       lhs op    rhs est.std    se      z pvalue ci.lower ci.upper
#> 1       A =~ bfi_a1   0.548 0.011 47.751      0    0.525    0.570
#> 2       A =~ bfi_a2   0.545 0.014 40.197      0    0.519    0.572
#> 3       A =~ bfi_a3   0.513 0.012 42.123      0    0.489    0.536
#> 4       A =~ bfi_a4   0.538 0.012 44.762      0    0.514    0.562
#> 5       A =~ bfi_a5   0.495 0.013 37.939      0    0.469    0.520
#> 6       A =~ bfi_a6   0.546 0.011 48.328      0    0.524    0.568
#> 7       A =~ bfi_a7   0.670 0.011 63.320      0    0.649    0.690
#> 8       A =~ bfi_a8   0.582 0.011 50.674      0    0.560    0.605
#> 9       A =~ bfi_a9   0.541 0.012 44.863      0    0.517    0.564
#> 10 bfi_a1 ~~ bfi_a1   0.700 0.013 55.720      0    0.675    0.725
#> 11 bfi_a2 ~~ bfi_a2   0.703 0.015 47.460      0    0.673    0.732
#> 12 bfi_a3 ~~ bfi_a3   0.737 0.012 59.085      0    0.713    0.762
#> 13 bfi_a4 ~~ bfi_a4   0.711 0.013 54.955      0    0.685    0.736
#> 14 bfi_a5 ~~ bfi_a5   0.755 0.013 58.510      0    0.730    0.780
#> 15 bfi_a6 ~~ bfi_a6   0.702 0.012 56.902      0    0.678    0.726
#> 16 bfi_a7 ~~ bfi_a7   0.552 0.014 38.951      0    0.524    0.579
#> 17 bfi_a8 ~~ bfi_a8   0.661 0.013 49.397      0    0.635    0.687
#> 18 bfi_a9 ~~ bfi_a9   0.708 0.013 54.342      0    0.682    0.733
#> 19      A ~~      A   1.000 0.000     NA     NA    1.000    1.000

In lavaan syntax, the model with standardized estimates would be written out as

manmod <- "
A =~ .55*bfi_a1 + .55*bfi_a2 + .51*bfi_a3 + .54*bfi_a4 + .50*bfi_a5 + .55*bfi_a6 + .67*bfi_a7 + .58*bfi_a8 +.54*bfi_a9
"

Unlike other DFI functions, it is important that the variable labels in a manually entered model exactly match how the variables are labeled in the user’s dataset. The congruence is needed so that the Bollen-Stine bootstrap is applied to the correct variables.

The code in the nnorOne function then specifies the manually entered model manmod as the model object with manual=TRUE to tell dynamic to parse a manually entered file rather than a lavaan object. The sample size option n= is also a required option with manual=TRUE. A data= option is required to use the nnorOne or nnorHB functions. The full code for this example would be,


#manually entered model statement using standardized estimates
manmod <- "
A =~ .55*bfi_a1 + .55*bfi_a2 + .51*bfi_a3 + .54*bfi_a4 + .50*bfi_a5 + .55*bfi_a6 + .67*bfi_a7 + .58*bfi_a8 +.54*bfi_a9"

#DFI function with manual=TRUE
dynamic::nnorOne(manmod, data=bfi_a_data, n=6649, manual=T,estimator="MLR")
#> Your DFI cutoffs: 
#>             SRMR  RMSEA CFI  
#> Level-0     0.009 0.01  0.999
#> Specificity 95%   95%   95%  
#>                              
#> Level-1     0.021 0.035 0.982
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-2     0.028 0.05  0.964
#> Sensitivity 95%   95%   95%  
#>                              
#> Level-3     0.04  0.073 0.929
#> Sensitivity 95%   95%   95%  
#> 
#>  Notes:
#>   -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
#>   -Cutoffs with 95% sensitivity are reported when possible
#>   -If sensitivity is <50%, cutoffs will be supressed

Citation Recommendations

A paper specifically on non-normal DFI cutoffs is in progress but not yet complete, so there is not yet a citation.

To cite the ideas behind dynamic fit index cutoffs:

To cite the dynamic fit index R package discussed in this vignette:

This package relies on the following packages: