Spring 2020

Intro to CFA

  • Researcher uses model to confirm an existing theory/hypothesis
  • Must specify a particular model for testing. In the modeling world, to 'specify' a model means to define the structure of the model including the number of factors, which items load on which factors, and whether the factors covary
  • Requires that the researcher have more a-priori ideas about the item covariances than in the EFA approach

Model Constraints

  • May test specific hypotheses by using particular patterns of constraints in the model
  • For example, to specify which items load onto which factors the researcher constrains all factor loadings between those items and other factors in the model to 0
  • By making constraints on the model, the remaining parameters are free to be estimated using maximum likelihood

Model Constraints

  • The use of model constraints is an important component of CFA and one of the things that distinguishes CFA from EFA
  • Most basic hypothesis test in CFA: does hypothesized model fit significantly better than baseline model (the worst possible model for the data)?

Model Constraints

  • More sophisticated and/or specific hypothesis can also be tested in the CFA framework using model constraints
  • Example: Do two groups of individuals (e.g. reperesenting two different cultures) have the same factor structure? This could be tested by constraining all factor loadings to be the same across the two groups and comparing the fit to unconstrained model
  • If the fit of the latter model is significantly better than the former, conclude that the two groups have different factor structures
  • This is a simplified example of a method called Measurement Invariance (or Factor Invariance) Testing

CFA Caution

  • It is important to remind you that a model can be specified in a number of ways, and just because a model has good fit doesn’t mean that it is a ‘good’ model theoretically.
  • Remember that a model is not useful if it doesn’t make sense theoretically, and the researcher is responsible for making sure this is the case.

Model Evaluation

  • How can you determine whether a hypothesis has been supported?
  • Use various fit statistics and fit indices to evaluate the model and draw conclusions about your hypotheses

Chi-Square

  • As long as you have degrees of freedom left in your model (we call this an overidentified model- a model with 0 degrees of freedom is called a just identified model), you can get a \(\chi^2\) value for your model
  • Represents the difference between the actual covariance structure, and the covariance structure implied by the model
  • Larger values represent worse fit
  • Hope to achieve non-significance and fail to reject the null in this test of overall model fit

Incremental Fit Indices

  • Allow you to compare the fit of your hypothesized model to the baseline model (also called the null model).
  • Tucker-Lewis Index (TLI) and the Comparative Fit Index (CFI)
  • 2 slightly different calculations for model fit that generally range between 0 and 1. For both indices, values closer to 1 are desirable. A value of exactly 1 would suggest that the model misfit = 0
  • Many researchers agree that values should be greater than .90 for acceptable fit and .95 for good model fit

Incremental Fit Indices

Note: Some statisticians have pointed out that the CFI and TLI are not informative or reliable indicators of model fit unless the baseline model has an RMSEA > .158 (don't worry about the math on this one for now, I will show you how to calculate the RMSEA for the baseline model in R).

Absolute Fit Indices

  • Compares model fit to the best possible model, or a model that perfectly describes the data
  • Root mean-square error of approximation (RMSEA): measure of misfit, meaning that lower values are desirable
  • The best possible model would have an RMSEA of 0, meaning perfect fit. Thus, smaller values of RMSEA are desirable.
  • Value of < .05 is an indicator of good fit, and that values < .08 or even .10 represent acceptable fit. Values greater than .10 should certainly give you pause.

Comparative Fit Indices

  • Used to compare two nested models to determine which model fits the data better
  • Useful when you want to test competing hypotheses
  • Most popular: Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC)
  • The particular values are essentially meaningless when in comes to hypothesis testing. However, when models are nested they can be used to compare the two models. The model with a smaller AIC or BIC value is the preferred model
  • Both the AIC and BIC penalize for overly complex models.

Conclusions and Recommendations around Model Evaluation

  • There are many other fit statistics and fit indices in the literature; the ones I have described here are the most frequently reported
  • Increasing concern in scientific community about which fit indices researchers choose to report, and whether some researchers pick only the ones that confirm hypotheses
  • I recommend that you take a fully transparent approach and report all of the measures I have mentioned here or that you have access to