Population & Sample Covariance

Population covariance between items \(j\) and \(k\):

\[\sigma_{jk} = \frac{1}{n}\sum_{i=1}^{n}(y_{ij} - \bar{y}_j)(y_{ik} - \bar{y}_k)\]

Sample covariance (estimating \(\sigma_{jk}\) from \(n\) observations):

\[s_{jk} = \frac{1}{n-1}\sum_{i=1}^{n}(y_{ij} - \bar{y}_j)(y_{ik} - \bar{y}_k)\]

Estimating Covariance: Worked Example 1

\(n = 4\) observations, \(\bar{x} = 5\), \(\bar{y} = 5\):

\(i\) \(x_i\) \(y_i\) \(x_i - \bar{x}\) \(y_i - \bar{y}\) \((x_i-\bar{x})(y_i-\bar{y})\)
1 2 4 \(-3\) \(-1\) \(3\)
2 4 5 \(-1\) \(0\) \(0\)
3 6 5 \(1\) \(0\) \(0\)
4 8 6 \(3\) \(1\) \(3\)
Sum \(\mathbf{6}\)

Estimating Covariance: Worked Example 2

\[s_{xy} = \frac{1}{n-1}\sum_{i=1}^{4}(x_i - \bar{x})(y_i - \bar{y}) = \frac{6}{4-1} = \frac{6}{3} = 2\]

cov(c(2,4,6,8), c(4,5,5,6))
## [1] 2

On average, when \(x\) is 1 unit above its mean, \(y\) tends to be 2 units above its mean.

Correlation

Correlation — covariance scaled by the product of standard deviations:

\[\rho_{jk} = \frac{E\!\left[(y_j - \mu_j)(y_k - \mu_k)\right]}{\sqrt{E\!\left[(y_j-\mu_j)^2\right]\cdot E\!\left[(y_k-\mu_k)^2\right]}} = \frac{\sigma_{jk}}{\sigma_j \sigma_k}\]

\[r_{jk} = \frac{\displaystyle\sum_{i=1}^{n}(y_{ij}-\bar{y}_j)(y_{ik}-\bar{y}_k)}{\displaystyle\sqrt{\sum_{i=1}^{n}(y_{ij}-\bar{y}_j)^2 \;\cdot\; \sum_{i=1}^{n}(y_{ik}-\bar{y}_k)^2}}\]

Correlation Covariance comparison

Covariance Correlation
Scale Unbounded \(-1 \leq r \leq 1\)
Units Product of item units Unitless
Diagonal \(\sigma^2_j\) (variance) \(1\)

Goal of CFA: find \(\theta\) such that \(\boldsymbol{\Sigma}(\theta) \approx \mathbf{S}\)

Measurement Model — Scalar Form

For a 3 item one factor case, each observed item \(y\) is modeled as:

\[ \begin{matrix} y_1 = \tau_1 + \lambda_{1}\eta_{1} + \epsilon_{1} \\ y_2 = \tau_2 + \lambda_{2}\eta_{1} + \epsilon_{2} \\ y_3 = \tau_3 + \lambda_{3}\eta_{1} + \epsilon_{3} \end{matrix} \]

Symbol Name Interpretation
\(\tau\) Intercepts Item means
\(\lambda\) Loadings Correlation of item with factor
\(\eta\) Latent factor The underlying construct (e.g. SPSS Anxiety)
\(\epsilon\) Residuals What the factor does not explain

Effect of \(\tau\)

Effect of \(\lambda\)

Measurement Model — Matrix Form

For a 3 item one factor case: \[ \begin{matrix} y_1 = \tau_1 + \lambda_{1}\eta_{1} + \epsilon_{1} \\ y_2 = \tau_2 + \lambda_{2}\eta_{1} + \epsilon_{2} \\ y_3 = \tau_3 + \lambda_{3}\eta_{1} + \epsilon_{3} \end{matrix} \]

For a 3 item one factor case: \[ \begin{pmatrix} y_{1} \\ y_{2} \\ y_{3} \end{pmatrix} = \begin{pmatrix} \tau_1 \\ \tau_2 \\ \tau_3 \end{pmatrix} + \begin{pmatrix} \lambda_{1} \\ \lambda_{2} \\ \lambda_{3} \end{pmatrix} \begin{pmatrix} \eta_{1} \end{pmatrix} + \begin{pmatrix} \epsilon_{1} \\ \epsilon_{2} \\ \epsilon_{3} \end{pmatrix} \]

That is: \[\mathbf{y} = \boldsymbol{\tau} + \boldsymbol{\Lambda}\boldsymbol{\eta} + \boldsymbol{\epsilon}\]

Path diagram

Covariance Structure

The model-implied covariance matrix \(\Sigma(\theta)\) is:

\[\Sigma(\theta) = \mathbf{\Lambda \Psi \Lambda'} + \Theta_{\epsilon}\]

Symbol Name Description
\(\Lambda\) Factor loading matrix Same \(\lambda\)’s from the measurement model
\(\Psi\) Latent covariance matrix Variance of \(\eta\); scalar for one factor
\(\Theta_{\epsilon}\) Residual covariance matrix Variance-covariance of residuals

Covariance Structure — Expanded

\[ \Sigma(\theta) = \begin{pmatrix} \lambda_{1} \\ \lambda_{2} \\ \lambda_{3} \end{pmatrix} \begin{pmatrix} \psi_{11} \end{pmatrix} \begin{pmatrix} \lambda_{1} & \lambda_{2} & \lambda_{3} \end{pmatrix} + \begin{pmatrix} \theta_{11} & \theta_{12} & \theta_{13} \\ \theta_{21} & \theta_{22} & \theta_{23} \\ \theta_{31} & \theta_{32} & \theta_{33} \end{pmatrix} \]

  • \(\mathbf{\Lambda \Psi \Lambda’}\) captures the shared variance due to the factor
  • \(\Theta_{\epsilon}\) captures the unique variance of each item

\[\Sigma(\theta) = \text{Cov}(\mathbf{y}) = \boldsymbol{\Lambda} \Psi \mathbf{\Lambda}’ + \Theta_{\epsilon}\]

Model Degrees of Freedom — The Logic

\[df = \underbrace{\frac{p(p+1)}{2}}_{\text{known}} \;-\; \underbrace{q}_{\text{free parameters}}\]

Known quantities = unique elements of the sample covariance matrix \(\mathbf{S}\)

For \(p\) observed variables: \(\frac{p(p+1)}{2}\) values (lower triangle + diagonal)

\[\mathbf{S} = \begin{pmatrix} s_{11} & & \\ s_{21} & s_{22} & \\ s_{31} & s_{32} & s_{33} \end{pmatrix} \quad \Rightarrow \quad \frac{3 \cdot 4}{2} = 6 \text{ unique values}\]

Free parameters \(q\) = everything CFA must estimate from those values

Model Degrees of Freedom — The Logic cont

\(\quad df > 0\) → model is over-identified (testable)
\(\quad df = 0\) → model is just-identified (no test possible)
\(\quad df < 0\) → model is under-identified (cannot be estimated)

Model Degrees of Freedom — Counting Parameters

For a one-factor CFA with \(p\) items (marker-variable identification: \(\lambda_1 = 1\)):

Parameter Symbol Free count Reason
Factor loadings \(\lambda_2, \ldots, \lambda_p\) \(p - 1\) \(\lambda_1\) fixed to 1
Factor variance \(\psi_{11}\) \(1\) Scale of \(\eta\)
Residual variances \(\theta_{\epsilon_1}, \ldots, \theta_{\epsilon_p}\) \(p\) One per item
Total \(q\) \(\mathbf{2p}\)

\[df = \frac{p(p+1)}{2} - 2p = \frac{p(p-3)}{2}\]

Model Degrees of Freedom — Counting Parameters cont

\(p\) Knowns Free \(q\) \(df\) Status
3 6 6 0 just-identified
4 10 8 2 over-identified
5 15 10 5 over-identified
6 21 12 9 over-identified

Each additional item beyond 3 adds \(p - 2\) new \(df\) — more items = richer test of fit.

Types of Model Fit Indices

Absolute fit — how well does \(\hat{\boldsymbol{\Sigma}}(\theta)\) reproduce the observed \(\mathbf{S}\)?

Compares the model directly to the data, with no reference to any other model. A value of zero means perfect reproduction.

Incremental fit (Comparative) — how much better is the target model than the worst possible model?

Scales fit relative to a baseline (null) model in which all observed variables are uncorrelated (\(\boldsymbol{\Sigma}_b = \text{diag}(\mathbf{S})\)). A value of 1 means the target model explains all the covariance structure the null model cannot.

Parsimony fit — does the model fit well given its complexity?

Penalises models for having many free parameters (\(q\)). A model that fits perfectly but uses every available degree of freedom is not parsimonious. Parsimony indices reward simpler explanations.

Family Logic Representative indices
Absolute \(\hat{\boldsymbol{\Sigma}} \approx \mathbf{S}\)? \(\chi^2\), SRMR, GFI
Incremental Better than null? CFI, TLI, NFI
Parsimony Fit per \(df\)? RMSEA, AIC, BIC

Model Fit Indices 1

Type Index Formula / Benchmark Acceptable
Absolute \(\chi^2\) \(\sum w_{jk}(\hat{\sigma}_{jk} - s_{jk})^2\) \(p > .05\) (sensitive to \(n\))
Absolute SRMR \(\sqrt{\frac{2\sum_j\sum_{k\leq j}(s_{jk}-\hat{\sigma}_{jk})^2}{p(p+1)}}\) \(< .08\)
Absolute GFI \(1 - \frac{\hat{F}}{F_b}\) \(> .90\)
Incremental CFI \(1 - \frac{\hat{F} - df_t}{F_b - df_b}\) \(> .95\)
Incremental TLI / NNFI \(\frac{\chi^2_b/df_b \;-\; \chi^2_t/df_t}{\chi^2_b/df_b - 1}\) \(> .95\)

Subscript \(b\) = baseline (null) model; \(t\) = target model; \(p\) = number of observed variables

Model Fit Indices 2

Type Index Formula / Benchmark Acceptable
Incremental NFI \(\frac{\chi^2_b - \chi^2_t}{\chi^2_b}\) \(> .90\)
Parsimony RMSEA \(\sqrt{\max\!\left(0,\frac{\chi^2 - df}{df(n-1)}\right)}\) \(< .06\)
Parsimony AIC \(\chi^2 - 2\,df\) Lower is better
Parsimony BIC \(\chi^2 - df\ln(n)\) Lower is better

Subscript \(b\) = baseline (null) model; \(t\) = target model; \(p\) = number of observed variables

Fit Index: \(\chi^2\) — Chi-Square Test

Absolute

\[\chi^2 = (n-1)\, F(\mathbf{S},\, \hat{\boldsymbol{\Sigma}}) \qquad df = \tfrac{p(p+1)}{2} - q\]

  • Tests whether \(\boldsymbol{\Sigma}(\theta) = \mathbf{S}\) exactly in the population
  • Assess overall fit and the discrepancy between the sample and fitted covariance matrices.
  • \(H_0\): the model fits perfectly — so a non-significant \(p\) is desired
  • Problem: with large \(n\), trivial misfit becomes significant; with small \(n\), poor models pass
  • Best used as a relative comparison between nested models (\(\Delta\chi^2\) test)

Fit Index: \(\chi^2\) — Chi-Square Test

Absolute cont

Acceptable \(p > .05\)
Sensitive to Sample size \(n\), non-normality

Fit Index: SRMR — Standardized Root Mean Square Residual

Absolute

\[\text{SRMR} = \sqrt{\frac{2\sum_j\sum_{k \leq j}(s_{jk} - \hat{\sigma}_{jk})^2}{p(p+1)}}\]

  • Average discrepancy between observed and model-implied correlations
  • Standardizing makes it comparable across studies with different item scales
  • Value of 0 = perfect fit; increases as residuals grow
  • Sensitive to misspecified factor loadings (less so to structural paths)

Fit Index: SRMR — Standardized Root Mean Square Residual

Absolute cont

Acceptable \(< .08\)
Good \(< .05\)

Fit Index: GFI — Goodness of Fit Index

Absolute

\[\text{GFI} = 1 - \frac{\hat{F}(\mathbf{S},\,\hat{\boldsymbol{\Sigma}})}{F(\mathbf{S},\,\mathbf{I})}\]

  • Proportion of variance in \(\mathbf{S}\) explained by \(\hat{\boldsymbol{\Sigma}}\); analogous to \(R^2\)
  • Ranges \([0, 1]\); higher = better fit
  • Limitation: inflated by large \(n\) and number of parameters; not recommended as a primary index in modern practice
  • AGFI (Adjusted GFI) penalises for degrees of freedom: \(\text{AGFI} = 1 - \frac{p(p+1)}{2\,df}(1-\text{GFI})\)

Fit Index: GFI — Goodness of Fit Index

Absolute cont

Acceptable \(> .90\)
Note Superseded by CFI/RMSEA in most guidelines

Fit Index: CFI — Comparative Fit Index

Incremental

\[\text{CFI} = 1 - \frac{\max(\chi^2_t - df_t,\; 0)}{\max(\chi^2_b - df_b,\; 0)}\]

  • Compares the target model to the null model (all covariances \(= 0\))
  • Corrects for the positive bias of NFI with small samples
  • Ranges \([0, 1]\); robust to sample size
  • Most widely reported incremental index in CFA/SEM

Fit Index: CFI — Comparative Fit Index

Incremental cont

Acceptable \(> .95\) (Hu & Bentler, 1999)
Adequate \(> .90\)
Subscripts \(t\) = target model, \(b\) = baseline (null) model

Fit Index: TLI / NNFI — Tucker–Lewis Index

Incremental

\[\text{TLI} = \frac{\chi^2_b / df_b \;-\; \chi^2_t / df_t}{\chi^2_b / df_b \;-\; 1}\]

  • Also called Non-Normed Fit Index (NNFI)
  • Penalises model complexity: divides \(\chi^2\) by \(df\) before comparing
  • Can exceed 1 or fall below 0 for very good or very poor models
  • Preferred over NFI when comparing models of different complexity

Fit Index: TLI / NNFI — Tucker–Lewis Index

Incremental cont

Acceptable \(> .95\)
Adequate \(> .90\)
Note Values \(> 1\) are truncated to 1 in most software

Fit Index: NFI — Normed Fit Index

Incremental

\[\text{NFI} = \frac{\chi^2_b - \chi^2_t}{\chi^2_b}\]

  • Proportion of the null model’s \(\chi^2\) eliminated by the target model
  • First incremental index proposed (Bentler & Bonett, 1980)
  • Ranges strictly \([0, 1]\) — hence “normed”
  • Limitation: underestimates fit with small \(n\); does not penalise for complexity

Fit Index: NFI — Normed Fit Index

Incremental cont

Acceptable \(> .90\)
Limitation Biased downward when \(n < 200\)
Superseded by CFI, TLI for most applications

Fit Index: RMSEA — Root Mean Square Error of Approximation

Parsimony

\[\text{RMSEA} = \sqrt{\max\!\left(0,\; \frac{\chi^2 - df}{df\,(n-1)}\right)}\]

  • Measures misfit per degree of freedom — rewards parsimony
  • Acknowledges that no real-world model fits perfectly; tests “close fit”
  • 90% confidence interval routinely reported; test \(H_0\!: \text{RMSEA} \leq .05\)
  • Tends to favour models with many parameters (large \(df\))

Fit Index: RMSEA — Root Mean Square Error of Approximation

Parsimony cont

Good \(< .05\)
Acceptable \(\leq .08\)
Poor \(> .10\)

Fit Index: AIC — Akaike Information Criterion

Parsimony

\[\text{AIC} = \chi^2 - 2\,df\]

  • Penalises fit function by the number of free parameters (\(q\)); \(df\) decreases as \(q\) increases
  • Derived from information theory (Kullback–Leibler divergence)
  • Has no absolute cutoff — used for model comparison only
  • Smaller AIC = better balance of fit and parsimony

Fit Index: AIC — Akaike Information Criterion

Parsimony cont

Use Compare non-nested models
Rule \(\Delta\text{AIC} > 10\) = strong preference for lower model
Note Does not correct for sample size — prefer BIC with large \(n\)

Fit Index: BIC — Bayesian Information Criterion

Parsimony

\[\text{BIC} = \chi^2 - df\,\ln(n)\]

  • Penalises more heavily than AIC when \(n > e^2 \approx 7.4\) (always in practice)
  • Approximates the log Bayes factor between models
  • Favours simpler models more strongly than AIC as \(n\) grows
  • Preferred for model selection when sample size is large

Fit Index: BIC — Bayesian Information Criterion

Parsimony cont

Use Compare non-nested models
Rule \(\Delta\text{BIC} > 10\) = strong evidence for lower model
vs AIC BIC penalises complexity more; AIC penalises misfit more