Philip D Parker
07 Nov 2013
Var1 Var2 Var3 Var4 Var5 Var6
Person1 -1.91 0.42 -0.82 -0.32 -1.13 -0.96
Person2 0.09 0.21 0.26 -0.08 -0.23 0.09
Person3 1.19 0.49 -0.69 0.38 0.58 0.17
Person4 0.73 1.42 1.66 0.06 2.04 1.11
Person5 -0.49 -0.47 -0.73 -0.72 0.16 -1.54
Person6 -0.01 -0.09 0.20 -0.53 0.78 0.67
Person7 0.02 -0.57 -0.51 1.50 0.24 0.24
Person8 0.66 0.31 1.17 0.05 0.24 1.62
Person9 0.89 0.04 0.46 -0.73 0.23 0.31
Person10 1.18 1.31 2.05 0.82 -0.16 0.60
Var1 Var2 Var3 Var4 Var5 Var6
Var1 0.804 0.399 0.500 0.367 0.451 0.510
Var2 0.399 0.833 0.433 0.283 0.372 0.377
Var3 0.500 0.433 0.805 0.339 0.551 0.543
Var4 0.367 0.283 0.339 0.733 0.332 0.341
Var5 0.451 0.372 0.551 0.332 0.780 0.556
Var6 0.510 0.377 0.543 0.341 0.556 0.825
We also have a model we want to fit to our data:
SEM tests how closely this model produces an expected covariance matrix that is as close as possible to the observed covariances.
The expected covariance matrix formula is:
\[
\Sigma = \Lambda \Phi {\Lambda }' + \Theta
\]
But this model is NOT identified. As the latent variables are unobserved (more unknowns than knowns) there are an infinite set of solutions.
Typically one of the loadings is fixed to 1 but in this instance I fixed the variance so I could make things a little simplier:
\[
\Sigma = \Lambda {\Lambda }' + \Theta
\]
The model formula:
\[
\Sigma = \Lambda \Phi {\Lambda }' + \Theta
\]
is a compact representation of:
\[
\begin{Bmatrix}
\lambda_{11}
\\ \lambda_{21}
\\ \lambda_{31}
\\ \lambda_{41}
\\ \lambda_{51}
\\ \lambda_{61}
\end{Bmatrix}
\times
\begin{Bmatrix}
\lambda_{11} & \lambda_{21} & \lambda_{31} & \lambda_{41} & \lambda_{51} &\lambda_{61}
\end{Bmatrix}
+
\begin{Bmatrix}
\delta _{1} & & & & & \\
& \delta _{2} & & & & \\
& & \delta _{3} & & & \\
& & &\delta _{4} & & \\
& & & &\delta _{5} & \\
& & & & &\delta _{6}
\end{Bmatrix}
\]
Lets guess some values, say .7 for factor loadings.
A good guess for errors is \( 1-\lambda^2 \), so .51 for errors.
\[
\begin{Bmatrix}.7
\\ .7
\\ .7
\\ .7
\\ .7
\\ .7
\end{Bmatrix}
\times
\begin{Bmatrix}
.7 & .7 & .7 & .7 & .7 &.7
\end{Bmatrix}
+
\begin{Bmatrix}
.51 & & & & & \\
& .51 & & & & \\
& & .51 & & & \\
& & &.51 & & \\
& & & &.51 & \\
& & & & &.51
\end{Bmatrix}
\]
This gives us:
Var1 Var2 Var3 Var4 Var5 Var6
Var1 1.00 0.49 0.49 0.49 0.49 0.49
Var2 0.49 1.00 0.49 0.49 0.49 0.49
Var3 0.49 0.49 1.00 0.49 0.49 0.49
Var4 0.49 0.49 0.49 1.00 0.49 0.49
Var5 0.49 0.49 0.49 0.49 1.00 0.49
Var6 0.49 0.49 0.49 0.49 0.49 1.00
Which is closeish to:
Var1 Var2 Var3 Var4 Var5 Var6
Var1 0.80 0.40 0.50 0.37 0.45 0.51
Var2 0.40 0.83 0.43 0.28 0.37 0.38
Var3 0.50 0.43 0.80 0.34 0.55 0.54
Var4 0.37 0.28 0.34 0.73 0.33 0.34
Var5 0.45 0.37 0.55 0.33 0.78 0.56
Var6 0.51 0.38 0.54 0.34 0.56 0.82
Not Bad! We could do better though.
ML is a gradient descent algorithim that seeks to minimize:
\[
D_{ML} = log\left | \Sigma \right | + tr(S\Sigma ^{-1}) - log\left | S \right | - k
\]
The goal of ML is to minimize the differences between the expected covariance matrix and the observed one.
ML can give false solutions if it gets stuck' at a local min!
In practice however, the use of multiple start values and allowing the computer to pick start values makes this unlikley.
I coded up my own \( D_{ml} \) function and used a ML optimizer in R:
loadings SE Residual SE
L1 0.673 0.042 0.351 0.031
L2 0.549 0.046 0.532 0.043
L3 0.748 0.040 0.245 0.025
L4 0.476 0.044 0.507 0.040
L5 0.720 0.040 0.262 0.026
L6 0.742 0.041 0.274 0.027
Some fairly complicated math gives the standard errors but all code is avaliable with these slides.
This gives us:
L1 L2 L3 L4 L5 L6
[1,] 0.80 0.72 0.85 0.67 0.84 0.85
[2,] 0.90 0.83 0.94 0.79 0.93 0.94
[3,] 0.75 0.66 0.81 0.60 0.78 0.80
[4,] 0.83 0.77 0.86 0.73 0.85 0.86
[5,] 0.75 0.66 0.80 0.60 0.78 0.80
[6,] 0.77 0.68 0.83 0.63 0.81 0.83
Which is much closer to:
Var1 Var2 Var3 Var4 Var5 Var6
Var1 0.80 0.40 0.50 0.37 0.45 0.51
Var2 0.40 0.83 0.43 0.28 0.37 0.38
Var3 0.50 0.43 0.80 0.34 0.55 0.54
Var4 0.37 0.28 0.34 0.73 0.33 0.34
Var5 0.45 0.37 0.55 0.33 0.78 0.56
Var6 0.51 0.38 0.54 0.34 0.56 0.82
Great! We have found an optimal fit given the model we hypothesised…..But did we hypothesise the right model?! In other words is our expected covariance matrix so close to the observed that we can say the difference is due to chance.
\( \chi^2 \) takes the outcome of my ML discrepency with the formula and is given by:
\[ D_{ml} \times (N-1) \]
chi-square = 21.73
For the p-value we first need the \( df \), which is given by: \[ df = p - k = (\frac{6\times(6+1)}{2}) - (6+6) = 9 \]
We can then use the \( \chi^2 \) and \( df \) to give the p-value:
p value = 0.01
Note the role of N in this equation. Bigger sample \( > \) \( \chi^{2} \)
David Kenny provides a great reference for fit measures.
RMSEA:
\[
\frac{\sqrt{\chi^2-df}}{\sqrt{df(N-1)}}
\]
RMSEA = 0.063
Incremental fit requires the estimation of another model; A variance only model.
Var1 Var2 Var3 Var4 Var5 Var6
Var1 0.8 0.00 0.00 0.00 0.00 0.00
Var2 0.0 0.83 0.00 0.00 0.00 0.00
Var3 0.0 0.00 0.81 0.00 0.00 0.00
Var4 0.0 0.00 0.00 0.73 0.00 0.00
Var5 0.0 0.00 0.00 0.00 0.78 0.00
Var6 0.0 0.00 0.00 0.00 0.00 0.83
We can go further from here if the null RMSEA is <.15
Here it is 0.4312
CFI: \( \frac{(\chi^2_n - df_n) - (\chi^2_h - df_h)}{\chi^2_n - df_n} \)
TLI: \( \frac{(\chi^2_n /df_n) - (\chi^2_h/df_h)}{\chi^2_n /df_n} \)
In our case:
CFI = 0.987
TLI = 0.979
ESEM:
BSEM
Cronbach's alpha has many problems:
Model based approaches from CFA are also possible that do NOT require \( \tau \) equivalence: \[ \Omega = \frac{\sum (\lambda) ^2}{\sum (\lambda) ^2 + \sum \delta } \]