The lavaan
package also supports multiple groups SEM by adding group
argument in the cfa()
function.
In this practice, we will fit a CFA model with the HolzingerSwineford1939
data set.
setwd("D:/Class Materials & Work/Summer 2020 practice/SEM/Multiple Group CFA")
library(lavaan)
library(semTools)
First, we will specify the model.
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
Then, we will fit the CFA model as usual.
fit <- cfa(HS.model,
data = HolzingerSwineford1939,
group = "school")
summary(fit)
## lavaan 0.6-6 ended normally after 57 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 60
##
## Number of observations per group:
## Pasteur 156
## Grant-White 145
##
## Model Test User Model:
##
## Test statistic 115.851
## Degrees of freedom 48
## P-value (Chi-square) 0.000
## Test statistic for each group:
## Pasteur 64.309
## Grant-White 51.542
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
##
## Group 1 [Pasteur]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## visual =~
## x1 1.000
## x2 0.394 0.122 3.220 0.001
## x3 0.570 0.140 4.076 0.000
## textual =~
## x4 1.000
## x5 1.183 0.102 11.613 0.000
## x6 0.875 0.077 11.421 0.000
## speed =~
## x7 1.000
## x8 1.125 0.277 4.057 0.000
## x9 0.922 0.225 4.104 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## visual ~~
## textual 0.479 0.106 4.531 0.000
## speed 0.185 0.077 2.397 0.017
## textual ~~
## speed 0.182 0.069 2.628 0.009
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .x1 4.941 0.095 52.249 0.000
## .x2 5.984 0.098 60.949 0.000
## .x3 2.487 0.093 26.778 0.000
## .x4 2.823 0.092 30.689 0.000
## .x5 3.995 0.105 38.183 0.000
## .x6 1.922 0.079 24.321 0.000
## .x7 4.432 0.087 51.181 0.000
## .x8 5.563 0.078 71.214 0.000
## .x9 5.418 0.079 68.440 0.000
## visual 0.000
## textual 0.000
## speed 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.298 0.232 1.286 0.198
## .x2 1.334 0.158 8.464 0.000
## .x3 0.989 0.136 7.271 0.000
## .x4 0.425 0.069 6.138 0.000
## .x5 0.456 0.086 5.292 0.000
## .x6 0.290 0.050 5.780 0.000
## .x7 0.820 0.125 6.580 0.000
## .x8 0.510 0.116 4.406 0.000
## .x9 0.680 0.104 6.516 0.000
## visual 1.097 0.276 3.967 0.000
## textual 0.894 0.150 5.963 0.000
## speed 0.350 0.126 2.778 0.005
##
##
## Group 2 [Grant-White]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## visual =~
## x1 1.000
## x2 0.736 0.155 4.760 0.000
## x3 0.925 0.166 5.583 0.000
## textual =~
## x4 1.000
## x5 0.990 0.087 11.418 0.000
## x6 0.963 0.085 11.377 0.000
## speed =~
## x7 1.000
## x8 1.226 0.187 6.569 0.000
## x9 1.058 0.165 6.429 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## visual ~~
## textual 0.408 0.098 4.153 0.000
## speed 0.276 0.076 3.639 0.000
## textual ~~
## speed 0.222 0.073 3.022 0.003
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .x1 4.930 0.095 51.696 0.000
## .x2 6.200 0.092 67.416 0.000
## .x3 1.996 0.086 23.195 0.000
## .x4 3.317 0.093 35.625 0.000
## .x5 4.712 0.096 48.986 0.000
## .x6 2.469 0.094 26.277 0.000
## .x7 3.921 0.086 45.819 0.000
## .x8 5.488 0.087 63.174 0.000
## .x9 5.327 0.085 62.571 0.000
## visual 0.000
## textual 0.000
## speed 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.715 0.126 5.676 0.000
## .x2 0.899 0.123 7.339 0.000
## .x3 0.557 0.103 5.409 0.000
## .x4 0.315 0.065 4.870 0.000
## .x5 0.419 0.072 5.812 0.000
## .x6 0.406 0.069 5.880 0.000
## .x7 0.600 0.091 6.584 0.000
## .x8 0.401 0.094 4.249 0.000
## .x9 0.535 0.089 6.010 0.000
## visual 0.604 0.160 3.762 0.000
## textual 0.942 0.152 6.177 0.000
## speed 0.461 0.118 3.910 0.000
fitmeasures(fit, fit.measures = "all", output = "text")
##
## Model Test User Model:
##
## Test statistic 115.851
## Degrees of freedom 48
## P-value 0.000
##
## Model Test Baseline Model:
##
## Test statistic 957.769
## Degrees of freedom 72
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.923
## Tucker-Lewis Index (TLI) 0.885
## Bentler-Bonett Non-normed Fit Index (NNFI) 0.885
## Bentler-Bonett Normed Fit Index (NFI) 0.879
## Parsimony Normed Fit Index (PNFI) 0.586
## Bollen's Relative Fit Index (RFI) 0.819
## Bollen's Incremental Fit Index (IFI) 0.925
## Relative Noncentrality Index (RNI) 0.923
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -3682.198
## Loglikelihood unrestricted model (H1) -3624.272
##
## Akaike (AIC) 7484.395
## Bayesian (BIC) 7706.822
## Sample-size adjusted Bayesian (BIC) 7516.536
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.097
## 90 Percent confidence interval - lower 0.075
## 90 Percent confidence interval - upper 0.120
## P-value RMSEA <= 0.05 0.001
##
## Standardized Root Mean Square Residual:
##
## RMR 0.083
## RMR (No Mean) 0.091
## SRMR 0.068
##
## Other Fit Indices:
##
## Hoelter Critical N (CN) alpha = 0.05 170.324
## Hoelter Critical N (CN) alpha = 0.01 192.439
##
## Goodness of Fit Index (GFI) 0.995
## Adjusted Goodness of Fit Index (AGFI) 0.989
## Parsimony Goodness of Fit Index (PGFI) 0.442
##
## McDonald Fit Index (MFI) 0.893
##
If you want to fix parameters, or provide starting values, you can use the same pre-multiplication techniques, but the single argument is now replaced by a vector of arguments, one for each group.
HS.model_2 <- ' visual =~ x1 + 0.5*x2 + c(0.6, 0.8)*x3
textual =~ x4 + start(c(1.2, 0.6))*x5 + a*x6
speed =~ x7 + x8 + x9 '
In the definition of the latent factor visual
, we have fixed the factor loading of the indicator x3
to the value ‘0.6’ in the first group, and to the value ‘0.8’ in the second group, while the factor loading of the indicator x2
is fixed to the value ‘0.5’ in both groups.
In the definition of the textual
factor, two different starting values are provided for the x5
indicator; one for each group. In addition, we have labeled the factor loading of the x6
indicator as ‘a’, but this label is only given to the parameter of the first group.
If you want to provide labels to each of the two groups, you can write something like c(a1,a2)*x6
Be careful: if you write c(a,a)*x6
, both parameters (in the first and second group) will get the same label, and hence they will be treated as a single parameter.
To verify the effects of these modifiers, we refit the model:
fit_2 <- cfa(HS.model_2,
data = HolzingerSwineford1939,
group = "school")
summary(fit_2)
## lavaan 0.6-6 ended normally after 45 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 56
##
## Number of observations per group:
## Pasteur 156
## Grant-White 145
##
## Model Test User Model:
##
## Test statistic 118.976
## Degrees of freedom 52
## P-value (Chi-square) 0.000
## Test statistic for each group:
## Pasteur 64.901
## Grant-White 54.075
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
##
## Group 1 [Pasteur]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## visual =~
## x1 1.000
## x2 0.500
## x3 0.600
## textual =~
## x4 1.000
## x5 1.185 0.102 11.598 0.000
## x6 (a) 0.876 0.077 11.409 0.000
## speed =~
## x7 1.000
## x8 1.129 0.279 4.055 0.000
## x9 0.931 0.227 4.103 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## visual ~~
## textual 0.460 0.103 4.479 0.000
## speed 0.182 0.076 2.408 0.016
## textual ~~
## speed 0.181 0.069 2.625 0.009
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .x1 4.941 0.094 52.379 0.000
## .x2 5.984 0.100 59.945 0.000
## .x3 2.487 0.092 26.983 0.000
## .x4 2.823 0.092 30.689 0.000
## .x5 3.995 0.105 38.183 0.000
## .x6 1.922 0.079 24.320 0.000
## .x7 4.432 0.087 51.181 0.000
## .x8 5.563 0.078 71.214 0.000
## .x9 5.418 0.079 68.440 0.000
## visual 0.000
## textual 0.000
## speed 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.388 0.129 3.005 0.003
## .x2 1.304 0.155 8.432 0.000
## .x3 0.965 0.120 8.016 0.000
## .x4 0.427 0.069 6.153 0.000
## .x5 0.454 0.086 5.270 0.000
## .x6 0.289 0.050 5.763 0.000
## .x7 0.824 0.124 6.617 0.000
## .x8 0.510 0.116 4.417 0.000
## .x9 0.677 0.105 6.479 0.000
## visual 1.001 0.172 5.803 0.000
## textual 0.892 0.150 5.953 0.000
## speed 0.346 0.125 2.768 0.006
##
##
## Group 2 [Grant-White]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## visual =~
## x1 1.000
## x2 0.500
## x3 0.800
## textual =~
## x4 1.000
## x5 0.990 0.087 11.425 0.000
## x6 0.963 0.085 11.374 0.000
## speed =~
## x7 1.000
## x8 1.228 0.188 6.539 0.000
## x9 1.081 0.168 6.417 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## visual ~~
## textual 0.454 0.099 4.585 0.000
## speed 0.315 0.079 4.004 0.000
## textual ~~
## speed 0.222 0.073 3.049 0.002
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .x1 4.930 0.097 50.688 0.000
## .x2 6.200 0.089 69.616 0.000
## .x3 1.996 0.086 23.223 0.000
## .x4 3.317 0.093 35.625 0.000
## .x5 4.712 0.096 48.986 0.000
## .x6 2.469 0.094 26.277 0.000
## .x7 3.921 0.086 45.819 0.000
## .x8 5.488 0.087 63.174 0.000
## .x9 5.327 0.085 62.571 0.000
## visual 0.000
## textual 0.000
## speed 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.637 0.115 5.539 0.000
## .x2 0.966 0.120 8.076 0.000
## .x3 0.601 0.091 6.591 0.000
## .x4 0.316 0.065 4.877 0.000
## .x5 0.418 0.072 5.805 0.000
## .x6 0.407 0.069 5.887 0.000
## .x7 0.609 0.091 6.658 0.000
## .x8 0.411 0.094 4.385 0.000
## .x9 0.522 0.089 5.887 0.000
## visual 0.735 0.132 5.544 0.000
## textual 0.942 0.152 6.177 0.000
## speed 0.453 0.117 3.871 0.000
We can also fix parameters in some groups, but not all by using NA
to force a parameter to be free in one (or more) group(s). Suppose for example we have four groups. We define a latent variable (say f
) with three indicators. We wish to fix the factor loading of indicator item2
to 1.0 in all but the second group. We can write something like:
f =~ item1 + c(1,NA,1,1)*item2 + item3
We can constrain one or more parameters to be equal across groups by giving them the same label. For example, to constrain the factor loading of the indicator x3
to be equal across (two) groups, you can write:
HS.model_3 <- ' visual =~ x1 + x2 + c(v3,v3)*x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
Identical labels imply identical parameters, both within and across groups.
Although providing identical labels is a very flexible method to specify equality constraints for a few parameters, there is a more convenient way to impose equality constraints on a whole set of parameters (for example: all factor loadings, or all intercepts) We call these type of constraints group equality constraints and they can be specified by the argument group.equal
in the fitting function.
For example, to constrain (all) the factor loadings to be equal across groups, you can proceed as follows:
HS.model_4 <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit_3 <- cfa(HS.model,
data = HolzingerSwineford1939,
group = "school",
group.equal = c("loadings"))
summary(fit_3)
## lavaan 0.6-6 ended normally after 42 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 60
## Number of equality constraints 6
##
## Number of observations per group:
## Pasteur 156
## Grant-White 145
##
## Model Test User Model:
##
## Test statistic 124.044
## Degrees of freedom 54
## P-value (Chi-square) 0.000
## Test statistic for each group:
## Pasteur 68.825
## Grant-White 55.219
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
##
## Group 1 [Pasteur]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## visual =~
## x1 1.000
## x2 (.p2.) 0.599 0.100 5.979 0.000
## x3 (.p3.) 0.784 0.108 7.267 0.000
## textual =~
## x4 1.000
## x5 (.p5.) 1.083 0.067 16.049 0.000
## x6 (.p6.) 0.912 0.058 15.785 0.000
## speed =~
## x7 1.000
## x8 (.p8.) 1.201 0.155 7.738 0.000
## x9 (.p9.) 1.038 0.136 7.629 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## visual ~~
## textual 0.416 0.097 4.271 0.000
## speed 0.169 0.064 2.643 0.008
## textual ~~
## speed 0.176 0.061 2.882 0.004
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .x1 4.941 0.093 52.991 0.000
## .x2 5.984 0.100 60.096 0.000
## .x3 2.487 0.094 26.465 0.000
## .x4 2.823 0.093 30.371 0.000
## .x5 3.995 0.101 39.714 0.000
## .x6 1.922 0.081 23.711 0.000
## .x7 4.432 0.086 51.540 0.000
## .x8 5.563 0.078 71.087 0.000
## .x9 5.418 0.079 68.153 0.000
## visual 0.000
## textual 0.000
## speed 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.551 0.137 4.010 0.000
## .x2 1.258 0.155 8.117 0.000
## .x3 0.882 0.128 6.884 0.000
## .x4 0.434 0.070 6.238 0.000
## .x5 0.508 0.082 6.229 0.000
## .x6 0.266 0.050 5.294 0.000
## .x7 0.849 0.114 7.468 0.000
## .x8 0.515 0.095 5.409 0.000
## .x9 0.658 0.096 6.865 0.000
## visual 0.805 0.171 4.714 0.000
## textual 0.913 0.137 6.651 0.000
## speed 0.305 0.078 3.920 0.000
##
##
## Group 2 [Grant-White]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## visual =~
## x1 1.000
## x2 (.p2.) 0.599 0.100 5.979 0.000
## x3 (.p3.) 0.784 0.108 7.267 0.000
## textual =~
## x4 1.000
## x5 (.p5.) 1.083 0.067 16.049 0.000
## x6 (.p6.) 0.912 0.058 15.785 0.000
## speed =~
## x7 1.000
## x8 (.p8.) 1.201 0.155 7.738 0.000
## x9 (.p9.) 1.038 0.136 7.629 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## visual ~~
## textual 0.437 0.099 4.423 0.000
## speed 0.314 0.079 3.958 0.000
## textual ~~
## speed 0.226 0.072 3.144 0.002
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .x1 4.930 0.097 50.763 0.000
## .x2 6.200 0.091 68.379 0.000
## .x3 1.996 0.085 23.455 0.000
## .x4 3.317 0.092 35.950 0.000
## .x5 4.712 0.100 47.173 0.000
## .x6 2.469 0.091 27.248 0.000
## .x7 3.921 0.086 45.555 0.000
## .x8 5.488 0.087 63.257 0.000
## .x9 5.327 0.085 62.786 0.000
## visual 0.000
## textual 0.000
## speed 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.645 0.127 5.084 0.000
## .x2 0.933 0.121 7.732 0.000
## .x3 0.605 0.096 6.282 0.000
## .x4 0.329 0.062 5.279 0.000
## .x5 0.384 0.073 5.270 0.000
## .x6 0.437 0.067 6.576 0.000
## .x7 0.599 0.090 6.651 0.000
## .x8 0.406 0.089 4.541 0.000
## .x9 0.532 0.086 6.202 0.000
## visual 0.722 0.161 4.490 0.000
## textual 0.906 0.136 6.646 0.000
## speed 0.475 0.109 4.347 0.000
If you omit the group.equal
argument, all parameters are freely estimated in each group (but the model structure is the same).
But what if you want to constrain a whole group of parameters (say all factor loadings and intercepts) across groups, except for one or two parameters that need to stay free in all groups. For this scenario, you can use the argument group.partial
, containing the names of those parameters that need to remain free.
fit_4 <- cfa(HS.model,
data = HolzingerSwineford1939,
group = "school",
group.equal = c("loadings", "intercepts"),
group.partial = c("visual=~x2", "x7~1"))