In this practice, we will use the built-in PoliticalDemocracy
dataset, which is also used by Bollen (1989) in his book.
The figure below contains a graphical representation of the model that we want to fit.
The above model can be syntactically explained as follows:
setwd("D:/Class Materials & Work/Summer 2020 practice/SEM/SEM")
library(lavaan)
library(semPlot)
library(OpenMx)
library(tidyverse)
library(knitr)
library(kableExtra)
library(GGally)
library(ggcorrplot)
model <- '
# measurement model
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
In the syntax above, there are three formula types, latent variabele definitions (using the =~
operator), regression formulas (using the ~
operator), and (co)variance formulas (using the ~~
operator).
Residual variance can be recognized as disturbance variable.
About covariance (~~
), The variables can be either observed or latent variables. If the two variable names are the same, the expression refers to the variance (or residual variance) of that variable. If the two variable names are different, the expression refers to the (residual) covariance among these two variables.
Note that the two expressions y2 ~~ y4
and y2 ~~ y6
, can be combined into the expression y2 ~~ y4 + y6
. This is just a shorthand notation.
We can use a correlation matrix to visually represent the dataset, as well as its IV and DV.
ggcorr(PoliticalDemocracy,
nbreaks = 6,
label = T, low = "red3", high = "green3",
label_round = 2, name = "Correlation Scale", label_alpha = T, hjust = 0.75) +
ggtitle(label = "Correlation Plot") +
theme(plot.title = element_text(hjust = 0.6)) #move the title to the middle
To fit the model and see the results we can type:
fit <- sem(model, data=PoliticalDemocracy)
summary(fit, standardized=TRUE)
## lavaan 0.6-6 ended normally after 68 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 31
##
## Number of observations 75
##
## Model Test User Model:
##
## Test statistic 38.125
## Degrees of freedom 35
## P-value (Chi-square) 0.329
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## ind60 =~
## x1 1.000 0.670 0.920
## x2 2.180 0.139 15.742 0.000 1.460 0.973
## x3 1.819 0.152 11.967 0.000 1.218 0.872
## dem60 =~
## y1 1.000 2.223 0.850
## y2 1.257 0.182 6.889 0.000 2.794 0.717
## y3 1.058 0.151 6.987 0.000 2.351 0.722
## y4 1.265 0.145 8.722 0.000 2.812 0.846
## dem65 =~
## y5 1.000 2.103 0.808
## y6 1.186 0.169 7.024 0.000 2.493 0.746
## y7 1.280 0.160 8.002 0.000 2.691 0.824
## y8 1.266 0.158 8.007 0.000 2.662 0.828
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## dem60 ~
## ind60 1.483 0.399 3.715 0.000 0.447 0.447
## dem65 ~
## ind60 0.572 0.221 2.586 0.010 0.182 0.182
## dem60 0.837 0.098 8.514 0.000 0.885 0.885
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .y1 ~~
## .y5 0.624 0.358 1.741 0.082 0.624 0.296
## .y2 ~~
## .y4 1.313 0.702 1.871 0.061 1.313 0.273
## .y6 2.153 0.734 2.934 0.003 2.153 0.356
## .y3 ~~
## .y7 0.795 0.608 1.308 0.191 0.795 0.191
## .y4 ~~
## .y8 0.348 0.442 0.787 0.431 0.348 0.109
## .y6 ~~
## .y8 1.356 0.568 2.386 0.017 1.356 0.338
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .x1 0.082 0.019 4.184 0.000 0.082 0.154
## .x2 0.120 0.070 1.718 0.086 0.120 0.053
## .x3 0.467 0.090 5.177 0.000 0.467 0.239
## .y1 1.891 0.444 4.256 0.000 1.891 0.277
## .y2 7.373 1.374 5.366 0.000 7.373 0.486
## .y3 5.067 0.952 5.324 0.000 5.067 0.478
## .y4 3.148 0.739 4.261 0.000 3.148 0.285
## .y5 2.351 0.480 4.895 0.000 2.351 0.347
## .y6 4.954 0.914 5.419 0.000 4.954 0.443
## .y7 3.431 0.713 4.814 0.000 3.431 0.322
## .y8 3.254 0.695 4.685 0.000 3.254 0.315
## ind60 0.448 0.087 5.173 0.000 1.000 1.000
## .dem60 3.956 0.921 4.295 0.000 0.800 0.800
## .dem65 0.172 0.215 0.803 0.422 0.039 0.039
semPaths(fit, what = 'std', layout = 'tree', edge.label.cex=.9, curvePivot = TRUE)
The argument standardized = TRUE
augments the output with standardized parameter values (Std.lv
and Std.all
), but we will not use fit.measures=TRUE
in this function. Std.lv
standardizes only the the latent variables, while Std.all
standardizes both latent and observed variables.