SEM with a covariance matrix

Author

Ty Partridge

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lavaan)
This is lavaan 0.6-19
lavaan is FREE software! Please report any bugs.

If you have an article with the correlations and standard deviations for your variables you can convert that to a covariance matrix with the following code. You can then take the resulting covariance matrix and copy it into lavaan code for your model (see block 3)

# Example correlation matrix
correlation_matrix <- matrix(c(1, 0.8, 0.6,
                                0.8, 1, 0.7,
                                0.6, 0.7, 1), 
                             nrow = 3, byrow = TRUE)

# Standard deviations for the variables
std_devs <- c(2, 3, 4)

# Transform correlation matrix to covariance matrix
covariance_matrix <- correlation_matrix * (std_devs %*% t(std_devs))

# Create a mask for the lower triangular part
lower_half <- covariance_matrix
lower_half[upper.tri(covariance_matrix)] <- "" # Use NA or 0 if you prefer

# Print the lower half
print(lower_half, quote = FALSE)
     [,1] [,2] [,3]
[1,] 4             
[2,] 4.8  9        
[3,] 4.8  8.4  16  

Once you have your covariance matrix you can past it in as shown below (this is a different data set which has a known SEM model, but to get it you would use the step shown in block 2 above)

lower <- '
 11.834
  6.947   9.364
  6.819   5.091  12.532
  4.783   5.028   7.495   9.986
 -3.839  -3.889  -3.841  -3.625  9.610
-21.899 -18.831 -21.748 -18.775 35.522 450.288 '

wheaton.cov <- 
    getCov(lower, names = c("anomia67", "powerless67", 
                            "anomia71", "powerless71",
                            "education", "sei"))

Next you specify the model just as you would as if you had the full raw data. The only difference is that when you are specifying your fit object (i.e., fit) instead of using the data arguement you use “sample.cov = covariance object name” and the “sample.nobs = sample size”

# classic wheaton et al. model
wheaton.model <- '
  # latent variables
    ses     =~ education + sei
    alien67 =~ anomia67 + powerless67
    alien71 =~ anomia71 + powerless71
  # regressions
    alien71 ~ alien67 + ses
    alien67 ~ ses
  # correlated residuals
    anomia67 ~~ anomia71
    powerless67 ~~ powerless71
'
fit <- sem(wheaton.model, 
           sample.cov = wheaton.cov, 
           sample.nobs = 932, mimic = "Mplus")
Warning: lavaan->lav_lavaan_step05_samplestats():  
   sample.mean= argument is missing, but model contains mean/intercept 
   parameters.
summary(fit, fit.measures = TRUE, standardized = TRUE)
lavaan 0.6-19 ended normally after 84 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        23

  Number of observations                           932

Model Test User Model:
                                                      
  Test statistic                                 4.735
  Degrees of freedom                                 4
  P-value (Chi-square)                           0.316

Model Test Baseline Model:

  Test statistic                              2133.722
  Degrees of freedom                                15
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000
  Tucker-Lewis Index (TLI)                       0.999

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -15213.274
  Loglikelihood unrestricted model (H1)     -15210.906
                                                      
  Akaike (AIC)                               30472.548
  Bayesian (BIC)                             30583.807
  Sample-size adjusted Bayesian (SABIC)      30510.761

Root Mean Square Error of Approximation:

  RMSEA                                          0.014
  90 Percent confidence interval - lower         0.000
  90 Percent confidence interval - upper         0.053
  P-value H_0: RMSEA <= 0.050                    0.930
  P-value H_0: RMSEA >= 0.080                    0.001

Standardized Root Mean Square Residual:

  SRMR                                           0.007

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  ses =~                                                                
    education         1.000                               2.607    0.842
    sei               5.219    0.422   12.364    0.000   13.609    0.642
  alien67 =~                                                            
    anomia67          1.000                               2.663    0.774
    powerless67       0.979    0.062   15.895    0.000    2.606    0.852
  alien71 =~                                                            
    anomia71          1.000                               2.850    0.805
    powerless71       0.922    0.059   15.498    0.000    2.628    0.832

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  alien71 ~                                                             
    alien67           0.607    0.051   11.898    0.000    0.567    0.567
    ses              -0.227    0.052   -4.334    0.000   -0.207   -0.207
  alien67 ~                                                             
    ses              -0.575    0.056  -10.195    0.000   -0.563   -0.563

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .anomia67 ~~                                                           
   .anomia71          1.623    0.314    5.176    0.000    1.623    0.356
 .powerless67 ~~                                                        
   .powerless71       0.339    0.261    1.298    0.194    0.339    0.121

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .education         0.000    0.101    0.000    1.000    0.000    0.000
   .sei               0.000    0.695    0.000    1.000    0.000    0.000
   .anomia67          0.000    0.113    0.000    1.000    0.000    0.000
   .powerless67       0.000    0.100    0.000    1.000    0.000    0.000
   .anomia71          0.000    0.116    0.000    1.000    0.000    0.000
   .powerless71       0.000    0.103    0.000    1.000    0.000    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .education         2.801    0.507    5.525    0.000    2.801    0.292
   .sei             264.597   18.126   14.597    0.000  264.597    0.588
   .anomia67          4.731    0.453   10.441    0.000    4.731    0.400
   .powerless67       2.563    0.403    6.359    0.000    2.563    0.274
   .anomia71          4.399    0.515    8.542    0.000    4.399    0.351
   .powerless71       3.070    0.434    7.070    0.000    3.070    0.308
    ses               6.798    0.649   10.475    0.000    1.000    1.000
   .alien67           4.841    0.467   10.359    0.000    0.683    0.683
   .alien71           4.083    0.404   10.104    0.000    0.503    0.503