Load in the the data and packages

We will use the same ELS dataset that we used in logistic regression week. For packages, we will use tidyverseand lavaan.

library(tidyverse)
library(lavaan)

library(haven)
els <- read_dta("ELS_Logistic.dta")
glimpse(els)

Rows: 16,197
Columns: 55
$ stu_id   <dbl> 101101, 101102, 101104, 101105, 101106, 101107, 101108…
$ sch_id   <dbl> 1011, 1011, 1011, 1011, 1011, 1011, 1011, 1011, 1011, …
$ strat_id <dbl> 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101,…
$ psu      <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ f1sch_id <dbl+lbl> 1011, 1011, 1011, 1011, 1011, 1011, 1011,   -8,   …
$ f1univ1  <dbl+lbl> 101, 101, 101, 101, 101, 101, 101, 107, 107, 101, …
$ f1univ2a <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ f1univ2b <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 7, 7, 1, 1, 1, 1, 1, 1, 1, 1,…
$ f2univ_p <dbl+lbl> 103, 101, 101, 101, 101, 101, 101, 104, 105, 101, …
$ bystuwt  <dbl+lbl> 178.9513,  28.2951, 589.7248, 235.7822, 178.9513, …
$ bysex    <dbl+lbl> 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 2,…
$ byrace   <dbl+lbl> 5, 2, 7, 3, 4, 4, 4, 7, 4, 3, 3, 4, 3, 2, 2, 3, 3,…
$ bystlang <dbl+lbl> 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,…
$ byses1   <dbl+lbl> -0.25,  0.58, -0.85, -0.80, -1.41, -1.07,  0.27, -…
$ byses1qu <dbl+lbl> 2, 4, 1, 1, 1, 1, 3, 2, 1, 1, 2, 2, 1, 2, 4, 3, 3,…
$ bygrdrpt <dbl+lbl>  0,  0,  0, 98,  0,  0, 98,  0,  0, 98,  0,  0,  1…
$ byhomlit <dbl+lbl> 0, 3, 2, 1, 1, 2, 1, 3, 0, 2, 0, 3, 2, 1, 2, 0, 1,…
$ byparasp <dbl+lbl> 5, 7, 7, 6, 2, 3, 5, 5, 5, 2, 5, 3, 6, 5, 6, 3, 5,…
$ byiepflg <dbl+lbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…
$ bytxcstd <dbl+lbl> 56.21, 57.66, 66.50, 46.46, 36.17, 30.72, 45.46, 6…
$ bytxcqu  <dbl+lbl> 3, 4, 4, 2, 1, 1, 2, 4, 1, 2, 4, 2, 1, 3, 4, 1, 1,…
$ bywrtnga <dbl+lbl>  1.191,  1.191,  0.996, -0.137, -0.435, -0.630, -0…
$ byxtracu <dbl+lbl> 1, 3, 2, 0, 0, 0, 1, 2, 1, 2, 0, 0, 2, 1, 2, 0, 0,…
$ byhmwrk  <dbl+lbl>  7,  5, -9, 11, 10, 13, 12, 12, -9, 23,  9,  4,  7…
$ bytvvigm <dbl+lbl> 99,  4,  1, 99,  4,  3,  2,  3, -9,  3,  8,  1,  4…
$ f1qwt    <dbl+lbl>  152.9769,   25.3577,  709.4246,  199.7193,  152.9…
$ f1pnlwt  <dbl+lbl>  155.6312,   25.4906,  725.6926,  205.1919,  155.6…
$ f1trscwt <dbl+lbl> 284.6529,  42.0937, 730.2420, 280.0837, 287.2398, …
$ f2qtscwt <dbl+lbl>   0.0000,  50.3749, 755.4681, 245.5287, 265.5541, …
$ f2qwt    <dbl+lbl>   0.0000,  28.2772, 660.0555, 198.5202, 153.7132, …
$ f2f1wt   <dbl+lbl>   0.0000,  29.8462, 713.1807, 227.0875, 163.9503, …
$ f2bywt   <dbl+lbl>   0.0000,  27.8257, 662.1222, 212.3764, 158.2162, …
$ bys14    <dbl+lbl> 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 2,…
$ bys15    <dbl+lbl> 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0,…
$ bys24d   <dbl+lbl>  1,  1,  1,  1,  1,  2,  2,  1,  2,  1,  1,  1,  2…
$ bys24e   <dbl+lbl>  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  2…
$ bys24f   <dbl+lbl>  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  2…
$ bys33a   <dbl+lbl>  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0…
$ bys33b   <dbl+lbl>  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0…
$ bys43    <dbl+lbl>  2,  2,  3,  5,  2, 12, -9,  0, -9,  2,  0,  2,  0…
$ bys62a   <dbl+lbl> -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3…
$ bys62g   <dbl+lbl> -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3…
$ bys66a   <dbl+lbl>  4,  1,  1,  1,  1,  6,  1,  1,  1,  4, -9,  1,  1…
$ bys66b   <dbl+lbl>  6,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1…
$ bys66c   <dbl+lbl>  3, -1, -1,  6,  1, -3, -1,  6,  1,  6,  1,  1,  1…
$ bys66f   <dbl+lbl>  1,  1, -1,  1,  1, -3,  1,  1,  1,  1,  1,  1, -3…
$ bys66g   <dbl+lbl> -3, -3, -1,  1, -3, -9,  1,  1, -3, -1,  1,  1, -3…
$ bys67    <dbl+lbl> 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,…
$ bys71d   <dbl+lbl>  0,  0,  0,  0,  0,  0,  1,  0,  0,  0,  0,  0,  0…
$ bys71e   <dbl+lbl>  0,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  0…
$ bys83a   <dbl+lbl>  2,  5,  2,  2,  1, -9, -1, -1, -1,  2, -1,  6,  2…
$ bys83b   <dbl+lbl>  4,  7,  4,  2,  1, -9, -1, -1, -1,  2, -1,  2,  5…
$ bys87a   <dbl+lbl>  3,  2,  3,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2…
$ bys87e   <dbl+lbl>  2,  2,  1,  2,  2, -9,  2,  2,  2,  2,  1,  2,  2…
$ bys87f   <dbl+lbl>  3,  2,  3,  1,  2,  1,  3,  1,  2,  1,  2,  2,  2…

Quick data cleaning and coding lesson: `transmute` is your friend!

We will use the transmute function from the tidyverse. This function works like a combination of mutate and select at the same time. Only variabels that are named in transmute will be kept. So we will make copies of the variables we need that have more readable names, and in doing so, we also will create a smaller dataset just containing the variables we need. Pretty cool!

els.clean <- els %>%
  na_if(., -9) %>%
  na_if(., -8) %>%
  na_if(., -7) %>%
  na_if(., -4)%>%
  na_if(., -1) %>%
    transmute(.,
         hrsread = bys43,
         absorbmath = bys87a,
         absorbread = bys87e,
         importmath = bys87f,
         comptest = bytxcstd,
         sex = as_factor(bysex),
         sesq = as_factor(byses1qu))

Before selecting variables, we first used the na_if function from the tidyverse to convert common missing value codes from ELS into NAs. We also did this last week.

Next, we selected variables related to math and reading engagement (absorbmath and absorbread) and math importance (importmath). We also selected hours read at home per week (hrsread), a standardized achievement test variable (comptest), a of sex (sex) and a quartile measure of SES (sesq). We converted the last two categorical variables into factors using the as_factor function.

Getting started with path analysis - replicating common statistical techniques

Pearson Correlation/ Simple Linear Regression (note the `stand` option)

I recommend using full-information maximum likelihood (FIML), which adjusts for missing values. It is often preferred to multiple imputation! Just add the option estimator = "MLR" to the end of the sem command in lavaan.

Specify your model (Just like a simple linear regression)

model1 <- 'comptest ~ absorbread'

We can now fit the model as follows:

fit1 <- sem(model1, data = els.clean, estimator = "MLR")

Once the model has been fitted, the `summary()` function provides a nice summary of the fitted model:

summary(fit1, standardized = TRUE)

lavaan 0.6-7 ended normally after 16 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                          2
                                                      
                                                  Used       Total
  Number of observations                         11603       16197
                                                                  
Model Test User Model:
                                              Standard      Robust
  Test Statistic                                 0.000       0.000
  Degrees of freedom                                 0           0

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  comptest ~                                                            
    absorbread       -2.374    0.114  -20.878    0.000   -2.374   -0.229

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .comptest         90.527    1.131   80.075    0.000   90.527    0.947

Visualize with a path diagram

semPlot::semPaths(fit1, whatLabels = "standardized")

Compare to a regression setup using the `lm()` function

model2 <- lm(comptest ~ absorbread, data = els.clean)
summary(model2)


Call:
lm(formula = comptest ~ absorbread, data = els.clean)

Residuals:
    Min      1Q  Median      3Q     Max 
-43.952  -6.393   0.419   6.868  29.052 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 57.07586    0.22911  249.12   <2e-16 ***
absorbread  -2.37439    0.09354  -25.38   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.515 on 11601 degrees of freedom
  (4594 observations deleted due to missingness)
Multiple R-squared:  0.05262,   Adjusted R-squared:  0.05253 
F-statistic: 644.3 on 1 and 11601 DF,  p-value: < 2.2e-16

Independent samples t-test

table(els.clean$sex)


{Survey component legitimate skip/NA} 
                                    0 
                      {Nonrespondent} 
                                    0 
                                 Male 
                                 7653 
                               Female 
                                 7717

We need to use `case_when()` to create a dummy variable

els.clean <- els.clean %>%
  mutate(.,
          male = case_when(
           sex == "Male" ~ 1,
           sex == "Female" ~ 0))

Now, we can specify our model

model3 <- 'absorbmath ~ male'
fit3 <- sem(model3, data = els.clean, estimator = "MLR")
summary(fit3, standardized = TRUE)

lavaan 0.6-7 ended normally after 11 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                          2
                                                      
                                                  Used       Total
  Number of observations                         11755       16197
                                                                  
Model Test User Model:
                                              Standard      Robust
  Test Statistic                                 0.000       0.000
  Degrees of freedom                                 0           0

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  absorbmath ~                                                          
    male             -0.081    0.016   -5.055    0.000   -0.081   -0.047

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .absorbmath        0.743    0.025   30.138    0.000    0.743    0.998

Visualize with a path diagram

semPlot::semPaths(fit3, whatLabels = "standardized")

One-Way ANOVA

Create dummy variables for the SES quartiles

table(els.clean$sesq)


{Survey component legitimate skip/NA} 
                                    0 
                      {Nonrespondent} 
                                    0 
                      Lowest quartile 
                                 3608 
                      Second quartile 
                                 3604 
                       Third quartile 
                                 3731 
                     Highest quartile 
                                 4301

els.clean <- els.clean %>%
  mutate(.,
          ses1 = case_when(
           sesq== "Lowest quartile" ~ 1,
           TRUE ~ 0),
         ses2 = case_when(
           sesq== "Second quartile" ~ 1,
           TRUE ~ 0),
        ses3 = case_when(
           sesq== "Third quartile" ~ 1,
           TRUE ~ 0),
        ses4 = case_when(
           sesq== "Highest quartile" ~ 1,
           TRUE ~ 0))

Specify the model (note the choice of reference group)

model4 <- 'importmath ~ ses1 + ses2 + ses3'
fit4 <- sem(model4, data = els.clean, estimator = "MLR")
summary(fit4, standardized = TRUE)

lavaan 0.6-7 ended normally after 15 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                          4
                                                      
                                                  Used       Total
  Number of observations                         11763       16197
                                                                  
Model Test User Model:
                                              Standard      Robust
  Test Statistic                                 0.000       0.000
  Degrees of freedom                                 0           0

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  importmath ~                                                          
    ses1             -0.039    0.023   -1.662    0.097   -0.039   -0.017
    ses2             -0.011    0.024   -0.481    0.630   -0.011   -0.005
    ses3              0.033    0.022    1.485    0.138    0.033    0.016

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .importmath        0.822    0.016   51.731    0.000    0.822    0.999

Visualize with a path diagram

semPlot::semPaths(fit4, whatLabels = "standardized")

Multiple linear regression

model5 <- 'comptest ~ hrsread + absorbmath + absorbread + importmath + male + ses1 + ses2 + ses3'
fit5 <- sem(model5, data = els.clean, estimator = "MLR")
summary(fit5, standardized = TRUE)

lavaan 0.6-7 ended normally after 69 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                          9
                                                      
                                                  Used       Total
  Number of observations                         11235       16197
                                                                  
Model Test User Model:
                                              Standard      Robust
  Test Statistic                                 0.000       0.000
  Degrees of freedom                                 0           0

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  comptest ~                                                            
    hrsread           0.053    0.025    2.108    0.035    0.053    0.021
    absorbmath        0.255    0.104    2.439    0.015    0.255    0.023
    absorbread       -1.907    0.109  -17.449    0.000   -1.907   -0.184
    importmath       -0.589    0.100   -5.885    0.000   -0.589   -0.055
    male              0.495    0.166    2.972    0.003    0.495    0.026
    ses1            -10.302    0.231  -44.560    0.000  -10.302   -0.435
    ses2             -6.981    0.224  -31.165    0.000   -6.981   -0.306
    ses3             -4.221    0.217  -19.469    0.000   -4.221   -0.190

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .comptest         72.814    0.985   73.925    0.000   72.814    0.785

Visualize with a path diagram

semPlot::semPaths(fit5, whatLabels = "standardized")

Start to put it all together- combine a CFA model with a regression model

Is the model below identified?

model6 <- 'Read =~ hrsread + absorbread'
fit6 <- sem(model6, data=els.clean, estimator = "MLR")

lavaan WARNING:
    The variance-covariance matrix of the estimated parameters (vcov)
    does not appear to be positive definite! The smallest eigenvalue
    (= -3.463451e+02) is smaller than zero. This may be a symptom that
    the model is not identified.lavaan WARNING: some estimated ov variances are negative

summary(fit6, standardized = TRUE)

lavaan 0.6-7 ended normally after 21 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                          4
                                                      
                                                  Used       Total
  Number of observations                         11319       16197
                                                                  
Model Test User Model:
                                                      
  Test statistic                                    NA
  Degrees of freedom                                -1
  P-value (Unknown)                                 NA

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Read =~                                                               
    hrsread           1.000                               1.140    0.295
    absorbread       -1.010       NA                     -1.151   -1.229

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .hrsread          13.607       NA                     13.607    0.913
   .absorbread       -0.447       NA                     -0.447   -0.509
    Read              1.300       NA                      1.000    1.000

How about this one?

model7 <- 'Learn =~ hrsread + absorbread + absorbmath + importmath'
fit7 <- sem(model7, data=els.clean, estimator = "MLR")
summary(fit7, fit.measures = TRUE, standardized = TRUE)

lavaan 0.6-7 ended normally after 50 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                          8
                                                      
                                                  Used       Total
  Number of observations                         11235       16197
                                                                  
Model Test User Model:
                                               Standard      Robust
  Test Statistic                               1531.674    1195.624
  Degrees of freedom                                  2           2
  P-value (Chi-square)                            0.000       0.000
  Scaling correction factor                                   1.281
       Yuan-Bentler correction (Mplus variant)                     

Model Test Baseline Model:

  Test statistic                              3954.430    1838.852
  Degrees of freedom                                 6           6
  P-value                                        0.000       0.000
  Scaling correction factor                                  2.150

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.613       0.349
  Tucker-Lewis Index (TLI)                      -0.162      -0.954
                                                                  
  Robust Comparative Fit Index (CFI)                         0.612
  Robust Tucker-Lewis Index (TLI)                           -0.164

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -74091.256  -74091.256
  Scaling correction factor                                  3.248
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)             NA          NA
  Scaling correction factor                                  2.855
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              148198.513  148198.513
  Bayesian (BIC)                            148257.127  148257.127
  Sample-size adjusted Bayesian (BIC)       148231.704  148231.704

Root Mean Square Error of Approximation:

  RMSEA                                          0.261       0.230
  90 Percent confidence interval - lower         0.250       0.221
  90 Percent confidence interval - upper         0.272       0.240
  P-value RMSEA <= 0.05                          0.000       0.000
                                                                  
  Robust RMSEA                                               0.261
  90 Percent confidence interval - lower                     0.249
  90 Percent confidence interval - upper                     0.273

Standardized Root Mean Square Residual:

  SRMR                                           0.109       0.109

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Learn =~                                                              
    hrsread           1.000                               0.445    0.115
    absorbread       -0.476    0.049   -9.739    0.000   -0.212   -0.228
    absorbmath       -1.163    0.178   -6.534    0.000   -0.517   -0.600
    importmath       -1.362    0.215   -6.339    0.000   -0.605   -0.672

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .hrsread          14.734    0.433   33.993    0.000   14.734    0.987
   .absorbread        0.821    0.018   45.443    0.000    0.821    0.948
   .absorbmath        0.475    0.030   15.689    0.000    0.475    0.640
   .importmath        0.446    0.025   17.779    0.000    0.446    0.549
    Learn             0.198    0.058    3.392    0.001    1.000    1.000

Connect this latent measure to an outcome - `comptest`

model8 <- 'Learn =~ hrsread + absorbread + absorbmath + importmath
comptest ~ Learn'
fit8 <- sem(model8, data=els.clean, estimator = "MLR")
summary(fit8, fit.measures = TRUE, standardized = TRUE)

lavaan 0.6-7 ended normally after 59 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of free parameters                         10
                                                      
                                                  Used       Total
  Number of observations                         11235       16197
                                                                  
Model Test User Model:
                                               Standard      Robust
  Test Statistic                               2118.391    1979.069
  Degrees of freedom                                  5           5
  P-value (Chi-square)                            0.000       0.000
  Scaling correction factor                                   1.070
       Yuan-Bentler correction (Mplus variant)                     

Model Test Baseline Model:

  Test statistic                              4638.662    2646.109
  Degrees of freedom                                10          10
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.753

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.543       0.251
  Tucker-Lewis Index (TLI)                       0.087      -0.498
                                                                  
  Robust Comparative Fit Index (CFI)                         0.543
  Robust Tucker-Lewis Index (TLI)                            0.085

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)            -115434.688 -115434.688
  Scaling correction factor                                  2.865
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)             NA          NA
  Scaling correction factor                                  2.267
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              230889.376  230889.376
  Bayesian (BIC)                            230962.644  230962.644
  Sample-size adjusted Bayesian (BIC)       230930.865  230930.865

Root Mean Square Error of Approximation:

  RMSEA                                          0.194       0.187
  90 Percent confidence interval - lower         0.187       0.181
  90 Percent confidence interval - upper         0.201       0.194
  P-value RMSEA <= 0.05                          0.000       0.000
                                                                  
  Robust RMSEA                                               0.194
  90 Percent confidence interval - lower                     0.187
  90 Percent confidence interval - upper                     0.201

Standardized Root Mean Square Residual:

  SRMR                                           0.106       0.106

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Learn =~                                                              
    hrsread           1.000                               0.512    0.133
    absorbread       -0.462    0.043  -10.816    0.000   -0.237   -0.254
    absorbmath       -0.985    0.155   -6.368    0.000   -0.505   -0.586
    importmath       -1.178    0.198   -5.953    0.000   -0.603   -0.669

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  comptest ~                                                            
    Learn             2.355    0.302    7.809    0.000    1.207    0.125

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .hrsread          14.669    0.436   33.651    0.000   14.669    0.982
   .absorbread        0.810    0.019   42.099    0.000    0.810    0.935
   .absorbmath        0.487    0.028   17.349    0.000    0.487    0.657
   .importmath        0.448    0.025   17.934    0.000    0.448    0.552
   .comptest         91.355    1.153   79.258    0.000   91.355    0.984
    Learn             0.262    0.079    3.324    0.001    1.000    1.000

Visualize with a path diagram

semPlot::semPaths(fit8, whatLabels = "standardized")

LS0tCnRpdGxlOiAiTW9kdWxlIDExIC0gU3RydWN0dXJhbCBFcXVhdGlvbiBNb2RlbGluZyAoU0VNKSwgUGFydCAxIgphdXRob3I6ICJEci4gQnJvZGEiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCiMgTG9hZCBpbiB0aGUgdGhlIGRhdGEgYW5kIHBhY2thZ2VzCldlIHdpbGwgdXNlIHRoZSBzYW1lIEVMUyBkYXRhc2V0IHRoYXQgd2UgdXNlZCBpbiBsb2dpc3RpYyByZWdyZXNzaW9uIHdlZWsuIEZvciBwYWNrYWdlcywgd2Ugd2lsbCB1c2UgYHRpZHl2ZXJzZWBhbmQgYGxhdmFhbmAuCmBgYHtyfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKbGlicmFyeShsYXZhYW4pCmBgYAoKYGBge3J9CmxpYnJhcnkoaGF2ZW4pCmVscyA8LSByZWFkX2R0YSgiRUxTX0xvZ2lzdGljLmR0YSIpCmdsaW1wc2UoZWxzKQpgYGAKCiMgUXVpY2sgZGF0YSBjbGVhbmluZyBhbmQgY29kaW5nIGxlc3NvbjogYHRyYW5zbXV0ZWAgaXMgeW91ciBmcmllbmQhCldlIHdpbGwgdXNlIHRoZSBgdHJhbnNtdXRlYCBmdW5jdGlvbiBmcm9tIHRoZSB0aWR5dmVyc2UuIFRoaXMgZnVuY3Rpb24gd29ya3MgbGlrZSBhIGNvbWJpbmF0aW9uIG9mIGBtdXRhdGVgIGFuZCBgc2VsZWN0YCBhdCB0aGUgc2FtZSB0aW1lLiBPbmx5IHZhcmlhYmVscyB0aGF0IGFyZSBuYW1lZCBpbiBgdHJhbnNtdXRlYCB3aWxsIGJlIGtlcHQuIFNvIHdlIHdpbGwgbWFrZSBjb3BpZXMgb2YgdGhlIHZhcmlhYmxlcyB3ZSBuZWVkIHRoYXQgaGF2ZSBtb3JlIHJlYWRhYmxlIG5hbWVzLCBhbmQgaW4gZG9pbmcgc28sIHdlIGFsc28gd2lsbCBjcmVhdGUgYSBzbWFsbGVyIGRhdGFzZXQganVzdCBjb250YWluaW5nIHRoZSB2YXJpYWJsZXMgd2UgbmVlZC4gUHJldHR5IGNvb2whCgpgYGB7cn0KZWxzLmNsZWFuIDwtIGVscyAlPiUKICBuYV9pZiguLCAtOSkgJT4lCiAgbmFfaWYoLiwgLTgpICU+JQogIG5hX2lmKC4sIC03KSAlPiUKICBuYV9pZiguLCAtNCklPiUKICBuYV9pZiguLCAtMSkgJT4lCiAgICB0cmFuc211dGUoLiwKICAgICAgICAgaHJzcmVhZCA9IGJ5czQzLAogICAgICAgICBhYnNvcmJtYXRoID0gYnlzODdhLAogICAgICAgICBhYnNvcmJyZWFkID0gYnlzODdlLAogICAgICAgICBpbXBvcnRtYXRoID0gYnlzODdmLAogICAgICAgICBjb21wdGVzdCA9IGJ5dHhjc3RkLAogICAgICAgICBzZXggPSBhc19mYWN0b3IoYnlzZXgpLAogICAgICAgICBzZXNxID0gYXNfZmFjdG9yKGJ5c2VzMXF1KSkKICAKYGBgCkJlZm9yZSBzZWxlY3RpbmcgdmFyaWFibGVzLCB3ZSBmaXJzdCB1c2VkIHRoZSBgbmFfaWZgIGZ1bmN0aW9uIGZyb20gdGhlIGB0aWR5dmVyc2VgIHRvIGNvbnZlcnQgY29tbW9uIG1pc3NpbmcgdmFsdWUgY29kZXMgZnJvbSBFTFMgaW50byBgTkFgcy4gV2UgYWxzbyBkaWQgdGhpcyBsYXN0IHdlZWsuCgpOZXh0LCB3ZSBzZWxlY3RlZCB2YXJpYWJsZXMgcmVsYXRlZCB0byBtYXRoIGFuZCByZWFkaW5nIGVuZ2FnZW1lbnQgKGBhYnNvcmJtYXRoYCBhbmQgYGFic29yYnJlYWRgKSBhbmQgbWF0aCBpbXBvcnRhbmNlIChgaW1wb3J0bWF0aGApLiBXZSBhbHNvIHNlbGVjdGVkIGhvdXJzIHJlYWQgYXQgaG9tZSBwZXIgd2VlayAoYGhyc3JlYWRgKSwgYSBzdGFuZGFyZGl6ZWQgYWNoaWV2ZW1lbnQgdGVzdCB2YXJpYWJsZSAoYGNvbXB0ZXN0YCksIGEgb2Ygc2V4IChgc2V4YCkgYW5kIGEgcXVhcnRpbGUgbWVhc3VyZSBvZiBTRVMgKGBzZXNxYCkuIFdlIGNvbnZlcnRlZCB0aGUgbGFzdCB0d28gY2F0ZWdvcmljYWwgdmFyaWFibGVzIGludG8gZmFjdG9ycyB1c2luZyB0aGUgYGFzX2ZhY3RvcmAgZnVuY3Rpb24uCgojIEdldHRpbmcgc3RhcnRlZCB3aXRoIHBhdGggYW5hbHlzaXMgLSByZXBsaWNhdGluZyBjb21tb24gc3RhdGlzdGljYWwgdGVjaG5pcXVlcwoKIyMgUGVhcnNvbiBDb3JyZWxhdGlvbi8gU2ltcGxlIExpbmVhciBSZWdyZXNzaW9uIChub3RlIHRoZSBgc3RhbmRgIG9wdGlvbikKSSByZWNvbW1lbmQgdXNpbmcgZnVsbC1pbmZvcm1hdGlvbiBtYXhpbXVtIGxpa2VsaWhvb2QgKEZJTUwpLCB3aGljaCBhZGp1c3RzIGZvciBtaXNzaW5nIHZhbHVlcy4gSXQgaXMgb2Z0ZW4gcHJlZmVycmVkIHRvIG11bHRpcGxlIGltcHV0YXRpb24hIEp1c3QgYWRkIHRoZSBvcHRpb24gYGVzdGltYXRvciA9ICJNTFIiYCB0byB0aGUgZW5kIG9mIHRoZSBgc2VtYCBjb21tYW5kIGluIGBsYXZhYW5gLgoKIyMjIFNwZWNpZnkgeW91ciBtb2RlbCAoSnVzdCBsaWtlIGEgc2ltcGxlIGxpbmVhciByZWdyZXNzaW9uKQoKYGBge3J9Cm1vZGVsMSA8LSAnY29tcHRlc3QgfiBhYnNvcmJyZWFkJwpgYGAKCiMjIyBXZSBjYW4gbm93IGZpdCB0aGUgbW9kZWwgYXMgZm9sbG93czoKYGBge3J9CmZpdDEgPC0gc2VtKG1vZGVsMSwgZGF0YSA9IGVscy5jbGVhbiwgZXN0aW1hdG9yID0gIk1MUiIpCmBgYAoKIyMjIE9uY2UgdGhlIG1vZGVsIGhhcyBiZWVuIGZpdHRlZCwgdGhlIGBzdW1tYXJ5KClgIGZ1bmN0aW9uIHByb3ZpZGVzIGEgbmljZSBzdW1tYXJ5IG9mIHRoZSBmaXR0ZWQgbW9kZWw6CmBgYHtyfQpzdW1tYXJ5KGZpdDEsIHN0YW5kYXJkaXplZCA9IFRSVUUpCmBgYAojIyMgVmlzdWFsaXplIHdpdGggYSBwYXRoIGRpYWdyYW0KYGBge3J9CnNlbVBsb3Q6OnNlbVBhdGhzKGZpdDEsIHdoYXRMYWJlbHMgPSAic3RhbmRhcmRpemVkIikKYGBgCgoKIyMjIENvbXBhcmUgdG8gYSByZWdyZXNzaW9uIHNldHVwIHVzaW5nIHRoZSBgbG0oKWAgZnVuY3Rpb24KYGBge3J9Cm1vZGVsMiA8LSBsbShjb21wdGVzdCB+IGFic29yYnJlYWQsIGRhdGEgPSBlbHMuY2xlYW4pCnN1bW1hcnkobW9kZWwyKQpgYGAKCiMjIEluZGVwZW5kZW50IHNhbXBsZXMgdC10ZXN0CmBgYHtyfQp0YWJsZShlbHMuY2xlYW4kc2V4KQpgYGAKIyMjIFdlIG5lZWQgdG8gdXNlIGBjYXNlX3doZW4oKWAgdG8gY3JlYXRlIGEgZHVtbXkgdmFyaWFibGUgCmBgYHtyfQplbHMuY2xlYW4gPC0gZWxzLmNsZWFuICU+JQogIG11dGF0ZSguLAogICAgICAgICAgbWFsZSA9IGNhc2Vfd2hlbigKICAgICAgICAgICBzZXggPT0gIk1hbGUiIH4gMSwKICAgICAgICAgICBzZXggPT0gIkZlbWFsZSIgfiAwKSkKYGBgCiMjIyBOb3csIHdlIGNhbiBzcGVjaWZ5IG91ciBtb2RlbApgYGB7cn0KbW9kZWwzIDwtICdhYnNvcmJtYXRoIH4gbWFsZScKZml0MyA8LSBzZW0obW9kZWwzLCBkYXRhID0gZWxzLmNsZWFuLCBlc3RpbWF0b3IgPSAiTUxSIikKc3VtbWFyeShmaXQzLCBzdGFuZGFyZGl6ZWQgPSBUUlVFKQpgYGAKIyMjIFZpc3VhbGl6ZSB3aXRoIGEgcGF0aCBkaWFncmFtCmBgYHtyfQpzZW1QbG90OjpzZW1QYXRocyhmaXQzLCB3aGF0TGFiZWxzID0gInN0YW5kYXJkaXplZCIpCmBgYAoKIyMgT25lLVdheSBBTk9WQQojIyMgQ3JlYXRlIGR1bW15IHZhcmlhYmxlcyBmb3IgdGhlIFNFUyBxdWFydGlsZXMKYGBge3J9CnRhYmxlKGVscy5jbGVhbiRzZXNxKQpgYGAKCmBgYHtyfQplbHMuY2xlYW4gPC0gZWxzLmNsZWFuICU+JQogIG11dGF0ZSguLAogICAgICAgICAgc2VzMSA9IGNhc2Vfd2hlbigKICAgICAgICAgICBzZXNxPT0gIkxvd2VzdCBxdWFydGlsZSIgfiAxLAogICAgICAgICAgIFRSVUUgfiAwKSwKICAgICAgICAgc2VzMiA9IGNhc2Vfd2hlbigKICAgICAgICAgICBzZXNxPT0gIlNlY29uZCBxdWFydGlsZSIgfiAxLAogICAgICAgICAgIFRSVUUgfiAwKSwKICAgICAgICBzZXMzID0gY2FzZV93aGVuKAogICAgICAgICAgIHNlc3E9PSAiVGhpcmQgcXVhcnRpbGUiIH4gMSwKICAgICAgICAgICBUUlVFIH4gMCksCiAgICAgICAgc2VzNCA9IGNhc2Vfd2hlbigKICAgICAgICAgICBzZXNxPT0gIkhpZ2hlc3QgcXVhcnRpbGUiIH4gMSwKICAgICAgICAgICBUUlVFIH4gMCkpCmBgYAojIyMgU3BlY2lmeSB0aGUgbW9kZWwgKG5vdGUgdGhlIGNob2ljZSBvZiByZWZlcmVuY2UgZ3JvdXApCmBgYHtyfQptb2RlbDQgPC0gJ2ltcG9ydG1hdGggfiBzZXMxICsgc2VzMiArIHNlczMnCmZpdDQgPC0gc2VtKG1vZGVsNCwgZGF0YSA9IGVscy5jbGVhbiwgZXN0aW1hdG9yID0gIk1MUiIpCnN1bW1hcnkoZml0NCwgc3RhbmRhcmRpemVkID0gVFJVRSkKYGBgCgojIyMgVmlzdWFsaXplIHdpdGggYSBwYXRoIGRpYWdyYW0KYGBge3J9CnNlbVBsb3Q6OnNlbVBhdGhzKGZpdDQsIHdoYXRMYWJlbHMgPSAic3RhbmRhcmRpemVkIikKYGBgCgojIyBNdWx0aXBsZSBsaW5lYXIgcmVncmVzc2lvbgpgYGB7cn0KbW9kZWw1IDwtICdjb21wdGVzdCB+IGhyc3JlYWQgKyBhYnNvcmJtYXRoICsgYWJzb3JicmVhZCArIGltcG9ydG1hdGggKyBtYWxlICsgc2VzMSArIHNlczIgKyBzZXMzJwpmaXQ1IDwtIHNlbShtb2RlbDUsIGRhdGEgPSBlbHMuY2xlYW4sIGVzdGltYXRvciA9ICJNTFIiKQpzdW1tYXJ5KGZpdDUsIHN0YW5kYXJkaXplZCA9IFRSVUUpCmBgYAoKIyMjIFZpc3VhbGl6ZSB3aXRoIGEgcGF0aCBkaWFncmFtCmBgYHtyfQpzZW1QbG90OjpzZW1QYXRocyhmaXQ1LCB3aGF0TGFiZWxzID0gInN0YW5kYXJkaXplZCIpCmBgYAoKIyBTdGFydCB0byBwdXQgaXQgYWxsIHRvZ2V0aGVyLSBjb21iaW5lIGEgQ0ZBIG1vZGVsIHdpdGggYSByZWdyZXNzaW9uIG1vZGVsCiMjIElzIHRoZSBtb2RlbCBiZWxvdyBpZGVudGlmaWVkPwpgYGB7cn0KbW9kZWw2IDwtICdSZWFkID1+IGhyc3JlYWQgKyBhYnNvcmJyZWFkJwpmaXQ2IDwtIHNlbShtb2RlbDYsIGRhdGE9ZWxzLmNsZWFuLCBlc3RpbWF0b3IgPSAiTUxSIikKc3VtbWFyeShmaXQ2LCBzdGFuZGFyZGl6ZWQgPSBUUlVFKQpgYGAKCiMjIEhvdyBhYm91dCB0aGlzIG9uZT8KYGBge3J9Cm1vZGVsNyA8LSAnTGVhcm4gPX4gaHJzcmVhZCArIGFic29yYnJlYWQgKyBhYnNvcmJtYXRoICsgaW1wb3J0bWF0aCcKZml0NyA8LSBzZW0obW9kZWw3LCBkYXRhPWVscy5jbGVhbiwgZXN0aW1hdG9yID0gIk1MUiIpCnN1bW1hcnkoZml0NywgZml0Lm1lYXN1cmVzID0gVFJVRSwgc3RhbmRhcmRpemVkID0gVFJVRSkKYGBgCgojIyBDb25uZWN0IHRoaXMgbGF0ZW50IG1lYXN1cmUgdG8gYW4gb3V0Y29tZSAtIGBjb21wdGVzdGAKYGBge3J9Cm1vZGVsOCA8LSAnTGVhcm4gPX4gaHJzcmVhZCArIGFic29yYnJlYWQgKyBhYnNvcmJtYXRoICsgaW1wb3J0bWF0aApjb21wdGVzdCB+IExlYXJuJwpmaXQ4IDwtIHNlbShtb2RlbDgsIGRhdGE9ZWxzLmNsZWFuLCBlc3RpbWF0b3IgPSAiTUxSIikKc3VtbWFyeShmaXQ4LCBmaXQubWVhc3VyZXMgPSBUUlVFLCBzdGFuZGFyZGl6ZWQgPSBUUlVFKQpgYGAKCiMjIFZpc3VhbGl6ZSB3aXRoIGEgcGF0aCBkaWFncmFtCmBgYHtyfQpzZW1QbG90OjpzZW1QYXRocyhmaXQ4LCB3aGF0TGFiZWxzID0gInN0YW5kYXJkaXplZCIpCmBgYAo=

Module 11 - Structural Equation Modeling (SEM), Part 1

Dr. Broda

Load in the the data and packages

Quick data cleaning and coding lesson: `transmute` is your friend!

Getting started with path analysis - replicating common statistical techniques

Pearson Correlation/ Simple Linear Regression (note the `stand` option)

Specify your model (Just like a simple linear regression)

We can now fit the model as follows:

Once the model has been fitted, the `summary()` function provides a nice summary of the fitted model:

Visualize with a path diagram

Compare to a regression setup using the `lm()` function

Independent samples t-test

We need to use `case_when()` to create a dummy variable

Now, we can specify our model

Visualize with a path diagram

One-Way ANOVA

Create dummy variables for the SES quartiles

Specify the model (note the choice of reference group)

Visualize with a path diagram

Multiple linear regression

Visualize with a path diagram

Start to put it all together- combine a CFA model with a regression model

Is the model below identified?

How about this one?

Connect this latent measure to an outcome - `comptest`

Visualize with a path diagram

Module 11 - Structural Equation Modeling (SEM), Part 1

Dr. Broda

Load in the the data and packages

Quick data cleaning and coding lesson: transmute is your friend!

Getting started with path analysis - replicating common statistical techniques

Pearson Correlation/ Simple Linear Regression (note the stand option)

Specify your model (Just like a simple linear regression)

We can now fit the model as follows:

Once the model has been fitted, the summary() function provides a nice summary of the fitted model:

Visualize with a path diagram

Compare to a regression setup using the lm() function

Independent samples t-test

We need to use case_when() to create a dummy variable

Now, we can specify our model

Visualize with a path diagram

One-Way ANOVA

Create dummy variables for the SES quartiles

Specify the model (note the choice of reference group)

Visualize with a path diagram

Multiple linear regression

Visualize with a path diagram

Start to put it all together- combine a CFA model with a regression model

Is the model below identified?

How about this one?

Connect this latent measure to an outcome - comptest

Visualize with a path diagram

Quick data cleaning and coding lesson: `transmute` is your friend!

Pearson Correlation/ Simple Linear Regression (note the `stand` option)

Once the model has been fitted, the `summary()` function provides a nice summary of the fitted model:

Compare to a regression setup using the `lm()` function

We need to use `case_when()` to create a dummy variable

Connect this latent measure to an outcome - `comptest`