687 HW3

load("/Users/zhoudongqiang/Desktop/2023 fall/687/687 homework/HW3/ess_belgium.rdata")

1. Provide a brief description of the variables of interest using tables or graphs. Are there any missing data? Justify the way that you handle the missing values.

#select the data we are going to use.
data<-ess_belgium[,c("trstprl","trstlgl","trstplt","trstprt","imsmetn","imdfetn","impcntr","stfgov")]
for (i in 1:ncol(data)){ as.factor(data[,i])}

“88” represents missing data for “trstprl”,“trstlgl”,“trstplt”, “trstprt”, and “stfgov”.

# For trust variables and government satisfaction
trust_value = c(rep(c("0","1", "2","3","4","5", 
                "6","7","8","9","10","88"), each = 5))

trust_vars <- c(rep(c("trstprl", "trstlgl", "trstplt", "trstprt", "stfgov")
                    , times = 12))

frequency = data[,c(1,2,3,4,8)]%>% # the frequecy of each category 
  apply(MARGIN = 2, FUN = table) %>%
  apply(MARGIN = 2, FUN = unname)
frequency = c(frequency[,1],frequency[,2],frequency[,3],frequency[,4],frequency[,5])

trust_freq = data.frame(trust_value,trust_vars,frequency)

# Plotting trust variables frequency using ggplot2
ggplot(trust_freq, aes(x = trust_vars, y = frequency, fill = trust_value ,label = frequency)) +
  geom_bar(stat = "identity") +
  labs(fill = "Value")

“8” represents missing data for “imsmetn”,“imdfetn”, and “impcntr”.

# For immigratrion variables
immigration_value <- c(rep(c("1", "2","3","4", "8"), each = 3))

immigration_vars <- c(rep(c("imsmetn", "imdfetn", "impcntr"),times=5))

frequency = data[,c(5,6,7)]%>%
  apply(MARGIN = 2, FUN = table) %>%
apply(MARGIN = 2, FUN = unname)
frequency = c(frequency[,1],frequency[,2],frequency[,3])

immigration_freq = data.frame(immigration_value,immigration_vars,frequency)

# Plotting immigration variables frequency using ggplot2
ggplot(immigration_freq, aes(x = immigration_vars, y = frequency, 
                             fill = immigration_value,label = frequency)) +
  geom_bar(stat = "identity") +
  labs(fill = "Value")

Drop missing variables

rows_with_88 <- apply(data[, 1:4], 1, function(x) any(x == "88"))
rows_with_8 <- apply(data[, 5:7], 1, function(x) any(x == "8"))

# Combine the logical indices
rows_to_remove <- rows_with_88 | rows_with_8

# Subset the data to exclude these rows
data <- data[!rows_to_remove, ]

#Now there is 1657 non-null variables

2.Create a dichotomous variable to indicate satisfaction with the national government: STF_IND = 0 if STFGOV<=5, and STF_IND = 1 otherwise.

data$STF_IND <- 1
data$STF_IND[data$stfgov<=5]<-0
table(data$STF_IND)

## 
##    0    1 
## 1273  384

3. Perform a multiple group analysis to examine invariance in the way that a latent construct of trust in leadership is measured between individuals who are satisfied with the government and those who are not. What is your conclusion?

trust_model <- ' trstlead =~ trstplt + trstprt + trstprl+ trstlgl '

# Fit the configural invariance model (no constraints between groups)
fit_configural <- cfa(trust_model, group = "STF_IND", data = data)
# Check the fit
summary(fit_configural)

## lavaan 0.6.16 ended normally after 54 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        24
## 
##   Number of observations per group:                   
##     1                                              384
##     0                                             1273
## 
## Model Test User Model:
##                                                       
##   Test statistic                               131.603
##   Degrees of freedom                                 4
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     1                                           56.247
##     0                                           75.356
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## 
## Group 1 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   trstlead =~                                         
##     trstplt           1.000                           
##     trstprt           0.935    0.053   17.724    0.000
##     trstprl           0.527    0.048   11.037    0.000
##     trstlgl           0.457    0.053    8.617    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           5.086    0.105   48.610    0.000
##    .trstprt           5.008    0.104   48.060    0.000
##    .trstprl           5.880    0.094   62.662    0.000
##    .trstlgl           5.914    0.101   58.517    0.000
##     trstlead          0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           0.597    0.169    3.541    0.000
##    .trstprt           1.016    0.161    6.319    0.000
##    .trstprl           2.378    0.180   13.210    0.000
##    .trstlgl           3.170    0.235   13.505    0.000
##     trstlead          3.606    0.342   10.553    0.000
## 
## 
## Group 2 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   trstlead =~                                         
##     trstplt           1.000                           
##     trstprt           0.995    0.023   43.735    0.000
##     trstprl           0.824    0.027   30.976    0.000
##     trstlgl           0.741    0.031   23.912    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           3.499    0.058   60.067    0.000
##    .trstprt           3.521    0.058   60.349    0.000
##    .trstprl           4.016    0.061   65.899    0.000
##    .trstlgl           4.623    0.065   70.617    0.000
##     trstlead          0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           0.718    0.062   11.651    0.000
##    .trstprt           0.764    0.062   12.304    0.000
##    .trstprl           2.280    0.100   22.745    0.000
##    .trstlgl           3.480    0.145   23.940    0.000
##     trstlead          3.601    0.177   20.294    0.000

# Fit the metric invariance model (constrain factor loadings)
fit_metric <- cfa(trust_model, 
                  group.equal = c("loadings"),
                  group = "STF_IND", data = data)

summary(fit_metric)

## lavaan 0.6.16 ended normally after 48 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        24
##   Number of equality constraints                     3
## 
##   Number of observations per group:                   
##     1                                              384
##     0                                             1273
## 
## Model Test User Model:
##                                                       
##   Test statistic                               172.295
##   Degrees of freedom                                 7
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     1                                           89.401
##     0                                           82.894
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## 
## Group 1 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   trstlead =~                                         
##     trstplt           1.000                           
##     trstprt (.p2.)    0.987    0.021   47.239    0.000
##     trstprl (.p3.)    0.767    0.024   32.598    0.000
##     trstlgl (.p4.)    0.682    0.027   25.198    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           5.086    0.101   50.414    0.000
##    .trstprt           5.008    0.102   49.043    0.000
##    .trstprl           5.880    0.104   56.697    0.000
##    .trstlgl           5.914    0.109   54.456    0.000
##     trstlead          0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           0.835    0.114    7.299    0.000
##    .trstprt           1.007    0.120    8.367    0.000
##    .trstprl           2.324    0.185   12.578    0.000
##    .trstlgl           3.098    0.236   13.105    0.000
##     trstlead          3.073    0.260   11.839    0.000
## 
## 
## Group 2 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   trstlead =~                                         
##     trstplt           1.000                           
##     trstprt (.p2.)    0.987    0.021   47.239    0.000
##     trstprl (.p3.)    0.767    0.024   32.598    0.000
##     trstlgl (.p4.)    0.682    0.027   25.198    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           3.499    0.059   59.561    0.000
##    .trstprt           3.521    0.059   60.088    0.000
##    .trstprl           4.016    0.059   67.533    0.000
##    .trstlgl           4.623    0.064   72.004    0.000
##     trstlead          0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstplt           0.691    0.063   10.974    0.000
##    .trstprt           0.761    0.063   12.066    0.000
##    .trstprl           2.325    0.100   23.142    0.000
##    .trstlgl           3.524    0.146   24.158    0.000
##     trstlead          3.702    0.178   20.761    0.000

#configural invariance model: Test statistic: 131.603 Degrees of freedom: 4

#metric invariance model: Test statistic: 172.295 Degrees of freedom: 7

p_value<- pchisq(172.295-131.603,7-3 , lower.tail = FALSE)
p_value

## [1] 3.11287e-08

The p_value is 3.11287e-08 which is much smaller than 0.001. It suggest that in terms of model fitting, there is a significant difference between the constrained and unconstrained models.The constrained model fits much worse than the unconstrained model.Thus, the two groups of government satisfaction appear to measure different constructs.

4. Divide the individuals into three groups using a latent class analysis (LCA) based on the three measures of attitudes toward immigrants. What are your qualitative descriptions of each of the three classes?

# Prepare Data:
data$imsmetn <- factor(data$imsmetn)
data$imdfetn <- factor(data$imdfetn)
data$impcntr <- factor(data$impcntr)

lca_model <- poLCA(cbind(imsmetn, imdfetn, impcntr) ~ 1,
                   data = data,
                   nclass = 3,
                   maxiter = 50000,
                   nrep = 10)

## Model 1: llik = -4723.374 ... best llik = -4723.374
## Model 2: llik = -4859.954 ... best llik = -4723.374
## Model 3: llik = -4723.374 ... best llik = -4723.374
## Model 4: llik = -4723.374 ... best llik = -4723.374
## Model 5: llik = -4723.374 ... best llik = -4723.374
## Model 6: llik = -4723.374 ... best llik = -4723.374
## Model 7: llik = -4749.13 ... best llik = -4723.374
## Model 8: llik = -4749.13 ... best llik = -4723.374
## Model 9: llik = -4749.13 ... best llik = -4723.374
## Model 10: llik = -4778.172 ... best llik = -4723.374
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $imsmetn
##                1      2      3      4
## class 1:  0.9848 0.0152 0.0000 0.0000
## class 2:  0.1247 0.8577 0.0176 0.0000
## class 3:  0.0373 0.2987 0.4701 0.1939
## 
## $imdfetn
##                1      2      3      4
## class 1:  0.9635 0.0365 0.0000 0.0000
## class 2:  0.0133 0.9276 0.0579 0.0011
## class 3:  0.0013 0.0239 0.6028 0.3719
## 
## $impcntr
##                1      2      3      4
## class 1:  0.7929 0.1664 0.0337 0.0070
## class 2:  0.0322 0.8246 0.1296 0.0136
## class 3:  0.0034 0.1117 0.5659 0.3190
## 
## Estimated class population shares 
##  0.0882 0.4605 0.4513 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.0899 0.4635 0.4466 
##  
## ========================================================= 
## Fit for 3 latent classes: 
## ========================================================= 
## number of observations: 1657 
## number of estimated parameters: 29 
## residual degrees of freedom: 34 
## maximum log-likelihood: -4723.374 
##  
## AIC(3): 9504.748
## BIC(3): 9661.718
## G^2(3): 719.3494 (Likelihood ratio/deviance statistic) 
## X^2(3): 944.0326 (Chi-square goodness of fit) 
##

start<- Sys.time()
end <- Sys.time()
time_spend<- round (start-end,2)
lca_model

## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $imsmetn
##                1      2      3      4
## class 1:  0.9848 0.0152 0.0000 0.0000
## class 2:  0.1247 0.8577 0.0176 0.0000
## class 3:  0.0373 0.2987 0.4701 0.1939
## 
## $imdfetn
##                1      2      3      4
## class 1:  0.9635 0.0365 0.0000 0.0000
## class 2:  0.0133 0.9276 0.0579 0.0011
## class 3:  0.0013 0.0239 0.6028 0.3719
## 
## $impcntr
##                1      2      3      4
## class 1:  0.7929 0.1664 0.0337 0.0070
## class 2:  0.0322 0.8246 0.1296 0.0136
## class 3:  0.0034 0.1117 0.5659 0.3190
## 
## Estimated class population shares 
##  0.0882 0.4605 0.4513 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.0899 0.4635 0.4466 
##  
## ========================================================= 
## Fit for 3 latent classes: 
## ========================================================= 
## number of observations: 1657 
## number of estimated parameters: 29 
## residual degrees of freedom: 34 
## maximum log-likelihood: -4723.374 
##  
## AIC(3): 9504.748
## BIC(3): 9661.718
## G^2(3): 719.3494 (Likelihood ratio/deviance statistic) 
## X^2(3): 944.0326 (Chi-square goodness of fit) 
##

plot(lca_model)

Class 1 (Population share ≈ 0.088): This class has a lower probability of selecting the most positive outcomes across all three measures of attitudes towards immigrants (imsmetn, imdfetn, impcntr). This suggests that this class is the most restrictive or conservative regarding immigration, with a relatively small proportion of the population (approximately 8.8%).

Class 2 (Population share ≈ 0.451): This class shows an increasing probability of selecting more positive outcomes for attitudes towards immigrants. The probabilities are not as high as Class 3 for the most positive outcome but are higher than Class 1. This could be interpreted as a moderately open attitude towards immigration, representing the largest share of the population (approximately 45.1%).

Class 3 (Population share ≈ 0.46): This class has the highest probabilities of selecting the most positive outcomes for attitudes towards immigrants. This suggests that this class is the most open or liberal regarding immigration policies, and it makes up a significant portion of the population, nearly equal to Class 2 (approximately 46%).

5.Consider latent class analyses with four or five classes as well. Which of the three LCA models would appear to have the best fit? What do you notice computationally as the hypothesized number of classes increases?

lca_model_2 <- poLCA(cbind(imsmetn, imdfetn, impcntr) ~ 1,
                   data = data,
                   nclass = 4,
                   maxiter = 50000,
                   nrep = 10)

## Model 1: llik = -4372.78 ... best llik = -4372.78
## Model 2: llik = -4372.78 ... best llik = -4372.78
## Model 3: llik = -4372.78 ... best llik = -4372.78
## Model 4: llik = -4372.78 ... best llik = -4372.78
## Model 5: llik = -4372.78 ... best llik = -4372.78
## Model 6: llik = -4372.78 ... best llik = -4372.78
## Model 7: llik = -4372.78 ... best llik = -4372.78
## Model 8: llik = -4372.78 ... best llik = -4372.78
## Model 9: llik = -4372.78 ... best llik = -4372.78
## Model 10: llik = -4372.78 ... best llik = -4372.78
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $imsmetn
##                1      2      3      4
## class 1:  0.9848 0.0152 0.0000 0.0000
## class 2:  0.0208 0.1495 0.2329 0.5968
## class 3:  0.1256 0.8588 0.0156 0.0000
## class 4:  0.0468 0.3850 0.5641 0.0041
## 
## $imdfetn
##                1      2      3      4
## class 1:  0.9637 0.0363 0.0000 0.0000
## class 2:  0.0000 0.0226 0.0000 0.9774
## class 3:  0.0136 0.9482 0.0257 0.0125
## class 4:  0.0021 0.0283 0.9019 0.0677
## 
## $impcntr
##                1      2      3      4
## class 1:  0.7931 0.1662 0.0337 0.0070
## class 2:  0.0042 0.0590 0.1039 0.8329
## class 3:  0.0321 0.8322 0.1201 0.0155
## class 4:  0.0042 0.1507 0.7730 0.0721
## 
## Estimated class population shares 
##  0.0882 0.1445 0.4489 0.3184 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.0899 0.1412 0.4424 0.3265 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## number of observations: 1657 
## number of estimated parameters: 39 
## residual degrees of freedom: 24 
## maximum log-likelihood: -4372.78 
##  
## AIC(4): 8823.561
## BIC(4): 9034.658
## G^2(4): 18.16198 (Likelihood ratio/deviance statistic) 
## X^2(4): 15.74141 (Chi-square goodness of fit) 
##

lca_model_3 <- poLCA(cbind(imsmetn, imdfetn, impcntr) ~ 1,
                   data = data,
                   nclass = 5,
                   maxiter = 50000,
                   nrep = 10)

## Model 1: llik = -4370.019 ... best llik = -4370.019
## Model 2: llik = -4370.019 ... best llik = -4370.019
## Model 3: llik = -4370.019 ... best llik = -4370.019
## Model 4: llik = -4369.333 ... best llik = -4369.333
## Model 5: llik = -4370.019 ... best llik = -4369.333
## Model 6: llik = -4370.209 ... best llik = -4369.333
## Model 7: llik = -4369.333 ... best llik = -4369.333
## Model 8: llik = -4369.804 ... best llik = -4369.333
## Model 9: llik = -4369.333 ... best llik = -4369.333
## Model 10: llik = -4370.209 ... best llik = -4369.333
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $imsmetn
##                1      2      3      4
## class 1:  0.9848 0.0152 0.0000 0.0000
## class 2:  0.1251 0.8595 0.0155 0.0000
## class 3:  0.0000 0.0937 0.0730 0.8333
## class 4:  0.0401 0.4008 0.5544 0.0047
## class 5:  0.0765 0.3088 0.6147 0.0000
## 
## $imdfetn
##                1      2      3      4
## class 1:  0.9635 0.0365 0.0000 0.0000
## class 2:  0.0136 0.9489 0.0249 0.0126
## class 3:  0.0000 0.0140 0.0000 0.9860
## class 4:  0.0026 0.0176 0.9798 0.0000
## class 5:  0.0000 0.0637 0.3510 0.5852
## 
## $impcntr
##                1      2      3      4
## class 1:  0.7930 0.1662 0.0337 0.0070
## class 2:  0.0320 0.8329 0.1217 0.0133
## class 3:  0.0058 0.0567 0.1042 0.8333
## class 4:  0.0054 0.1578 0.8368 0.0000
## class 5:  0.0000 0.1005 0.3395 0.5600
## 
## Estimated class population shares 
##  0.0882 0.4483 0.1036 0.2564 0.1035 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.0899 0.4424 0.0863 0.283 0.0984 
##  
## ========================================================= 
## Fit for 5 latent classes: 
## ========================================================= 
## number of observations: 1657 
## number of estimated parameters: 49 
## residual degrees of freedom: 14 
## maximum log-likelihood: -4369.333 
##  
## AIC(5): 8836.665
## BIC(5): 9101.891
## G^2(5): 11.26667 (Likelihood ratio/deviance statistic) 
## X^2(5): 8.52821 (Chi-square goodness of fit) 
##

c(lca_model$bic,lca_model_2$bic,lca_model_3$bic)

## [1] 9661.718 9034.658 9101.891

I chose the four classes LCA model because it has the smallest BIC, indicating the best model fit.

And number of estimated parameters increase from 29 to 49, which would widen the standard errors of the estimated parameters.

6. Fit a structural equation model, using the latent trust in leadership measure indicated by the four measures to predict the latent immigration attitude measure indicated by the three measures. Interpret the estimated coefficient for the structural component of this model.

# Convert unordered factors to ordered factors
data$imsmetn <- factor(data$imsmetn, ordered = TRUE)
data$imdfetn <- factor(data$imdfetn, ordered = TRUE)
data$impcntr <- factor(data$impcntr, ordered = TRUE)

sem_model <- '
  trust_in_leadership =~ trstprl + trstlgl + trstplt + trstprt
  immigration_attitude =~ imsmetn + imdfetn + impcntr
  immigration_attitude ~ trust_in_leadership
'

fit <- sem(sem_model, data=data)

summary(fit)

## lavaan 0.6.16 ended normally after 46 iterations
## 
##   Estimator                                       DWLS
##   Optimization method                           NLMINB
##   Number of model parameters                        25
## 
##   Number of observations                          1657
## 
## Model Test User Model:
##                                               Standard      Scaled
##   Test Statistic                                65.151     143.909
##   Degrees of freedom                                13          13
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  0.467
##   Shift parameter                                            4.417
##     simple second-order correction                                
## 
## Parameter Estimates:
## 
##   Standard errors                           Robust.sem
##   Information                                 Expected
##   Information saturated (h1) model        Unstructured
## 
## Latent Variables:
##                           Estimate  Std.Err  z-value  P(>|z|)
##   trust_in_leadership =~                                     
##     trstprl                  1.000                           
##     trstlgl                  0.865    0.037   23.545    0.000
##     trstplt                  0.927    0.037   24.830    0.000
##     trstprt                  0.915    0.038   24.204    0.000
##   immigration_attitude =~                                    
##     imsmetn                  1.000                           
##     imdfetn                  1.170    0.015   79.550    0.000
##     impcntr                  1.039    0.010  102.300    0.000
## 
## Regressions:
##                          Estimate  Std.Err  z-value  P(>|z|)
##   immigration_attitude ~                                    
##     trust_n_ldrshp         -0.140    0.011  -12.203    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstprl           4.448    0.057   78.164    0.000
##    .trstlgl           4.922    0.059   82.773    0.000
##    .trstplt           3.867    0.054   71.529    0.000
##    .trstprt           3.865    0.054   72.044    0.000
##    .imsmetn           0.000                           
##    .imdfetn           0.000                           
##    .impcntr           0.000                           
##     trust_n_ldrshp    0.000                           
##    .immigratn_tttd    0.000                           
## 
## Thresholds:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     imsmetn|t1       -0.990    0.037  -26.780    0.000
##     imsmetn|t2        0.502    0.032   15.569    0.000
##     imsmetn|t3        1.356    0.044   31.061    0.000
##     imdfetn|t1       -1.330    0.043  -30.887    0.000
##     imdfetn|t2        0.083    0.031    2.677    0.007
##     imdfetn|t3        0.961    0.037   26.273    0.000
##     impcntr|t1       -1.364    0.044  -31.108    0.000
##     impcntr|t2        0.078    0.031    2.529    0.011
##     impcntr|t3        1.033    0.038   27.481    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstprl           1.451    0.108   13.465    0.000
##    .trstlgl           2.714    0.115   23.578    0.000
##    .trstplt           1.663    0.092   18.102    0.000
##    .trstprt           1.685    0.096   17.574    0.000
##    .imsmetn           0.302                           
##    .imdfetn           0.044                           
##    .impcntr           0.247                           
##     trust_n_ldrshp    3.584    0.238   15.058    0.000
##    .immigratn_tttd    0.628    0.015   41.001    0.000
## 
## Scales y*:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     imsmetn           1.000                           
##     imdfetn           1.000                           
##     impcntr           1.000

The 1 unit increase in the respondent’s trust in leadership is associated with -0.140 unit decrease in negative attitudes. (p-value less than 0.001), which implies that the higher the respondents’ trust in leadership, the more open their attitudes toward immigrants.

7. How might you improve the fitted model in part 6?

mod_indices <- modificationIndices(fit)

model_refined <-
  '
    trustleader =~ trstprl + trstlgl + trstplt + trstprt
    immigrationatt =~ imsmetn + imdfetn + impcntr
    trstprl ~~ trstlgl 
    trstplt ~~ trstprl 
    immigrationatt~ a*trustleader
'
fit_mod <- sem(model_refined, data=data)

summary(fit_mod)

## lavaan 0.6.16 ended normally after 60 iterations
## 
##   Estimator                                       DWLS
##   Optimization method                           NLMINB
##   Number of model parameters                        27
## 
##   Number of observations                          1657
## 
## Model Test User Model:
##                                               Standard      Scaled
##   Test Statistic                                63.258     140.134
##   Degrees of freedom                                11          11
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  0.462
##   Shift parameter                                            3.304
##     simple second-order correction                                
## 
## Parameter Estimates:
## 
##   Standard errors                           Robust.sem
##   Information                                 Expected
##   Information saturated (h1) model        Unstructured
## 
## Latent Variables:
##                     Estimate  Std.Err  z-value  P(>|z|)
##   trustleader =~                                       
##     trstprl            1.000                           
##     trstlgl            0.835    0.041   20.329    0.000
##     trstplt            0.901    0.045   20.146    0.000
##     trstprt            0.856    0.052   16.620    0.000
##   immigrationatt =~                                    
##     imsmetn            1.000                           
##     imdfetn            1.170    0.015   79.548    0.000
##     impcntr            1.039    0.010  102.291    0.000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   immigrationatt ~                                    
##     trustleadr (a)   -0.131    0.012  -10.707    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .trstprl ~~                                          
##    .trstlgl          -0.262    0.154   -1.702    0.089
##    .trstplt          -0.340    0.128   -2.646    0.008
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstprl           4.448    0.057   78.164    0.000
##    .trstlgl           4.922    0.059   82.773    0.000
##    .trstplt           3.867    0.054   71.529    0.000
##    .trstprt           3.865    0.054   72.044    0.000
##    .imsmetn           0.000                           
##    .imdfetn           0.000                           
##    .impcntr           0.000                           
##     trustleader       0.000                           
##    .immigrationatt    0.000                           
## 
## Thresholds:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     imsmetn|t1       -0.990    0.037  -26.780    0.000
##     imsmetn|t2        0.502    0.032   15.569    0.000
##     imsmetn|t3        1.356    0.044   31.061    0.000
##     imdfetn|t1       -1.330    0.043  -30.887    0.000
##     imdfetn|t2        0.083    0.031    2.677    0.007
##     imdfetn|t3        0.961    0.037   26.273    0.000
##     impcntr|t1       -1.364    0.044  -31.108    0.000
##     impcntr|t2        0.078    0.031    2.529    0.011
##     impcntr|t3        1.033    0.038   27.481    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstprl           1.092    0.247    4.430    0.000
##    .trstlgl           2.651    0.158   16.766    0.000
##    .trstplt           1.542    0.126   12.205    0.000
##    .trstprt           1.799    0.130   13.833    0.000
##    .imsmetn           0.302                           
##    .imdfetn           0.044                           
##    .impcntr           0.247                           
##     trustleader       3.942    0.336   11.735    0.000
##    .immigrationatt    0.630    0.015   41.152    0.000
## 
## Scales y*:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     imsmetn           1.000                           
##     imdfetn           1.000                           
##     impcntr           1.000

measurement_invariance_model <- '
  group: gender
  trust_in_leadership =~ trstprl + trstlgl + trstplt + trstprt
  immigration_attitude =~ imsmetn + imdfetn + impcntr
  immigration_attitude ~ trust_in_leadership
'
fit_invariance <- sem(measurement_invariance_model, data=data)

## Warning in lavParseModelString(model): lavaan WARNING: syntax contains only a
## single block identifier: group

summary(fit_invariance)

## lavaan 0.6.16 ended normally after 46 iterations
## 
##   Estimator                                       DWLS
##   Optimization method                           NLMINB
##   Number of model parameters                        25
## 
##   Number of observations                          1657
## 
## Model Test User Model:
##                                               Standard      Scaled
##   Test Statistic                                65.151     143.909
##   Degrees of freedom                                13          13
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  0.467
##   Shift parameter                                            4.417
##     simple second-order correction                                
## 
## Parameter Estimates:
## 
##   Standard errors                           Robust.sem
##   Information                                 Expected
##   Information saturated (h1) model        Unstructured
## 
## Latent Variables:
##                           Estimate  Std.Err  z-value  P(>|z|)
##   trust_in_leadership =~                                     
##     trstprl                  1.000                           
##     trstlgl                  0.865    0.037   23.545    0.000
##     trstplt                  0.927    0.037   24.830    0.000
##     trstprt                  0.915    0.038   24.204    0.000
##   immigration_attitude =~                                    
##     imsmetn                  1.000                           
##     imdfetn                  1.170    0.015   79.550    0.000
##     impcntr                  1.039    0.010  102.300    0.000
## 
## Regressions:
##                          Estimate  Std.Err  z-value  P(>|z|)
##   immigration_attitude ~                                    
##     trust_n_ldrshp         -0.140    0.011  -12.203    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstprl           4.448    0.057   78.164    0.000
##    .trstlgl           4.922    0.059   82.773    0.000
##    .trstplt           3.867    0.054   71.529    0.000
##    .trstprt           3.865    0.054   72.044    0.000
##    .imsmetn           0.000                           
##    .imdfetn           0.000                           
##    .impcntr           0.000                           
##     trust_n_ldrshp    0.000                           
##    .immigratn_tttd    0.000                           
## 
## Thresholds:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     imsmetn|t1       -0.990    0.037  -26.780    0.000
##     imsmetn|t2        0.502    0.032   15.569    0.000
##     imsmetn|t3        1.356    0.044   31.061    0.000
##     imdfetn|t1       -1.330    0.043  -30.887    0.000
##     imdfetn|t2        0.083    0.031    2.677    0.007
##     imdfetn|t3        0.961    0.037   26.273    0.000
##     impcntr|t1       -1.364    0.044  -31.108    0.000
##     impcntr|t2        0.078    0.031    2.529    0.011
##     impcntr|t3        1.033    0.038   27.481    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .trstprl           1.451    0.108   13.465    0.000
##    .trstlgl           2.714    0.115   23.578    0.000
##    .trstplt           1.663    0.092   18.102    0.000
##    .trstprt           1.685    0.096   17.574    0.000
##    .imsmetn           0.302                           
##    .imdfetn           0.044                           
##    .impcntr           0.247                           
##     trust_n_ldrshp    3.584    0.238   15.058    0.000
##    .immigratn_tttd    0.628    0.015   41.001    0.000
## 
## Scales y*:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     imsmetn           1.000                           
##     imdfetn           1.000                           
##     impcntr           1.000

The original model (BIC = 35062.51) has been improved compared to refined model (BIC = 34953.34), which means that new model has a better fit.

8. Write a one-page summary of the results and conclusions of the above analyses.

First I created frequency plots for the two variables of interest and found that each contained a certain number of missing values. So I removed the observations with missing values, a total of 46 variables, which is not counting many not fundamentally corrupting the SEM model results.

I categorized the survey participants into two groups based on their satisfaction with the government. When I tried to implement Multi-Group Analysis (MGA), the constrained factor loadings resulted in a model markedly different from the initial one, leading to the termination of the MGA process. Consequently, it remains unclear whether the measurement invariance is strong, weak, or non-existent.

Subsequently, I conducted a Latent Class Analysis (LCA) and grouped the individuals into three categories, guided by three metrics reflecting their views on immigration.The first group seems to be more inclined to be more conservative and opposed to immigration. The second group is relatively more open, indicating a more moderate and friendly stance towards immigrants. The third group is more open, choosing mainly “allow many people to live here” on all questions, indicating a welcoming attitude towards immigrants.

Subsequently, I developed a structural equation modeling (SEM) framework that utilized four latent indicators of leadership trust to predict latent attitudes toward immigrants suggested by these three indicators. Findings suggest that there is a direct correlation between the level of trust in leadership and more receptive attitudes toward immigrants, with more trust in leaders being associated with more open attitudes toward immigrants. To enhance the SEM model, I reviewed the modification indices and incorporated the correlations between “trstprl ~~ trstlgl” and “trstplt ~~ trstprl”