── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(poLCA)
Warning: package 'poLCA' was built under R version 4.5.1
Loading required package: scatterplot3d
Loading required package: MASS
Attaching package: 'MASS'
The following object is masked from 'package:dplyr':
select
optimal_classes <- aic_bic$nclass[which.min(aic_bic$BIC)]cat("Optimal number of classes:", optimal_classes, "\n")
Optimal number of classes: 2
# Fit final LCA without covariatesset.seed(123)lca_model <-poLCA(f, data = Hydro_Data_hydro, nclass = optimal_classes, maxiter =1000, nrep =10, verbose =TRUE)
Model 1: llik = -938.9017 ... best llik = -938.9017
Model 2: llik = -935.799 ... best llik = -935.799
Model 3: llik = -934.9633 ... best llik = -934.9633
Model 4: llik = -937.7484 ... best llik = -934.9633
Model 5: llik = -935.6573 ... best llik = -934.9633
Model 6: llik = -933.1128 ... best llik = -933.1128
Model 7: llik = -933.2593 ... best llik = -933.1128
Model 8: llik = -937.11 ... best llik = -933.1128
Model 9: llik = -934.8324 ... best llik = -933.1128
Model 10: llik = -937.8992 ... best llik = -933.1128
Conditional item response (column) probabilities,
by outcome variable, for each class (row)
$Number.Triggers
Pr(1) Pr(2) Pr(3) Pr(4) Pr(5) Pr(6) Pr(7) Pr(8) Pr(9) Pr(10)
class 1: 0.0000 0.00 0.2432 0.0000 0.0000 0.00 0.1622 0.000 0.1486 0.0811
class 2: 0.0125 0.25 0.0000 0.2375 0.1375 0.15 0.0000 0.125 0.0000 0.0000
Pr(11) Pr(12) Pr(13) Pr(14) Pr(15) Pr(16) Pr(17) Pr(18)
class 1: 0.0676 0.0000 0.0811 0.0811 0.0541 0.0405 0.027 0.0135
class 2: 0.0000 0.0875 0.0000 0.0000 0.0000 0.0000 0.000 0.0000
$Trigger.Code
Pr(1) Pr(2) Pr(3) Pr(4) Pr(5) Pr(6) Pr(7) Pr(8) Pr(9) Pr(10)
class 1: 0.0000 0.00 0.0000 0.000 0.0000 0.2568 0.0000 0.0000 0.00 0.1622
class 2: 0.0125 0.15 0.0125 0.025 0.0625 0.0000 0.2375 0.1375 0.15 0.0000
Pr(11) Pr(12) Pr(13) Pr(14) Pr(15) Pr(16) Pr(17) Pr(18) Pr(19) Pr(20)
class 1: 0.000 0.1486 0.0811 0.0676 0.0000 0.0676 0.0811 0.0541 0.0405 0.027
class 2: 0.125 0.0000 0.0000 0.0000 0.0875 0.0000 0.0000 0.0000 0.0000 0.000
Pr(21)
class 1: 0.0135
class 2: 0.0000
$Sound.Sensitivity
Pr(1) Pr(2) Pr(3) Pr(4)
class 1: 0.2027 0.1351 0.2432 0.4189
class 2: 0.3500 0.1625 0.2000 0.2875
Estimated class population shares
0.4805 0.5195
Predicted class memberships (by modal posterior prob.)
0.4805 0.5195
=========================================================
Fit for 2 latent classes:
=========================================================
number of observations: 154
number of estimated parameters: 81
residual degrees of freedom: 73
maximum log-likelihood: -933.1128
AIC(2): 2028.226
BIC(2): 2274.219
G^2(2): 659.4516 (Likelihood ratio/deviance statistic)
X^2(2): 1932.648 (Chi-square goodness of fit)
# -------------------------------# 4. Prepare complete cases for covariate model# -------------------------------complete_rows <-complete.cases(Hydro_Data_hydro[, c("Number.Triggers", "Trigger.Code", "Sound.Sensitivity","Race","Sex","Hydro_Cause","Age.at.Dx","Condition_Anxiety")])Hydro_Data_hydro_complete <- Hydro_Data_hydro[complete_rows, ]# -------------------------------# 5. Fit LCA with covariates on complete cases# -------------------------------f_cov <-cbind(Number.Triggers, Trigger.Code, Sound.Sensitivity) ~ Race + Sex + Hydro_Cause + Age.at.Dx + Condition_Anxietyset.seed(123)lca_cov_model <-poLCA(f_cov, data = Hydro_Data_hydro_complete, nclass = optimal_classes,maxiter =1000, nrep =10, verbose =TRUE)
Model 1: llik = -906.5582 ... best llik = -906.5582
Model 2: llik = -921.1447 ... best llik = -906.5582
Model 3: llik = -904.0522 ... best llik = -904.0522
Model 4: llik = -900.9455 ... best llik = -900.9455
Model 5: llik = -919.7637 ... best llik = -900.9455
Model 6: llik = -896.4337 ... best llik = -896.4337
Model 7: llik = -911.8051 ... best llik = -896.4337
Model 8: llik = -897.7748 ... best llik = -896.4337
Model 9: llik = -915.6626 ... best llik = -896.4337
Model 10: llik = -904.8344 ... best llik = -896.4337
Conditional item response (column) probabilities,
by outcome variable, for each class (row)
$Number.Triggers
Pr(1) Pr(2) Pr(3) Pr(4) Pr(5) Pr(6) Pr(7) Pr(8) Pr(9) Pr(10)
class 1: 0 0.2092 0.2224 0.2348 0.000 0.0000 0.1483 0.0000 0.000 0.0741
class 2: 0 0.0296 0.0000 0.0000 0.157 0.1712 0.0000 0.1427 0.157 0.0000
Pr(11) Pr(12) Pr(13) Pr(14) Pr(15) Pr(16) Pr(17) Pr(18)
class 1: 0.0618 0.0000 0.0124 0.0000 0.0000 0.0371 0.0000 0.0000
class 2: 0.0000 0.0999 0.0714 0.0714 0.0571 0.0000 0.0285 0.0143
$Trigger.Code
Pr(1) Pr(2) Pr(3) Pr(4) Pr(5) Pr(6) Pr(7) Pr(8) Pr(9) Pr(10)
class 1: 0 0.1483 0.0124 0.0247 0.0238 0.2348 0.2348 0.000 0.0000 0.1483
class 2: 0 0.0000 0.0000 0.0000 0.0296 0.0000 0.0000 0.157 0.1712 0.0000
Pr(11) Pr(12) Pr(13) Pr(14) Pr(15) Pr(16) Pr(17) Pr(18) Pr(19) Pr(20)
class 1: 0.0000 0.000 0.0741 0.0618 0.0000 0.0000 0.0000 0.0000 0.0371 0.0000
class 2: 0.1427 0.157 0.0000 0.0000 0.0999 0.0714 0.0714 0.0571 0.0000 0.0285
Pr(21)
class 1: 0.0000
class 2: 0.0143
$Sound.Sensitivity
Pr(1) Pr(2) Pr(3) Pr(4)
class 1: 0.3460 0.1721 0.2101 0.2719
class 2: 0.1856 0.1295 0.2426 0.4424
Estimated class population shares
0.5359 0.4641
Predicted class memberships (by modal posterior prob.)
0.5364 0.4636
=========================================================
Fit for 2 latent classes:
=========================================================
2 / 1
Coefficient Std. error t value Pr(>|t|)
(Intercept) 12.99788 0.90551 1.435400e+01 0.000
Race3 -15.04760 1.16599 -1.290500e+01 0.000
Race4 5.54897 0.00000 2.889094e+07 0.000
Race5 -13.69285 0.59030 -2.319600e+01 0.000
Sex1 0.28058 0.52679 5.330000e-01 0.597
Hydro_Cause2 1.31395 1.28803 1.020000e+00 0.313
Hydro_Cause3 -0.74799 1.14137 -6.550000e-01 0.516
Hydro_Cause4 16.15040 0.00000 1.885722e+08 0.000
Hydro_Cause5 -1.99803 1.45413 -1.374000e+00 0.176
Hydro_Cause6 -0.37707 1.39498 -2.700000e-01 0.788
Hydro_Cause7 -15.09373 0.00000 -2.957812e+08 0.000
Hydro_Cause8 -0.07958 1.41202 -5.600000e-02 0.955
Hydro_Cause9 0.26808 2.11507 1.270000e-01 0.900
Hydro_Cause10 0.07600 0.81955 9.300000e-02 0.927
Hydro_Cause11 0.92943 0.90686 1.025000e+00 0.311
Hydro_Cause12 0.54058 0.69506 7.780000e-01 0.441
Hydro_Cause13 -0.09359 0.88263 -1.060000e-01 0.916
Hydro_Cause14 14.26098 0.00000 4.463102e+07 0.000
Age.at.Dx2 -0.51145 0.99513 -5.140000e-01 0.610
Age.at.Dx3 0.13169 0.95317 1.380000e-01 0.891
Age.at.Dx4 0.17160 1.05736 1.620000e-01 0.872
Age.at.Dx5 -0.74656 1.59839 -4.670000e-01 0.643
Age.at.Dx6 0.54526 1.12314 4.850000e-01 0.630
Age.at.Dx7 -0.25335 0.97240 -2.610000e-01 0.796
Age.at.Dx8 -2.94668 1.49361 -1.973000e+00 0.055
Age.at.Dx9 -0.16219 0.87758 -1.850000e-01 0.854
Condition_Anxiety1 1.15013 0.46461 2.475000e+00 0.017
=========================================================
number of observations: 151
number of estimated parameters: 107
residual degrees of freedom: 44
maximum log-likelihood: -896.4337
AIC(2): 2006.867
BIC(2): 2329.716
X^2(2): 1762.098 (Chi-square goodness of fit)
ALERT: estimation algorithm automatically restarted with new initial values
# -------------------------------# 6. Assign predicted classes back to original filtered dataset# -------------------------------Hydro_Data_hydro$pred_class <-NAHydro_Data_hydro$pred_class[complete_rows] <- lca_cov_model$predclass# View predicted class countstable(Hydro_Data_hydro$pred_class)
1 2
81 70
# Optional: view coefficients for covariateslca_cov_model$coeff
# A tibble: 21 × 1
Class
<fct>
1 Class 1
2 Class 11
3 Class 12
4 Class 13
5 Class 14
6 Class 15
7 Class 16
8 Class 17
9 Class 18
10 Class 2
# ℹ 11 more rows