# A tibble: 5 × 3
civic_class n percent
<fct> <int> <dbl>
1 submarine 691 15.4
2 tender_boat 1315 29.2
3 cruise_ship 1125 25.0
4 tugboat 1073 23.9
5 lightship 291 6.47
Replicating the Civic Navigator’s algorithms
1 Using MLM to replicate LCA civic classes
1.1 Background
In 2023, the Florida Poly group developed an algorithm to replicate the 5 civic classes generated by Mplus’ Latent Class Analysis (LCA). The online Civic Navigator relies on this algorithm to assign users to their corresponding civic class. The team wrote three Markdown files that I consulted to learn how they created this formula. Those files are:
COAT Data Analysis for insights dated June 29, 2023
COAT Data Analysis for insights-test June 29, 2023
COAT Data Analysis for insights dated August 29, 2023
The source of their analysis was the Excel file labeled Combinations 12 behaviors & classes - no missing data created on June 5, 2022 (civic_classes_06_05_2022). In turn, the source of the Excel file is an Mplus LCA analysis based on the 2021 National Survey on America Civic Health-Master File.dta. It is a Stata file.
For each of the 4,495 cases that comprise the 2021 National Survey on America Civic Health (2021 NASCH), the Excel file reports their answers to 12 civic behaviors as well as their corresponding civic class assignment. For more information, see the Civic Navigator Report.
In 2024, the original civic classes were relabeled from animal species to classes of ships as follows:
Lightships = Mammals
Birds = Cruise Ships
Reptiles = Tugboats
Fish = Submarines
Amphibians = Tender Boats
1.2 Strategy
To replicate the Civic Navigator’s civic class assignment algorithm, I used a model-training approach based on the 2021 NASCH dataset. Specifically, I estimated multinomial logistic regression (MLM) models using the same 12 civic behaviors that informed the original Latent Class Analysis (LCA). The goal was to determine whether a regression-based model could accurately reproduce the five civic classes previously identified by LCA.
To evaluate the performance of the MLM results, I conducted three validation checks. First, I compared the MLM-generated class assignments with the original LCA classifications for all respondents in the 2021 NASCH dataset. This step assessed the degree of agreement between the MLM and LCA methods. Second, I conducted a stress test by applying the MLM model to five hypothetical cases to determine whether the predicted classifications matched those generated by the online Civic Navigator. Finally, I conducted a split-sample validation test to assess the MLM model’s reliability in reproducing the civic classes under independent conditions.
In essence, this strategy tests whether two different statistical approaches converge on the same civic class structure (see Figure 1). LCA identifies hidden groups based on shared behavioral patterns, while MLM uses observed behaviors to estimate the probability that an individual belongs to each class and assigns the class with the highest probability. Strong agreement between the two methods increases confidence that the civic classification system is stable and can be reliably implemented using a regression-based algorithm. The split-sample validation provides additional evidence that the new algorithm is robust and not dependent on a particular sample.
1.3 Verifying LCA classes
The civic_classes_06_05_2022 file contains the original Mplus LCA classes. Their frequencies are as follows:
1.4 Using MLM to estimate civic classes
This section documents the civic classes estimated under multinomial logistic regression analyses (MLM). It is based on the 2021 NASCH sample of 4,495 subjects who reported their civic engagement across 12 behaviors.
The first step was to estimate the intercept model. This model is a baseline. It helps to assess the extent to which adding the 12 civic behaviors as predictors of civic classes improves the model’s fit. Next, I estimated the full MLM with all 12 civic behaviors. Finally, I assessed the resulting improvement of fit of the foll model in relation to the intercept model.
1.4.1 Intercept model only
# weights: 10 (4 variable)
initial value 7234.423416
final value 6802.251610
converged
| civic class | |||
|---|---|---|---|
| Predictors | Odds Ratios | std. Error | Response |
| (Intercept) | 1.903 *** | 0.089 | tender_boat |
| (Intercept) | 1.628 *** | 0.079 | cruise_ship |
| (Intercept) | 1.553 *** | 0.076 | tugboat |
| (Intercept) | 0.421 *** | 0.029 | lightship |
| Observations | 4495 | ||
| R2 / R2 adjusted | 0.000 / -0.000 | ||
| AIC | 13612.503 | ||
| * p<0.05 ** p<0.01 *** p<0.001 | |||
The intercept model displays poor fit indices. R2 = 0 indicates this model accounts for little variation among the civic classes. Moreover, the intercept model’s AIC = 13,612 is very large, as is its BIC = 13,638.
1.4.2 Full model
Modeling the impact of 12 civic behaviors on the probability of being assigned to any of the 5 civic classes. Data scientists refer to this strategy as “training the model”. This model can be used to predict civic class membership of new users of the Civic Navigator.
1.4.2.1 MLM model
The predictors and estimates of the civic class model are reported below:
# weights: 70 (52 variable)
initial value 7234.423416
iter 10 value 2688.669387
iter 20 value 1493.066862
iter 30 value 1046.646970
iter 40 value 538.026844
iter 50 value 117.356639
iter 60 value 44.430657
iter 70 value 26.337779
iter 80 value 4.809437
iter 90 value 0.325707
iter 100 value 0.022822
final value 0.022822
stopped after 100 iterations
| civic class | |||
|---|---|---|---|
| Predictors | Log-Odds | std. Error | Response |
| (Intercept) | -1063.968 | 857.868 | tender_boat |
| vote cong | 2292.411 ** | 857.865 | tender_boat |
| vote state | 3211.766 *** | 857.868 | tender_boat |
| pol rallies | -1160.618 ** | 419.349 | tender_boat |
| lobbying | -2072.489 *** | 419.349 | tender_boat |
| pol campaign | -1481.351 *** | 0.000 | tender_boat |
| vol faith | -536.952 *** | 0.000 | tender_boat |
| vol charity | -2811.393 *** | 445.992 | tender_boat |
| vol cure | -2333.738 *** | 0.000 | tender_boat |
| vol culture | -1913.349 *** | 0.001 | tender_boat |
| donate pol | -2475.774 *** | 0.000 | tender_boat |
| give noprof | -1071.046 *** | 0.000 | tender_boat |
| donate food | -175.751 *** | 0.001 | tender_boat |
| (Intercept) | -3400.976 ** | 1057.455 | cruise_ship |
| vote cong | 2397.054 ** | 826.939 | cruise_ship |
| vote state | 2947.744 | 1573.186 | cruise_ship |
| pol rallies | -1071.585 *** | 0.153 | cruise_ship |
| lobbying | -2067.762 | 1787.521 | cruise_ship |
| pol campaign | -1233.252 | 706.529 | cruise_ship |
| vol faith | -288.295 | 2734.998 | cruise_ship |
| vol charity | -321.544 | 1233.029 | cruise_ship |
| vol cure | -566.890 | 1665.075 | cruise_ship |
| vol culture | -993.312 | 1319.717 | cruise_ship |
| donate pol | -268.786 | 496.010 | cruise_ship |
| give noprof | 632.374 | 964.209 | cruise_ship |
| donate food | 1435.514 | 1493.334 | cruise_ship |
| (Intercept) | -1380.904 | 4920.806 | tugboat |
| vote cong | 886.334 *** | 164.281 | tugboat |
| vote state | 662.352 | 2428.059 | tugboat |
| pol rallies | 525.918 | 419.181 | tugboat |
| lobbying | 508.538 | 1371.500 | tugboat |
| pol campaign | 672.019 | 706.516 | tugboat |
| vol faith | 350.025 | 4116.454 | tugboat |
| vol charity | 441.454 | 5161.604 | tugboat |
| vol cure | 568.968 | 5171.188 | tugboat |
| vol culture | 560.111 | 1319.703 | tugboat |
| donate pol | 453.504 | 531.675 | tugboat |
| give noprof | 368.435 | 927.915 | tugboat |
| donate food | 184.272 | 1455.890 | tugboat |
| (Intercept) | -7556.105 *** | 0.015 | lightship |
| vote cong | 1166.597 *** | 0.014 | lightship |
| vote state | 1299.892 *** | 0.014 | lightship |
| pol rallies | 1285.872 *** | 0.015 | lightship |
| lobbying | 984.141 *** | 0.001 | lightship |
| pol campaign | 1342.971 *** | 0.015 | lightship |
| vol faith | 717.002 *** | 0.015 | lightship |
| vol charity | 1298.641 *** | 0.015 | lightship |
| vol cure | 1293.679 *** | 0.015 | lightship |
| vol culture | 937.631 *** | 0.015 | lightship |
| donate pol | 1188.964 *** | 0.015 | lightship |
| give noprof | 880.624 *** | 0.001 | lightship |
| donate food | 915.719 *** | 0.015 | lightship |
| Observations | 4495 | ||
| R2 / R2 adjusted | 1.000 / 1.000 | ||
| AIC | 104.046 | ||
| * p<0.05 ** p<0.01 *** p<0.001 | |||
1.4.2.2 Saving the trained MLM model
Storing the fully trained MLM model (coefficients, reference class, and structure) as a transportable RDS file (mlm_civic_navigator_model.rds). This model can be used to replicate the algorithm embedded in the online Civic Navigator. To display the coefficients of the model type: coef(mlm_civic_navigator_model).
1.4.2.3 Comparison between models
This section reports the extent to which the full model of 12 behaviors represents a significant improvement in fit relative to the intercept model. The tests included the scaled deviance of the model and indicators of overall fit, such as the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). Lower AIC and BIC values signify a better fit. The ANOVA test represents a more rigorous assessment of the improvement of fit between the intercept model and the full model.
The AIC = 104.1 and BIC = 437.4 for the MLM full model are substantially lower than those of the MLM intercept model: AIC = 13,612.5, BIC = 13,638.2.
1.4.2.3.1 ANOVA test
Likelihood ratio tests of Multinomial Models
Response: civic_class
Model
1 1
2 vote_cong + vote_state + pol_rallies + lobbying + pol_campaign + vol_faith + vol_charity + vol_cure + vol_culture + donate_pol + give_noprof + donate_food
Resid. df Resid. Dev Test Df LR stat. Pr(Chi)
1 17976 1.360450e+04
2 17928 4.564458e-02 1 vs 2 48 13604.46 0
1.4.2.3.2 Results
The full model with civic behaviors improves the prediction of civic class membership beyond the intercept-only model. The scale deviance (-0.0228223) of the full model is substantially smaller than that of the intercept-only model (-6802.2516104), suggesting a better fit to the data.
Moreover, the ANOVA test corroborates that the decrease in scale deviance associated with the full MLM model is statistically significant ( X 2 = 13,604, df = 46, p < .05).
1.5 Convergence between the MLM & the LCA model
This section documents the extent to which the MLM strategy reproduces the LCA model. The strategy consisted of estimating the probabilities of belonging to each of the 5 classes for every member of the 2021 NSCH. Next, it reports the results of comparing the civic classes predicted by MLM versus those documented by LCA.
As a second test, five carefully constructed hypothetical cases were entered into the online Civic Navigator. Each case represents one of the five civic categories based on the probabilities of engagement in the 12 civic behaviors for each of the five civic classes (see Table 2: Probabilities of Engagement in the Civic Navigator Report).
1.5.1 MLM Probabilities of class membership
Estimating probabilities and identifying the civic class membership for each member of the 2021 NASCH using the trained MLM model. Below are the predicted civic class memberships for 10 randomly selected cases. Notice that some probabilities are reported in scientific notation, making them difficult to read.
submarine tender_boat cruise_ship tugboat lightship
1131 0 1 2.072622e-126 0.000000e+00 0
6 0 1 0.000000e+00 0.000000e+00 0
2143 0 0 1.000000e+00 9.118273e-31 0
7 0 1 0.000000e+00 0.000000e+00 0
2575 1 0 0.000000e+00 0.000000e+00 0
1567 0 0 1.000000e+00 1.409121e-213 0
3134 0 0 0.000000e+00 0.000000e+00 1
3040 1 0 0.000000e+00 0.000000e+00 0
1442 0 0 1.000000e+00 0.000000e+00 0
1239 0 1 3.870810e-277 0.000000e+00 0
1.5.2 Identifying class membership
The procedure consisted of identifying each subject’s class membership by selecting the class with the highest probability across the 5 civic classes. Below, the table reports the percentage of cases across the 5 civic classes for the whole 2021 NASCH.
classified_df$civic_class_MLM n percent
submarine 691 0.1537264
tender_boat 1315 0.2925473
cruise_ship 1125 0.2502781
tugboat 1073 0.2387097
lightship 291 0.0647386
1.5.3 Congruence between the new algorithm and LCA
The table reports the level of congruence between the MLM model and the LCA in predicting class membership. The rows display the number of subjects assigned to each of the 5 classes under the MLM model. The columns show the number of subjects assigned to each class by the online Civic Navigator. The diagonal displays the degree of consistency between the two methods. As shown by the diagonal, the MLM model replicated the civic classes uncovered by LCA.
| LCA Model |
submarine N = 6911 |
tender_boat N = 1,3151 |
cruise_ship N = 1,1251 |
tugboat N = 1,0731 |
lightship N = 2911 |
|---|---|---|---|---|---|
| MLM Predicted Civic Class | |||||
| submarine | 691 / 691 (100%) | 0 / 1,315 (0%) | 0 / 1,125 (0%) | 0 / 1,073 (0%) | 0 / 291 (0%) |
| tender_boat | 0 / 691 (0%) | 1,315 / 1,315 (100%) | 0 / 1,125 (0%) | 0 / 1,073 (0%) | 0 / 291 (0%) |
| cruise_ship | 0 / 691 (0%) | 0 / 1,315 (0%) | 1,125 / 1,125 (100%) | 0 / 1,073 (0%) | 0 / 291 (0%) |
| tugboat | 0 / 691 (0%) | 0 / 1,315 (0%) | 0 / 1,125 (0%) | 1,073 / 1,073 (100%) | 0 / 291 (0%) |
| lightship | 0 / 691 (0%) | 0 / 1,315 (0%) | 0 / 1,125 (0%) | 0 / 1,073 (0%) | 291 / 291 (100%) |
| 1 n / N (%) | |||||
1.5.4 Assessing the degree of association
The degree of association between the LCA Civic Classes and the MLM civic classes was assessed using the chi-square test and Cramer’s v. The X2 = 17,980, df = 16, p < .01 indicates that the civic classes under MLM are statistically associated with the LCA civic classes. The correlation between the two classifications is perfect (Cramer’s V = 1).
1.5.5 Conclusion
Because the LCA latent classes were derived from the 12 civic behavior indicators, I examined whether class membership could be reconstructed using a trained multinomial model (MLM). Results indicate that the MLM model perfectly reproduces the Mplus’ LCA civic classes. In essence, the MLM five-class solution is a function of the 12 behaviors.
1.6 Additional stress test of the trained MLM model
This section examines the extent to which the trained MLM reproduces the online Civic Navigator of predicted classes. As previously noted, I entered five constructed hypothetical cases into the online Civic Navigator. The pattern of simulated answers resembles that of probabilities of engagement in the 12 civic behaviors for each of the five classes under the LCA model. Those patterns are reported in Table 2 of the Civic Navigator Report.
# A tibble: 5 × 14
id civic_cl_sim vote_cong vote_state pol_rallies lobbying pol_campaign
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 6000 submarine 0 0 0 0 0
2 6001 tender_boat 1 1 0 0 0
3 6002 cruise_ship 1 1 0 0 0
4 6003 tugboat 1 1 0 0 0
5 6004 lightship 1 1 1 1 1
# ℹ 7 more variables: vol_faith <dbl>, vol_charity <dbl>, vol_cure <dbl>,
# vol_culture <dbl>, donate_pol <dbl>, give_noprof <dbl>, donate_food <dbl>
As shown below, the MLM algorithm produced the same classifications as the online Civic Navigator’s algorithm across all 5 simulated cases. The diagonal displays 100% of agreement between the Civic Navigator’s class assignments and those generated by the MLM algorithm.
| Civic Navigator |
submarine N = 11 |
tender_boat N = 11 |
cruise_ship N = 11 |
tugboat N = 11 |
lightship N = 11 |
|---|---|---|---|---|---|
| MLM Algorithm | |||||
| submarine | 1 / 1 (100%) | 0 / 1 (0%) | 0 / 1 (0%) | 0 / 1 (0%) | 0 / 1 (0%) |
| tender_boat | 0 / 1 (0%) | 1 / 1 (100%) | 0 / 1 (0%) | 0 / 1 (0%) | 0 / 1 (0%) |
| cruise_ship | 0 / 1 (0%) | 0 / 1 (0%) | 1 / 1 (100%) | 0 / 1 (0%) | 0 / 1 (0%) |
| tugboat | 0 / 1 (0%) | 0 / 1 (0%) | 0 / 1 (0%) | 1 / 1 (100%) | 0 / 1 (0%) |
| lightship | 0 / 1 (0%) | 0 / 1 (0%) | 0 / 1 (0%) | 0 / 1 (0%) | 1 / 1 (100%) |
| 1 n / N (%) | |||||
1.6.1 Summary
The evidence suggests that the Civic Navigator is functioning well. In addition, we now have a second, independently developed version of the classification method that future teams can use to verify or reproduce the results with high accuracy (see figure 2).
1.7 Split-Sample validation
This approach seeks to document the stability of the Civic Class classification in itself. Accordingly, I randomly divided the original 2021 NASCH sample into two equal groups: a training sample and a test sample.
The model was estimated on the training sample using the same 12 civic behavior indicators and the same reference category (submarine) as in the original specification. Then, the estimated model was used to predict civic class membership for respondents in the test sample. In essence, this approach tests whether the model’s classification structure remains stable when it is applied to new observations.
1.7.1 Split-validation results
The model demonstrated exceptional stability. It achieved 99% classification accuracy, with 99% agreement beyond chance (Kappa). Misclassifications were extremely rare across all five civic classes, and class-level performance was nearly perfect. These results indicate that the civic classification system is highly reliable and robust under independent validation.
# weights: 70 (52 variable)
initial value 3621.235303
iter 10 value 1137.987191
iter 20 value 718.359134
iter 30 value 294.743502
iter 40 value 52.706094
iter 50 value 4.468170
iter 60 value 0.011323
final value 0.000061
converged
Confusion Matrix and Statistics
Reference
Prediction submarine tender_boat cruise_ship tugboat lightship
submarine 345 1 0 4 0
tender_boat 0 651 0 2 0
cruise_ship 0 3 562 0 0
tugboat 0 2 0 526 0
lightship 0 0 0 4 145
Overall Statistics
Accuracy : 0.9929
95% CI : (0.9885, 0.9959)
No Information Rate : 0.2927
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.9907
Mcnemar's Test P-Value : NA
Statistics by Class:
Class: submarine Class: tender_boat Class: cruise_ship
Sensitivity 1.0000 0.9909 1.0000
Specificity 0.9974 0.9987 0.9982
Pos Pred Value 0.9857 0.9969 0.9947
Neg Pred Value 1.0000 0.9962 1.0000
Prevalence 0.1537 0.2927 0.2503
Detection Rate 0.1537 0.2900 0.2503
Detection Prevalence 0.1559 0.2909 0.2517
Balanced Accuracy 0.9987 0.9948 0.9991
Class: tugboat Class: lightship
Sensitivity 0.9813 1.00000
Specificity 0.9988 0.99810
Pos Pred Value 0.9962 0.97315
Neg Pred Value 0.9942 1.00000
Prevalence 0.2388 0.06459
Detection Rate 0.2343 0.06459
Detection Prevalence 0.2352 0.06637
Balanced Accuracy 0.9901 0.99905
1.7.1.1 Conclusion
The split-sample validation confirms that the five civic classes are highly stable and reproducible. When the model was trained on one half of the data and tested on the other, it reproduced the original class assignments almost perfectly.
This means the classification system is not dependent on a particular sample or statistical artifact. The underlying civic behavior patterns are consistent and reliable.
Importantly, the same model logic powers the online Civic Navigator. As a result, The Civic Navigator provides a dependable, empirically validated map of the civil landscape.
In short, users can trust the Civic Navigator. Its civic classifications are stable, reflecting meaningful patterns of civic engagement.