Replicating the Civic Navigator’s algorithms

Author

Alberto F Cabrera

Published

March 1, 2026

1 Using MLM to replicate LCA civic classes

1.1 Background

In 2023, the Florida Poly group developed an algorithm to replicate the 5 civic classes generated by Mplus’ Latent Class Analysis (LCA). The online Civic Navigator relies on this algorithm to assign users to their corresponding civic class. The team wrote three Markdown files that I consulted to learn how they created this formula. Those files are:

COAT Data Analysis for insights dated June 29, 2023
COAT Data Analysis for insights-test June 29, 2023
COAT Data Analysis for insights dated August 29, 2023

The source of their analysis was the Excel file labeled Combinations 12 behaviors & classes - no missing data created on June 5, 2022 (civic_classes_06_05_2022). In turn, the source of the Excel file is an Mplus LCA analysis based on the 2021 National Survey on America Civic Health-Master File.dta. It is a Stata file.

For each of the 4,495 cases that comprise the 2021 National Survey on America Civic Health (2021 NASCH), the Excel file reports their answers to 12 civic behaviors as well as their corresponding civic class assignment. For more information, see the Civic Navigator Report.

In 2024, the original civic classes were relabeled from animal species to classes of ships as follows:

Lightships = Mammals
Birds = Cruise Ships
Reptiles = Tugboats
Fish = Submarines
Amphibians = Tender Boats

1.2 Strategy

To replicate the Civic Navigator’s civic class assignment algorithm, I used a model-training approach based on the 2021 NASCH dataset. Specifically, I estimated multinomial logistic regression (MLM) models using the same 12 civic behaviors that informed the original Latent Class Analysis (LCA). The goal was to determine whether a regression-based model could accurately reproduce the five civic classes previously identified by LCA.

To evaluate the performance of the MLM results, I conducted three validation checks. First, I compared the MLM-generated class assignments with the original LCA classifications for all respondents in the 2021 NASCH dataset. This step assessed the degree of agreement between the MLM and LCA methods. Second, I conducted a stress test by applying the MLM model to five hypothetical cases to determine whether the predicted classifications matched those generated by the online Civic Navigator. Finally, I conducted a split-sample validation test to assess the MLM model’s reliability in reproducing the civic classes under independent conditions.

In essence, this strategy tests whether two different statistical approaches converge on the same civic class structure (see Figure 1). LCA identifies hidden groups based on shared behavioral patterns, while MLM uses observed behaviors to estimate the probability that an individual belongs to each class and assigns the class with the highest probability. Strong agreement between the two methods increases confidence that the civic classification system is stable and can be reliably implemented using a regression-based algorithm. The split-sample validation provides additional evidence that the new algorithm is robust and not dependent on a particular sample.

Figure 1. Replication & Validation Strategy: LCA & MLM

1.3 Verifying LCA classes

The civic_classes_06_05_2022 file contains the original Mplus LCA classes. Their frequencies are as follows:

# A tibble: 5 × 3
  civic_class     n percent
  <fct>       <int>   <dbl>
1 submarine     691   15.4 
2 tender_boat  1315   29.2 
3 cruise_ship  1125   25.0 
4 tugboat      1073   23.9 
5 lightship     291    6.47

1.4 Using MLM to estimate civic classes

This section documents the civic classes estimated under multinomial logistic regression analyses (MLM). It is based on the 2021 NASCH sample of 4,495 subjects who reported their civic engagement across 12 behaviors.

The first step was to estimate the intercept model. This model is a baseline. It helps to assess the extent to which adding the 12 civic behaviors as predictors of civic classes improves the model’s fit. Next, I estimated the full MLM with all 12 civic behaviors. Finally, I assessed the resulting improvement of fit of the foll model in relation to the intercept model.

1.4.1 Intercept model only

# weights:  10 (4 variable)
initial  value 7234.423416 
final  value 6802.251610 
converged

	civic class
Predictors	Odds Ratios	std. Error	Response
(Intercept)	1.903 ^***	0.089	tender_boat
(Intercept)	1.628 ^***	0.079	cruise_ship
(Intercept)	1.553 ^***	0.076	tugboat
(Intercept)	0.421 ^***	0.029	lightship
Observations	4495
R² / R² adjusted	0.000 / -0.000
AIC	13612.503
* p<0.05 p<0.01 * p<0.001

The intercept model displays poor fit indices. R² = 0 indicates this model accounts for little variation among the civic classes. Moreover, the intercept model’s AIC = 13,612 is very large, as is its BIC = 13,638.

1.4.2 Full model

Modeling the impact of 12 civic behaviors on the probability of being assigned to any of the 5 civic classes. Data scientists refer to this strategy as “training the model”. This model can be used to predict civic class membership of new users of the Civic Navigator.

The MLM full model can be used to assign future survey takers to their corresponding civic classes

1.4.2.1 MLM model

The predictors and estimates of the civic class model are reported below:

# weights:  70 (52 variable)
initial  value 7234.423416 
iter  10 value 2688.669387
iter  20 value 1493.066862
iter  30 value 1046.646970
iter  40 value 538.026844
iter  50 value 117.356639
iter  60 value 44.430657
iter  70 value 26.337779
iter  80 value 4.809437
iter  90 value 0.325707
iter 100 value 0.022822
final  value 0.022822 
stopped after 100 iterations

	civic class
Predictors	Log-Odds	std. Error	Response
(Intercept)	-1063.968	857.868	tender_boat
vote cong	2292.411 ^**	857.865	tender_boat
vote state	3211.766 ^***	857.868	tender_boat
pol rallies	-1160.618 ^**	419.349	tender_boat
lobbying	-2072.489 ^***	419.349	tender_boat
pol campaign	-1481.351 ^***	0.000	tender_boat
vol faith	-536.952 ^***	0.000	tender_boat
vol charity	-2811.393 ^***	445.992	tender_boat
vol cure	-2333.738 ^***	0.000	tender_boat
vol culture	-1913.349 ^***	0.001	tender_boat
donate pol	-2475.774 ^***	0.000	tender_boat
give noprof	-1071.046 ^***	0.000	tender_boat
donate food	-175.751 ^***	0.001	tender_boat
(Intercept)	-3400.976 ^**	1057.455	cruise_ship
vote cong	2397.054 ^**	826.939	cruise_ship
vote state	2947.744	1573.186	cruise_ship
pol rallies	-1071.585 ^***	0.153	cruise_ship
lobbying	-2067.762	1787.521	cruise_ship
pol campaign	-1233.252	706.529	cruise_ship
vol faith	-288.295	2734.998	cruise_ship
vol charity	-321.544	1233.029	cruise_ship
vol cure	-566.890	1665.075	cruise_ship
vol culture	-993.312	1319.717	cruise_ship
donate pol	-268.786	496.010	cruise_ship
give noprof	632.374	964.209	cruise_ship
donate food	1435.514	1493.334	cruise_ship
(Intercept)	-1380.904	4920.806	tugboat
vote cong	886.334 ^***	164.281	tugboat
vote state	662.352	2428.059	tugboat
pol rallies	525.918	419.181	tugboat
lobbying	508.538	1371.500	tugboat
pol campaign	672.019	706.516	tugboat
vol faith	350.025	4116.454	tugboat
vol charity	441.454	5161.604	tugboat
vol cure	568.968	5171.188	tugboat
vol culture	560.111	1319.703	tugboat
donate pol	453.504	531.675	tugboat
give noprof	368.435	927.915	tugboat
donate food	184.272	1455.890	tugboat
(Intercept)	-7556.105 ^***	0.015	lightship
vote cong	1166.597 ^***	0.014	lightship
vote state	1299.892 ^***	0.014	lightship
pol rallies	1285.872 ^***	0.015	lightship
lobbying	984.141 ^***	0.001	lightship
pol campaign	1342.971 ^***	0.015	lightship
vol faith	717.002 ^***	0.015	lightship
vol charity	1298.641 ^***	0.015	lightship
vol cure	1293.679 ^***	0.015	lightship
vol culture	937.631 ^***	0.015	lightship
donate pol	1188.964 ^***	0.015	lightship
give noprof	880.624 ^***	0.001	lightship
donate food	915.719 ^***	0.015	lightship
Observations	4495
R² / R² adjusted	1.000 / 1.000
AIC	104.046
* p<0.05 p<0.01 * p<0.001

1.4.2.2 Saving the trained MLM model

Storing the fully trained MLM model (coefficients, reference class, and structure) as a transportable RDS file (mlm_civic_navigator_model.rds). This model can be used to replicate the algorithm embedded in the online Civic Navigator. To display the coefficients of the model type: coef(mlm_civic_navigator_model).

1.4.2.3 Comparison between models

This section reports the extent to which the full model of 12 behaviors represents a significant improvement in fit relative to the intercept model. The tests included the scaled deviance of the model and indicators of overall fit, such as the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). Lower AIC and BIC values signify a better fit. The ANOVA test represents a more rigorous assessment of the improvement of fit between the intercept model and the full model.

The AIC = 104.1 and BIC = 437.4 for the MLM full model are substantially lower than those of the MLM intercept model: AIC = 13,612.5, BIC = 13,638.2.

1.4.2.3.1 ANOVA test

Likelihood ratio tests of Multinomial Models

Response: civic_class
                                                                                                                                                       Model
1                                                                                                                                                          1
2 vote_cong + vote_state + pol_rallies + lobbying + pol_campaign + vol_faith + vol_charity + vol_cure + vol_culture + donate_pol + give_noprof + donate_food
  Resid. df   Resid. Dev   Test    Df LR stat. Pr(Chi)
1     17976 1.360450e+04                              
2     17928 4.564458e-02 1 vs 2    48 13604.46       0

1.4.2.3.2 Results

The full model with civic behaviors improves the prediction of civic class membership beyond the intercept-only model. The scale deviance (-0.0228223) of the full model is substantially smaller than that of the intercept-only model (-6802.2516104), suggesting a better fit to the data.

Moreover, the ANOVA test corroborates that the decrease in scale deviance associated with the full MLM model is statistically significant ( X ² = 13,604, df = 46, p < .05).

1.5 Convergence between the MLM & the LCA model

This section documents the extent to which the MLM strategy reproduces the LCA model. The strategy consisted of estimating the probabilities of belonging to each of the 5 classes for every member of the 2021 NSCH. Next, it reports the results of comparing the civic classes predicted by MLM versus those documented by LCA.

As a second test, five carefully constructed hypothetical cases were entered into the online Civic Navigator. Each case represents one of the five civic categories based on the probabilities of engagement in the 12 civic behaviors for each of the five civic classes (see Table 2: Probabilities of Engagement in the Civic Navigator Report).

1.5.1 MLM Probabilities of class membership

Estimating probabilities and identifying the civic class membership for each member of the 2021 NASCH using the trained MLM model. Below are the predicted civic class memberships for 10 randomly selected cases. Notice that some probabilities are reported in scientific notation, making them difficult to read.

     submarine tender_boat   cruise_ship       tugboat lightship
1131         0           1 2.072622e-126  0.000000e+00         0
6            0           1  0.000000e+00  0.000000e+00         0
2143         0           0  1.000000e+00  9.118273e-31         0
7            0           1  0.000000e+00  0.000000e+00         0
2575         1           0  0.000000e+00  0.000000e+00         0
1567         0           0  1.000000e+00 1.409121e-213         0
3134         0           0  0.000000e+00  0.000000e+00         1
3040         1           0  0.000000e+00  0.000000e+00         0
1442         0           0  1.000000e+00  0.000000e+00         0
1239         0           1 3.870810e-277  0.000000e+00         0

1.5.2 Identifying class membership

The procedure consisted of identifying each subject’s class membership by selecting the class with the highest probability across the 5 civic classes. Below, the table reports the percentage of cases across the 5 civic classes for the whole 2021 NASCH.

 classified_df$civic_class_MLM    n   percent
                     submarine  691 0.1537264
                   tender_boat 1315 0.2925473
                   cruise_ship 1125 0.2502781
                       tugboat 1073 0.2387097
                     lightship  291 0.0647386

1.5.3 Congruence between the new algorithm and LCA

The table reports the level of congruence between the MLM model and the LCA in predicting class membership. The rows display the number of subjects assigned to each of the 5 classes under the MLM model. The columns show the number of subjects assigned to each class by the online Civic Navigator. The diagonal displays the degree of consistency between the two methods. As shown by the diagonal, the MLM model replicated the civic classes uncovered by LCA.

**Comparison of predicted classes by original LCA model and MLM model**
Diagonal displays the proportion of congruence between the two models
*LCA Model*	submarine N = 691¹	tender_boat N = 1,315¹	cruise_ship N = 1,125¹	tugboat N = 1,073¹	lightship N = 291¹
MLM Predicted Civic Class
submarine	691 / 691 (100%)	0 / 1,315 (0%)	0 / 1,125 (0%)	0 / 1,073 (0%)	0 / 291 (0%)
tender_boat	0 / 691 (0%)	1,315 / 1,315 (100%)	0 / 1,125 (0%)	0 / 1,073 (0%)	0 / 291 (0%)
cruise_ship	0 / 691 (0%)	0 / 1,315 (0%)	1,125 / 1,125 (100%)	0 / 1,073 (0%)	0 / 291 (0%)
tugboat	0 / 691 (0%)	0 / 1,315 (0%)	0 / 1,125 (0%)	1,073 / 1,073 (100%)	0 / 291 (0%)
lightship	0 / 691 (0%)	0 / 1,315 (0%)	0 / 1,125 (0%)	0 / 1,073 (0%)	291 / 291 (100%)
¹ n / N (%)

1.5.4 Assessing the degree of association

The degree of association between the LCA Civic Classes and the MLM civic classes was assessed using the chi-square test and Cramer’s v. The X² = 17,980, df = 16, p < .01 indicates that the civic classes under MLM are statistically associated with the LCA civic classes. The correlation between the two classifications is perfect (Cramer’s V = 1).

1.5.5 Conclusion

Because the LCA latent classes were derived from the 12 civic behavior indicators, I examined whether class membership could be reconstructed using a trained multinomial model (MLM). Results indicate that the MLM model perfectly reproduces the Mplus’ LCA civic classes. In essence, the MLM five-class solution is a function of the 12 behaviors.

1.6 Additional stress test of the trained MLM model

This section examines the extent to which the trained MLM reproduces the online Civic Navigator of predicted classes. As previously noted, I entered five constructed hypothetical cases into the online Civic Navigator. The pattern of simulated answers resembles that of probabilities of engagement in the 12 civic behaviors for each of the five classes under the LCA model. Those patterns are reported in Table 2 of the Civic Navigator Report.

# A tibble: 5 × 14
     id civic_cl_sim vote_cong vote_state pol_rallies lobbying pol_campaign
  <dbl> <chr>            <dbl>      <dbl>       <dbl>    <dbl>        <dbl>
1  6000 submarine            0          0           0        0            0
2  6001 tender_boat          1          1           0        0            0
3  6002 cruise_ship          1          1           0        0            0
4  6003 tugboat              1          1           0        0            0
5  6004 lightship            1          1           1        1            1
# ℹ 7 more variables: vol_faith <dbl>, vol_charity <dbl>, vol_cure <dbl>,
#   vol_culture <dbl>, donate_pol <dbl>, give_noprof <dbl>, donate_food <dbl>

As shown below, the MLM algorithm produced the same classifications as the online Civic Navigator’s algorithm across all 5 simulated cases. The diagonal displays 100% of agreement between the Civic Navigator’s class assignments and those generated by the MLM algorithm.

**Comparison of predicted classes by Civic Navigator and the MLM algorithm**
Diagonal displays the proportion of congruence between the two models
*Civic Navigator*	submarine N = 1¹	tender_boat N = 1¹	cruise_ship N = 1¹	tugboat N = 1¹	lightship N = 1¹
MLM Algorithm
submarine	1 / 1 (100%)	0 / 1 (0%)	0 / 1 (0%)	0 / 1 (0%)	0 / 1 (0%)
tender_boat	0 / 1 (0%)	1 / 1 (100%)	0 / 1 (0%)	0 / 1 (0%)	0 / 1 (0%)
cruise_ship	0 / 1 (0%)	0 / 1 (0%)	1 / 1 (100%)	0 / 1 (0%)	0 / 1 (0%)
tugboat	0 / 1 (0%)	0 / 1 (0%)	0 / 1 (0%)	1 / 1 (100%)	0 / 1 (0%)
lightship	0 / 1 (0%)	0 / 1 (0%)	0 / 1 (0%)	0 / 1 (0%)	1 / 1 (100%)
¹ n / N (%)

1.6.1 Summary

The evidence suggests that the Civic Navigator is functioning well. In addition, we now have a second, independently developed version of the classification method that future teams can use to verify or reproduce the results with high accuracy (see figure 2).

1.7 Split-Sample validation

This approach seeks to document the stability of the Civic Class classification in itself. Accordingly, I randomly divided the original 2021 NASCH sample into two equal groups: a training sample and a test sample.

Stratified splitting preserves the proportions of the population’s civic classes in the sample.

The model was estimated on the training sample using the same 12 civic behavior indicators and the same reference category (submarine) as in the original specification. Then, the estimated model was used to predict civic class membership for respondents in the test sample. In essence, this approach tests whether the model’s classification structure remains stable when it is applied to new observations.

1.7.1 Split-validation results

The model demonstrated exceptional stability. It achieved 99% classification accuracy, with 99% agreement beyond chance (Kappa). Misclassifications were extremely rare across all five civic classes, and class-level performance was nearly perfect. These results indicate that the civic classification system is highly reliable and robust under independent validation.

# weights:  70 (52 variable)
initial  value 3621.235303 
iter  10 value 1137.987191
iter  20 value 718.359134
iter  30 value 294.743502
iter  40 value 52.706094
iter  50 value 4.468170
iter  60 value 0.011323
final  value 0.000061 
converged

Confusion Matrix and Statistics

             Reference
Prediction    submarine tender_boat cruise_ship tugboat lightship
  submarine         345           1           0       4         0
  tender_boat         0         651           0       2         0
  cruise_ship         0           3         562       0         0
  tugboat             0           2           0     526         0
  lightship           0           0           0       4       145

Overall Statistics
                                          
               Accuracy : 0.9929          
                 95% CI : (0.9885, 0.9959)
    No Information Rate : 0.2927          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.9907          
                                          
 Mcnemar's Test P-Value : NA              

Statistics by Class:

                     Class: submarine Class: tender_boat Class: cruise_ship
Sensitivity                    1.0000             0.9909             1.0000
Specificity                    0.9974             0.9987             0.9982
Pos Pred Value                 0.9857             0.9969             0.9947
Neg Pred Value                 1.0000             0.9962             1.0000
Prevalence                     0.1537             0.2927             0.2503
Detection Rate                 0.1537             0.2900             0.2503
Detection Prevalence           0.1559             0.2909             0.2517
Balanced Accuracy              0.9987             0.9948             0.9991
                     Class: tugboat Class: lightship
Sensitivity                  0.9813          1.00000
Specificity                  0.9988          0.99810
Pos Pred Value               0.9962          0.97315
Neg Pred Value               0.9942          1.00000
Prevalence                   0.2388          0.06459
Detection Rate               0.2343          0.06459
Detection Prevalence         0.2352          0.06637
Balanced Accuracy            0.9901          0.99905

1.7.1.1 Conclusion

The split-sample validation confirms that the five civic classes are highly stable and reproducible. When the model was trained on one half of the data and tested on the other, it reproduced the original class assignments almost perfectly.

This means the classification system is not dependent on a particular sample or statistical artifact. The underlying civic behavior patterns are consistent and reliable.

Importantly, the same model logic powers the online Civic Navigator. As a result, The Civic Navigator provides a dependable, empirically validated map of the civil landscape.

In short, users can trust the Civic Navigator. Its civic classifications are stable, reflecting meaningful patterns of civic engagement.