Replicating the Civic Navigator’s algorithms

Author

Alberto F Cabrera

Published

March 1, 2026

1 Using MLM to replicate LCA civic classes

1.1 Background

In 2023, the Florida Poly group developed an algorithm to replicate the 5 civic classes generated by Mplus’ Latent Class Analysis (LCA). The online Civic Navigator relies on this algorithm to assign users to their corresponding civic class. The team wrote three Markdown files that I consulted to learn how they created this formula. Those files are:

  • COAT Data Analysis for insights dated June 29, 2023

  • COAT Data Analysis for insights-test June 29, 2023

  • COAT Data Analysis for insights dated August 29, 2023

The source of their analysis was the Excel file labeled Combinations 12 behaviors & classes - no missing data created on June 5, 2022 (civic_classes_06_05_2022). In turn, the source of the Excel file is an Mplus LCA analysis based on the 2021 National Survey on America Civic Health-Master File.dta. It is a Stata file.

For each of the 4,495 cases that comprise the 2021 National Survey on America Civic Health (2021 NASCH), the Excel file reports their answers to 12 civic behaviors as well as their corresponding civic class assignment. For more information, see the Civic Navigator Report.

In 2024, the original civic classes were relabeled from animal species to classes of ships as follows:

  1. Lightships = Mammals

  2. Birds = Cruise Ships

  3. Reptiles = Tugboats

  4. Fish = Submarines

  5. Amphibians = Tender Boats

1.2 Strategy

To replicate the Civic Navigator’s civic class assignment algorithm, I used a model-training approach based on the 2021 NASCH dataset. Specifically, I estimated multinomial logistic regression (MLM) models using the same 12 civic behaviors that informed the original Latent Class Analysis (LCA). The goal was to determine whether a regression-based model could accurately reproduce the five civic classes previously identified by LCA.

To evaluate the performance of the MLM results, I conducted three validation checks. First, I compared the MLM-generated class assignments with the original LCA classifications for all respondents in the 2021 NASCH dataset. This step assessed the degree of agreement between the MLM and LCA methods. Second, I conducted a stress test by applying the MLM model to five hypothetical cases to determine whether the predicted classifications matched those generated by the online Civic Navigator. Finally, I conducted a split-sample validation test to assess the MLM model’s reliability in reproducing the civic classes under independent conditions.

In essence, this strategy tests whether two different statistical approaches converge on the same civic class structure (see Figure 1). LCA identifies hidden groups based on shared behavioral patterns, while MLM uses observed behaviors to estimate the probability that an individual belongs to each class and assigns the class with the highest probability. Strong agreement between the two methods increases confidence that the civic classification system is stable and can be reliably implemented using a regression-based algorithm. The split-sample validation provides additional evidence that the new algorithm is robust and not dependent on a particular sample.

Figure 1. Replication & Validation Strategy: LCA & MLM

1.3 Verifying LCA classes

The civic_classes_06_05_2022 file contains the original Mplus LCA classes. Their frequencies are as follows:

# A tibble: 5 × 3
  civic_class     n percent
  <fct>       <int>   <dbl>
1 submarine     691   15.4 
2 tender_boat  1315   29.2 
3 cruise_ship  1125   25.0 
4 tugboat      1073   23.9 
5 lightship     291    6.47

1.4 Using MLM to estimate civic classes

This section documents the civic classes estimated under multinomial logistic regression analyses (MLM). It is based on the 2021 NASCH sample of 4,495 subjects who reported their civic engagement across 12 behaviors.

The first step was to estimate the intercept model. This model is a baseline. It helps to assess the extent to which adding the 12 civic behaviors as predictors of civic classes improves the model’s fit. Next, I estimated the full MLM with all 12 civic behaviors. Finally, I assessed the resulting improvement of fit of the foll model in relation to the intercept model.

1.4.1 Intercept model only

# weights:  10 (4 variable)
initial  value 7234.423416 
final  value 6802.251610 
converged
  civic class
Predictors Odds Ratios std. Error Response
(Intercept) 1.903 *** 0.089 tender_boat
(Intercept) 1.628 *** 0.079 cruise_ship
(Intercept) 1.553 *** 0.076 tugboat
(Intercept) 0.421 *** 0.029 lightship
Observations 4495
R2 / R2 adjusted 0.000 / -0.000
AIC 13612.503
* p<0.05   ** p<0.01   *** p<0.001

The intercept model displays poor fit indices. R2 = 0 indicates this model accounts for little variation among the civic classes. Moreover, the intercept model’s AIC = 13,612 is very large, as is its BIC = 13,638.

1.4.2 Full model

Modeling the impact of 12 civic behaviors on the probability of being assigned to any of the 5 civic classes. Data scientists refer to this strategy as “training the model”. This model can be used to predict civic class membership of new users of the Civic Navigator.

TipThe MLM full model can be used to assign future survey takers to their corresponding civic classes

1.4.2.1 MLM model

The predictors and estimates of the civic class model are reported below:

# weights:  70 (52 variable)
initial  value 7234.423416 
iter  10 value 2688.669387
iter  20 value 1493.066862
iter  30 value 1046.646970
iter  40 value 538.026844
iter  50 value 117.356639
iter  60 value 44.430657
iter  70 value 26.337779
iter  80 value 4.809437
iter  90 value 0.325707
iter 100 value 0.022822
final  value 0.022822 
stopped after 100 iterations
  civic class
Predictors Log-Odds std. Error Response
(Intercept) -1063.968 857.868 tender_boat
vote cong 2292.411 ** 857.865 tender_boat
vote state 3211.766 *** 857.868 tender_boat
pol rallies -1160.618 ** 419.349 tender_boat
lobbying -2072.489 *** 419.349 tender_boat
pol campaign -1481.351 *** 0.000 tender_boat
vol faith -536.952 *** 0.000 tender_boat
vol charity -2811.393 *** 445.992 tender_boat
vol cure -2333.738 *** 0.000 tender_boat
vol culture -1913.349 *** 0.001 tender_boat
donate pol -2475.774 *** 0.000 tender_boat
give noprof -1071.046 *** 0.000 tender_boat
donate food -175.751 *** 0.001 tender_boat
(Intercept) -3400.976 ** 1057.455 cruise_ship
vote cong 2397.054 ** 826.939 cruise_ship
vote state 2947.744 1573.186 cruise_ship
pol rallies -1071.585 *** 0.153 cruise_ship
lobbying -2067.762 1787.521 cruise_ship
pol campaign -1233.252 706.529 cruise_ship
vol faith -288.295 2734.998 cruise_ship
vol charity -321.544 1233.029 cruise_ship
vol cure -566.890 1665.075 cruise_ship
vol culture -993.312 1319.717 cruise_ship
donate pol -268.786 496.010 cruise_ship
give noprof 632.374 964.209 cruise_ship
donate food 1435.514 1493.334 cruise_ship
(Intercept) -1380.904 4920.806 tugboat
vote cong 886.334 *** 164.281 tugboat
vote state 662.352 2428.059 tugboat
pol rallies 525.918 419.181 tugboat
lobbying 508.538 1371.500 tugboat
pol campaign 672.019 706.516 tugboat
vol faith 350.025 4116.454 tugboat
vol charity 441.454 5161.604 tugboat
vol cure 568.968 5171.188 tugboat
vol culture 560.111 1319.703 tugboat
donate pol 453.504 531.675 tugboat
give noprof 368.435 927.915 tugboat
donate food 184.272 1455.890 tugboat
(Intercept) -7556.105 *** 0.015 lightship
vote cong 1166.597 *** 0.014 lightship
vote state 1299.892 *** 0.014 lightship
pol rallies 1285.872 *** 0.015 lightship
lobbying 984.141 *** 0.001 lightship
pol campaign 1342.971 *** 0.015 lightship
vol faith 717.002 *** 0.015 lightship
vol charity 1298.641 *** 0.015 lightship
vol cure 1293.679 *** 0.015 lightship
vol culture 937.631 *** 0.015 lightship
donate pol 1188.964 *** 0.015 lightship
give noprof 880.624 *** 0.001 lightship
donate food 915.719 *** 0.015 lightship
Observations 4495
R2 / R2 adjusted 1.000 / 1.000
AIC 104.046
* p<0.05   ** p<0.01   *** p<0.001

1.4.2.2 Saving the trained MLM model

Storing the fully trained MLM model (coefficients, reference class, and structure) as a transportable RDS file (mlm_civic_navigator_model.rds). This model can be used to replicate the algorithm embedded in the online Civic Navigator. To display the coefficients of the model type: coef(mlm_civic_navigator_model).

1.4.2.3 Comparison between models

This section reports the extent to which the full model of 12 behaviors represents a significant improvement in fit relative to the intercept model. The tests included the scaled deviance of the model and indicators of overall fit, such as the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). Lower AIC and BIC values signify a better fit. The ANOVA test represents a more rigorous assessment of the improvement of fit between the intercept model and the full model.

The AIC = 104.1 and BIC = 437.4 for the MLM full model are substantially lower than those of the MLM intercept model: AIC = 13,612.5, BIC = 13,638.2.

1.4.2.3.1 ANOVA test
Likelihood ratio tests of Multinomial Models

Response: civic_class
                                                                                                                                                       Model
1                                                                                                                                                          1
2 vote_cong + vote_state + pol_rallies + lobbying + pol_campaign + vol_faith + vol_charity + vol_cure + vol_culture + donate_pol + give_noprof + donate_food
  Resid. df   Resid. Dev   Test    Df LR stat. Pr(Chi)
1     17976 1.360450e+04                              
2     17928 4.564458e-02 1 vs 2    48 13604.46       0
1.4.2.3.2 Results

The full model with civic behaviors improves the prediction of civic class membership beyond the intercept-only model. The scale deviance (-0.0228223) of the full model is substantially smaller than that of the intercept-only model (-6802.2516104), suggesting a better fit to the data.

Moreover, the ANOVA test corroborates that the decrease in scale deviance associated with the full MLM model is statistically significant ( X 2 = 13,604, df = 46, p < .05).

1.5 Convergence between the MLM & the LCA model

This section documents the extent to which the MLM strategy reproduces the LCA model. The strategy consisted of estimating the probabilities of belonging to each of the 5 classes for every member of the 2021 NSCH. Next, it reports the results of comparing the civic classes predicted by MLM versus those documented by LCA.

As a second test, five carefully constructed hypothetical cases were entered into the online Civic Navigator. Each case represents one of the five civic categories based on the probabilities of engagement in the 12 civic behaviors for each of the five civic classes (see Table 2: Probabilities of Engagement in the Civic Navigator Report).

1.5.1 MLM Probabilities of class membership

Estimating probabilities and identifying the civic class membership for each member of the 2021 NASCH using the trained MLM model. Below are the predicted civic class memberships for 10 randomly selected cases. Notice that some probabilities are reported in scientific notation, making them difficult to read.

     submarine tender_boat   cruise_ship       tugboat lightship
1131         0           1 2.072622e-126  0.000000e+00         0
6            0           1  0.000000e+00  0.000000e+00         0
2143         0           0  1.000000e+00  9.118273e-31         0
7            0           1  0.000000e+00  0.000000e+00         0
2575         1           0  0.000000e+00  0.000000e+00         0
1567         0           0  1.000000e+00 1.409121e-213         0
3134         0           0  0.000000e+00  0.000000e+00         1
3040         1           0  0.000000e+00  0.000000e+00         0
1442         0           0  1.000000e+00  0.000000e+00         0
1239         0           1 3.870810e-277  0.000000e+00         0

1.5.2 Identifying class membership

The procedure consisted of identifying each subject’s class membership by selecting the class with the highest probability across the 5 civic classes. Below, the table reports the percentage of cases across the 5 civic classes for the whole 2021 NASCH.

 classified_df$civic_class_MLM    n   percent
                     submarine  691 0.1537264
                   tender_boat 1315 0.2925473
                   cruise_ship 1125 0.2502781
                       tugboat 1073 0.2387097
                     lightship  291 0.0647386

1.5.3 Congruence between the new algorithm and LCA

The table reports the level of congruence between the MLM model and the LCA in predicting class membership. The rows display the number of subjects assigned to each of the 5 classes under the MLM model. The columns show the number of subjects assigned to each class by the online Civic Navigator. The diagonal displays the degree of consistency between the two methods. As shown by the diagonal, the MLM model replicated the civic classes uncovered by LCA.

Comparison of predicted classes by original LCA model and MLM model
Diagonal displays the proportion of congruence between the two models
LCA Model submarine
N = 6911
tender_boat
N = 1,3151
cruise_ship
N = 1,1251
tugboat
N = 1,0731
lightship
N = 2911
MLM Predicted Civic Class




    submarine 691 / 691 (100%) 0 / 1,315 (0%) 0 / 1,125 (0%) 0 / 1,073 (0%) 0 / 291 (0%)
    tender_boat 0 / 691 (0%) 1,315 / 1,315 (100%) 0 / 1,125 (0%) 0 / 1,073 (0%) 0 / 291 (0%)
    cruise_ship 0 / 691 (0%) 0 / 1,315 (0%) 1,125 / 1,125 (100%) 0 / 1,073 (0%) 0 / 291 (0%)
    tugboat 0 / 691 (0%) 0 / 1,315 (0%) 0 / 1,125 (0%) 1,073 / 1,073 (100%) 0 / 291 (0%)
    lightship 0 / 691 (0%) 0 / 1,315 (0%) 0 / 1,125 (0%) 0 / 1,073 (0%) 291 / 291 (100%)
1 n / N (%)

1.5.4 Assessing the degree of association

The degree of association between the LCA Civic Classes and the MLM civic classes was assessed using the chi-square test and Cramer’s v. The X2 = 17,980, df = 16, p < .01 indicates that the civic classes under MLM are statistically associated with the LCA civic classes. The correlation between the two classifications is perfect (Cramer’s V = 1).

1.5.5 Conclusion

Because the LCA latent classes were derived from the 12 civic behavior indicators, I examined whether class membership could be reconstructed using a trained multinomial model (MLM). Results indicate that the MLM model perfectly reproduces the Mplus’ LCA civic classes. In essence, the MLM five-class solution is a function of the 12 behaviors.

1.6 Additional stress test of the trained MLM model

This section examines the extent to which the trained MLM reproduces the online Civic Navigator of predicted classes. As previously noted, I entered five constructed hypothetical cases into the online Civic Navigator. The pattern of simulated answers resembles that of probabilities of engagement in the 12 civic behaviors for each of the five classes under the LCA model. Those patterns are reported in Table 2 of the Civic Navigator Report.

# A tibble: 5 × 14
     id civic_cl_sim vote_cong vote_state pol_rallies lobbying pol_campaign
  <dbl> <chr>            <dbl>      <dbl>       <dbl>    <dbl>        <dbl>
1  6000 submarine            0          0           0        0            0
2  6001 tender_boat          1          1           0        0            0
3  6002 cruise_ship          1          1           0        0            0
4  6003 tugboat              1          1           0        0            0
5  6004 lightship            1          1           1        1            1
# ℹ 7 more variables: vol_faith <dbl>, vol_charity <dbl>, vol_cure <dbl>,
#   vol_culture <dbl>, donate_pol <dbl>, give_noprof <dbl>, donate_food <dbl>

As shown below, the MLM algorithm produced the same classifications as the online Civic Navigator’s algorithm across all 5 simulated cases. The diagonal displays 100% of agreement between the Civic Navigator’s class assignments and those generated by the MLM algorithm.

Comparison of predicted classes by Civic Navigator and the MLM algorithm
Diagonal displays the proportion of congruence between the two models
Civic Navigator submarine
N = 11
tender_boat
N = 11
cruise_ship
N = 11
tugboat
N = 11
lightship
N = 11
MLM Algorithm




    submarine 1 / 1 (100%) 0 / 1 (0%) 0 / 1 (0%) 0 / 1 (0%) 0 / 1 (0%)
    tender_boat 0 / 1 (0%) 1 / 1 (100%) 0 / 1 (0%) 0 / 1 (0%) 0 / 1 (0%)
    cruise_ship 0 / 1 (0%) 0 / 1 (0%) 1 / 1 (100%) 0 / 1 (0%) 0 / 1 (0%)
    tugboat 0 / 1 (0%) 0 / 1 (0%) 0 / 1 (0%) 1 / 1 (100%) 0 / 1 (0%)
    lightship 0 / 1 (0%) 0 / 1 (0%) 0 / 1 (0%) 0 / 1 (0%) 1 / 1 (100%)
1 n / N (%)

1.6.1 Summary

The evidence suggests that the Civic Navigator is functioning well. In addition, we now have a second, independently developed version of the classification method that future teams can use to verify or reproduce the results with high accuracy (see figure 2).

1.7 Split-Sample validation

This approach seeks to document the stability of the Civic Class classification in itself. Accordingly, I randomly divided the original 2021 NASCH sample into two equal groups: a training sample and a test sample.

ImportantStratified splitting preserves the proportions of the population’s civic classes in the sample.

The model was estimated on the training sample using the same 12 civic behavior indicators and the same reference category (submarine) as in the original specification. Then, the estimated model was used to predict civic class membership for respondents in the test sample. In essence, this approach tests whether the model’s classification structure remains stable when it is applied to new observations.

1.7.1 Split-validation results

The model demonstrated exceptional stability. It achieved 99% classification accuracy, with 99% agreement beyond chance (Kappa). Misclassifications were extremely rare across all five civic classes, and class-level performance was nearly perfect. These results indicate that the civic classification system is highly reliable and robust under independent validation.

# weights:  70 (52 variable)
initial  value 3621.235303 
iter  10 value 1137.987191
iter  20 value 718.359134
iter  30 value 294.743502
iter  40 value 52.706094
iter  50 value 4.468170
iter  60 value 0.011323
final  value 0.000061 
converged
Confusion Matrix and Statistics

             Reference
Prediction    submarine tender_boat cruise_ship tugboat lightship
  submarine         345           1           0       4         0
  tender_boat         0         651           0       2         0
  cruise_ship         0           3         562       0         0
  tugboat             0           2           0     526         0
  lightship           0           0           0       4       145

Overall Statistics
                                          
               Accuracy : 0.9929          
                 95% CI : (0.9885, 0.9959)
    No Information Rate : 0.2927          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.9907          
                                          
 Mcnemar's Test P-Value : NA              

Statistics by Class:

                     Class: submarine Class: tender_boat Class: cruise_ship
Sensitivity                    1.0000             0.9909             1.0000
Specificity                    0.9974             0.9987             0.9982
Pos Pred Value                 0.9857             0.9969             0.9947
Neg Pred Value                 1.0000             0.9962             1.0000
Prevalence                     0.1537             0.2927             0.2503
Detection Rate                 0.1537             0.2900             0.2503
Detection Prevalence           0.1559             0.2909             0.2517
Balanced Accuracy              0.9987             0.9948             0.9991
                     Class: tugboat Class: lightship
Sensitivity                  0.9813          1.00000
Specificity                  0.9988          0.99810
Pos Pred Value               0.9962          0.97315
Neg Pred Value               0.9942          1.00000
Prevalence                   0.2388          0.06459
Detection Rate               0.2343          0.06459
Detection Prevalence         0.2352          0.06637
Balanced Accuracy            0.9901          0.99905

1.7.1.1 Conclusion

The split-sample validation confirms that the five civic classes are highly stable and reproducible. When the model was trained on one half of the data and tested on the other, it reproduced the original class assignments almost perfectly.

This means the classification system is not dependent on a particular sample or statistical artifact. The underlying civic behavior patterns are consistent and reliable.

Importantly, the same model logic powers the online Civic Navigator. As a result, The Civic Navigator provides a dependable, empirically validated map of the civil landscape.

In short, users can trust the Civic Navigator. Its civic classifications are stable, reflecting meaningful patterns of civic engagement.