Research Title

Heterogeneous Preferences for Sustainable Urban Bus Services: Evidence from a Discrete Choice Experiment

Research Question

What factors influence preferences for sustainable urban bus services, and how do environmental attitudes contribute to heterogeneous choice behavior?

Hypotheses

First Hypothesis

Respondents prefer bus alternatives with:

shorter waiting times, higher reliability, greater safety, easier app usability, lower prices.

Second Hypothesis

Preferences for sustainable bus service attributes differ across latent population segments.

Third Hypothesis

Respondents with stronger environmental concern are more likely to attend to emission reduction attributes.

Setup, Data Loading & Cleaning

Load libraries

library(apollo)

## Warning: package 'apollo' was built under R version 4.4.3

## 
## 
##              . ,,                                                            
##             ,      ,,                                                        
##  ,,,,,,    ,         ,,                                                      
## ,     ,,  ,            ,,,,.                                                 
## ,,     , ,,   ,,,,,,    ,,,                                 //  //           
##   ,     ,,,.   ,,,,,.   ,,      ////                        //  //           
## ,,     ,,,,,.           ,,     // //     //////    /////    //  //    /////  
## ,,,        ,,           ,      //  //    /    //  //   //   //  //   //   // 
##               ,,       ,      ////////   /    //  //   //   //  //   //   // 
##                 ,,   ,,      //     //   /   ///  //   //   //  //   //   // 
##                    ,         //      //  /////      ///      //  //    ///   
##                                          //                                  
##                                          //                                  
## 
## Apollo 0.3.7
## https://www.ApolloChoiceModelling.com
## See url for a detailed manual, examples and a user forum.
## Sign up to the user forum to receive updates on new releases.
## 
## Please cite Apollo in all written material you produce:
## Hess S, Palma D (2019). "Apollo: a flexible, powerful and customisable
## freeware package for choice model estimation and application." Journal
## of Choice Modelling, 32. doi.org/10.1016/j.jocm.2019.100170
## 
## The developers of Apollo acknowledge the substantial support provided by
## the European Research Council (ERC) through the consolidator grant DECISIONS,
## the proof of concept grant APOLLO, and the advanced grant SYNERGY.
## 
## Apollo is integrated into SurveyEngine, a leading stated choice platform
## Try it at www.surveyengine.com/apollo

library(readxl)

## Warning: package 'readxl' was built under R version 4.4.3

library(dplyr)

## Warning: package 'dplyr' was built under R version 4.4.3

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

apollo_initialise()
apollo_lcPars <- NA   # fix Apollo 0.3.7 environment scan bug

Loading the dataset

raw <- read_excel(
  "C:/Users/MAGWALI/Desktop/Bus_DCE_DIB_Encoded_Data.xlsx"
) %>% rename(ID = censusno)

cat("Raw dimensions:", nrow(raw), "rows,", ncol(raw), "columns\n")

## Raw dimensions: 7320 rows, 31 columns

cat("Unique respondents:", n_distinct(raw$ID), "\n")

## Unique respondents: 366

Remove problematic cards (wrong number of alternatives)

raw <- raw %>%
  filter(!(ID == 223 & cardid == 20)) %>%
  filter(!(ID == 224 & cardid == 20)) %>%
  filter(!(ID == 322 & cardid ==  1)) %>%
  filter(!(ID == 323 & cardid ==  1)) %>%
  arrange(ID, cardid, alternative)

cat("After cleaning:", nrow(raw), "rows\n")

## After cleaning: 7312 rows

Convert long format to wide format

attr_cols  <- c("time_long","time_ave","time_short",
                "unaccurate","accurate_ave","accurate",
                "unsafe","safe_ave","safe",
                "difficult","easy",
                "undecrease","decrease_30","decrease_100",
                "price","ascfinal","choice")

indiv_cols <- c("ID","blockid","cardid",
                "gender","age","edu","income","marriage",
                "frequency","envi_con","emmis_con","agree","tax_agree")

alt1 <- raw %>% filter(alternative == 1) %>%
  select(all_of(c(indiv_cols, attr_cols))) %>%
  rename_with(~ paste0(., "_1"), all_of(attr_cols))

alt2 <- raw %>% filter(alternative == 2) %>%
  select(all_of(attr_cols)) %>%
  rename_with(~ paste0(., "_2"), everything())

database <- bind_cols(
  alt1 %>% select(all_of(indiv_cols)),
  alt1 %>% select(ends_with("_1")),
  alt2 %>% select(ends_with("_2"))
) %>%
  mutate(
    choiceVar   = ifelse(choice_1 == 1, 1, 2),
    log_price_1 = log(price_1 + 1),
    log_price_2 = log(price_2 + 1),
    envi_con_s  = as.numeric(scale(envi_con)),
    income_s    = as.numeric(scale(income)),
    age_s       = as.numeric(scale(age))
  )

cat("Wide format:", nrow(database), "rows,", ncol(database), "columns\n")

## Wide format: 3656 rows, 53 columns

Validation

cat("\n--- Validation ---\n")

## 
## --- Validation ---

cat("Rows (expect 3656):      ", nrow(database), "\n")

## Rows (expect 3656):       3656

cat("Respondents (expect 366):", n_distinct(database$ID), "\n")

## Respondents (expect 366): 366

cat("choiceVar values:        ", paste(sort(unique(database$choiceVar)), collapse=", "), "\n")

## choiceVar values:         1, 2

cat("ascfinal_1 (expect 0):  ", unique(database$ascfinal_1), "\n")

## ascfinal_1 (expect 0):   0

cat("ascfinal_2 (expect 1):  ", unique(database$ascfinal_2), "\n")

## ascfinal_2 (expect 1):   1

cat("Price levels (alt 1):   ", paste(sort(unique(database$price_1)), collapse=", "), "\n")

## Price levels (alt 1):    0, 10, 20, 40, 100

CHUNK 2 — BASELINE MNL (LINEAR PRICE)

Purpose: Estimate average preferences. Benchmark model. Reference levels: time_long, unaccurate, unsafe, difficult, undecrease

apollo_control_mnl <- list(
  modelName       = "MNL_Baseline",
  modelDescr      = "Baseline MNL: linear price, all attributes",
  indivID         = "ID",
  outputDirectory = "output/"
)

apollo_beta_mnl <- c(
  b_asc        =  0,
  b_time_ave   =  0.1,
  b_time_short =  0.3,
  b_acc_ave    =  0.1,
  b_acc_high   =  0.3,
  b_safe_ave   =  0.1,
  b_safe_high  =  0.3,
  b_easy       =  0.2,
  b_emiss30    =  0.1,
  b_emiss100   =  0.3,
  b_price      = -0.01
)

apollo_fixed_mnl <- c()

apollo_probabilities_mnl <- function(apollo_beta, apollo_inputs,
                                     functionality = "estimate") {
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  P <- list()
  V <- list()
  
  V[["1"]] <-
    b_asc * ascfinal_1 +
    b_time_ave * time_ave_1 + b_time_short * time_short_1 +
    b_acc_ave  * accurate_ave_1 + b_acc_high * accurate_1 +
    b_safe_ave * safe_ave_1     + b_safe_high * safe_1 +
    b_easy     * easy_1 +
    b_emiss30  * decrease_30_1  + b_emiss100 * decrease_100_1 +
    b_price    * price_1
  
  V[["2"]] <-
    b_asc * ascfinal_2 +
    b_time_ave * time_ave_2 + b_time_short * time_short_2 +
    b_acc_ave  * accurate_ave_2 + b_acc_high * accurate_2 +
    b_safe_ave * safe_ave_2     + b_safe_high * safe_2 +
    b_easy     * easy_2 +
    b_emiss30  * decrease_30_2  + b_emiss100 * decrease_100_2 +
    b_price    * price_2
  
  mnl_settings <- list(
    alternatives = c("1" = 1, "2" = 2),
    avail        = list("1" = 1, "2" = 1),
    choiceVar    = choiceVar,
    V            = V
  )
  
  P[["model"]] <- apollo_mnl(mnl_settings, functionality)
  P <- apollo_panelProd(P, apollo_inputs, functionality)
  P <- apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

apollo_lcPars <- NA
apollo_inputs_mnl <- apollo_validateInputs(
  apollo_beta    = apollo_beta_mnl,
  apollo_fixed   = apollo_fixed_mnl,
  database       = database,
  apollo_control = apollo_control_mnl
)

## Several observations per individual detected based on the value of ID.
##   Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.

apollo_lcPars <- NA
model_mnl <- apollo_estimate(
  apollo_beta          = apollo_beta_mnl,
  apollo_fixed         = apollo_fixed_mnl,
  apollo_probabilities = apollo_probabilities_mnl,
  apollo_inputs        = apollo_inputs_mnl
)

## WARNING: Element apollo_lcPars in the global environment differs from
##   that inside apollo_inputs. The latter will be used. If you wish to
##   use the former, stop this function by pressing the "Escape" key, and
##   rerun apollo_validateInputs before calling this function. 
## Preparing user-defined functions.
## 
## Testing likelihood function...
## 
## Overview of choices for MNL model component :
##                                        1       2
## Times available                  3656.00 3656.00
## Times chosen                     2125.00 1531.00
## Percentage chosen overall          58.12   41.88
## Percentage chosen when available   58.12   41.88
## 
## 
## Pre-processing likelihood function...
## 
## Testing influence of parameters
## Starting main estimation
## 
## BGW using analytic model derivatives supplied by caller...
## 
## 
## Iterates will be written to: 
##  output/MNL_Baseline_iterations.csv
##     it    nf     F            RELDF    PRELDF    RELDX    MODEL stppar
##      0     1 2.275262462e+03
##      1     3 2.252737914e+03 9.900e-03 9.896e-03 4.09e-02   G   7.89e+00
##      2     5 2.148039032e+03 4.648e-02 4.804e-02 3.30e-01   G   4.52e-01
##      3     6 2.122819176e+03 1.174e-02 1.153e-02 1.91e-01   S   0.00e+00
##      4     7 2.122697139e+03 5.749e-05 5.894e-05 9.18e-03   S   0.00e+00
##      5     8 2.122696723e+03 1.960e-07 2.216e-07 6.14e-04   G   0.00e+00
##      6     9 2.122696714e+03 3.961e-09 4.008e-09 9.70e-05   S   0.00e+00
##      7    10 2.122696714e+03 7.039e-12 7.498e-12 2.47e-06   S   0.00e+00
## 
## ***** Relative function convergence (favorable) *****
## 
## Estimated parameters with approximate standard errors from BHHH matrix:
##                 Estimate     BHHH se BHH t-ratio (0)
## b_asc          -0.162471     0.06298         -2.5798
## b_time_ave      0.068463     0.07545          0.9074
## b_time_short    0.317469     0.05331          5.9549
## b_acc_ave       0.283988     0.07248          3.9182
## b_acc_high      0.613894     0.05840         10.5126
## b_safe_ave      0.609304     0.06735          9.0474
## b_safe_high     1.107313     0.05954         18.5977
## b_easy          0.184909     0.05002          3.6969
## b_emiss30       0.071065     0.06161          1.1534
## b_emiss100      0.464873     0.04499         10.3317
## b_price        -0.009867  7.4271e-04        -13.2849
## 
## Final LL: -2122.6967
## 
## Calculating log-likelihood at equal shares (LL(0)) for applicable
##   models...
## Calculating log-likelihood at observed shares from estimation data
##   (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical jacobian of analytical
##   gradient.
##  0%....25%....50%....75%..100%
## Negative definite Hessian with maximum eigenvalue: -82.545791
## 
## Your model was estimated using the BGW algorithm. Please acknowledge
##   this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
## 
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
##   doi.org/10.1016/j.jocm.2019.100170

apollo_modelOutput(model_mnl, modelOutput_settings = list(printPVal = TRUE))

## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
##   DOI 10.1016/j.jocm.2019.100170
##   www.ApolloChoiceModelling.com
## 
## Model name                                  : MNL_Baseline
## Model description                           : Baseline MNL: linear price, all attributes
## Model run at                                : 2026-05-19 17:27:37.617629
## Estimation method                           : bgw
## Estimation diagnosis                        : Relative function convergence (favorable)
## Optimisation diagnosis                      : Maximum found
##      hessian properties                     : Negative definite
##      maximum eigenvalue                     : -82.545791
##      reciprocal of condition number         : 4.41078e-05
## Number of individuals                       : 366
## Number of rows in database                  : 3656
## Number of modelled outcomes                 : 3656
## 
## Number of cores used                        :  1 
## Model without mixing
## 
## LL(start)                                   : -2275.26
## LL at equal shares, LL(0)                   : -2534.15
## LL at observed shares, LL(C)                : -2485.68
## LL(final)                                   : -2122.7
## Rho-squared vs equal shares                  :  0.1624 
## Adj.Rho-squared vs equal shares              :  0.158 
## Rho-squared vs observed shares               :  0.146 
## Adj.Rho-squared vs observed shares           :  0.142 
## AIC                                         :  4267.39 
## BIC                                         :  4335.64 
## 
## Estimated parameters                        : 11
## Time taken (hh:mm:ss)                       :  00:00:1.52 
##      pre-estimation                         :  00:00:0.56 
##      estimation                             :  00:00:0.31 
##      post-estimation                        :  00:00:0.65 
## Iterations                                  :  7  
## 
## Unconstrained optimisation.
## 
## Estimates:
##                 Estimate        s.e.   t.rat.(0)  p(1-sided)    Rob.s.e.
## b_asc          -0.162471     0.05689      -2.856    0.002147     0.05348
## b_time_ave      0.068463     0.06237       1.098    0.136184     0.05740
## b_time_short    0.317469     0.05811       5.463   2.344e-08     0.06841
## b_acc_ave       0.283988     0.06226       4.562   2.539e-06     0.05628
## b_acc_high      0.613894     0.06505       9.438    0.000000     0.07569
## b_safe_ave      0.609304     0.06828       8.924    0.000000     0.07456
## b_safe_high     1.107313     0.06492      17.056    0.000000     0.07667
## b_easy          0.184909     0.04332       4.268   9.853e-06     0.03944
## b_emiss30       0.071065     0.05512       1.289    0.098639     0.05476
## b_emiss100      0.464873     0.05225       8.896    0.000000     0.06248
## b_price        -0.009867  8.2391e-04     -11.976    0.000000  9.7113e-04
##              Rob.t.rat.(0)  p(1-sided)
## b_asc               -3.038    0.001191
## b_time_ave           1.193    0.116498
## b_time_short         4.641   1.736e-06
## b_acc_ave            5.046   2.261e-07
## b_acc_high           8.111   2.220e-16
## b_safe_ave           8.172   1.110e-16
## b_safe_high         14.443    0.000000
## b_easy               4.688   1.378e-06
## b_emiss30            1.298    0.097187
## b_emiss100           7.440   5.018e-14
## b_price            -10.160    0.000000

rho2_mnl <- 1 - (model_mnl$LLout / model_mnl$LL0)
cat("\n--- MNL Baseline Fit ---\n")

## 
## --- MNL Baseline Fit ---

cat("LL:          ", round(model_mnl$LLout, 3), "\n")

## LL:           -2122.697

cat("McFadden R2: ", round(rho2_mnl, 4), "\n")

## McFadden R2:  0.1624

cat("AIC:         ", round(model_mnl$AIC, 3), "\n")

## AIC:          4267.393

cat("BIC:         ", round(model_mnl$BIC, 3), "\n")

## BIC:          4335.639

MNL WITH NONLINEAR (LOG) PRICE

Purpose: Test diminishing price sensitivity. Only change from baseline: b_price * price replaced with b_price_log * log(price + 1)

apollo_control_mnl_log <- list(
  modelName       = "MNL_LogPrice",
  modelDescr      = "MNL: log-transformed price",
  indivID         = "ID",
  outputDirectory = "output/"
)

apollo_beta_mnl_log <- c(
  b_asc        =  0,
  b_time_ave   =  0.1,
  b_time_short =  0.3,
  b_acc_ave    =  0.1,
  b_acc_high   =  0.3,
  b_safe_ave   =  0.1,
  b_safe_high  =  0.3,
  b_easy       =  0.2,
  b_emiss30    =  0.1,
  b_emiss100   =  0.3,
  b_price_log  = -0.3
)

apollo_fixed_mnl_log <- c()

apollo_probabilities_mnl_log <- function(apollo_beta, apollo_inputs,
                                         functionality = "estimate") {
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  P <- list()
  V <- list()
  
  V[["1"]] <-
    b_asc * ascfinal_1 +
    b_time_ave * time_ave_1 + b_time_short * time_short_1 +
    b_acc_ave  * accurate_ave_1 + b_acc_high * accurate_1 +
    b_safe_ave * safe_ave_1     + b_safe_high * safe_1 +
    b_easy     * easy_1 +
    b_emiss30  * decrease_30_1  + b_emiss100 * decrease_100_1 +
    b_price_log * log_price_1
  
  V[["2"]] <-
    b_asc * ascfinal_2 +
    b_time_ave * time_ave_2 + b_time_short * time_short_2 +
    b_acc_ave  * accurate_ave_2 + b_acc_high * accurate_2 +
    b_safe_ave * safe_ave_2     + b_safe_high * safe_2 +
    b_easy     * easy_2 +
    b_emiss30  * decrease_30_2  + b_emiss100 * decrease_100_2 +
    b_price_log * log_price_2
  
  mnl_settings <- list(
    alternatives = c("1" = 1, "2" = 2),
    avail        = list("1" = 1, "2" = 1),
    choiceVar    = choiceVar,
    V            = V
  )
  
  P[["model"]] <- apollo_mnl(mnl_settings, functionality)
  P <- apollo_panelProd(P, apollo_inputs, functionality)
  P <- apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

apollo_lcPars <- NA
apollo_inputs_mnl_log <- apollo_validateInputs(
  apollo_beta    = apollo_beta_mnl_log,
  apollo_fixed   = apollo_fixed_mnl_log,
  database       = database,
  apollo_control = apollo_control_mnl_log
)

## Several observations per individual detected based on the value of ID.
##   Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.

apollo_lcPars <- NA
model_mnl_log <- apollo_estimate(
  apollo_beta          = apollo_beta_mnl_log,
  apollo_fixed         = apollo_fixed_mnl_log,
  apollo_probabilities = apollo_probabilities_mnl_log,
  apollo_inputs        = apollo_inputs_mnl_log
)

## WARNING: Element apollo_lcPars in the global environment differs from
##   that inside apollo_inputs. The latter will be used. If you wish to
##   use the former, stop this function by pressing the "Escape" key, and
##   rerun apollo_validateInputs before calling this function. 
## Preparing user-defined functions.
## 
## Testing likelihood function...
## 
## Overview of choices for MNL model component :
##                                        1       2
## Times available                  3656.00 3656.00
## Times chosen                     2125.00 1531.00
## Percentage chosen overall          58.12   41.88
## Percentage chosen when available   58.12   41.88
## 
## 
## Pre-processing likelihood function...
## 
## Testing influence of parameters
## Starting main estimation
## 
## BGW using analytic model derivatives supplied by caller...
## 
## 
## Iterates will be written to: 
##  output/MNL_LogPrice_iterations.csv
##     it    nf     F            RELDF    PRELDF    RELDX    MODEL stppar
##      0     1 2.395124849e+03
##      1     3 2.359863853e+03 1.472e-02 1.461e-02 2.74e-02   G   9.00e+00
##      2     5 2.184927979e+03 7.413e-02 7.055e-02 3.13e-01   G   2.97e-01
##      3     6 2.158124379e+03 1.227e-02 1.247e-02 1.43e-01   S   0.00e+00
##      4     7 2.158023232e+03 4.687e-05 4.261e-05 7.54e-03   S   0.00e+00
##      5     8 2.158019858e+03 1.564e-06 1.581e-06 1.42e-03   S   0.00e+00
##      6     9 2.158019842e+03 7.659e-09 8.367e-09 1.43e-04   S   0.00e+00
##      7    10 2.158019841e+03 9.084e-11 9.211e-11 1.29e-05   S   0.00e+00
## 
## ***** Relative function convergence (favorable) *****
## 
## Estimated parameters with approximate standard errors from BHHH matrix:
##                 Estimate     BHHH se BHH t-ratio (0)
## b_asc           -0.33225     0.06310          -5.266
## b_time_ave      -0.08003     0.07442          -1.075
## b_time_short     0.16955     0.05311           3.193
## b_acc_ave        0.35961     0.07294           4.931
## b_acc_high       0.74717     0.05868          12.733
## b_safe_ave       0.45779     0.06618           6.918
## b_safe_high      1.02859     0.06078          16.922
## b_easy           0.19629     0.04958           3.959
## b_emiss30        0.07620     0.06100           1.249
## b_emiss100       0.43048     0.04354           9.887
## b_price_log     -0.16830     0.01759          -9.568
## 
## Final LL: -2158.0198
## 
## Calculating log-likelihood at equal shares (LL(0)) for applicable
##   models...
## Calculating log-likelihood at observed shares from estimation data
##   (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical jacobian of analytical
##   gradient.
##  0%....25%....50%....75%..100%
## Negative definite Hessian with maximum eigenvalue: -88.501112
## 
## Your model was estimated using the BGW algorithm. Please acknowledge
##   this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
## 
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
##   doi.org/10.1016/j.jocm.2019.100170

apollo_modelOutput(model_mnl_log, modelOutput_settings = list(printPVal = TRUE))

## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
##   DOI 10.1016/j.jocm.2019.100170
##   www.ApolloChoiceModelling.com
## 
## Model name                                  : MNL_LogPrice
## Model description                           : MNL: log-transformed price
## Model run at                                : 2026-05-19 17:27:39.244725
## Estimation method                           : bgw
## Estimation diagnosis                        : Relative function convergence (favorable)
## Optimisation diagnosis                      : Maximum found
##      hessian properties                     : Negative definite
##      maximum eigenvalue                     : -88.501112
##      reciprocal of condition number         : 0.0248896
## Number of individuals                       : 366
## Number of rows in database                  : 3656
## Number of modelled outcomes                 : 3656
## 
## Number of cores used                        :  1 
## Model without mixing
## 
## LL(start)                                   : -2395.12
## LL at equal shares, LL(0)                   : -2534.15
## LL at observed shares, LL(C)                : -2485.68
## LL(final)                                   : -2158.02
## Rho-squared vs equal shares                  :  0.1484 
## Adj.Rho-squared vs equal shares              :  0.1441 
## Rho-squared vs observed shares               :  0.1318 
## Adj.Rho-squared vs observed shares           :  0.1278 
## AIC                                         :  4338.04 
## BIC                                         :  4406.29 
## 
## Estimated parameters                        : 11
## Time taken (hh:mm:ss)                       :  00:00:0.96 
##      pre-estimation                         :  00:00:0.22 
##      estimation                             :  00:00:0.15 
##      post-estimation                        :  00:00:0.58 
## Iterations                                  :  7  
## 
## Unconstrained optimisation.
## 
## Estimates:
##                 Estimate        s.e.   t.rat.(0)  p(1-sided)    Rob.s.e.
## b_asc           -0.33225     0.05642      -5.889   1.946e-09     0.05221
## b_time_ave      -0.08003     0.05892      -1.358    0.087171     0.05370
## b_time_short     0.16955     0.05530       3.066    0.001085     0.06285
## b_acc_ave        0.35961     0.06256       5.749   4.495e-09     0.05782
## b_acc_high       0.74717     0.06483      11.525    0.000000     0.07635
## b_safe_ave       0.45779     0.06697       6.835   4.088e-12     0.07439
## b_safe_high      1.02859     0.06387      16.105    0.000000     0.07342
## b_easy           0.19629     0.04276       4.590   2.214e-06     0.03894
## b_emiss30        0.07620     0.05549       1.373    0.084826     0.05554
## b_emiss100       0.43048     0.05152       8.355    0.000000     0.06207
## b_price_log     -0.16830     0.01845      -9.120    0.000000     0.02171
##              Rob.t.rat.(0)  p(1-sided)
## b_asc               -6.363   9.874e-11
## b_time_ave          -1.490    0.068071
## b_time_short         2.698    0.003491
## b_acc_ave            6.220   2.491e-10
## b_acc_high           9.786    0.000000
## b_safe_ave           6.154   3.775e-10
## b_safe_high         14.010    0.000000
## b_easy               5.040   2.325e-07
## b_emiss30            1.372    0.085030
## b_emiss100           6.935   2.031e-12
## b_price_log         -7.753   4.441e-15

rho2_mnl_log <- 1 - (model_mnl_log$LLout / model_mnl_log$LL0)
cat("\n--- MNL Log-Price Fit ---\n")

## 
## --- MNL Log-Price Fit ---

cat("LL:          ", round(model_mnl_log$LLout, 3), "\n")

## LL:           -2158.02

cat("McFadden R2: ", round(rho2_mnl_log, 4), "\n")

## McFadden R2:  0.1484

cat("AIC:         ", round(model_mnl_log$AIC, 3), "\n")

## AIC:          4338.04

cat("BIC:         ", round(model_mnl_log$BIC, 3), "\n")

## BIC:          4406.285

cat("\nFunctional form comparison (lower BIC = better):\n")

## 
## Functional form comparison (lower BIC = better):

cat("MNL linear BIC: ", round(model_mnl$BIC, 3), "\n")

## MNL linear BIC:  4335.639

cat("MNL log BIC:    ", round(model_mnl_log$BIC, 3), "\n")

## MNL log BIC:     4406.285

2-CLASS LATENT CLASS MODEL (LCM)

Purpose: Capture preference heterogeneity across two latent segments. Membership predicted by environmental concern and income.

apollo_control_lcm2 <- list(
  modelName       = "LCM_2Class",
  modelDescr      = "2-class LCM with environmental and income predictors",
  indivID         = "ID",
  outputDirectory = "output/"
)

apollo_beta_lcm2 <- c(
  # Class 1: environmentally oriented
  asc_c1        =  0,
  b_time_ave_c1 =  0.1,
  b_time_sh_c1  =  0.3,
  b_acc_ave_c1  =  0.1,
  b_acc_hi_c1   =  0.3,
  b_safe_ave_c1 =  0.1,
  b_safe_hi_c1  =  0.3,
  b_easy_c1     =  0.2,
  b_em30_c1     =  0.2,
  b_em100_c1    =  0.5,
  b_price_c1    = -0.01,
  
  # Class 2: price sensitive (reference class)
  asc_c2        =  0,
  b_time_ave_c2 =  0.1,
  b_time_sh_c2  =  0.3,
  b_acc_ave_c2  =  0.1,
  b_acc_hi_c2   =  0.3,
  b_safe_ave_c2 =  0.1,
  b_safe_hi_c2  =  0.3,
  b_easy_c2     =  0.2,
  b_em30_c2     =  0.05,
  b_em100_c2    =  0.1,
  b_price_c2    = -0.03,
  
  # Class membership (class 2 = reference)
  delta_c1     =  0,
  gamma_envi   =  0.3,
  gamma_income =  0.2
)

apollo_fixed_lcm2 <- c()

apollo_lcPars_lcm2 <- function(apollo_beta, apollo_inputs) {
  # Access directly — no apollo_attach, no apollo_mnl
  delta_c1     <- apollo_beta["delta_c1"]
  gamma_envi   <- apollo_beta["gamma_envi"]
  gamma_income <- apollo_beta["gamma_income"]
  envi_con_s   <- apollo_inputs$database$envi_con_s
  income_s     <- apollo_inputs$database$income_s
  
  # Manual softmax
  V1    <- delta_c1 + gamma_envi * envi_con_s + gamma_income * income_s
  denom <- exp(V1) + 1
  
  lcpars <- list()
  lcpars[["pi_values"]] <- list(
    class1 = exp(V1) / denom,
    class2 = 1       / denom
  )
  return(lcpars)
}

apollo_probabilities_lcm2 <- function(apollo_beta, apollo_inputs,
                                      functionality = "estimate") {
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  P <- list()
  
  # Class 1 utilities
  V1 <- list()
  V1[["1"]] <-
    asc_c1 * ascfinal_1 +
    b_time_ave_c1 * time_ave_1 + b_time_sh_c1 * time_short_1 +
    b_acc_ave_c1  * accurate_ave_1 + b_acc_hi_c1 * accurate_1 +
    b_safe_ave_c1 * safe_ave_1     + b_safe_hi_c1 * safe_1 +
    b_easy_c1     * easy_1 +
    b_em30_c1     * decrease_30_1  + b_em100_c1 * decrease_100_1 +
    b_price_c1    * price_1
  V1[["2"]] <-
    asc_c1 * ascfinal_2 +
    b_time_ave_c1 * time_ave_2 + b_time_sh_c1 * time_short_2 +
    b_acc_ave_c1  * accurate_ave_2 + b_acc_hi_c1 * accurate_2 +
    b_safe_ave_c1 * safe_ave_2     + b_safe_hi_c1 * safe_2 +
    b_easy_c1     * easy_2 +
    b_em30_c1     * decrease_30_2  + b_em100_c1 * decrease_100_2 +
    b_price_c1    * price_2
  
  # Class 2 utilities
  V2 <- list()
  V2[["1"]] <-
    asc_c2 * ascfinal_1 +
    b_time_ave_c2 * time_ave_1 + b_time_sh_c2 * time_short_1 +
    b_acc_ave_c2  * accurate_ave_1 + b_acc_hi_c2 * accurate_1 +
    b_safe_ave_c2 * safe_ave_1     + b_safe_hi_c2 * safe_1 +
    b_easy_c2     * easy_1 +
    b_em30_c2     * decrease_30_1  + b_em100_c2 * decrease_100_1 +
    b_price_c2    * price_1
  V2[["2"]] <-
    asc_c2 * ascfinal_2 +
    b_time_ave_c2 * time_ave_2 + b_time_sh_c2 * time_short_2 +
    b_acc_ave_c2  * accurate_ave_2 + b_acc_hi_c2 * accurate_2 +
    b_safe_ave_c2 * safe_ave_2     + b_safe_hi_c2 * safe_2 +
    b_easy_c2     * easy_2 +
    b_em30_c2     * decrease_30_2  + b_em100_c2 * decrease_100_2 +
    b_price_c2    * price_2
  
  mnl_set <- list(
    alternatives = c("1" = 1, "2" = 2),
    avail        = list("1" = 1, "2" = 1),
    choiceVar    = choiceVar
  )
  
  P[["class_1"]] <- apollo_panelProd(
    apollo_mnl(c(mnl_set, list(componentName = "MNL_c1", V = V1)), functionality),
    apollo_inputs, functionality)
  P[["class_2"]] <- apollo_panelProd(
    apollo_mnl(c(mnl_set, list(componentName = "MNL_c2", V = V2)), functionality),
    apollo_inputs, functionality)
  
  P[["model"]] <- apollo_lc(
    list(
      inClassProb = list(P[["class_1"]], P[["class_2"]]),
      classProb   = pi_values
    ),
    apollo_inputs, functionality
  )
  
  P <- apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

apollo_inputs_lcm2 <- apollo_validateInputs(
  apollo_beta    = apollo_beta_lcm2,
  apollo_fixed   = apollo_fixed_lcm2,
  database       = database,
  apollo_control = apollo_control_lcm2,
  apollo_lcPars  = apollo_lcPars_lcm2
)

## Several observations per individual detected based on the value of ID.
##   Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.

model_lcm2 <- apollo_estimate(
  apollo_beta          = apollo_beta_lcm2,
  apollo_fixed         = apollo_fixed_lcm2,
  apollo_probabilities = apollo_probabilities_lcm2,
  apollo_inputs        = apollo_inputs_lcm2
)

## WARNING: Element apollo_lcPars in the global environment differs from
##   that inside apollo_inputs. The latter will be used. If you wish to
##   use the former, stop this function by pressing the "Escape" key, and
##   rerun apollo_validateInputs before calling this function. 
## Preparing user-defined functions.
## WARNING: The pre-processing of 'apollo_probabilities' failed during
##   loop expansion. Your model may still run, but this indicates a
##   potential problem. Please contact the developers for assistance! 
## 
## Testing likelihood function...
## 
## Overview of choices for MNL model component MNL_c1:
##                                        1       2
## Times available                  3656.00 3656.00
## Times chosen                     2125.00 1531.00
## Percentage chosen overall          58.12   41.88
## Percentage chosen when available   58.12   41.88
## 
## 
## Overview of choices for MNL model component MNL_c2:
##                                        1       2
## Times available                  3656.00 3656.00
## Times chosen                     2125.00 1531.00
## Percentage chosen overall          58.12   41.88
## Percentage chosen when available   58.12   41.88
## 
## 
## Summary of class allocation for model component :
##          Mean prob.
## Class_1      0.5002
## Class_2      0.4998
## The class allocation probabilities for model component "model" are
##   calculated at the observation level in 'apollo_lcPars', but are used
##   in 'apollo_probabilities' to multiply within class probabilities that
##   are at the individual level. Apollo will average the class allocation
##   probabilities across observations for the same individual level
##   before using them to multiply the within-class probabilities. If your
##   class allocation probabilities are constant across choice situations
##   for the same individual, then this is of no concern. If your class
##   allocation probabilities however vary across choice tasks, then you
##   should change your model specification in 'apollo_probabilities' to
##   only call 'apollo_panelProd' after calling 'apollo_lc'.
## 
## Pre-processing likelihood function...
## INFORMATION: Apollo was not able to compute analytical gradients for
##   your model. This could be because you are using model components for
##   which analytical gradients are not yet implemented, or because you
##   coded your own model functions. If however you only used apollo_mnl,
##   apollo_fmnl, apollo_normalDensity, apollo_ol or apollo_op, then there
##   could be another issue. You might want to ask for help in the Apollo
##   forum (https://www.ApolloChoiceModelling.com/forum) on how to solve
##   this issue. If you do, please post your code and data (if not
##   confidential). 
## Analytical gradients could not be calculated for all components,
##   numerical gradients will be used.
## 
## Testing influence of parameters.........................
## Starting main estimation
## 
## BGW is using FD derivatives for model Jacobian. (Caller did not provide derivatives.)
## 
## 
## Iterates will be written to: 
##  output/LCM_2Class_iterations.csv
##     it    nf     F            RELDF    PRELDF    RELDX    MODEL stppar
##      0     1 2.323544922e+03
##      1     4 2.193202586e+03 5.610e-02 4.227e-02 1.15e-01   G   4.57e-01
##      2     5 2.095329992e+03 4.463e-02 3.548e-02 2.78e-01   G   0.00e+00
##      3     6 2.063504984e+03 1.519e-02 2.857e-02 2.22e-01   S   5.65e-03
##      4     8 2.027886820e+03 1.726e-02 1.485e-02 1.27e-01  G-S  0.00e+00
##      5     9 2.014171652e+03 6.763e-03 6.736e-03 7.37e-02   S   0.00e+00
##      6    10 2.008878572e+03 2.628e-03 2.686e-03 5.56e-02   S   0.00e+00
##      7    11 2.007698848e+03 5.873e-04 5.145e-04 1.95e-02   S   0.00e+00
##      8    12 2.007521453e+03 8.836e-05 1.397e-04 1.72e-02   G   0.00e+00
##      9    14 2.007334110e+03 9.332e-05 7.067e-05 1.37e-02  G-S  0.00e+00
##     10    15 2.007242976e+03 4.540e-05 2.864e-05 1.21e-02   G   0.00e+00
##     11    16 2.007147927e+03 4.735e-05 2.608e-05 1.02e-02   G   0.00e+00
##     12    17 2.007020774e+03 6.335e-05 3.214e-05 1.40e-02   G   0.00e+00
##     13    18 2.006829204e+03 9.545e-05 4.717e-05 1.45e-02   G   0.00e+00
##     14    19 2.006526456e+03 1.509e-04 7.334e-05 2.10e-02   G   0.00e+00
##     15    20 2.006112806e+03 2.062e-04 1.107e-04 3.56e-02   G   0.00e+00
##     16    21 2.005678697e+03 2.164e-04 1.395e-04 1.96e-02   G   0.00e+00
##     17    22 2.005377629e+03 1.501e-04 1.590e-04 1.83e-02   G   0.00e+00
##     18    23 2.005311376e+03 3.304e-05 2.249e-04 1.75e-02   G   0.00e+00
##     19    25 2.004901081e+03 2.046e-04 1.568e-04 1.36e-02  G-S  0.00e+00
##     20    26 2.004765450e+03 6.765e-05 5.075e-05 5.52e-03   S   0.00e+00
##     21    27 2.004705586e+03 2.986e-05 2.340e-05 6.21e-03   S   0.00e+00
##     22    28 2.004684849e+03 1.034e-05 8.440e-06 2.55e-03   S   0.00e+00
##     23    29 2.004680011e+03 2.413e-06 1.928e-06 1.14e-03   S   0.00e+00
##     24    30 2.004678823e+03 5.927e-07 4.685e-07 6.75e-04   S   0.00e+00
##     25    31 2.004678427e+03 1.975e-07 1.614e-07 4.46e-04   S   0.00e+00
##     26    32 2.004678299e+03 6.397e-08 5.082e-08 1.76e-04   S   0.00e+00
##     27    33 2.004678266e+03 1.640e-08 1.447e-08 9.02e-05   S   0.00e+00
##     28    35 2.004678263e+03 1.574e-09 2.651e-09 6.35e-05  G-S  0.00e+00
##     29    36 2.004678261e+03 8.780e-10 7.372e-10 1.35e-05   S   0.00e+00
##     30    37 2.004678261e+03 1.213e-10 1.016e-10 1.02e-05   S   0.00e+00
##     31    38 2.004678261e+03 2.439e-11 2.221e-11 4.35e-06   S   0.00e+00
## 
## ***** Relative function convergence (favorable) *****
## 
## Estimated parameters with approximate standard errors from BHHH matrix:
##                  Estimate     BHHH se BHH t-ratio (0)
## asc_c1          -0.298034    0.078728         -3.7856
## b_time_ave_c1    0.097132    0.092443          1.0507
## b_time_sh_c1     0.294847    0.072236          4.0817
## b_acc_ave_c1     0.385454    0.093058          4.1421
## b_acc_hi_c1      0.634826    0.078091          8.1293
## b_safe_ave_c1    0.927671    0.091113         10.1816
## b_safe_hi_c1     1.402875    0.072588         19.3266
## b_easy_c1        0.038453    0.065682          0.5854
## b_em30_c1       -0.085296    0.077666         -1.0982
## b_em100_c1       0.391250    0.062595          6.2505
## b_price_c1      -0.002923    0.001253         -2.3327
## asc_c2           0.107685    0.196347          0.5484
## b_time_ave_c2   -0.125731    0.233450         -0.5386
## b_time_sh_c2     0.234295    0.197557          1.1860
## b_acc_ave_c2     0.071664    0.221332          0.3238
## b_acc_hi_c2      0.771684    0.238387          3.2371
## b_safe_ave_c2    0.104995    0.202783          0.5178
## b_safe_hi_c2     0.538205    0.195796          2.7488
## b_easy_c2        0.709985    0.150630          4.7134
## b_em30_c2        0.853335    0.199757          4.2719
## b_em100_c2       1.055390    0.179279          5.8869
## b_price_c2      -0.037359    0.004091         -9.1320
## delta_c1         0.876711    0.165693          5.2912
## gamma_envi      -0.016156    0.144540         -0.1118
## gamma_income     0.466706    0.173735          2.6863
## 
## Final LL: -2004.6783
## 
## 
## Summary of class allocation for model component :
##          Mean prob.
## Class_1      0.6966
## Class_2      0.3034
## 
## Calculating log-likelihood at equal shares (LL(0)) for applicable
##   models...
## Calculating log-likelihood at observed shares from estimation data
##   (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical methods (numDeriv).
##  0%....25%....50%....75%....100%
## Negative definite Hessian with maximum eigenvalue: -8.574092
## 
## Your model was estimated using the BGW algorithm. Please acknowledge
##   this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
## 
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
##   doi.org/10.1016/j.jocm.2019.100170

apollo_modelOutput(model_lcm2, modelOutput_settings = list(printPVal = TRUE))

## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
##   DOI 10.1016/j.jocm.2019.100170
##   www.ApolloChoiceModelling.com
## 
## Model name                                  : LCM_2Class
## Model description                           : 2-class LCM with environmental and income predictors
## Model run at                                : 2026-05-19 17:27:40.302126
## Estimation method                           : bgw
## Estimation diagnosis                        : Relative function convergence (favorable)
## Optimisation diagnosis                      : Maximum found
##      hessian properties                     : Negative definite
##      maximum eigenvalue                     : -8.574092
##      reciprocal of condition number         : 7.79621e-06
## Number of individuals                       : 366
## Number of rows in database                  : 3656
## Number of modelled outcomes                 : 10968
##                           MNL_c1 : 3656
##                           MNL_c2 : 3656
##                            model : 3656
## 
## Number of cores used                        :  1 
## Model without mixing
## 
## LL(start)                                   : -2323.54
## LL (whole model) at equal shares, LL(0)     : -2534.15
## LL (whole model) at observed shares, LL(C)  : -2485.68
## LL(final, whole model)                      : -2004.68
## Rho-squared vs equal shares                  :  0.2089 
## Adj.Rho-squared vs equal shares              :  0.1991 
## Rho-squared vs observed shares               :  0.1935 
## Adj.Rho-squared vs observed shares           :  0.1843 
## AIC                                         :  4059.36 
## BIC                                         :  4214.46 
## 
## LL(0,class_1)                    : -2534.15
## LL(final,class_1)                : -2240.3
## LL(0,class_2)                    : -2534.15
## LL(final,class_2)                : -3149.59
## 
## Estimated parameters                        : 25
## Time taken (hh:mm:ss)                       :  00:00:49.41 
##      pre-estimation                         :  00:00:0.68 
##      estimation                             :  00:00:12.8 
##      post-estimation                        :  00:00:35.93 
## Iterations                                  :  31  
## 
## Unconstrained optimisation.
## 
## Estimates:
##                  Estimate        s.e.   t.rat.(0)  p(1-sided)    Rob.s.e.
## asc_c1          -0.298034    0.076413     -3.9003   4.804e-05    0.078257
## b_time_ave_c1    0.097132    0.080052      1.2134    0.112496    0.079747
## b_time_sh_c1     0.294847    0.077508      3.8041   7.117e-05    0.090718
## b_acc_ave_c1     0.385454    0.084430      4.5654   2.493e-06    0.082753
## b_acc_hi_c1      0.634826    0.086529      7.3366   1.096e-13    0.101457
## b_safe_ave_c1    0.927671    0.090586     10.2408    0.000000    0.101454
## b_safe_hi_c1     1.402875    0.087052     16.1154    0.000000    0.122546
## b_easy_c1        0.038453    0.057490      0.6689    0.251791    0.054998
## b_em30_c1       -0.085296    0.075039     -1.1367    0.127834    0.088388
## b_em100_c1       0.391250    0.073014      5.3586   4.194e-08    0.098676
## b_price_c1      -0.002923    0.001134     -2.5771    0.004982    0.001172
## asc_c2           0.107685    0.184734      0.5829    0.279974    0.226332
## b_time_ave_c2   -0.125731    0.190687     -0.6594    0.254833    0.211992
## b_time_sh_c2     0.234295    0.177484      1.3201    0.093402    0.185024
## b_acc_ave_c2     0.071664    0.170341      0.4207    0.336983    0.173151
## b_acc_hi_c2      0.771684    0.208214      3.7062  1.0519e-04    0.207189
## b_safe_ave_c2    0.104995    0.179820      0.5839    0.279647    0.217419
## b_safe_hi_c2     0.538205    0.205477      2.6193    0.004406    0.261039
## b_easy_c2        0.709985    0.140066      5.0689   2.000e-07    0.180006
## b_em30_c2        0.853335    0.200904      4.2475   1.081e-05    0.256718
## b_em100_c2       1.055390    0.192472      5.4833   2.087e-08    0.248533
## b_price_c2      -0.037359    0.004301     -8.6863    0.000000    0.006034
## delta_c1         0.876711    0.178341      4.9159   4.418e-07    0.234458
## gamma_envi      -0.016156    0.136290     -0.1185    0.452820    0.171232
## gamma_income     0.466706    0.166195      2.8082    0.002491    0.173801
##               Rob.t.rat.(0)  p(1-sided)
## asc_c1             -3.80837   6.994e-05
## b_time_ave_c1       1.21801    0.111610
## b_time_sh_c1        3.25016  5.7670e-04
## b_acc_ave_c1        4.65787   1.597e-06
## b_acc_hi_c1         6.25707   1.961e-10
## b_safe_ave_c1       9.14376    0.000000
## b_safe_hi_c1       11.44772    0.000000
## b_easy_c1           0.69916    0.242225
## b_em30_c1          -0.96501    0.167269
## b_em100_c1          3.96501   3.670e-05
## b_price_c1         -2.49406    0.006314
## asc_c2              0.47578    0.317114
## b_time_ave_c2      -0.59309    0.276560
## b_time_sh_c2        1.26630    0.102703
## b_acc_ave_c2        0.41388    0.339480
## b_acc_hi_c2         3.72454   9.784e-05
## b_safe_ave_c2       0.48292    0.314577
## b_safe_hi_c2        2.06178    0.019614
## b_easy_c2           3.94424   4.003e-05
## b_em30_c2           3.32401  4.4366e-04
## b_em100_c2          4.24648   1.086e-05
## b_price_c2         -6.19181   2.974e-10
## delta_c1            3.73931   9.226e-05
## gamma_envi         -0.09435    0.462416
## gamma_income        2.68528    0.003623
## 
## 
## Summary of class allocation for model component :
##          Mean prob.
## Class_1      0.6966
## Class_2      0.3034

rho2_lcm2 <- 1 - (model_lcm2$LLout / model_lcm2$LL0)
cat("\n--- 2-Class LCM Fit ---\n")

## 
## --- 2-Class LCM Fit ---

cat("LL:          ", round(model_lcm2$LLout, 3), "\n")

## LL:           -2004.678 -2240.296 -3149.591

cat("McFadden R2: ", round(rho2_lcm2, 4), "\n")

## McFadden R2:  0.2089 0.116 -0.2429

cat("AIC:         ", round(model_lcm2$AIC, 3), "\n")

## AIC:          4059.357

cat("BIC:         ", round(model_lcm2$BIC, 3), "\n")

## BIC:          4214.46

LC-ANA: LATENT CLASS + ATTRIBUTE NON-ATTENDANCE

Purpose: Behavioral extension. Class 1 (Attenders): emission coefficients active Class 2 (Non-attenders): emission coefficients omitted Membership predicted by envi_con_s Tests H3: environmentally concerned respondents attend to emission reduction attributes.

apollo_control_lcana <- list(
  modelName       = "LC_ANA",
  modelDescr      = "LC-ANA: emission attribute non-attendance",
  indivID         = "ID",
  outputDirectory = "output/"
)

apollo_beta_lcana <- c(
  # Class 1: attends to emissions
  asc_c1        =  0,
  b_time_ave_c1 =  0.1,
  b_time_sh_c1  =  0.3,
  b_acc_ave_c1  =  0.1,
  b_acc_hi_c1   =  0.3,
  b_safe_ave_c1 =  0.1,
  b_safe_hi_c1  =  0.3,
  b_easy_c1     =  0.2,
  b_em30_c1     =  0.2,
  b_em100_c1    =  0.5,
  b_price_c1    = -0.01,
  
  # Class 2: ignores emissions (non-attender)
  asc_c2        =  0,
  b_time_ave_c2 =  0.1,
  b_time_sh_c2  =  0.3,
  b_acc_ave_c2  =  0.1,
  b_acc_hi_c2   =  0.3,
  b_safe_ave_c2 =  0.1,
  b_safe_hi_c2  =  0.3,
  b_easy_c2     =  0.2,
  b_price_c2    = -0.02,
  # No emission parameters — this IS the non-attendance
  
  # Class membership
  delta_c1   =  0,
  gamma_envi =  0.5
)

apollo_fixed_lcana <- c()

apollo_lcPars_lcana <- function(apollo_beta, apollo_inputs) {
  # Access directly — no apollo_attach, no apollo_mnl
  delta_c1   <- apollo_beta["delta_c1"]
  gamma_envi <- apollo_beta["gamma_envi"]
  envi_con_s <- apollo_inputs$database$envi_con_s
  
  # Manual softmax
  V1    <- delta_c1 + gamma_envi * envi_con_s
  denom <- exp(V1) + 1
  
  lcpars <- list()
  lcpars[["pi_values"]] <- list(
    class1 = exp(V1) / denom,
    class2 = 1       / denom
  )
  return(lcpars)
}

apollo_probabilities_lcana <- function(apollo_beta, apollo_inputs,
                                       functionality = "estimate") {
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  P <- list()
  
  # Class 1: full utility including emissions
  V1 <- list()
  V1[["1"]] <-
    asc_c1 * ascfinal_1 +
    b_time_ave_c1 * time_ave_1 + b_time_sh_c1 * time_short_1 +
    b_acc_ave_c1  * accurate_ave_1 + b_acc_hi_c1 * accurate_1 +
    b_safe_ave_c1 * safe_ave_1     + b_safe_hi_c1 * safe_1 +
    b_easy_c1     * easy_1 +
    b_em30_c1     * decrease_30_1  + b_em100_c1 * decrease_100_1 +
    b_price_c1    * price_1
  V1[["2"]] <-
    asc_c1 * ascfinal_2 +
    b_time_ave_c1 * time_ave_2 + b_time_sh_c1 * time_short_2 +
    b_acc_ave_c1  * accurate_ave_2 + b_acc_hi_c1 * accurate_2 +
    b_safe_ave_c1 * safe_ave_2     + b_safe_hi_c1 * safe_2 +
    b_easy_c1     * easy_2 +
    b_em30_c1     * decrease_30_2  + b_em100_c1 * decrease_100_2 +
    b_price_c1    * price_2
  
  # Class 2: emission attributes omitted (non-attendance)
  V2 <- list()
  V2[["1"]] <-
    asc_c2 * ascfinal_1 +
    b_time_ave_c2 * time_ave_1 + b_time_sh_c2 * time_short_1 +
    b_acc_ave_c2  * accurate_ave_1 + b_acc_hi_c2 * accurate_1 +
    b_safe_ave_c2 * safe_ave_1     + b_safe_hi_c2 * safe_1 +
    b_easy_c2     * easy_1 +
    b_price_c2    * price_1
  V2[["2"]] <-
    asc_c2 * ascfinal_2 +
    b_time_ave_c2 * time_ave_2 + b_time_sh_c2 * time_short_2 +
    b_acc_ave_c2  * accurate_ave_2 + b_acc_hi_c2 * accurate_2 +
    b_safe_ave_c2 * safe_ave_2     + b_safe_hi_c2 * safe_2 +
    b_easy_c2     * easy_2 +
    b_price_c2    * price_2
  
  mnl_set <- list(
    alternatives = c("1" = 1, "2" = 2),
    avail        = list("1" = 1, "2" = 1),
    choiceVar    = choiceVar
  )
  
  P[["class_1"]] <- apollo_panelProd(
    apollo_mnl(c(mnl_set, list(componentName = "MNL_c1", V = V1)), functionality),
    apollo_inputs, functionality)
  P[["class_2"]] <- apollo_panelProd(
    apollo_mnl(c(mnl_set, list(componentName = "MNL_c2", V = V2)), functionality),
    apollo_inputs, functionality)
  
  P[["model"]] <- apollo_lc(
    list(
      inClassProb = list(P[["class_1"]], P[["class_2"]]),
      classProb   = pi_values
    ),
    apollo_inputs, functionality
  )
  
  P <- apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

apollo_inputs_lcana <- apollo_validateInputs(
  apollo_beta    = apollo_beta_lcana,
  apollo_fixed   = apollo_fixed_lcana,
  database       = database,
  apollo_control = apollo_control_lcana,
  apollo_lcPars  = apollo_lcPars_lcana
)

## Several observations per individual detected based on the value of ID.
##   Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.

model_lcana <- apollo_estimate(
  apollo_beta          = apollo_beta_lcana,
  apollo_fixed         = apollo_fixed_lcana,
  apollo_probabilities = apollo_probabilities_lcana,
  apollo_inputs        = apollo_inputs_lcana
)

## WARNING: Element apollo_lcPars in the global environment differs from
##   that inside apollo_inputs. The latter will be used. If you wish to
##   use the former, stop this function by pressing the "Escape" key, and
##   rerun apollo_validateInputs before calling this function. 
## Preparing user-defined functions.
## WARNING: The pre-processing of 'apollo_probabilities' failed during
##   loop expansion. Your model may still run, but this indicates a
##   potential problem. Please contact the developers for assistance! 
## 
## Testing likelihood function...
## 
## Overview of choices for MNL model component MNL_c1:
##                                        1       2
## Times available                  3656.00 3656.00
## Times chosen                     2125.00 1531.00
## Percentage chosen overall          58.12   41.88
## Percentage chosen when available   58.12   41.88
## 
## 
## Overview of choices for MNL model component MNL_c2:
##                                        1       2
## Times available                  3656.00 3656.00
## Times chosen                     2125.00 1531.00
## Percentage chosen overall          58.12   41.88
## Percentage chosen when available   58.12   41.88
## 
## 
## Summary of class allocation for model component :
##          Mean prob.
## Class_1      0.5016
## Class_2      0.4984
## The class allocation probabilities for model component "model" are
##   calculated at the observation level in 'apollo_lcPars', but are used
##   in 'apollo_probabilities' to multiply within class probabilities that
##   are at the individual level. Apollo will average the class allocation
##   probabilities across observations for the same individual level
##   before using them to multiply the within-class probabilities. If your
##   class allocation probabilities are constant across choice situations
##   for the same individual, then this is of no concern. If your class
##   allocation probabilities however vary across choice tasks, then you
##   should change your model specification in 'apollo_probabilities' to
##   only call 'apollo_panelProd' after calling 'apollo_lc'.
## 
## Pre-processing likelihood function...
## INFORMATION: Apollo was not able to compute analytical gradients for
##   your model. This could be because you are using model components for
##   which analytical gradients are not yet implemented, or because you
##   coded your own model functions. If however you only used apollo_mnl,
##   apollo_fmnl, apollo_normalDensity, apollo_ol or apollo_op, then there
##   could be another issue. You might want to ask for help in the Apollo
##   forum (https://www.ApolloChoiceModelling.com/forum) on how to solve
##   this issue. If you do, please post your code and data (if not
##   confidential). 
## Analytical gradients could not be calculated for all components,
##   numerical gradients will be used.
## 
## Testing influence of parameters......................
## Starting main estimation
## 
## BGW is using FD derivatives for model Jacobian. (Caller did not provide derivatives.)
## 
## 
## Iterates will be written to: 
##  output/LC_ANA_iterations.csv
##     it    nf     F            RELDF    PRELDF    RELDX    MODEL stppar
##      0     1 2.303998017e+03
##      1     4 2.161634940e+03 6.179e-02 4.264e-02 2.18e-01   G   0.00e+00
##      2     5 2.075322208e+03 3.993e-02 3.012e-02 2.34e-01   G   0.00e+00
##      3     6 2.045028424e+03 1.460e-02 1.462e-02 1.13e-01   G   0.00e+00
##      4     7 2.026567695e+03 9.027e-03 7.728e-03 8.29e-02   S   0.00e+00
##      5     8 2.021228821e+03 2.634e-03 2.388e-03 3.93e-02   S   0.00e+00
##      6     9 2.019125370e+03 1.041e-03 8.753e-04 2.31e-02   S   0.00e+00
##      7    10 2.018453395e+03 3.328e-04 2.961e-04 1.16e-02   S   0.00e+00
##      8    11 2.018205443e+03 1.228e-04 1.028e-04 9.43e-03   S   0.00e+00
##      9    12 2.018186023e+03 9.623e-06 3.552e-05 8.85e-03   G   0.00e+00
##     10    13 2.018143457e+03 2.109e-05 1.962e-05 3.91e-03   S   0.00e+00
##     11    14 2.018138783e+03 2.316e-06 2.090e-06 1.39e-03   S   0.00e+00
##     12    15 2.018137856e+03 4.592e-07 8.067e-07 7.19e-04   G   0.00e+00
##     13    16 2.018136860e+03 4.936e-07 3.835e-07 4.09e-04   S   0.00e+00
##     14    17 2.018136723e+03 6.815e-08 6.314e-08 1.49e-04   G   0.00e+00
##     15    18 2.018136675e+03 2.369e-08 3.941e-08 1.93e-04   G   0.00e+00
##     16    20 2.018136624e+03 2.550e-08 1.875e-08 1.00e-04  G-S  0.00e+00
##     17    21 2.018136613e+03 5.061e-09 6.830e-09 6.33e-05   G   0.00e+00
##     18    22 2.018136611e+03 1.077e-09 6.014e-09 7.43e-05   G   0.00e+00
##     19    23 2.018136602e+03 4.368e-09 3.723e-09 4.97e-05   S   0.00e+00
##     20    24 2.018136601e+03 6.523e-10 5.682e-10 2.19e-05   S   0.00e+00
##     21    25 2.018136601e+03 1.941e-10 1.624e-10 7.15e-06   S   0.00e+00
##     22    26 2.018136601e+03 4.535e-11 4.225e-11 7.57e-06   S   0.00e+00
## 
## ***** Relative function convergence (favorable) *****
## 
## Estimated parameters with approximate standard errors from BHHH matrix:
##                  Estimate     BHHH se BHH t-ratio (0)
## asc_c1          -0.235319    0.070222         -3.3511
## b_time_ave_c1    0.096862    0.085080          1.1385
## b_time_sh_c1     0.292059    0.064409          4.5344
## b_acc_ave_c1     0.306528    0.082069          3.7350
## b_acc_hi_c1      0.580501    0.068544          8.4691
## b_safe_ave_c1    0.843489    0.083110         10.1491
## b_safe_hi_c1     1.329788    0.067255         19.7723
## b_easy_c1        0.121670    0.058951          2.0639
## b_em30_c1        0.042306    0.068562          0.6170
## b_em100_c1       0.519816    0.054825          9.4814
## b_price_c1      -0.004248    0.001096         -3.8747
## asc_c2           0.037150    0.267331          0.1390
## b_time_ave_c2    0.436715    0.381066          1.1460
## b_time_sh_c2     1.292701    0.384172          3.3649
## b_acc_ave_c2     0.575447    0.343712          1.6742
## b_acc_hi_c2      1.669257    0.332724          5.0169
## b_safe_ave_c2   -0.248548    0.277115         -0.8969
## b_safe_hi_c2     0.117955    0.298566          0.3951
## b_easy_c2        0.484515    0.218082          2.2217
## b_price_c2      -0.059294    0.008724         -6.7965
## delta_c1         1.295743    0.159279          8.1351
## gamma_envi       0.148521    0.153752          0.9660
## 
## Final LL: -2018.1366
## 
## 
## Summary of class allocation for model component :
##          Mean prob.
## Class_1      0.7841
## Class_2      0.2159
## 
## Calculating log-likelihood at equal shares (LL(0)) for applicable
##   models...
## Calculating log-likelihood at observed shares from estimation data
##   (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical methods (numDeriv).
##  0%....25%....50%....75%....100%
## Negative definite Hessian with maximum eigenvalue: -3.436019
## 
## Your model was estimated using the BGW algorithm. Please acknowledge
##   this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
## 
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
##   doi.org/10.1016/j.jocm.2019.100170

apollo_modelOutput(model_lcana, modelOutput_settings = list(printPVal = TRUE))

## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
##   DOI 10.1016/j.jocm.2019.100170
##   www.ApolloChoiceModelling.com
## 
## Model name                                  : LC_ANA
## Model description                           : LC-ANA: emission attribute non-attendance
## Model run at                                : 2026-05-19 17:28:29.805928
## Estimation method                           : bgw
## Estimation diagnosis                        : Relative function convergence (favorable)
## Optimisation diagnosis                      : Maximum found
##      hessian properties                     : Negative definite
##      maximum eigenvalue                     : -3.436019
##      reciprocal of condition number         : 2.65151e-06
## Number of individuals                       : 366
## Number of rows in database                  : 3656
## Number of modelled outcomes                 : 10968
##                           MNL_c1 : 3656
##                           MNL_c2 : 3656
##                            model : 3656
## 
## Number of cores used                        :  1 
## Model without mixing
## 
## LL(start)                                   : -2304
## LL (whole model) at equal shares, LL(0)     : -2534.15
## LL (whole model) at observed shares, LL(C)  : -2485.68
## LL(final, whole model)                      : -2018.14
## Rho-squared vs equal shares                  :  0.2036 
## Adj.Rho-squared vs equal shares              :  0.1949 
## Rho-squared vs observed shares               :  0.1881 
## Adj.Rho-squared vs observed shares           :  0.18 
## AIC                                         :  4080.27 
## BIC                                         :  4216.76 
## 
## LL(0,class_1)                    : -2534.15
## LL(final,class_1)                : -2186.96
## LL(0,class_2)                    : -2534.15
## LL(final,class_2)                : -4191.03
## 
## Estimated parameters                        : 22
## Time taken (hh:mm:ss)                       :  00:00:35.97 
##      pre-estimation                         :  00:00:0.61 
##      estimation                             :  00:00:8.08 
##      post-estimation                        :  00:00:27.28 
## Iterations                                  :  22  
## 
## Unconstrained optimisation.
## 
## Estimates:
##                  Estimate        s.e.   t.rat.(0)  p(1-sided)    Rob.s.e.
## asc_c1          -0.235319    0.066831     -3.5211  2.1486e-04    0.064890
## b_time_ave_c1    0.096862    0.074312      1.3034    0.096211    0.074908
## b_time_sh_c1     0.292059    0.071773      4.0692   2.359e-05    0.087360
## b_acc_ave_c1     0.306528    0.073834      4.1516   1.651e-05    0.071301
## b_acc_hi_c1      0.580501    0.077507      7.4897   3.453e-14    0.092163
## b_safe_ave_c1    0.843489    0.082667     10.2035    0.000000    0.088565
## b_safe_hi_c1     1.329788    0.078330     16.9767    0.000000    0.101165
## b_easy_c1        0.121670    0.051166      2.3779    0.008705    0.047817
## b_em30_c1        0.042306    0.064156      0.6594    0.254812    0.068148
## b_em100_c1       0.519816    0.063034      8.2465   1.110e-16    0.077489
## b_price_c1      -0.004248    0.001003     -4.2378   1.129e-05  9.8696e-04
## asc_c2           0.037150    0.208882      0.1779    0.429419    0.207006
## b_time_ave_c2    0.436715    0.325742      1.3407    0.090012    0.332459
## b_time_sh_c2     1.292701    0.394245      3.2789  5.2101e-04    0.455495
## b_acc_ave_c2     0.575447    0.243159      2.3665    0.008977    0.216720
## b_acc_hi_c2      1.669257    0.304044      5.4902   2.008e-08    0.334891
## b_safe_ave_c2   -0.248548    0.235799     -1.0541    0.145925    0.284347
## b_safe_hi_c2     0.117955    0.247574      0.4764    0.316879    0.228603
## b_easy_c2        0.484515    0.157615      3.0741    0.001056    0.150626
## b_price_c2      -0.059294    0.008344     -7.1059   5.976e-13    0.009982
## delta_c1         1.295743    0.165710      7.8193   2.665e-15    0.187351
## gamma_envi       0.148521    0.139180      1.0671    0.142959    0.154157
##               Rob.t.rat.(0)  p(1-sided)
## asc_c1              -3.6264  1.4368e-04
## b_time_ave_c1        1.2931    0.097992
## b_time_sh_c1         3.3432  4.1416e-04
## b_acc_ave_c1         4.2991   8.576e-06
## b_acc_hi_c1          6.2986   1.501e-10
## b_safe_ave_c1        9.5239    0.000000
## b_safe_hi_c1        13.1447    0.000000
## b_easy_c1            2.5445    0.005472
## b_em30_c1            0.6208    0.267366
## b_em100_c1           6.7082   9.849e-12
## b_price_c1          -4.3045   8.366e-06
## asc_c2               0.1795    0.428786
## b_time_ave_c2        1.3136    0.094492
## b_time_sh_c2         2.8380    0.002270
## b_acc_ave_c2         2.6553    0.003962
## b_acc_hi_c2          4.9845   3.106e-07
## b_safe_ave_c2       -0.8741    0.191032
## b_safe_hi_c2         0.5160    0.302933
## b_easy_c2            3.2167  6.4844e-04
## b_price_c2          -5.9399   1.426e-09
## delta_c1             6.9161   2.321e-12
## gamma_envi           0.9634    0.167663
## 
## 
## Summary of class allocation for model component :
##          Mean prob.
## Class_1      0.7841
## Class_2      0.2159

rho2_lcana <- 1 - (model_lcana$LLout / model_lcana$LL0)
cat("\n--- LC-ANA Fit ---\n")

## 
## --- LC-ANA Fit ---

cat("LL:          ", round(model_lcana$LLout, 3), "\n")

## LL:           -2018.137 -2186.958 -4191.027

cat("McFadden R2: ", round(rho2_lcana, 4), "\n")

## McFadden R2:  0.2036 0.137 -0.6538

cat("AIC:         ", round(model_lcana$AIC, 3), "\n")

## AIC:          4080.273

cat("BIC:         ", round(model_lcana$BIC, 3), "\n")

## BIC:          4216.764

MODEL COMPARISON & WTP SUMMARY

cat("\n============================================================\n")

## 
## ============================================================

cat("MODEL COMPARISON TABLE\n")

## MODEL COMPARISON TABLE

cat("============================================================\n")

## ============================================================

cat(sprintf("%-20s %10s %6s %10s %10s\n",
            "Model", "LL", "Params", "AIC", "BIC"))

## Model                        LL Params        AIC        BIC

cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
            "MNL (linear)",
            model_mnl$LLout, length(model_mnl$estimate),
            model_mnl$AIC, model_mnl$BIC))

## MNL (linear)            -2122.7     11     4267.4     4335.6

cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
            "MNL (log price)",
            model_mnl_log$LLout, length(model_mnl_log$estimate),
            model_mnl_log$AIC, model_mnl_log$BIC))

## MNL (log price)         -2158.0     11     4338.0     4406.3

cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
            "LCM 2-class",
            model_lcm2$LLout, length(model_lcm2$estimate),
            model_lcm2$AIC, model_lcm2$BIC))

## LCM 2-class             -2004.7     25     4059.4     4214.5
##  LCM 2-class             -2240.3     25     4059.4     4214.5
##  LCM 2-class             -3149.6     25     4059.4     4214.5

cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
            "LC-ANA",
            model_lcana$LLout, length(model_lcana$estimate),
            model_lcana$AIC, model_lcana$BIC))

## LC-ANA                  -2018.1     22     4080.3     4216.8
##  LC-ANA                  -2187.0     22     4080.3     4216.8
##  LC-ANA                  -4191.0     22     4080.3     4216.8

cat("(Lower AIC/BIC = better fit)\n")

## (Lower AIC/BIC = better fit)

WTP from MNL baseline

cat("\n--- WTP: MNL Baseline (% fare increase accepted) ---\n")

## 
## --- WTP: MNL Baseline (% fare increase accepted) ---

est <- model_mnl$estimate
cat("30%  emission reduction: ",
    round(-(est["b_emiss30"]  / est["b_price"]), 2), "%\n")

## 30%  emission reduction:  7.2 %

cat("100% emission reduction: ",
    round(-(est["b_emiss100"] / est["b_price"]), 2), "%\n")

## 100% emission reduction:  47.11 %

cat("Short vs long wait time: ",
    round(-(est["b_time_short"] / est["b_price"]), 2), "%\n")

## Short vs long wait time:  32.18 %

cat("High vs low safety:      ",
    round(-(est["b_safe_high"]  / est["b_price"]), 2), "%\n")

## High vs low safety:       112.23 %

WTP from 2-Class LCM

cat("\n--- WTP: 2-Class LCM (% fare increase accepted) ---\n")

## 
## --- WTP: 2-Class LCM (% fare increase accepted) ---

est2 <- model_lcm2$estimate
cat("Class 1 — 100% emission reduction: ",
    round(-(est2["b_em100_c1"] / est2["b_price_c1"]), 2), "%\n")

## Class 1 — 100% emission reduction:  133.86 %

cat("Class 2 — 100% emission reduction: ",
    round(-(est2["b_em100_c2"] / est2["b_price_c2"]), 2), "%\n")

## Class 2 — 100% emission reduction:  28.25 %

LC-ANA class membership interpretation

cat("\n--- LC-ANA: Class Membership ---\n")

## 
## --- LC-ANA: Class Membership ---

esta <- model_lcana$estimate
cat("delta_c1 (intercept):     ", round(esta["delta_c1"],   3), "\n")

## delta_c1 (intercept):      1.296

cat("gamma_envi (env concern): ", round(esta["gamma_envi"], 3), "\n")

## gamma_envi (env concern):  0.149

cat("If gamma_envi > 0 and significant:\n")

## If gamma_envi > 0 and significant:

cat("  Higher environmental concern predicts\n")

##   Higher environmental concern predicts

cat("  attending to emission reduction attributes (H3 supported)\n")

##   attending to emission reduction attributes (H3 supported)

Overall Conclusion

This study provides strong evidence that Hanoi bus users value improvements in service quality and are willing to pay for more sustainable urban bus services. However, the analysis also demonstrates substantial preference heterogeneity, indicating that different groups of travelers evaluate service attributes in markedly different ways.

Hypothesis 1 — Supported

The baseline Multinomial Logit (MNL) model confirms that respondents systematically prefer bus alternatives offering higher service quality across several dimensions.

Among all attributes, safety emerged as the most influential factor in mode choice. Respondents were willing to accept an estimated 112% fare increase in exchange for high safety conditions relative to unsafe alternatives. This suggests that perceptions of personal safety are a dominant determinant of bus usage decisions in Hanoi.

Environmental performance was also positively valued. Respondents were willing to pay approximately 47% higher fares for fully zero-emission buses and around 7% more for buses achieving a 30% reduction in emissions. While the coefficient associated with a 30% emission reduction was only marginally significant, the coefficient for full emission elimination was highly significant, indicating that respondents respond much more strongly to complete environmental improvements than to partial reductions.

Service reliability and reduced waiting times also significantly influenced utility. In particular, respondents were willing to pay an estimated 32% fare premium for shorter waiting times compared to long waiting times. Similarly, highly accurate and reliable services generated significant positive utility gains.

Digital accessibility additionally played an important role. The positive and significant coefficient associated with app usability suggests that passengers value convenient and user-friendly digital interfaces when using bus services.

As expected, fare price carried a negative and highly significant coefficient, confirming that higher costs reduce utility and choice probability.

One notable finding is that average waiting time was not statistically significant. This suggests that respondents only perceive substantial utility gains when waiting times become very short, while average waiting times are not considered meaningfully different from long waits.

Hypothesis 2 — Strongly Supported

The 2-Class Latent Class Model (LCM) reveals the presence of two clearly differentiated behavioral segments within the sample population.

Class 1: Service-Quality and Sustainability Oriented (~70%)

The larger segment places substantial emphasis on safety, reliability, and environmental performance. This class exhibits particularly strong preferences for high safety standards and significant emission reductions. Their willingness-to-pay for fully zero-emission buses exceeds 130% of the current fare level, reflecting a strong valuation of environmental sustainability and service quality.

This class also demonstrates relatively low price sensitivity, indicating that respondents in this segment are more willing to trade higher fares for improved service outcomes.

Class 2: Price- and Convenience-Sensitive Travelers (~30%)

The second segment is substantially more price sensitive, with fare coefficients approximately thirteen times larger in magnitude than those observed for Class 1. Although this group still values reliability, convenience, and environmental improvements, their willingness-to-pay for cleaner buses is considerably lower.

Interestingly, environmental attributes remain positive and statistically significant within this class, suggesting that environmentally sustainable services are valued across both segments. However, respondents in this group are less willing or able to absorb large fare increases associated with sustainability improvements.

Income was found to significantly predict class membership. Higher-income respondents were more likely to belong to the quality-oriented segment, while lower-income respondents were more likely to belong to the price-sensitive group. In contrast, environmental concern was not a statistically significant predictor of class membership, which represents an important and somewhat unexpected finding.

The latent class specification also produced a substantial improvement in model fit. The Bayesian Information Criterion (BIC) decreased from 4335.6 in the baseline MNL model to 4214.5 in the 2-Class LCM, confirming the importance of accounting for preference heterogeneity within the population.

Hypothesis 3 — Directionally Supported but Not Statistically Confirmed

The Latent Class Attribute Non-Attendance (LC-ANA) specification classified respondents into two groups: an emission-attending class and a non-attending class that ignored emission attributes during decision-making.

The coefficient associated with environmental concern was positive, indicating that individuals with stronger environmental attitudes were more likely to attend to emission-related attributes. Although this relationship was consistent with theoretical expectations, it did not achieve conventional levels of statistical significance.

Accordingly, Hypothesis 3 can be considered directionally supported but not statistically confirmed. One possible explanation is that the sample size may have limited the statistical power necessary to detect this effect more precisely.

At the same time, the highly significant class allocation intercept suggests that the majority of respondents attended to emission attributes regardless of their stated environmental concern. This finding itself is substantively important, as it indicates that environmental performance is broadly relevant to bus users even among individuals who do not strongly identify as environmentally concerned.

Model Comparison

The linear-price MNL specification outperformed the nonlinear log-price specification based on both AIC and BIC measures. This suggests that respondents exhibit approximately linear sensitivity to fare increases within the experimental range considered in the study.

Among all estimated models, the 2-Class Latent Class Model achieved the best overall performance. The substantial improvement in fit statistics confirms that preference heterogeneity is a defining characteristic of the dataset and that a two-segment representation adequately captures behavioral variation without introducing excessive model complexity.

Policy Implications

Several important policy implications emerge from the findings.

First, investments in safety improvements should be considered a top priority for public transport operators and policymakers. The exceptionally high willingness-to-pay estimates associated with safer services suggest that improvements in vehicle maintenance, operational reliability, driver training, and passenger security are likely to generate substantial increases in perceived service value and ridership.

Second, the findings indicate that transitioning toward cleaner and zero-emission bus fleets is economically and socially feasible, but pricing strategies should account for heterogeneous user preferences. While a majority of respondents appear willing to pay substantial premiums for cleaner services, a sizeable price-sensitive segment may be adversely affected by uniform fare increases. Consequently, phased implementation strategies, targeted subsidies, or differentiated pricing mechanisms may be necessary to maintain accessibility while supporting sustainable transport objectives.

Finally, the significance of app usability highlights the growing importance of digital service integration within urban transport systems. Investments in intuitive mobile applications, real-time service information, and simplified digital ticketing systems may represent relatively low-cost yet highly effective strategies for improving the attractiveness and competitiveness of public transport services, particularly among younger and technology-oriented commuters.

Choice Modelling Project

Langelihle Magwali (472619)

2026-05-07