Heterogeneous Preferences for Sustainable Urban Bus Services: Evidence from a Discrete Choice Experiment
What factors influence preferences for sustainable urban bus services, and how do environmental attitudes contribute to heterogeneous choice behavior?
Respondents prefer bus alternatives with:
shorter waiting times, higher reliability, greater safety, easier app usability, lower prices.
Preferences for sustainable bus service attributes differ across latent population segments.
Respondents with stronger environmental concern are more likely to attend to emission reduction attributes.
Load libraries
library(apollo)
## Warning: package 'apollo' was built under R version 4.4.3
##
##
## . ,,
## , ,,
## ,,,,,, , ,,
## , ,, , ,,,,.
## ,, , ,, ,,,,,, ,,, // //
## , ,,,. ,,,,,. ,, //// // //
## ,, ,,,,,. ,, // // ////// ///// // // /////
## ,,, ,, , // // / // // // // // // //
## ,, , //////// / // // // // // // //
## ,, ,, // // / /// // // // // // //
## , // // ///// /// // // ///
## //
## //
##
## Apollo 0.3.7
## https://www.ApolloChoiceModelling.com
## See url for a detailed manual, examples and a user forum.
## Sign up to the user forum to receive updates on new releases.
##
## Please cite Apollo in all written material you produce:
## Hess S, Palma D (2019). "Apollo: a flexible, powerful and customisable
## freeware package for choice model estimation and application." Journal
## of Choice Modelling, 32. doi.org/10.1016/j.jocm.2019.100170
##
## The developers of Apollo acknowledge the substantial support provided by
## the European Research Council (ERC) through the consolidator grant DECISIONS,
## the proof of concept grant APOLLO, and the advanced grant SYNERGY.
##
## Apollo is integrated into SurveyEngine, a leading stated choice platform
## Try it at www.surveyengine.com/apollo
library(readxl)
## Warning: package 'readxl' was built under R version 4.4.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
apollo_initialise()
apollo_lcPars <- NA # fix Apollo 0.3.7 environment scan bug
Loading the dataset
raw <- read_excel(
"C:/Users/MAGWALI/Desktop/Bus_DCE_DIB_Encoded_Data.xlsx"
) %>% rename(ID = censusno)
cat("Raw dimensions:", nrow(raw), "rows,", ncol(raw), "columns\n")
## Raw dimensions: 7320 rows, 31 columns
cat("Unique respondents:", n_distinct(raw$ID), "\n")
## Unique respondents: 366
raw <- raw %>%
filter(!(ID == 223 & cardid == 20)) %>%
filter(!(ID == 224 & cardid == 20)) %>%
filter(!(ID == 322 & cardid == 1)) %>%
filter(!(ID == 323 & cardid == 1)) %>%
arrange(ID, cardid, alternative)
cat("After cleaning:", nrow(raw), "rows\n")
## After cleaning: 7312 rows
attr_cols <- c("time_long","time_ave","time_short",
"unaccurate","accurate_ave","accurate",
"unsafe","safe_ave","safe",
"difficult","easy",
"undecrease","decrease_30","decrease_100",
"price","ascfinal","choice")
indiv_cols <- c("ID","blockid","cardid",
"gender","age","edu","income","marriage",
"frequency","envi_con","emmis_con","agree","tax_agree")
alt1 <- raw %>% filter(alternative == 1) %>%
select(all_of(c(indiv_cols, attr_cols))) %>%
rename_with(~ paste0(., "_1"), all_of(attr_cols))
alt2 <- raw %>% filter(alternative == 2) %>%
select(all_of(attr_cols)) %>%
rename_with(~ paste0(., "_2"), everything())
database <- bind_cols(
alt1 %>% select(all_of(indiv_cols)),
alt1 %>% select(ends_with("_1")),
alt2 %>% select(ends_with("_2"))
) %>%
mutate(
choiceVar = ifelse(choice_1 == 1, 1, 2),
log_price_1 = log(price_1 + 1),
log_price_2 = log(price_2 + 1),
envi_con_s = as.numeric(scale(envi_con)),
income_s = as.numeric(scale(income)),
age_s = as.numeric(scale(age))
)
cat("Wide format:", nrow(database), "rows,", ncol(database), "columns\n")
## Wide format: 3656 rows, 53 columns
cat("\n--- Validation ---\n")
##
## --- Validation ---
cat("Rows (expect 3656): ", nrow(database), "\n")
## Rows (expect 3656): 3656
cat("Respondents (expect 366):", n_distinct(database$ID), "\n")
## Respondents (expect 366): 366
cat("choiceVar values: ", paste(sort(unique(database$choiceVar)), collapse=", "), "\n")
## choiceVar values: 1, 2
cat("ascfinal_1 (expect 0): ", unique(database$ascfinal_1), "\n")
## ascfinal_1 (expect 0): 0
cat("ascfinal_2 (expect 1): ", unique(database$ascfinal_2), "\n")
## ascfinal_2 (expect 1): 1
cat("Price levels (alt 1): ", paste(sort(unique(database$price_1)), collapse=", "), "\n")
## Price levels (alt 1): 0, 10, 20, 40, 100
Purpose: Estimate average preferences. Benchmark model. Reference levels: time_long, unaccurate, unsafe, difficult, undecrease
apollo_control_mnl <- list(
modelName = "MNL_Baseline",
modelDescr = "Baseline MNL: linear price, all attributes",
indivID = "ID",
outputDirectory = "output/"
)
apollo_beta_mnl <- c(
b_asc = 0,
b_time_ave = 0.1,
b_time_short = 0.3,
b_acc_ave = 0.1,
b_acc_high = 0.3,
b_safe_ave = 0.1,
b_safe_high = 0.3,
b_easy = 0.2,
b_emiss30 = 0.1,
b_emiss100 = 0.3,
b_price = -0.01
)
apollo_fixed_mnl <- c()
apollo_probabilities_mnl <- function(apollo_beta, apollo_inputs,
functionality = "estimate") {
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P <- list()
V <- list()
V[["1"]] <-
b_asc * ascfinal_1 +
b_time_ave * time_ave_1 + b_time_short * time_short_1 +
b_acc_ave * accurate_ave_1 + b_acc_high * accurate_1 +
b_safe_ave * safe_ave_1 + b_safe_high * safe_1 +
b_easy * easy_1 +
b_emiss30 * decrease_30_1 + b_emiss100 * decrease_100_1 +
b_price * price_1
V[["2"]] <-
b_asc * ascfinal_2 +
b_time_ave * time_ave_2 + b_time_short * time_short_2 +
b_acc_ave * accurate_ave_2 + b_acc_high * accurate_2 +
b_safe_ave * safe_ave_2 + b_safe_high * safe_2 +
b_easy * easy_2 +
b_emiss30 * decrease_30_2 + b_emiss100 * decrease_100_2 +
b_price * price_2
mnl_settings <- list(
alternatives = c("1" = 1, "2" = 2),
avail = list("1" = 1, "2" = 1),
choiceVar = choiceVar,
V = V
)
P[["model"]] <- apollo_mnl(mnl_settings, functionality)
P <- apollo_panelProd(P, apollo_inputs, functionality)
P <- apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
apollo_lcPars <- NA
apollo_inputs_mnl <- apollo_validateInputs(
apollo_beta = apollo_beta_mnl,
apollo_fixed = apollo_fixed_mnl,
database = database,
apollo_control = apollo_control_mnl
)
## Several observations per individual detected based on the value of ID.
## Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.
apollo_lcPars <- NA
model_mnl <- apollo_estimate(
apollo_beta = apollo_beta_mnl,
apollo_fixed = apollo_fixed_mnl,
apollo_probabilities = apollo_probabilities_mnl,
apollo_inputs = apollo_inputs_mnl
)
## WARNING: Element apollo_lcPars in the global environment differs from
## that inside apollo_inputs. The latter will be used. If you wish to
## use the former, stop this function by pressing the "Escape" key, and
## rerun apollo_validateInputs before calling this function.
## Preparing user-defined functions.
##
## Testing likelihood function...
##
## Overview of choices for MNL model component :
## 1 2
## Times available 3656.00 3656.00
## Times chosen 2125.00 1531.00
## Percentage chosen overall 58.12 41.88
## Percentage chosen when available 58.12 41.88
##
##
## Pre-processing likelihood function...
##
## Testing influence of parameters
## Starting main estimation
##
## BGW using analytic model derivatives supplied by caller...
##
##
## Iterates will be written to:
## output/MNL_Baseline_iterations.csv
## it nf F RELDF PRELDF RELDX MODEL stppar
## 0 1 2.275262462e+03
## 1 3 2.252737914e+03 9.900e-03 9.896e-03 4.09e-02 G 7.89e+00
## 2 5 2.148039032e+03 4.648e-02 4.804e-02 3.30e-01 G 4.52e-01
## 3 6 2.122819176e+03 1.174e-02 1.153e-02 1.91e-01 S 0.00e+00
## 4 7 2.122697139e+03 5.749e-05 5.894e-05 9.18e-03 S 0.00e+00
## 5 8 2.122696723e+03 1.960e-07 2.216e-07 6.14e-04 G 0.00e+00
## 6 9 2.122696714e+03 3.961e-09 4.008e-09 9.70e-05 S 0.00e+00
## 7 10 2.122696714e+03 7.039e-12 7.498e-12 2.47e-06 S 0.00e+00
##
## ***** Relative function convergence (favorable) *****
##
## Estimated parameters with approximate standard errors from BHHH matrix:
## Estimate BHHH se BHH t-ratio (0)
## b_asc -0.162471 0.06298 -2.5798
## b_time_ave 0.068463 0.07545 0.9074
## b_time_short 0.317469 0.05331 5.9549
## b_acc_ave 0.283988 0.07248 3.9182
## b_acc_high 0.613894 0.05840 10.5126
## b_safe_ave 0.609304 0.06735 9.0474
## b_safe_high 1.107313 0.05954 18.5977
## b_easy 0.184909 0.05002 3.6969
## b_emiss30 0.071065 0.06161 1.1534
## b_emiss100 0.464873 0.04499 10.3317
## b_price -0.009867 7.4271e-04 -13.2849
##
## Final LL: -2122.6967
##
## Calculating log-likelihood at equal shares (LL(0)) for applicable
## models...
## Calculating log-likelihood at observed shares from estimation data
## (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical jacobian of analytical
## gradient.
## 0%....25%....50%....75%..100%
## Negative definite Hessian with maximum eigenvalue: -82.545791
##
## Your model was estimated using the BGW algorithm. Please acknowledge
## this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
##
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
## doi.org/10.1016/j.jocm.2019.100170
apollo_modelOutput(model_mnl, modelOutput_settings = list(printPVal = TRUE))
## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
## DOI 10.1016/j.jocm.2019.100170
## www.ApolloChoiceModelling.com
##
## Model name : MNL_Baseline
## Model description : Baseline MNL: linear price, all attributes
## Model run at : 2026-05-19 17:27:37.617629
## Estimation method : bgw
## Estimation diagnosis : Relative function convergence (favorable)
## Optimisation diagnosis : Maximum found
## hessian properties : Negative definite
## maximum eigenvalue : -82.545791
## reciprocal of condition number : 4.41078e-05
## Number of individuals : 366
## Number of rows in database : 3656
## Number of modelled outcomes : 3656
##
## Number of cores used : 1
## Model without mixing
##
## LL(start) : -2275.26
## LL at equal shares, LL(0) : -2534.15
## LL at observed shares, LL(C) : -2485.68
## LL(final) : -2122.7
## Rho-squared vs equal shares : 0.1624
## Adj.Rho-squared vs equal shares : 0.158
## Rho-squared vs observed shares : 0.146
## Adj.Rho-squared vs observed shares : 0.142
## AIC : 4267.39
## BIC : 4335.64
##
## Estimated parameters : 11
## Time taken (hh:mm:ss) : 00:00:1.52
## pre-estimation : 00:00:0.56
## estimation : 00:00:0.31
## post-estimation : 00:00:0.65
## Iterations : 7
##
## Unconstrained optimisation.
##
## Estimates:
## Estimate s.e. t.rat.(0) p(1-sided) Rob.s.e.
## b_asc -0.162471 0.05689 -2.856 0.002147 0.05348
## b_time_ave 0.068463 0.06237 1.098 0.136184 0.05740
## b_time_short 0.317469 0.05811 5.463 2.344e-08 0.06841
## b_acc_ave 0.283988 0.06226 4.562 2.539e-06 0.05628
## b_acc_high 0.613894 0.06505 9.438 0.000000 0.07569
## b_safe_ave 0.609304 0.06828 8.924 0.000000 0.07456
## b_safe_high 1.107313 0.06492 17.056 0.000000 0.07667
## b_easy 0.184909 0.04332 4.268 9.853e-06 0.03944
## b_emiss30 0.071065 0.05512 1.289 0.098639 0.05476
## b_emiss100 0.464873 0.05225 8.896 0.000000 0.06248
## b_price -0.009867 8.2391e-04 -11.976 0.000000 9.7113e-04
## Rob.t.rat.(0) p(1-sided)
## b_asc -3.038 0.001191
## b_time_ave 1.193 0.116498
## b_time_short 4.641 1.736e-06
## b_acc_ave 5.046 2.261e-07
## b_acc_high 8.111 2.220e-16
## b_safe_ave 8.172 1.110e-16
## b_safe_high 14.443 0.000000
## b_easy 4.688 1.378e-06
## b_emiss30 1.298 0.097187
## b_emiss100 7.440 5.018e-14
## b_price -10.160 0.000000
rho2_mnl <- 1 - (model_mnl$LLout / model_mnl$LL0)
cat("\n--- MNL Baseline Fit ---\n")
##
## --- MNL Baseline Fit ---
cat("LL: ", round(model_mnl$LLout, 3), "\n")
## LL: -2122.697
cat("McFadden R2: ", round(rho2_mnl, 4), "\n")
## McFadden R2: 0.1624
cat("AIC: ", round(model_mnl$AIC, 3), "\n")
## AIC: 4267.393
cat("BIC: ", round(model_mnl$BIC, 3), "\n")
## BIC: 4335.639
Purpose: Test diminishing price sensitivity. Only change from baseline: b_price * price replaced with b_price_log * log(price + 1)
apollo_control_mnl_log <- list(
modelName = "MNL_LogPrice",
modelDescr = "MNL: log-transformed price",
indivID = "ID",
outputDirectory = "output/"
)
apollo_beta_mnl_log <- c(
b_asc = 0,
b_time_ave = 0.1,
b_time_short = 0.3,
b_acc_ave = 0.1,
b_acc_high = 0.3,
b_safe_ave = 0.1,
b_safe_high = 0.3,
b_easy = 0.2,
b_emiss30 = 0.1,
b_emiss100 = 0.3,
b_price_log = -0.3
)
apollo_fixed_mnl_log <- c()
apollo_probabilities_mnl_log <- function(apollo_beta, apollo_inputs,
functionality = "estimate") {
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P <- list()
V <- list()
V[["1"]] <-
b_asc * ascfinal_1 +
b_time_ave * time_ave_1 + b_time_short * time_short_1 +
b_acc_ave * accurate_ave_1 + b_acc_high * accurate_1 +
b_safe_ave * safe_ave_1 + b_safe_high * safe_1 +
b_easy * easy_1 +
b_emiss30 * decrease_30_1 + b_emiss100 * decrease_100_1 +
b_price_log * log_price_1
V[["2"]] <-
b_asc * ascfinal_2 +
b_time_ave * time_ave_2 + b_time_short * time_short_2 +
b_acc_ave * accurate_ave_2 + b_acc_high * accurate_2 +
b_safe_ave * safe_ave_2 + b_safe_high * safe_2 +
b_easy * easy_2 +
b_emiss30 * decrease_30_2 + b_emiss100 * decrease_100_2 +
b_price_log * log_price_2
mnl_settings <- list(
alternatives = c("1" = 1, "2" = 2),
avail = list("1" = 1, "2" = 1),
choiceVar = choiceVar,
V = V
)
P[["model"]] <- apollo_mnl(mnl_settings, functionality)
P <- apollo_panelProd(P, apollo_inputs, functionality)
P <- apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
apollo_lcPars <- NA
apollo_inputs_mnl_log <- apollo_validateInputs(
apollo_beta = apollo_beta_mnl_log,
apollo_fixed = apollo_fixed_mnl_log,
database = database,
apollo_control = apollo_control_mnl_log
)
## Several observations per individual detected based on the value of ID.
## Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.
apollo_lcPars <- NA
model_mnl_log <- apollo_estimate(
apollo_beta = apollo_beta_mnl_log,
apollo_fixed = apollo_fixed_mnl_log,
apollo_probabilities = apollo_probabilities_mnl_log,
apollo_inputs = apollo_inputs_mnl_log
)
## WARNING: Element apollo_lcPars in the global environment differs from
## that inside apollo_inputs. The latter will be used. If you wish to
## use the former, stop this function by pressing the "Escape" key, and
## rerun apollo_validateInputs before calling this function.
## Preparing user-defined functions.
##
## Testing likelihood function...
##
## Overview of choices for MNL model component :
## 1 2
## Times available 3656.00 3656.00
## Times chosen 2125.00 1531.00
## Percentage chosen overall 58.12 41.88
## Percentage chosen when available 58.12 41.88
##
##
## Pre-processing likelihood function...
##
## Testing influence of parameters
## Starting main estimation
##
## BGW using analytic model derivatives supplied by caller...
##
##
## Iterates will be written to:
## output/MNL_LogPrice_iterations.csv
## it nf F RELDF PRELDF RELDX MODEL stppar
## 0 1 2.395124849e+03
## 1 3 2.359863853e+03 1.472e-02 1.461e-02 2.74e-02 G 9.00e+00
## 2 5 2.184927979e+03 7.413e-02 7.055e-02 3.13e-01 G 2.97e-01
## 3 6 2.158124379e+03 1.227e-02 1.247e-02 1.43e-01 S 0.00e+00
## 4 7 2.158023232e+03 4.687e-05 4.261e-05 7.54e-03 S 0.00e+00
## 5 8 2.158019858e+03 1.564e-06 1.581e-06 1.42e-03 S 0.00e+00
## 6 9 2.158019842e+03 7.659e-09 8.367e-09 1.43e-04 S 0.00e+00
## 7 10 2.158019841e+03 9.084e-11 9.211e-11 1.29e-05 S 0.00e+00
##
## ***** Relative function convergence (favorable) *****
##
## Estimated parameters with approximate standard errors from BHHH matrix:
## Estimate BHHH se BHH t-ratio (0)
## b_asc -0.33225 0.06310 -5.266
## b_time_ave -0.08003 0.07442 -1.075
## b_time_short 0.16955 0.05311 3.193
## b_acc_ave 0.35961 0.07294 4.931
## b_acc_high 0.74717 0.05868 12.733
## b_safe_ave 0.45779 0.06618 6.918
## b_safe_high 1.02859 0.06078 16.922
## b_easy 0.19629 0.04958 3.959
## b_emiss30 0.07620 0.06100 1.249
## b_emiss100 0.43048 0.04354 9.887
## b_price_log -0.16830 0.01759 -9.568
##
## Final LL: -2158.0198
##
## Calculating log-likelihood at equal shares (LL(0)) for applicable
## models...
## Calculating log-likelihood at observed shares from estimation data
## (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical jacobian of analytical
## gradient.
## 0%....25%....50%....75%..100%
## Negative definite Hessian with maximum eigenvalue: -88.501112
##
## Your model was estimated using the BGW algorithm. Please acknowledge
## this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
##
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
## doi.org/10.1016/j.jocm.2019.100170
apollo_modelOutput(model_mnl_log, modelOutput_settings = list(printPVal = TRUE))
## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
## DOI 10.1016/j.jocm.2019.100170
## www.ApolloChoiceModelling.com
##
## Model name : MNL_LogPrice
## Model description : MNL: log-transformed price
## Model run at : 2026-05-19 17:27:39.244725
## Estimation method : bgw
## Estimation diagnosis : Relative function convergence (favorable)
## Optimisation diagnosis : Maximum found
## hessian properties : Negative definite
## maximum eigenvalue : -88.501112
## reciprocal of condition number : 0.0248896
## Number of individuals : 366
## Number of rows in database : 3656
## Number of modelled outcomes : 3656
##
## Number of cores used : 1
## Model without mixing
##
## LL(start) : -2395.12
## LL at equal shares, LL(0) : -2534.15
## LL at observed shares, LL(C) : -2485.68
## LL(final) : -2158.02
## Rho-squared vs equal shares : 0.1484
## Adj.Rho-squared vs equal shares : 0.1441
## Rho-squared vs observed shares : 0.1318
## Adj.Rho-squared vs observed shares : 0.1278
## AIC : 4338.04
## BIC : 4406.29
##
## Estimated parameters : 11
## Time taken (hh:mm:ss) : 00:00:0.96
## pre-estimation : 00:00:0.22
## estimation : 00:00:0.15
## post-estimation : 00:00:0.58
## Iterations : 7
##
## Unconstrained optimisation.
##
## Estimates:
## Estimate s.e. t.rat.(0) p(1-sided) Rob.s.e.
## b_asc -0.33225 0.05642 -5.889 1.946e-09 0.05221
## b_time_ave -0.08003 0.05892 -1.358 0.087171 0.05370
## b_time_short 0.16955 0.05530 3.066 0.001085 0.06285
## b_acc_ave 0.35961 0.06256 5.749 4.495e-09 0.05782
## b_acc_high 0.74717 0.06483 11.525 0.000000 0.07635
## b_safe_ave 0.45779 0.06697 6.835 4.088e-12 0.07439
## b_safe_high 1.02859 0.06387 16.105 0.000000 0.07342
## b_easy 0.19629 0.04276 4.590 2.214e-06 0.03894
## b_emiss30 0.07620 0.05549 1.373 0.084826 0.05554
## b_emiss100 0.43048 0.05152 8.355 0.000000 0.06207
## b_price_log -0.16830 0.01845 -9.120 0.000000 0.02171
## Rob.t.rat.(0) p(1-sided)
## b_asc -6.363 9.874e-11
## b_time_ave -1.490 0.068071
## b_time_short 2.698 0.003491
## b_acc_ave 6.220 2.491e-10
## b_acc_high 9.786 0.000000
## b_safe_ave 6.154 3.775e-10
## b_safe_high 14.010 0.000000
## b_easy 5.040 2.325e-07
## b_emiss30 1.372 0.085030
## b_emiss100 6.935 2.031e-12
## b_price_log -7.753 4.441e-15
rho2_mnl_log <- 1 - (model_mnl_log$LLout / model_mnl_log$LL0)
cat("\n--- MNL Log-Price Fit ---\n")
##
## --- MNL Log-Price Fit ---
cat("LL: ", round(model_mnl_log$LLout, 3), "\n")
## LL: -2158.02
cat("McFadden R2: ", round(rho2_mnl_log, 4), "\n")
## McFadden R2: 0.1484
cat("AIC: ", round(model_mnl_log$AIC, 3), "\n")
## AIC: 4338.04
cat("BIC: ", round(model_mnl_log$BIC, 3), "\n")
## BIC: 4406.285
cat("\nFunctional form comparison (lower BIC = better):\n")
##
## Functional form comparison (lower BIC = better):
cat("MNL linear BIC: ", round(model_mnl$BIC, 3), "\n")
## MNL linear BIC: 4335.639
cat("MNL log BIC: ", round(model_mnl_log$BIC, 3), "\n")
## MNL log BIC: 4406.285
Purpose: Capture preference heterogeneity across two latent segments. Membership predicted by environmental concern and income.
apollo_control_lcm2 <- list(
modelName = "LCM_2Class",
modelDescr = "2-class LCM with environmental and income predictors",
indivID = "ID",
outputDirectory = "output/"
)
apollo_beta_lcm2 <- c(
# Class 1: environmentally oriented
asc_c1 = 0,
b_time_ave_c1 = 0.1,
b_time_sh_c1 = 0.3,
b_acc_ave_c1 = 0.1,
b_acc_hi_c1 = 0.3,
b_safe_ave_c1 = 0.1,
b_safe_hi_c1 = 0.3,
b_easy_c1 = 0.2,
b_em30_c1 = 0.2,
b_em100_c1 = 0.5,
b_price_c1 = -0.01,
# Class 2: price sensitive (reference class)
asc_c2 = 0,
b_time_ave_c2 = 0.1,
b_time_sh_c2 = 0.3,
b_acc_ave_c2 = 0.1,
b_acc_hi_c2 = 0.3,
b_safe_ave_c2 = 0.1,
b_safe_hi_c2 = 0.3,
b_easy_c2 = 0.2,
b_em30_c2 = 0.05,
b_em100_c2 = 0.1,
b_price_c2 = -0.03,
# Class membership (class 2 = reference)
delta_c1 = 0,
gamma_envi = 0.3,
gamma_income = 0.2
)
apollo_fixed_lcm2 <- c()
apollo_lcPars_lcm2 <- function(apollo_beta, apollo_inputs) {
# Access directly — no apollo_attach, no apollo_mnl
delta_c1 <- apollo_beta["delta_c1"]
gamma_envi <- apollo_beta["gamma_envi"]
gamma_income <- apollo_beta["gamma_income"]
envi_con_s <- apollo_inputs$database$envi_con_s
income_s <- apollo_inputs$database$income_s
# Manual softmax
V1 <- delta_c1 + gamma_envi * envi_con_s + gamma_income * income_s
denom <- exp(V1) + 1
lcpars <- list()
lcpars[["pi_values"]] <- list(
class1 = exp(V1) / denom,
class2 = 1 / denom
)
return(lcpars)
}
apollo_probabilities_lcm2 <- function(apollo_beta, apollo_inputs,
functionality = "estimate") {
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P <- list()
# Class 1 utilities
V1 <- list()
V1[["1"]] <-
asc_c1 * ascfinal_1 +
b_time_ave_c1 * time_ave_1 + b_time_sh_c1 * time_short_1 +
b_acc_ave_c1 * accurate_ave_1 + b_acc_hi_c1 * accurate_1 +
b_safe_ave_c1 * safe_ave_1 + b_safe_hi_c1 * safe_1 +
b_easy_c1 * easy_1 +
b_em30_c1 * decrease_30_1 + b_em100_c1 * decrease_100_1 +
b_price_c1 * price_1
V1[["2"]] <-
asc_c1 * ascfinal_2 +
b_time_ave_c1 * time_ave_2 + b_time_sh_c1 * time_short_2 +
b_acc_ave_c1 * accurate_ave_2 + b_acc_hi_c1 * accurate_2 +
b_safe_ave_c1 * safe_ave_2 + b_safe_hi_c1 * safe_2 +
b_easy_c1 * easy_2 +
b_em30_c1 * decrease_30_2 + b_em100_c1 * decrease_100_2 +
b_price_c1 * price_2
# Class 2 utilities
V2 <- list()
V2[["1"]] <-
asc_c2 * ascfinal_1 +
b_time_ave_c2 * time_ave_1 + b_time_sh_c2 * time_short_1 +
b_acc_ave_c2 * accurate_ave_1 + b_acc_hi_c2 * accurate_1 +
b_safe_ave_c2 * safe_ave_1 + b_safe_hi_c2 * safe_1 +
b_easy_c2 * easy_1 +
b_em30_c2 * decrease_30_1 + b_em100_c2 * decrease_100_1 +
b_price_c2 * price_1
V2[["2"]] <-
asc_c2 * ascfinal_2 +
b_time_ave_c2 * time_ave_2 + b_time_sh_c2 * time_short_2 +
b_acc_ave_c2 * accurate_ave_2 + b_acc_hi_c2 * accurate_2 +
b_safe_ave_c2 * safe_ave_2 + b_safe_hi_c2 * safe_2 +
b_easy_c2 * easy_2 +
b_em30_c2 * decrease_30_2 + b_em100_c2 * decrease_100_2 +
b_price_c2 * price_2
mnl_set <- list(
alternatives = c("1" = 1, "2" = 2),
avail = list("1" = 1, "2" = 1),
choiceVar = choiceVar
)
P[["class_1"]] <- apollo_panelProd(
apollo_mnl(c(mnl_set, list(componentName = "MNL_c1", V = V1)), functionality),
apollo_inputs, functionality)
P[["class_2"]] <- apollo_panelProd(
apollo_mnl(c(mnl_set, list(componentName = "MNL_c2", V = V2)), functionality),
apollo_inputs, functionality)
P[["model"]] <- apollo_lc(
list(
inClassProb = list(P[["class_1"]], P[["class_2"]]),
classProb = pi_values
),
apollo_inputs, functionality
)
P <- apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
apollo_inputs_lcm2 <- apollo_validateInputs(
apollo_beta = apollo_beta_lcm2,
apollo_fixed = apollo_fixed_lcm2,
database = database,
apollo_control = apollo_control_lcm2,
apollo_lcPars = apollo_lcPars_lcm2
)
## Several observations per individual detected based on the value of ID.
## Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.
model_lcm2 <- apollo_estimate(
apollo_beta = apollo_beta_lcm2,
apollo_fixed = apollo_fixed_lcm2,
apollo_probabilities = apollo_probabilities_lcm2,
apollo_inputs = apollo_inputs_lcm2
)
## WARNING: Element apollo_lcPars in the global environment differs from
## that inside apollo_inputs. The latter will be used. If you wish to
## use the former, stop this function by pressing the "Escape" key, and
## rerun apollo_validateInputs before calling this function.
## Preparing user-defined functions.
## WARNING: The pre-processing of 'apollo_probabilities' failed during
## loop expansion. Your model may still run, but this indicates a
## potential problem. Please contact the developers for assistance!
##
## Testing likelihood function...
##
## Overview of choices for MNL model component MNL_c1:
## 1 2
## Times available 3656.00 3656.00
## Times chosen 2125.00 1531.00
## Percentage chosen overall 58.12 41.88
## Percentage chosen when available 58.12 41.88
##
##
## Overview of choices for MNL model component MNL_c2:
## 1 2
## Times available 3656.00 3656.00
## Times chosen 2125.00 1531.00
## Percentage chosen overall 58.12 41.88
## Percentage chosen when available 58.12 41.88
##
##
## Summary of class allocation for model component :
## Mean prob.
## Class_1 0.5002
## Class_2 0.4998
## The class allocation probabilities for model component "model" are
## calculated at the observation level in 'apollo_lcPars', but are used
## in 'apollo_probabilities' to multiply within class probabilities that
## are at the individual level. Apollo will average the class allocation
## probabilities across observations for the same individual level
## before using them to multiply the within-class probabilities. If your
## class allocation probabilities are constant across choice situations
## for the same individual, then this is of no concern. If your class
## allocation probabilities however vary across choice tasks, then you
## should change your model specification in 'apollo_probabilities' to
## only call 'apollo_panelProd' after calling 'apollo_lc'.
##
## Pre-processing likelihood function...
## INFORMATION: Apollo was not able to compute analytical gradients for
## your model. This could be because you are using model components for
## which analytical gradients are not yet implemented, or because you
## coded your own model functions. If however you only used apollo_mnl,
## apollo_fmnl, apollo_normalDensity, apollo_ol or apollo_op, then there
## could be another issue. You might want to ask for help in the Apollo
## forum (https://www.ApolloChoiceModelling.com/forum) on how to solve
## this issue. If you do, please post your code and data (if not
## confidential).
## Analytical gradients could not be calculated for all components,
## numerical gradients will be used.
##
## Testing influence of parameters.........................
## Starting main estimation
##
## BGW is using FD derivatives for model Jacobian. (Caller did not provide derivatives.)
##
##
## Iterates will be written to:
## output/LCM_2Class_iterations.csv
## it nf F RELDF PRELDF RELDX MODEL stppar
## 0 1 2.323544922e+03
## 1 4 2.193202586e+03 5.610e-02 4.227e-02 1.15e-01 G 4.57e-01
## 2 5 2.095329992e+03 4.463e-02 3.548e-02 2.78e-01 G 0.00e+00
## 3 6 2.063504984e+03 1.519e-02 2.857e-02 2.22e-01 S 5.65e-03
## 4 8 2.027886820e+03 1.726e-02 1.485e-02 1.27e-01 G-S 0.00e+00
## 5 9 2.014171652e+03 6.763e-03 6.736e-03 7.37e-02 S 0.00e+00
## 6 10 2.008878572e+03 2.628e-03 2.686e-03 5.56e-02 S 0.00e+00
## 7 11 2.007698848e+03 5.873e-04 5.145e-04 1.95e-02 S 0.00e+00
## 8 12 2.007521453e+03 8.836e-05 1.397e-04 1.72e-02 G 0.00e+00
## 9 14 2.007334110e+03 9.332e-05 7.067e-05 1.37e-02 G-S 0.00e+00
## 10 15 2.007242976e+03 4.540e-05 2.864e-05 1.21e-02 G 0.00e+00
## 11 16 2.007147927e+03 4.735e-05 2.608e-05 1.02e-02 G 0.00e+00
## 12 17 2.007020774e+03 6.335e-05 3.214e-05 1.40e-02 G 0.00e+00
## 13 18 2.006829204e+03 9.545e-05 4.717e-05 1.45e-02 G 0.00e+00
## 14 19 2.006526456e+03 1.509e-04 7.334e-05 2.10e-02 G 0.00e+00
## 15 20 2.006112806e+03 2.062e-04 1.107e-04 3.56e-02 G 0.00e+00
## 16 21 2.005678697e+03 2.164e-04 1.395e-04 1.96e-02 G 0.00e+00
## 17 22 2.005377629e+03 1.501e-04 1.590e-04 1.83e-02 G 0.00e+00
## 18 23 2.005311376e+03 3.304e-05 2.249e-04 1.75e-02 G 0.00e+00
## 19 25 2.004901081e+03 2.046e-04 1.568e-04 1.36e-02 G-S 0.00e+00
## 20 26 2.004765450e+03 6.765e-05 5.075e-05 5.52e-03 S 0.00e+00
## 21 27 2.004705586e+03 2.986e-05 2.340e-05 6.21e-03 S 0.00e+00
## 22 28 2.004684849e+03 1.034e-05 8.440e-06 2.55e-03 S 0.00e+00
## 23 29 2.004680011e+03 2.413e-06 1.928e-06 1.14e-03 S 0.00e+00
## 24 30 2.004678823e+03 5.927e-07 4.685e-07 6.75e-04 S 0.00e+00
## 25 31 2.004678427e+03 1.975e-07 1.614e-07 4.46e-04 S 0.00e+00
## 26 32 2.004678299e+03 6.397e-08 5.082e-08 1.76e-04 S 0.00e+00
## 27 33 2.004678266e+03 1.640e-08 1.447e-08 9.02e-05 S 0.00e+00
## 28 35 2.004678263e+03 1.574e-09 2.651e-09 6.35e-05 G-S 0.00e+00
## 29 36 2.004678261e+03 8.780e-10 7.372e-10 1.35e-05 S 0.00e+00
## 30 37 2.004678261e+03 1.213e-10 1.016e-10 1.02e-05 S 0.00e+00
## 31 38 2.004678261e+03 2.439e-11 2.221e-11 4.35e-06 S 0.00e+00
##
## ***** Relative function convergence (favorable) *****
##
## Estimated parameters with approximate standard errors from BHHH matrix:
## Estimate BHHH se BHH t-ratio (0)
## asc_c1 -0.298034 0.078728 -3.7856
## b_time_ave_c1 0.097132 0.092443 1.0507
## b_time_sh_c1 0.294847 0.072236 4.0817
## b_acc_ave_c1 0.385454 0.093058 4.1421
## b_acc_hi_c1 0.634826 0.078091 8.1293
## b_safe_ave_c1 0.927671 0.091113 10.1816
## b_safe_hi_c1 1.402875 0.072588 19.3266
## b_easy_c1 0.038453 0.065682 0.5854
## b_em30_c1 -0.085296 0.077666 -1.0982
## b_em100_c1 0.391250 0.062595 6.2505
## b_price_c1 -0.002923 0.001253 -2.3327
## asc_c2 0.107685 0.196347 0.5484
## b_time_ave_c2 -0.125731 0.233450 -0.5386
## b_time_sh_c2 0.234295 0.197557 1.1860
## b_acc_ave_c2 0.071664 0.221332 0.3238
## b_acc_hi_c2 0.771684 0.238387 3.2371
## b_safe_ave_c2 0.104995 0.202783 0.5178
## b_safe_hi_c2 0.538205 0.195796 2.7488
## b_easy_c2 0.709985 0.150630 4.7134
## b_em30_c2 0.853335 0.199757 4.2719
## b_em100_c2 1.055390 0.179279 5.8869
## b_price_c2 -0.037359 0.004091 -9.1320
## delta_c1 0.876711 0.165693 5.2912
## gamma_envi -0.016156 0.144540 -0.1118
## gamma_income 0.466706 0.173735 2.6863
##
## Final LL: -2004.6783
##
##
## Summary of class allocation for model component :
## Mean prob.
## Class_1 0.6966
## Class_2 0.3034
##
## Calculating log-likelihood at equal shares (LL(0)) for applicable
## models...
## Calculating log-likelihood at observed shares from estimation data
## (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical methods (numDeriv).
## 0%....25%....50%....75%....100%
## Negative definite Hessian with maximum eigenvalue: -8.574092
##
## Your model was estimated using the BGW algorithm. Please acknowledge
## this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
##
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
## doi.org/10.1016/j.jocm.2019.100170
apollo_modelOutput(model_lcm2, modelOutput_settings = list(printPVal = TRUE))
## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
## DOI 10.1016/j.jocm.2019.100170
## www.ApolloChoiceModelling.com
##
## Model name : LCM_2Class
## Model description : 2-class LCM with environmental and income predictors
## Model run at : 2026-05-19 17:27:40.302126
## Estimation method : bgw
## Estimation diagnosis : Relative function convergence (favorable)
## Optimisation diagnosis : Maximum found
## hessian properties : Negative definite
## maximum eigenvalue : -8.574092
## reciprocal of condition number : 7.79621e-06
## Number of individuals : 366
## Number of rows in database : 3656
## Number of modelled outcomes : 10968
## MNL_c1 : 3656
## MNL_c2 : 3656
## model : 3656
##
## Number of cores used : 1
## Model without mixing
##
## LL(start) : -2323.54
## LL (whole model) at equal shares, LL(0) : -2534.15
## LL (whole model) at observed shares, LL(C) : -2485.68
## LL(final, whole model) : -2004.68
## Rho-squared vs equal shares : 0.2089
## Adj.Rho-squared vs equal shares : 0.1991
## Rho-squared vs observed shares : 0.1935
## Adj.Rho-squared vs observed shares : 0.1843
## AIC : 4059.36
## BIC : 4214.46
##
## LL(0,class_1) : -2534.15
## LL(final,class_1) : -2240.3
## LL(0,class_2) : -2534.15
## LL(final,class_2) : -3149.59
##
## Estimated parameters : 25
## Time taken (hh:mm:ss) : 00:00:49.41
## pre-estimation : 00:00:0.68
## estimation : 00:00:12.8
## post-estimation : 00:00:35.93
## Iterations : 31
##
## Unconstrained optimisation.
##
## Estimates:
## Estimate s.e. t.rat.(0) p(1-sided) Rob.s.e.
## asc_c1 -0.298034 0.076413 -3.9003 4.804e-05 0.078257
## b_time_ave_c1 0.097132 0.080052 1.2134 0.112496 0.079747
## b_time_sh_c1 0.294847 0.077508 3.8041 7.117e-05 0.090718
## b_acc_ave_c1 0.385454 0.084430 4.5654 2.493e-06 0.082753
## b_acc_hi_c1 0.634826 0.086529 7.3366 1.096e-13 0.101457
## b_safe_ave_c1 0.927671 0.090586 10.2408 0.000000 0.101454
## b_safe_hi_c1 1.402875 0.087052 16.1154 0.000000 0.122546
## b_easy_c1 0.038453 0.057490 0.6689 0.251791 0.054998
## b_em30_c1 -0.085296 0.075039 -1.1367 0.127834 0.088388
## b_em100_c1 0.391250 0.073014 5.3586 4.194e-08 0.098676
## b_price_c1 -0.002923 0.001134 -2.5771 0.004982 0.001172
## asc_c2 0.107685 0.184734 0.5829 0.279974 0.226332
## b_time_ave_c2 -0.125731 0.190687 -0.6594 0.254833 0.211992
## b_time_sh_c2 0.234295 0.177484 1.3201 0.093402 0.185024
## b_acc_ave_c2 0.071664 0.170341 0.4207 0.336983 0.173151
## b_acc_hi_c2 0.771684 0.208214 3.7062 1.0519e-04 0.207189
## b_safe_ave_c2 0.104995 0.179820 0.5839 0.279647 0.217419
## b_safe_hi_c2 0.538205 0.205477 2.6193 0.004406 0.261039
## b_easy_c2 0.709985 0.140066 5.0689 2.000e-07 0.180006
## b_em30_c2 0.853335 0.200904 4.2475 1.081e-05 0.256718
## b_em100_c2 1.055390 0.192472 5.4833 2.087e-08 0.248533
## b_price_c2 -0.037359 0.004301 -8.6863 0.000000 0.006034
## delta_c1 0.876711 0.178341 4.9159 4.418e-07 0.234458
## gamma_envi -0.016156 0.136290 -0.1185 0.452820 0.171232
## gamma_income 0.466706 0.166195 2.8082 0.002491 0.173801
## Rob.t.rat.(0) p(1-sided)
## asc_c1 -3.80837 6.994e-05
## b_time_ave_c1 1.21801 0.111610
## b_time_sh_c1 3.25016 5.7670e-04
## b_acc_ave_c1 4.65787 1.597e-06
## b_acc_hi_c1 6.25707 1.961e-10
## b_safe_ave_c1 9.14376 0.000000
## b_safe_hi_c1 11.44772 0.000000
## b_easy_c1 0.69916 0.242225
## b_em30_c1 -0.96501 0.167269
## b_em100_c1 3.96501 3.670e-05
## b_price_c1 -2.49406 0.006314
## asc_c2 0.47578 0.317114
## b_time_ave_c2 -0.59309 0.276560
## b_time_sh_c2 1.26630 0.102703
## b_acc_ave_c2 0.41388 0.339480
## b_acc_hi_c2 3.72454 9.784e-05
## b_safe_ave_c2 0.48292 0.314577
## b_safe_hi_c2 2.06178 0.019614
## b_easy_c2 3.94424 4.003e-05
## b_em30_c2 3.32401 4.4366e-04
## b_em100_c2 4.24648 1.086e-05
## b_price_c2 -6.19181 2.974e-10
## delta_c1 3.73931 9.226e-05
## gamma_envi -0.09435 0.462416
## gamma_income 2.68528 0.003623
##
##
## Summary of class allocation for model component :
## Mean prob.
## Class_1 0.6966
## Class_2 0.3034
rho2_lcm2 <- 1 - (model_lcm2$LLout / model_lcm2$LL0)
cat("\n--- 2-Class LCM Fit ---\n")
##
## --- 2-Class LCM Fit ---
cat("LL: ", round(model_lcm2$LLout, 3), "\n")
## LL: -2004.678 -2240.296 -3149.591
cat("McFadden R2: ", round(rho2_lcm2, 4), "\n")
## McFadden R2: 0.2089 0.116 -0.2429
cat("AIC: ", round(model_lcm2$AIC, 3), "\n")
## AIC: 4059.357
cat("BIC: ", round(model_lcm2$BIC, 3), "\n")
## BIC: 4214.46
Purpose: Behavioral extension. Class 1 (Attenders): emission coefficients active Class 2 (Non-attenders): emission coefficients omitted Membership predicted by envi_con_s Tests H3: environmentally concerned respondents attend to emission reduction attributes.
apollo_control_lcana <- list(
modelName = "LC_ANA",
modelDescr = "LC-ANA: emission attribute non-attendance",
indivID = "ID",
outputDirectory = "output/"
)
apollo_beta_lcana <- c(
# Class 1: attends to emissions
asc_c1 = 0,
b_time_ave_c1 = 0.1,
b_time_sh_c1 = 0.3,
b_acc_ave_c1 = 0.1,
b_acc_hi_c1 = 0.3,
b_safe_ave_c1 = 0.1,
b_safe_hi_c1 = 0.3,
b_easy_c1 = 0.2,
b_em30_c1 = 0.2,
b_em100_c1 = 0.5,
b_price_c1 = -0.01,
# Class 2: ignores emissions (non-attender)
asc_c2 = 0,
b_time_ave_c2 = 0.1,
b_time_sh_c2 = 0.3,
b_acc_ave_c2 = 0.1,
b_acc_hi_c2 = 0.3,
b_safe_ave_c2 = 0.1,
b_safe_hi_c2 = 0.3,
b_easy_c2 = 0.2,
b_price_c2 = -0.02,
# No emission parameters — this IS the non-attendance
# Class membership
delta_c1 = 0,
gamma_envi = 0.5
)
apollo_fixed_lcana <- c()
apollo_lcPars_lcana <- function(apollo_beta, apollo_inputs) {
# Access directly — no apollo_attach, no apollo_mnl
delta_c1 <- apollo_beta["delta_c1"]
gamma_envi <- apollo_beta["gamma_envi"]
envi_con_s <- apollo_inputs$database$envi_con_s
# Manual softmax
V1 <- delta_c1 + gamma_envi * envi_con_s
denom <- exp(V1) + 1
lcpars <- list()
lcpars[["pi_values"]] <- list(
class1 = exp(V1) / denom,
class2 = 1 / denom
)
return(lcpars)
}
apollo_probabilities_lcana <- function(apollo_beta, apollo_inputs,
functionality = "estimate") {
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P <- list()
# Class 1: full utility including emissions
V1 <- list()
V1[["1"]] <-
asc_c1 * ascfinal_1 +
b_time_ave_c1 * time_ave_1 + b_time_sh_c1 * time_short_1 +
b_acc_ave_c1 * accurate_ave_1 + b_acc_hi_c1 * accurate_1 +
b_safe_ave_c1 * safe_ave_1 + b_safe_hi_c1 * safe_1 +
b_easy_c1 * easy_1 +
b_em30_c1 * decrease_30_1 + b_em100_c1 * decrease_100_1 +
b_price_c1 * price_1
V1[["2"]] <-
asc_c1 * ascfinal_2 +
b_time_ave_c1 * time_ave_2 + b_time_sh_c1 * time_short_2 +
b_acc_ave_c1 * accurate_ave_2 + b_acc_hi_c1 * accurate_2 +
b_safe_ave_c1 * safe_ave_2 + b_safe_hi_c1 * safe_2 +
b_easy_c1 * easy_2 +
b_em30_c1 * decrease_30_2 + b_em100_c1 * decrease_100_2 +
b_price_c1 * price_2
# Class 2: emission attributes omitted (non-attendance)
V2 <- list()
V2[["1"]] <-
asc_c2 * ascfinal_1 +
b_time_ave_c2 * time_ave_1 + b_time_sh_c2 * time_short_1 +
b_acc_ave_c2 * accurate_ave_1 + b_acc_hi_c2 * accurate_1 +
b_safe_ave_c2 * safe_ave_1 + b_safe_hi_c2 * safe_1 +
b_easy_c2 * easy_1 +
b_price_c2 * price_1
V2[["2"]] <-
asc_c2 * ascfinal_2 +
b_time_ave_c2 * time_ave_2 + b_time_sh_c2 * time_short_2 +
b_acc_ave_c2 * accurate_ave_2 + b_acc_hi_c2 * accurate_2 +
b_safe_ave_c2 * safe_ave_2 + b_safe_hi_c2 * safe_2 +
b_easy_c2 * easy_2 +
b_price_c2 * price_2
mnl_set <- list(
alternatives = c("1" = 1, "2" = 2),
avail = list("1" = 1, "2" = 1),
choiceVar = choiceVar
)
P[["class_1"]] <- apollo_panelProd(
apollo_mnl(c(mnl_set, list(componentName = "MNL_c1", V = V1)), functionality),
apollo_inputs, functionality)
P[["class_2"]] <- apollo_panelProd(
apollo_mnl(c(mnl_set, list(componentName = "MNL_c2", V = V2)), functionality),
apollo_inputs, functionality)
P[["model"]] <- apollo_lc(
list(
inClassProb = list(P[["class_1"]], P[["class_2"]]),
classProb = pi_values
),
apollo_inputs, functionality
)
P <- apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
apollo_inputs_lcana <- apollo_validateInputs(
apollo_beta = apollo_beta_lcana,
apollo_fixed = apollo_fixed_lcana,
database = database,
apollo_control = apollo_control_lcana,
apollo_lcPars = apollo_lcPars_lcana
)
## Several observations per individual detected based on the value of ID.
## Setting panelData in apollo_control set to TRUE.
## All checks on apollo_control completed.
## All checks on database completed.
model_lcana <- apollo_estimate(
apollo_beta = apollo_beta_lcana,
apollo_fixed = apollo_fixed_lcana,
apollo_probabilities = apollo_probabilities_lcana,
apollo_inputs = apollo_inputs_lcana
)
## WARNING: Element apollo_lcPars in the global environment differs from
## that inside apollo_inputs. The latter will be used. If you wish to
## use the former, stop this function by pressing the "Escape" key, and
## rerun apollo_validateInputs before calling this function.
## Preparing user-defined functions.
## WARNING: The pre-processing of 'apollo_probabilities' failed during
## loop expansion. Your model may still run, but this indicates a
## potential problem. Please contact the developers for assistance!
##
## Testing likelihood function...
##
## Overview of choices for MNL model component MNL_c1:
## 1 2
## Times available 3656.00 3656.00
## Times chosen 2125.00 1531.00
## Percentage chosen overall 58.12 41.88
## Percentage chosen when available 58.12 41.88
##
##
## Overview of choices for MNL model component MNL_c2:
## 1 2
## Times available 3656.00 3656.00
## Times chosen 2125.00 1531.00
## Percentage chosen overall 58.12 41.88
## Percentage chosen when available 58.12 41.88
##
##
## Summary of class allocation for model component :
## Mean prob.
## Class_1 0.5016
## Class_2 0.4984
## The class allocation probabilities for model component "model" are
## calculated at the observation level in 'apollo_lcPars', but are used
## in 'apollo_probabilities' to multiply within class probabilities that
## are at the individual level. Apollo will average the class allocation
## probabilities across observations for the same individual level
## before using them to multiply the within-class probabilities. If your
## class allocation probabilities are constant across choice situations
## for the same individual, then this is of no concern. If your class
## allocation probabilities however vary across choice tasks, then you
## should change your model specification in 'apollo_probabilities' to
## only call 'apollo_panelProd' after calling 'apollo_lc'.
##
## Pre-processing likelihood function...
## INFORMATION: Apollo was not able to compute analytical gradients for
## your model. This could be because you are using model components for
## which analytical gradients are not yet implemented, or because you
## coded your own model functions. If however you only used apollo_mnl,
## apollo_fmnl, apollo_normalDensity, apollo_ol or apollo_op, then there
## could be another issue. You might want to ask for help in the Apollo
## forum (https://www.ApolloChoiceModelling.com/forum) on how to solve
## this issue. If you do, please post your code and data (if not
## confidential).
## Analytical gradients could not be calculated for all components,
## numerical gradients will be used.
##
## Testing influence of parameters......................
## Starting main estimation
##
## BGW is using FD derivatives for model Jacobian. (Caller did not provide derivatives.)
##
##
## Iterates will be written to:
## output/LC_ANA_iterations.csv
## it nf F RELDF PRELDF RELDX MODEL stppar
## 0 1 2.303998017e+03
## 1 4 2.161634940e+03 6.179e-02 4.264e-02 2.18e-01 G 0.00e+00
## 2 5 2.075322208e+03 3.993e-02 3.012e-02 2.34e-01 G 0.00e+00
## 3 6 2.045028424e+03 1.460e-02 1.462e-02 1.13e-01 G 0.00e+00
## 4 7 2.026567695e+03 9.027e-03 7.728e-03 8.29e-02 S 0.00e+00
## 5 8 2.021228821e+03 2.634e-03 2.388e-03 3.93e-02 S 0.00e+00
## 6 9 2.019125370e+03 1.041e-03 8.753e-04 2.31e-02 S 0.00e+00
## 7 10 2.018453395e+03 3.328e-04 2.961e-04 1.16e-02 S 0.00e+00
## 8 11 2.018205443e+03 1.228e-04 1.028e-04 9.43e-03 S 0.00e+00
## 9 12 2.018186023e+03 9.623e-06 3.552e-05 8.85e-03 G 0.00e+00
## 10 13 2.018143457e+03 2.109e-05 1.962e-05 3.91e-03 S 0.00e+00
## 11 14 2.018138783e+03 2.316e-06 2.090e-06 1.39e-03 S 0.00e+00
## 12 15 2.018137856e+03 4.592e-07 8.067e-07 7.19e-04 G 0.00e+00
## 13 16 2.018136860e+03 4.936e-07 3.835e-07 4.09e-04 S 0.00e+00
## 14 17 2.018136723e+03 6.815e-08 6.314e-08 1.49e-04 G 0.00e+00
## 15 18 2.018136675e+03 2.369e-08 3.941e-08 1.93e-04 G 0.00e+00
## 16 20 2.018136624e+03 2.550e-08 1.875e-08 1.00e-04 G-S 0.00e+00
## 17 21 2.018136613e+03 5.061e-09 6.830e-09 6.33e-05 G 0.00e+00
## 18 22 2.018136611e+03 1.077e-09 6.014e-09 7.43e-05 G 0.00e+00
## 19 23 2.018136602e+03 4.368e-09 3.723e-09 4.97e-05 S 0.00e+00
## 20 24 2.018136601e+03 6.523e-10 5.682e-10 2.19e-05 S 0.00e+00
## 21 25 2.018136601e+03 1.941e-10 1.624e-10 7.15e-06 S 0.00e+00
## 22 26 2.018136601e+03 4.535e-11 4.225e-11 7.57e-06 S 0.00e+00
##
## ***** Relative function convergence (favorable) *****
##
## Estimated parameters with approximate standard errors from BHHH matrix:
## Estimate BHHH se BHH t-ratio (0)
## asc_c1 -0.235319 0.070222 -3.3511
## b_time_ave_c1 0.096862 0.085080 1.1385
## b_time_sh_c1 0.292059 0.064409 4.5344
## b_acc_ave_c1 0.306528 0.082069 3.7350
## b_acc_hi_c1 0.580501 0.068544 8.4691
## b_safe_ave_c1 0.843489 0.083110 10.1491
## b_safe_hi_c1 1.329788 0.067255 19.7723
## b_easy_c1 0.121670 0.058951 2.0639
## b_em30_c1 0.042306 0.068562 0.6170
## b_em100_c1 0.519816 0.054825 9.4814
## b_price_c1 -0.004248 0.001096 -3.8747
## asc_c2 0.037150 0.267331 0.1390
## b_time_ave_c2 0.436715 0.381066 1.1460
## b_time_sh_c2 1.292701 0.384172 3.3649
## b_acc_ave_c2 0.575447 0.343712 1.6742
## b_acc_hi_c2 1.669257 0.332724 5.0169
## b_safe_ave_c2 -0.248548 0.277115 -0.8969
## b_safe_hi_c2 0.117955 0.298566 0.3951
## b_easy_c2 0.484515 0.218082 2.2217
## b_price_c2 -0.059294 0.008724 -6.7965
## delta_c1 1.295743 0.159279 8.1351
## gamma_envi 0.148521 0.153752 0.9660
##
## Final LL: -2018.1366
##
##
## Summary of class allocation for model component :
## Mean prob.
## Class_1 0.7841
## Class_2 0.2159
##
## Calculating log-likelihood at equal shares (LL(0)) for applicable
## models...
## Calculating log-likelihood at observed shares from estimation data
## (LL(c)) for applicable models...
## Calculating LL of each model component...
## Calculating other model fit measures
## Computing covariance matrix using numerical methods (numDeriv).
## 0%....25%....50%....75%....100%
## Negative definite Hessian with maximum eigenvalue: -3.436019
##
## Your model was estimated using the BGW algorithm. Please acknowledge
## this by citing Bunch et al. (1993) - doi.org/10.1145/151271.151279
##
## Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
## doi.org/10.1016/j.jocm.2019.100170
apollo_modelOutput(model_lcana, modelOutput_settings = list(printPVal = TRUE))
## Model run by MAGWALI using Apollo 0.3.7 on R 4.4.1 for Windows.
## Please acknowledge the use of Apollo by citing Hess & Palma (2019)
## DOI 10.1016/j.jocm.2019.100170
## www.ApolloChoiceModelling.com
##
## Model name : LC_ANA
## Model description : LC-ANA: emission attribute non-attendance
## Model run at : 2026-05-19 17:28:29.805928
## Estimation method : bgw
## Estimation diagnosis : Relative function convergence (favorable)
## Optimisation diagnosis : Maximum found
## hessian properties : Negative definite
## maximum eigenvalue : -3.436019
## reciprocal of condition number : 2.65151e-06
## Number of individuals : 366
## Number of rows in database : 3656
## Number of modelled outcomes : 10968
## MNL_c1 : 3656
## MNL_c2 : 3656
## model : 3656
##
## Number of cores used : 1
## Model without mixing
##
## LL(start) : -2304
## LL (whole model) at equal shares, LL(0) : -2534.15
## LL (whole model) at observed shares, LL(C) : -2485.68
## LL(final, whole model) : -2018.14
## Rho-squared vs equal shares : 0.2036
## Adj.Rho-squared vs equal shares : 0.1949
## Rho-squared vs observed shares : 0.1881
## Adj.Rho-squared vs observed shares : 0.18
## AIC : 4080.27
## BIC : 4216.76
##
## LL(0,class_1) : -2534.15
## LL(final,class_1) : -2186.96
## LL(0,class_2) : -2534.15
## LL(final,class_2) : -4191.03
##
## Estimated parameters : 22
## Time taken (hh:mm:ss) : 00:00:35.97
## pre-estimation : 00:00:0.61
## estimation : 00:00:8.08
## post-estimation : 00:00:27.28
## Iterations : 22
##
## Unconstrained optimisation.
##
## Estimates:
## Estimate s.e. t.rat.(0) p(1-sided) Rob.s.e.
## asc_c1 -0.235319 0.066831 -3.5211 2.1486e-04 0.064890
## b_time_ave_c1 0.096862 0.074312 1.3034 0.096211 0.074908
## b_time_sh_c1 0.292059 0.071773 4.0692 2.359e-05 0.087360
## b_acc_ave_c1 0.306528 0.073834 4.1516 1.651e-05 0.071301
## b_acc_hi_c1 0.580501 0.077507 7.4897 3.453e-14 0.092163
## b_safe_ave_c1 0.843489 0.082667 10.2035 0.000000 0.088565
## b_safe_hi_c1 1.329788 0.078330 16.9767 0.000000 0.101165
## b_easy_c1 0.121670 0.051166 2.3779 0.008705 0.047817
## b_em30_c1 0.042306 0.064156 0.6594 0.254812 0.068148
## b_em100_c1 0.519816 0.063034 8.2465 1.110e-16 0.077489
## b_price_c1 -0.004248 0.001003 -4.2378 1.129e-05 9.8696e-04
## asc_c2 0.037150 0.208882 0.1779 0.429419 0.207006
## b_time_ave_c2 0.436715 0.325742 1.3407 0.090012 0.332459
## b_time_sh_c2 1.292701 0.394245 3.2789 5.2101e-04 0.455495
## b_acc_ave_c2 0.575447 0.243159 2.3665 0.008977 0.216720
## b_acc_hi_c2 1.669257 0.304044 5.4902 2.008e-08 0.334891
## b_safe_ave_c2 -0.248548 0.235799 -1.0541 0.145925 0.284347
## b_safe_hi_c2 0.117955 0.247574 0.4764 0.316879 0.228603
## b_easy_c2 0.484515 0.157615 3.0741 0.001056 0.150626
## b_price_c2 -0.059294 0.008344 -7.1059 5.976e-13 0.009982
## delta_c1 1.295743 0.165710 7.8193 2.665e-15 0.187351
## gamma_envi 0.148521 0.139180 1.0671 0.142959 0.154157
## Rob.t.rat.(0) p(1-sided)
## asc_c1 -3.6264 1.4368e-04
## b_time_ave_c1 1.2931 0.097992
## b_time_sh_c1 3.3432 4.1416e-04
## b_acc_ave_c1 4.2991 8.576e-06
## b_acc_hi_c1 6.2986 1.501e-10
## b_safe_ave_c1 9.5239 0.000000
## b_safe_hi_c1 13.1447 0.000000
## b_easy_c1 2.5445 0.005472
## b_em30_c1 0.6208 0.267366
## b_em100_c1 6.7082 9.849e-12
## b_price_c1 -4.3045 8.366e-06
## asc_c2 0.1795 0.428786
## b_time_ave_c2 1.3136 0.094492
## b_time_sh_c2 2.8380 0.002270
## b_acc_ave_c2 2.6553 0.003962
## b_acc_hi_c2 4.9845 3.106e-07
## b_safe_ave_c2 -0.8741 0.191032
## b_safe_hi_c2 0.5160 0.302933
## b_easy_c2 3.2167 6.4844e-04
## b_price_c2 -5.9399 1.426e-09
## delta_c1 6.9161 2.321e-12
## gamma_envi 0.9634 0.167663
##
##
## Summary of class allocation for model component :
## Mean prob.
## Class_1 0.7841
## Class_2 0.2159
rho2_lcana <- 1 - (model_lcana$LLout / model_lcana$LL0)
cat("\n--- LC-ANA Fit ---\n")
##
## --- LC-ANA Fit ---
cat("LL: ", round(model_lcana$LLout, 3), "\n")
## LL: -2018.137 -2186.958 -4191.027
cat("McFadden R2: ", round(rho2_lcana, 4), "\n")
## McFadden R2: 0.2036 0.137 -0.6538
cat("AIC: ", round(model_lcana$AIC, 3), "\n")
## AIC: 4080.273
cat("BIC: ", round(model_lcana$BIC, 3), "\n")
## BIC: 4216.764
cat("\n============================================================\n")
##
## ============================================================
cat("MODEL COMPARISON TABLE\n")
## MODEL COMPARISON TABLE
cat("============================================================\n")
## ============================================================
cat(sprintf("%-20s %10s %6s %10s %10s\n",
"Model", "LL", "Params", "AIC", "BIC"))
## Model LL Params AIC BIC
cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
"MNL (linear)",
model_mnl$LLout, length(model_mnl$estimate),
model_mnl$AIC, model_mnl$BIC))
## MNL (linear) -2122.7 11 4267.4 4335.6
cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
"MNL (log price)",
model_mnl_log$LLout, length(model_mnl_log$estimate),
model_mnl_log$AIC, model_mnl_log$BIC))
## MNL (log price) -2158.0 11 4338.0 4406.3
cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
"LCM 2-class",
model_lcm2$LLout, length(model_lcm2$estimate),
model_lcm2$AIC, model_lcm2$BIC))
## LCM 2-class -2004.7 25 4059.4 4214.5
## LCM 2-class -2240.3 25 4059.4 4214.5
## LCM 2-class -3149.6 25 4059.4 4214.5
cat(sprintf("%-20s %10.1f %6d %10.1f %10.1f\n",
"LC-ANA",
model_lcana$LLout, length(model_lcana$estimate),
model_lcana$AIC, model_lcana$BIC))
## LC-ANA -2018.1 22 4080.3 4216.8
## LC-ANA -2187.0 22 4080.3 4216.8
## LC-ANA -4191.0 22 4080.3 4216.8
cat("(Lower AIC/BIC = better fit)\n")
## (Lower AIC/BIC = better fit)
cat("\n--- WTP: MNL Baseline (% fare increase accepted) ---\n")
##
## --- WTP: MNL Baseline (% fare increase accepted) ---
est <- model_mnl$estimate
cat("30% emission reduction: ",
round(-(est["b_emiss30"] / est["b_price"]), 2), "%\n")
## 30% emission reduction: 7.2 %
cat("100% emission reduction: ",
round(-(est["b_emiss100"] / est["b_price"]), 2), "%\n")
## 100% emission reduction: 47.11 %
cat("Short vs long wait time: ",
round(-(est["b_time_short"] / est["b_price"]), 2), "%\n")
## Short vs long wait time: 32.18 %
cat("High vs low safety: ",
round(-(est["b_safe_high"] / est["b_price"]), 2), "%\n")
## High vs low safety: 112.23 %
cat("\n--- WTP: 2-Class LCM (% fare increase accepted) ---\n")
##
## --- WTP: 2-Class LCM (% fare increase accepted) ---
est2 <- model_lcm2$estimate
cat("Class 1 — 100% emission reduction: ",
round(-(est2["b_em100_c1"] / est2["b_price_c1"]), 2), "%\n")
## Class 1 — 100% emission reduction: 133.86 %
cat("Class 2 — 100% emission reduction: ",
round(-(est2["b_em100_c2"] / est2["b_price_c2"]), 2), "%\n")
## Class 2 — 100% emission reduction: 28.25 %
cat("\n--- LC-ANA: Class Membership ---\n")
##
## --- LC-ANA: Class Membership ---
esta <- model_lcana$estimate
cat("delta_c1 (intercept): ", round(esta["delta_c1"], 3), "\n")
## delta_c1 (intercept): 1.296
cat("gamma_envi (env concern): ", round(esta["gamma_envi"], 3), "\n")
## gamma_envi (env concern): 0.149
cat("If gamma_envi > 0 and significant:\n")
## If gamma_envi > 0 and significant:
cat(" Higher environmental concern predicts\n")
## Higher environmental concern predicts
cat(" attending to emission reduction attributes (H3 supported)\n")
## attending to emission reduction attributes (H3 supported)
This study provides strong evidence that Hanoi bus users value improvements in service quality and are willing to pay for more sustainable urban bus services. However, the analysis also demonstrates substantial preference heterogeneity, indicating that different groups of travelers evaluate service attributes in markedly different ways.
The baseline Multinomial Logit (MNL) model confirms that respondents systematically prefer bus alternatives offering higher service quality across several dimensions.
Among all attributes, safety emerged as the most influential factor in mode choice. Respondents were willing to accept an estimated 112% fare increase in exchange for high safety conditions relative to unsafe alternatives. This suggests that perceptions of personal safety are a dominant determinant of bus usage decisions in Hanoi.
Environmental performance was also positively valued. Respondents were willing to pay approximately 47% higher fares for fully zero-emission buses and around 7% more for buses achieving a 30% reduction in emissions. While the coefficient associated with a 30% emission reduction was only marginally significant, the coefficient for full emission elimination was highly significant, indicating that respondents respond much more strongly to complete environmental improvements than to partial reductions.
Service reliability and reduced waiting times also significantly influenced utility. In particular, respondents were willing to pay an estimated 32% fare premium for shorter waiting times compared to long waiting times. Similarly, highly accurate and reliable services generated significant positive utility gains.
Digital accessibility additionally played an important role. The positive and significant coefficient associated with app usability suggests that passengers value convenient and user-friendly digital interfaces when using bus services.
As expected, fare price carried a negative and highly significant coefficient, confirming that higher costs reduce utility and choice probability.
One notable finding is that average waiting time was not statistically significant. This suggests that respondents only perceive substantial utility gains when waiting times become very short, while average waiting times are not considered meaningfully different from long waits.
The 2-Class Latent Class Model (LCM) reveals the presence of two clearly differentiated behavioral segments within the sample population.
Class 1: Service-Quality and Sustainability Oriented (~70%)
The larger segment places substantial emphasis on safety, reliability, and environmental performance. This class exhibits particularly strong preferences for high safety standards and significant emission reductions. Their willingness-to-pay for fully zero-emission buses exceeds 130% of the current fare level, reflecting a strong valuation of environmental sustainability and service quality.
This class also demonstrates relatively low price sensitivity, indicating that respondents in this segment are more willing to trade higher fares for improved service outcomes.
Class 2: Price- and Convenience-Sensitive Travelers (~30%)
The second segment is substantially more price sensitive, with fare coefficients approximately thirteen times larger in magnitude than those observed for Class 1. Although this group still values reliability, convenience, and environmental improvements, their willingness-to-pay for cleaner buses is considerably lower.
Interestingly, environmental attributes remain positive and statistically significant within this class, suggesting that environmentally sustainable services are valued across both segments. However, respondents in this group are less willing or able to absorb large fare increases associated with sustainability improvements.
Income was found to significantly predict class membership. Higher-income respondents were more likely to belong to the quality-oriented segment, while lower-income respondents were more likely to belong to the price-sensitive group. In contrast, environmental concern was not a statistically significant predictor of class membership, which represents an important and somewhat unexpected finding.
The latent class specification also produced a substantial improvement in model fit. The Bayesian Information Criterion (BIC) decreased from 4335.6 in the baseline MNL model to 4214.5 in the 2-Class LCM, confirming the importance of accounting for preference heterogeneity within the population.
The Latent Class Attribute Non-Attendance (LC-ANA) specification classified respondents into two groups: an emission-attending class and a non-attending class that ignored emission attributes during decision-making.
The coefficient associated with environmental concern was positive, indicating that individuals with stronger environmental attitudes were more likely to attend to emission-related attributes. Although this relationship was consistent with theoretical expectations, it did not achieve conventional levels of statistical significance.
Accordingly, Hypothesis 3 can be considered directionally supported but not statistically confirmed. One possible explanation is that the sample size may have limited the statistical power necessary to detect this effect more precisely.
At the same time, the highly significant class allocation intercept suggests that the majority of respondents attended to emission attributes regardless of their stated environmental concern. This finding itself is substantively important, as it indicates that environmental performance is broadly relevant to bus users even among individuals who do not strongly identify as environmentally concerned.
The linear-price MNL specification outperformed the nonlinear log-price specification based on both AIC and BIC measures. This suggests that respondents exhibit approximately linear sensitivity to fare increases within the experimental range considered in the study.
Among all estimated models, the 2-Class Latent Class Model achieved the best overall performance. The substantial improvement in fit statistics confirms that preference heterogeneity is a defining characteristic of the dataset and that a two-segment representation adequately captures behavioral variation without introducing excessive model complexity.
Several important policy implications emerge from the findings.
First, investments in safety improvements should be considered a top priority for public transport operators and policymakers. The exceptionally high willingness-to-pay estimates associated with safer services suggest that improvements in vehicle maintenance, operational reliability, driver training, and passenger security are likely to generate substantial increases in perceived service value and ridership.
Second, the findings indicate that transitioning toward cleaner and zero-emission bus fleets is economically and socially feasible, but pricing strategies should account for heterogeneous user preferences. While a majority of respondents appear willing to pay substantial premiums for cleaner services, a sizeable price-sensitive segment may be adversely affected by uniform fare increases. Consequently, phased implementation strategies, targeted subsidies, or differentiated pricing mechanisms may be necessary to maintain accessibility while supporting sustainable transport objectives.
Finally, the significance of app usability highlights the growing importance of digital service integration within urban transport systems. Investments in intuitive mobile applications, real-time service information, and simplified digital ticketing systems may represent relatively low-cost yet highly effective strategies for improving the attractiveness and competitiveness of public transport services, particularly among younger and technology-oriented commuters.