flowchart LR A(Session) --> B(Set Working Directory) B --> C[To Project Directory]
Today we will have a detailed look in our data analysis. This includes:
working directory to Project directoryflowchart LR A(Session) --> B(Set Working Directory) B --> C[To Project Directory]
---- or ####For this session we will use data from a DCE on agri-environmental contracts choice (see Schulze et al. (2026)).
| Attribute | Description | Level | Type |
|---|---|---|---|
| Payment mechanism | Payment criteria of the agri-environmental contract | Practice based; Result based | Binary (0-Practice based, 1-Result based) |
| Contract partners | Option to involve partners in the contract and implement measure cooperatively | none; cooperative | Binary (0-none, 1-cooperatives) |
| Advisory services | Option to have advisory services included in the contract | none; inclusive | Binary (0-none, 1-inclusive) |
| Compensation | Compensation payments per hectare | 350–830 Euros/ha/year | 8 equally-separated levels |
The R package we are working with here (Apollo) will require the long format.
\(\rightarrow\) transform data accordingly
In our mock dataset, we have two distinct files: choice data and respondent characteristics.
# include the R script which includes all the packages
source("01_scripts/00_header.R")
#load the database including choice data
1 database_choice <- read_excel("00_data/agora_live_1_v1.xlsx")
#load the database including the covariates
database_covar <- read_excel("00_data/agora_live_1_covariates.xlsx")
#merge
2 database <- database_choice %>%
left_join(database_covar, by="RID")RID
In our raw datafile, we can hardly see what variables mean. For that reason, we rename and recode them accordingly.
# Recoding variables ----
database <- database %>%
mutate(pay_mode_1 = ifelse(a1_x1==2,0,1),
partner_1 = ifelse(a1_x2==1,0,1),
consult_1 = ifelse(a1_x3==1,0,1),
compensation_1 = (case_when((a1_x4==1) ~ 350,
(a1_x4==2) ~ 410,
(a1_x4==3) ~ 470,
(a1_x4==4) ~ 530,
(a1_x4==5) ~ 590,
(a1_x4==6) ~ 650,
(a1_x4==7) ~ 710,
(a1_x4==8) ~ 770,
(a1_x4==9) ~ 830)),
pay_mode_2 = ifelse(a2_x1==2,0,1),
partner_2 = ifelse(a2_x2==1,0,1),
consult_2 = ifelse(a2_x3==1,0,1),
compensation_2 = (case_when((a2_x4==1) ~ 350,
(a2_x4==2) ~ 410,
(a2_x4==3) ~ 470,
(a2_x4==4) ~ 530,
(a2_x4==5) ~ 590,
(a2_x4==6) ~ 650,
(a2_x4==7) ~ 710,
(a2_x4==8) ~ 770,
(a2_x4==9) ~ 830)))
# Generating opt out alternatives and relocating them to the correct place ----
database <- database %>%
mutate(pay_mode_3 = 0,
partner_3 = 0,
consult_3 = 0,
compensation_3 = 0) %>%
relocate(pay_mode_3, .after=compensation_2) %>%
relocate(partner_3, .after=pay_mode_3) %>%
relocate(consult_3, .after=partner_3) %>%
relocate(compensation_3, .after=consult_3) %>%
as.data.frame()We only want to include individuals that completed the survey.
STATUS variable for levels7 means complete10 means incompleteThere might be individuals in our dataset that always chose the opt out alternative.
| Variable | N | N = 3471 |
|---|---|---|
| ID | 347 | 174.00 (100.31) |
| Someone else should carry out nature conservation (yes) | 83 | 14 / 83 (17%) |
| The compensation amount was not sufficient (yes) | 83 | 21 / 83 (25%) |
| I have already done a lot for nature conservation in the past (yes) | 83 | 28 / 83 (34%) |
| There was not enough information available (yes) | 83 | 17 / 83 (20%) |
| The measure does not fit into my farm operations (yes) | 83 | 51 / 83 (61%) |
| 1 Mean (SD); n / N (%) | ||
Creating the dataset which excludes all serial protest responses.
database_np <- database %>%
group_by(ID) %>%
1 summarise(always_opt_out = all(pref1 == 3 & q30_2==1)) %>%
filter(!always_opt_out) %>%
select(ID) %>%
inner_join(database, by = "ID")
cset <- 12
nsets <- nrow(database_np)
respondents <- nsets/cset %>%
as.numeric()
database_np <- database_np %>%
mutate(ID = rep(1:respondents, each=cset))
2write.csv(database_np, "02_processed/agora_data.csv")We may want to know a few things prior to getting into the choice modelling analysis about the representativeness of our sample.
source("01_scripts/00_header.R")
1apollo_initialise()
nalt=12
2database <- read.csv("02_processed/agora_data.csv")
3apollo_control = list(
modelName ="Agora baseline",
modelDescr ="Simple MNL model",
indivID ="ID",
outputDirectory = "output",
panelData = TRUE
)apollo
ID variable, and create an output directory for all estimation files
Maximum Likelihood Estimation
Which values of \(\beta\) make the choices we observe (\(d_{nj}=1\)) most probable?
Building utlity functions \[\begin{align} U_i = \underbrace{V_i}_{\text{deterministic}} + \underbrace{\epsilon_i}_{\text{random}} \label{eq:rum} \end{align}\] with \[\begin{align} V_i = ASC + \beta_1X_{1_i} + \beta_2X_{2_i} + \cdots + \beta_{COST_i}X_{COST_i} \label{eq:det_part} \end{align}\]
Finally, we estimate the model…
AgoraBaselineMNL = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs,
estimate_settings=list(maxIterations=400, estimationRoutine="bfgs",
hessianRoutine="analytic"))Preparing user-defined functions.
Testing likelihood function...
Overview of choices for MNL model component :
first second third
Times available 3420.00 3420.00 3420.00
Times chosen 1374.00 845.00 1201.00
Percentage chosen overall 40.18 24.71 35.12
Percentage chosen when available 40.18 24.71 35.12
Pre-processing likelihood function...
Testing influence of parameters
Starting main estimation
Initial function value: -3757.254
Initial gradient value:
asc_pub b_pay_mode_pub b_partner_pub b_consult_pub asc_priv
234.0000 103.0000 94.0000 143.0000 -295.0000
b_pay_mode_priv b_partner_priv b_consult_priv b_comp
-172.3333 -168.0000 -104.6667 83310.0000
initial value 3757.254027
iter 2 value 3745.828230
iter 3 value 3684.427878
iter 4 value 3600.239896
iter 5 value 3595.030966
iter 6 value 3587.807902
iter 7 value 3579.101758
iter 8 value 3536.370271
iter 9 value 3536.231559
iter 10 value 3531.239777
iter 11 value 3456.412293
iter 12 value 3454.913940
iter 13 value 3450.303799
iter 14 value 3450.295732
iter 15 value 3450.295635
iter 15 value 3450.295634
iter 15 value 3450.295634
final value 3450.295634
converged
Estimated parameters with approximate standard errors from BHHH matrix:
Estimate BHHH se BHH t-ratio (0)
asc_pub -2.191634 0.12037 -18.2076
b_pay_mode_pub -0.131633 0.07954 -1.6548
b_partner_pub -0.039843 0.07582 -0.5255
b_consult_pub 0.225681 0.09753 2.3141
asc_priv -2.696137 0.12172 -22.1495
b_pay_mode_priv -0.227101 0.08420 -2.6971
b_partner_priv -0.068259 0.09859 -0.6924
b_consult_priv 0.278566 0.07897 3.5277
b_comp 0.003876 1.4192e-04 27.3091
Final LL: -3450.2956
Calculating log-likelihood at equal shares (LL(0)) for applicable
models...
Calculating log-likelihood at observed shares from estimation data
(LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Computing covariance matrix using numerical jacobian of analytical
gradient.
0%....25%....50%....75%...100%
Negative definite Hessian with maximum eigenvalue: -30.659283
Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
doi.org/10.1016/j.jocm.2019.100170
… and save the output in the output directory.
Model run by cschulze using Apollo 0.3.5 on R 4.5.0 for Windows.
Please acknowledge the use of Apollo by citing Hess & Palma (2019)
DOI 10.1016/j.jocm.2019.100170
www.ApolloChoiceModelling.com
Model name : Agora baseline
Model description : Simple MNL model
Model run at : 2026-03-05 16:52:35.367829
Estimation method : bfgs
Model diagnosis : successful convergence
Optimisation diagnosis : Maximum found
hessian properties : Negative definite
maximum eigenvalue : -30.659283
reciprocal of condition number : 1.02635e-07
Number of individuals : 285
Number of rows in database : 3420
Number of modelled outcomes : 3420
Number of cores used : 1
Model without mixing
LL(start) : -3757.25
LL at equal shares, LL(0) : -3757.25
LL at observed shares, LL(C) : -3691.16
LL(final) : -3450.3
Rho-squared vs equal shares : 0.0817
Adj.Rho-squared vs equal shares : 0.0793
Rho-squared vs observed shares : 0.0653
Adj.Rho-squared vs observed shares : 0.0634
AIC : 6918.59
BIC : 6973.83
Estimated parameters : 9
Time taken (hh:mm:ss) : 00:00:1.34
pre-estimation : 00:00:0.6
estimation : 00:00:0.37
post-estimation : 00:00:0.37
Iterations : 17
Unconstrained optimisation.
Estimates:
Estimate s.e. t.rat.(0) Rob.s.e. Rob.t.rat.(0)
asc_pub -2.191634 0.13367 -16.3955 0.17500 -12.5236
b_pay_mode_pub -0.131633 0.07303 -1.8024 0.07134 -1.8452
b_partner_pub -0.039843 0.07304 -0.5455 0.07307 -0.5453
b_consult_pub 0.225681 0.07304 3.0898 0.05607 4.0250
asc_priv -2.696137 0.13914 -19.3772 0.17740 -15.1984
b_pay_mode_priv -0.227101 0.08287 -2.7405 0.09130 -2.4874
b_partner_priv -0.068259 0.08322 -0.8203 0.07461 -0.9149
b_consult_priv 0.278566 0.08312 3.3512 0.09250 3.0114
b_comp 0.003876 1.8683e-04 20.7443 2.6245e-04 14.7671
Old result file "output/Agora baseline_output.txt"
renamed to: "output/Agora baseline_OLD49_output.txt"
Old result file "output/Agora baseline_estimates.csv"
renamed to: "output/Agora baseline_OLD49_estimates.csv"
Old result file "output/Agora baseline_model.rds"
renamed to: "output/Agora baseline_OLD49_model.rds"
Model output saved to output/Agora baseline_output.txt
Estimates saved to output/Agora baseline_estimates.csv
Model object saved to output/Agora baseline.rds
Remember, willingness to accept is the marginal rate of substitution, hence the quantification of how respondents trade-off different attributes.
How do farmers trade off different attributes, specifically compensation payments to other contract attributes, while supposedly maintaining the same level of utility?
Whenever we need standard errors from a nonlinear function of estimated coefficients, we use the delta method. The delta method turns uncertainty in your estimated \(\beta\)’s into uncertainty for a derived measure (like WTP or CV) by linearizing the formula around \(\hat\beta\) (Daly, Hess, and De Jong 2012).
If \(g(\beta)\) is in our case WTP or CV, then \[\begin{align} \operatorname{Var}\!\big(g(\hat{\boldsymbol{\beta}})\big) &\approx \nabla g(\hat{\boldsymbol{\beta}})^{\top}\, \widehat{\boldsymbol{\Sigma}}\, \nabla g(\hat{\boldsymbol{\beta}}) \\ \operatorname{SE}\!\big(g(\hat{\boldsymbol{\beta}})\big) &\approx \sqrt{\nabla g(\hat{\boldsymbol{\beta}})^{\top}\, \widehat{\boldsymbol{\Sigma}}\, \nabla g(\hat{\boldsymbol{\beta}})} \end{align}\]
Constructing WTA measures from parameters estimates.
deltaMethod_settings=list(expression=c(
wta_asc_pub ="asc_pub/b_comp",
wta_paymode_pub = "b_pay_mode_pub/b_comp",
wta_consult_pub = "b_consult_pub/b_comp",
wta_asc_priv ="asc_priv/b_comp",
wta_paymode_priv = "b_pay_mode_priv/b_comp",
wta_consult_priv = "b_consult_priv/b_comp"
))
apollo_deltaMethod(AgoraBaselineMNL, deltaMethod_settings) Running Delta method computation for user-defined function using robust standard errors
Expression Value s.e. t-ratio (0)
wta_asc_pub -565.4905 28.9924 -19.50
wta_paymode_pub -33.9642 18.3939 -1.85
wta_consult_pub 58.2309 15.0827 3.86
wta_asc_priv -695.6637 35.9319 -19.36
wta_paymode_priv -58.5972 23.7508 -2.47
wta_consult_priv 71.8763 24.3206 2.96
INFORMATION: The results of the Delta method calculations are returned invisibly as
an output from this function. Calling the function via
result=apollo_deltaMethod(...) will save this output in an object
called result (or otherwise named object).
Constructing compensating variation from parameters estimates.
deltaMethod_settings <- list(expression = c(
# Scenario utilities (partner not significant)
V1_pub = "asc_pub + 1*b_pay_mode_pub + 1*b_consult_pub + 0*b_partner_pub",
V1_priv = "asc_priv + 1*b_pay_mode_priv + 1*b_consult_priv + 0*b_partner_priv",
# Compensating variation (simple): CV = -(V1 - V0) / b_comp
cv_pub = "-((asc_pub + 1*b_pay_mode_pub + 1*b_consult_pub + 0*b_partner_pub) - 0)/b_comp",
cv_priv = "-((asc_priv + 1*b_pay_mode_priv + 1*b_consult_priv + 0*b_partner_priv) - 0)/b_comp"
))
apollo_deltaMethod(AgoraBaselineMNL, deltaMethod_settings) Running Delta method computation for user-defined function using robust standard errors
Expression Value s.e. t-ratio (0)
V1_pub -2.0976 0.1853 -11.32
V1_priv -2.6447 0.1769 -14.95
cv_pub 541.2238 31.0348 17.44
cv_priv 682.3846 34.5646 19.74
INFORMATION: The results of the Delta method calculations are returned invisibly as
an output from this function. Calling the function via
result=apollo_deltaMethod(...) will save this output in an object
called result (or otherwise named object).
What about impacts of:
We could either:
However, be aware of sample sizes of subsamples!
We could split the sample, for example, based on certified organic and non-organic farms (variable c9 in our data).
And then run the analysis for the split sample.
c11) and the partner attribute.# set apollo prior values ----
apollo_beta=c(asc_pub = 0,
b_pay_mode_pub = 0,
b_partner_pub = 0,
b_consult_pub = 0,
asc_priv = 0,
b_pay_mode_priv = 0,
b_partner_priv = 0,
b_consult_priv = 0,
b_comp = 0,
1 int_partner_advice_pub=0,
int_partner_advice_priv=0)Including new dummy coded variable in utility
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()
V = list()
V[['first']] = asc_pub + b_pay_mode_pub * pay_mode_1 + b_partner_pub * partner_1 + int_partner_advice_pub * advice_past * partner_1 + b_consult_pub * consult_1 + b_comp * compensation_1
V[['second']] = asc_priv + b_pay_mode_priv * pay_mode_2 + b_partner_priv * partner_2 + int_partner_advice_priv * advice_past * partner_1 + b_consult_priv * consult_2 + b_comp * compensation_2
V[['third']] = 0
mnl_settings = list(
alternatives = c(first=1, second=2, third=3),
avail = list(first=1, second=1, third=1),
choiceVar = pref1,
V = V
)
P[['model']] = apollo_mnl(mnl_settings, functionality)
P = apollo_panelProd(P, apollo_inputs, functionality)
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}AgoraBaselineMNL_int = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs,
estimate_settings=list(maxIterations=400, estimationRoutine="bfgs",
hessianRoutine="analytic"))Preparing user-defined functions.
Testing likelihood function...
Overview of choices for MNL model component :
first second third
Times available 3420.00 3420.00 3420.00
Times chosen 1374.00 845.00 1201.00
Percentage chosen overall 40.18 24.71 35.12
Percentage chosen when available 40.18 24.71 35.12
Pre-processing likelihood function...
Testing influence of parameters
Starting main estimation
Initial function value: -3757.254
Initial gradient value:
asc_pub b_pay_mode_pub b_partner_pub
234.0000 103.0000 94.0000
b_consult_pub asc_priv b_pay_mode_priv
143.0000 -295.0000 -172.3333
b_partner_priv b_consult_priv b_comp
-168.0000 -104.6667 83310.0000
int_partner_advice_pub int_partner_advice_priv
56.0000 -34.0000
initial value 3757.254027
iter 2 value 3745.828213
iter 3 value 3683.862865
iter 4 value 3595.432140
iter 5 value 3519.048374
iter 6 value 3502.433816
iter 7 value 3498.853609
iter 8 value 3496.198891
iter 9 value 3473.747313
iter 10 value 3456.913127
iter 11 value 3455.016138
iter 12 value 3454.533078
iter 13 value 3453.972938
iter 14 value 3444.988131
iter 15 value 3442.834990
iter 16 value 3442.830708
iter 17 value 3442.830571
iter 17 value 3442.830569
iter 17 value 3442.830569
final value 3442.830569
converged
Estimated parameters with approximate standard errors from BHHH matrix:
Estimate BHHH se BHH t-ratio (0)
asc_pub -2.203688 0.12115 -18.1893
b_pay_mode_pub -0.141271 0.07993 -1.7673
b_partner_pub -0.155886 0.07700 -2.0246
b_consult_pub 0.231883 0.09769 2.3736
asc_priv -2.722525 0.12170 -22.3716
b_pay_mode_priv -0.227577 0.08469 -2.6872
b_partner_priv -0.067603 0.09863 -0.6854
b_consult_priv 0.267883 0.08000 3.3485
b_comp 0.003881 1.4208e-04 27.3145
int_partner_advice_pub 0.506046 0.07193 7.0352
int_partner_advice_priv 0.219875 0.09235 2.3809
Final LL: -3442.8306
Calculating log-likelihood at equal shares (LL(0)) for applicable
models...
Calculating log-likelihood at observed shares from estimation data
(LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Computing covariance matrix using numerical jacobian of analytical
gradient.
0%....25%....50%....75%..100%
Negative definite Hessian with maximum eigenvalue: -29.886489
Please acknowledge the use of Apollo by citing Hess & Palma (2019) -
doi.org/10.1016/j.jocm.2019.100170
Model run by cschulze using Apollo 0.3.5 on R 4.5.0 for Windows.
Please acknowledge the use of Apollo by citing Hess & Palma (2019)
DOI 10.1016/j.jocm.2019.100170
www.ApolloChoiceModelling.com
Model name : Agora baseline interaction
Model description : Simple MNL model with interaction
Model run at : 2026-03-05 16:52:38.332485
Estimation method : bfgs
Model diagnosis : successful convergence
Optimisation diagnosis : Maximum found
hessian properties : Negative definite
maximum eigenvalue : -29.886489
reciprocal of condition number : 1.00342e-07
Number of individuals : 285
Number of rows in database : 3420
Number of modelled outcomes : 3420
Number of cores used : 1
Model without mixing
LL(start) : -3757.25
LL at equal shares, LL(0) : -3757.25
LL at observed shares, LL(C) : -3691.16
LL(final) : -3442.83
Rho-squared vs equal shares : 0.0837
Adj.Rho-squared vs equal shares : 0.0808
Rho-squared vs observed shares : 0.0673
Adj.Rho-squared vs observed shares : 0.0648
AIC : 6907.66
BIC : 6975.17
Estimated parameters : 11
Time taken (hh:mm:ss) : 00:00:1.15
pre-estimation : 00:00:0.3
estimation : 00:00:0.39
post-estimation : 00:00:0.45
Iterations : 20
Unconstrained optimisation.
Estimates:
Estimate s.e. t.rat.(0) Rob.s.e.
asc_pub -2.203688 0.13403 -16.4414 0.17528
b_pay_mode_pub -0.141271 0.07320 -1.9299 0.07172
b_partner_pub -0.155886 0.08035 -1.9401 0.09723
b_consult_pub 0.231883 0.07319 3.1681 0.05619
asc_priv -2.722525 0.13981 -19.4729 0.17984
b_pay_mode_priv -0.227577 0.08292 -2.7445 0.09128
b_partner_priv -0.067603 0.08326 -0.8119 0.07468
b_consult_priv 0.267883 0.08339 3.2123 0.09222
b_comp 0.003881 1.8729e-04 20.7212 2.6322e-04
int_partner_advice_pub 0.506046 0.13218 3.8285 0.24693
int_partner_advice_priv 0.219875 0.13756 1.5984 0.21803
Rob.t.rat.(0)
asc_pub -12.5725
b_pay_mode_pub -1.9696
b_partner_pub -1.6032
b_consult_pub 4.1265
asc_priv -15.1387
b_pay_mode_priv -2.4931
b_partner_priv -0.9053
b_consult_priv 2.9049
b_comp 14.7443
int_partner_advice_pub 2.0493
int_partner_advice_priv 1.0085
Old result file "output/Agora baseline interaction_output.txt"
renamed to: "output/Agora baseline interaction_OLD47_output.txt"
Old result file "output/Agora baseline interaction_estimates.csv"
renamed to: "output/Agora baseline interaction_OLD47_estimates.csv"
Old result file "output/Agora baseline interaction_model.rds"
renamed to: "output/Agora baseline interaction_OLD47_model.rds"
Model output saved to output/Agora baseline interaction_output.txt
Estimates saved to output/Agora baseline interaction_estimates.csv
Model object saved to output/Agora baseline interaction.rds
deltaMethod_settings=list(expression=c(
wta_asc_pub ="asc_pub/b_comp",
wta_paymode_pub = "b_pay_mode_pub/b_comp",
wta_partner_advice_pub = "(b_partner_pub + int_partner_advice_pub)/b_comp",
wta_partner_pub = "b_partner_pub/b_comp",
wta_consult_pub = "b_consult_pub/b_comp",
wta_asc_priv ="asc_priv/b_comp",
wta_paymode_priv = "b_pay_mode_priv/b_comp",
wta_partner_advice_priv = "(b_partner_pub + int_partner_advice_priv)/b_comp",
wta_partner_priv = "b_partner_priv/b_comp",
wta_consult_priv = "b_consult_priv/b_comp"
))
apollo_deltaMethod(AgoraBaselineMNL_int, deltaMethod_settings) Running Delta method computation for user-defined function using robust standard errors
Expression Value s.e. t-ratio (0)
wta_asc_pub -567.8231 28.9293 -19.63
wta_paymode_pub -36.4013 18.4762 -1.97
wta_partner_advice_pub 90.2255 52.3928 1.72
wta_partner_pub -40.1671 25.0477 -1.60
wta_consult_pub 59.7491 15.1144 3.95
wta_asc_priv -701.5114 36.9420 -18.99
wta_paymode_priv -58.6395 23.6923 -2.48
wta_partner_advice_priv 16.4878 55.2773 0.30
wta_partner_priv -17.4192 19.0249 -0.92
wta_consult_priv 69.0253 24.2324 2.85
INFORMATION: The results of the Delta method calculations are returned invisibly as
an output from this function. Calling the function via
result=apollo_deltaMethod(...) will save this output in an object
called result (or otherwise named object).
If we force both contracts to be result-based + include advisory services, how does uptake change as compensation increases?
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.08363 0.27175 0.37849 0.39096 0.49881 0.71242
first second third
0.4017534 0.2470772 0.3511694
We simulate a Discrete Choice Experiment dataset that follows the Random Utility Model (RUM): \[\begin{align} U_{njt}=V_{njt}+\epsilon_{njt} \end{align}\]
We create data where we know the truth, so we can later estimate an MNL and check whether we recover it.
We will:
Building a dataset with n rows (No choices yet!).
SIM_N <- 300 # number of respondents
SIM_T <- 12 # choice tasks per respondent
SIM_SEED <- 123
sim_agora <- function(N = SIM_N, T = SIM_T, seed = SIM_SEED){
comp_levels <- c(350,410,470,530,590,650,710,770,830)
n <- N*T
df <- data.frame(
ID = rep(1:N, each = T),
pay_mode_1 = rbinom(n, 1, 0.5),
partner_1 = rbinom(n, 1, 0.5),
consult_1 = rbinom(n, 1, 0.5),
compensation_1 = sample(comp_levels, n, replace=TRUE),
pay_mode_2 = rbinom(n, 1, 0.5),
partner_2 = rbinom(n, 1, 0.5),
consult_2 = rbinom(n, 1, 0.5),
compensation_2 = sample(comp_levels, n, replace=TRUE),
pay_mode_3 = 0,
partner_3 = 0,
consult_3 = 0,
compensation_3 = 0
)Defining “true” preference parameters (the data-generating process)
Building deterministic utilities \(V\)
V1 <- with(df, beta_true["asc_pub"] +
beta_true["b_pay_mode_pub"]*pay_mode_1 +
beta_true["b_partner_pub"]*partner_1 +
beta_true["b_consult_pub"]*consult_1 +
beta_true["b_comp"]*compensation_1)
V2 <- with(df, beta_true["asc_priv"] +
beta_true["b_pay_mode_priv"]*pay_mode_2 +
beta_true["b_partner_priv"]*partner_2 +
beta_true["b_consult_priv"]*consult_2 +
beta_true["b_comp"]*compensation_2)
V3 <- 0Adding randomness: \(\epsilon\) shocks
# i.i.d. Gumbel shocks (MNL kernel)
gumbel <- function(n) -log(-log(runif(n)))
U1 <- V1 + gumbel(n)
U2 <- V2 + gumbel(n)
U3 <- V3 + gumbel(n)This gives us \[\begin{align} U_{njt}=V_{njt}+\epsilon_{njt} \end{align}\]
Turning utilities into choices
So the rule is
the chosen alternative is the one with the highest utility
That is the behavioral assumption of random utility maximization.
apollo_control <- list(
modelName ="Simulated Agora baseline",
modelDescr ="MNL on simulated data",
indivID ="ID",
outputDirectory = "output_sim",
panelData = TRUE
)
apollo_beta <- c(
asc_pub = 0,
b_pay_mode_pub = 0,
b_partner_pub = 0,
b_consult_pub = 0,
asc_priv = 0,
b_pay_mode_priv = 0,
b_partner_priv = 0,
b_consult_priv = 0,
b_comp = 0
)
apollo_fixed <- c()
apollo_inputs <- apollo_validateInputs()
SimMNL <- apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)Model run by cschulze using Apollo 0.3.5 on R 4.5.0 for Windows.
Please acknowledge the use of Apollo by citing Hess & Palma (2019)
DOI 10.1016/j.jocm.2019.100170
www.ApolloChoiceModelling.com
Model name : Simulated Agora baseline
Model description : MNL on simulated data
Model run at : 2026-03-05 16:52:44.170779
Estimation method : bgw
Model diagnosis : Relative function convergence
Optimisation diagnosis : Maximum found
hessian properties : Negative definite
maximum eigenvalue : -28.46654
reciprocal of condition number : 1.03905e-07
Number of individuals : 300
Number of rows in database : 3600
Number of modelled outcomes : 3600
Number of cores used : 1
Model without mixing
LL(start) : -3955
LL at equal shares, LL(0) : -3955
LL at observed shares, LL(C) : -3418.98
LL(final) : -3239.6
Rho-squared vs equal shares : 0.1809
Adj.Rho-squared vs equal shares : 0.1786
Rho-squared vs observed shares : 0.0525
Adj.Rho-squared vs observed shares : 0.0504
AIC : 6497.2
BIC : 6552.9
Estimated parameters : 9
Time taken (hh:mm:ss) : 00:00:0.67
pre-estimation : 00:00:0.16
estimation : 00:00:0.17
post-estimation : 00:00:0.34
Iterations : 7
Unconstrained optimisation.
Estimates:
Estimate s.e. t.rat.(0) Rob.s.e. Rob.t.rat.(0)
asc_pub 0.368745 0.13848 2.66289 0.13420 2.74780
b_pay_mode_pub 0.527084 0.08098 6.50865 0.08276 6.36893
b_partner_pub 0.031601 0.08032 0.39342 0.08052 0.39244
b_consult_pub 0.500961 0.08083 6.19785 0.08285 6.04628
asc_priv 0.251496 0.14828 1.69605 0.15708 1.60102
b_pay_mode_priv 0.106741 0.09248 1.15425 0.09353 1.14127
b_partner_priv 0.003690 0.09220 0.04002 0.08982 0.04108
b_consult_priv 0.362350 0.09298 3.89720 0.09466 3.82801
b_comp -0.003185 2.0330e-04 -15.66826 2.0696e-04 -15.39111
est <- get_apollo_estimates(SimMNL)
comp <- data.frame(
param = names(beta_true),
true = as.numeric(beta_true),
est = as.numeric(est[names(beta_true)])
)
# add diagnostics
comp$diff <- comp$est - comp$true
comp$abs_diff <- abs(comp$diff)
comp$rel_diff <- comp$diff / comp$true
comp param true est diff abs_diff rel_diff
1 asc_pub 0.300 0.368745049 0.0687450494 0.0687450494 0.229150165
2 asc_priv 0.050 0.251496272 0.2014962722 0.2014962722 4.029925445
3 b_pay_mode_pub 0.400 0.527083718 0.1270837183 0.1270837183 0.317709296
4 b_partner_pub 0.150 0.031601110 -0.1183988897 0.1183988897 -0.789325931
5 b_consult_pub 0.500 0.500961199 0.0009611994 0.0009611994 0.001922399
6 b_pay_mode_priv 0.200 0.106741015 -0.0932589849 0.0932589849 -0.466294925
7 b_partner_priv 0.100 0.003690012 -0.0963099884 0.0963099884 -0.963099884
8 b_consult_priv 0.350 0.362350353 0.0123503530 0.0123503530 0.035286723
9 b_comp -0.003 -0.003185373 -0.0001853733 0.0001853733 0.061791103
What happens if we change size of N?
Re-run code with
N=100,N=500andN=2000!