Simulation experimental design

Published

July 3, 2025

Preliminary information

This document documents the process of generating and testing different designs for a study with 5 attributes and two alternatives. The process of finding the best design follows different steps.

Generating different designs: The parameters used to generate the designs are denoted as priors
Simulating a data set: The simulation assumes a clearly defined data generating process (DGP). The DGP is specified in terms of error variance, utility parameters and a specification of the utility function. The parameters used to generate the data are denoted as true parameters.
Estimating a model on the simulated dataset. Here, we can estimate a model that is exactly the same as the GDP or a model that deviates from the GDP. We denote the parameters estimated in each model as the estimated parameters.
We repeat steps 2 and 3 $N$ times to infer which design is the best. Here we look at statistical power (at a 5% significance level), unbiasedness and efficiency.
- Power: If we have 100 simulations, how often do we find an effect (knowing that the effect exists).
- Unbiasedness: If we have 100 simulations, we know that each estimation will not give the true parameter, but something around the true parameter. This is because the DGP has some randomness involved. But the estimated parameters should should fluctuate around the true parameter. Thus, the mean of the $N$ estimated parameters should be equal to the true parameter.
- Efficiency: This is a relative measure. The design we consider as efficient is the one, that has lower mean estimated standard errors (equally p-values) than all other other designs.
Once we have found a design, we check each choice set manually to make sure the choice sets are not bogous to the respondents. If we find bogous choice set we either take another design, or change this choice set.

Note: Unbiasedness and efficiency are independent of the error variance and effect size. Power, in contrast, depends on the magnitude of the true parameters or the error variance. As we will not know these values before we have collected the data, power is somehow an abitrarly measure. The best strategy is to use the parameters from a similar study as the true parameters, and multiply it by a constant $ c < 1$. Then we get a rather conservative power estimate. Still, we can always fool ourselves (and reviewers) by using parameters that give us the power we want.

For the simulations, we use 360 respondents for a GDP with homogenous preferences (conditional logit model)

Design Generation

We created four different experimental designs:

orthogonal design
efficient design
bayesian efficient design
efficient design capturing alternative specific parameters

Simulation

The following presents the results from a DGP of the conditional logit model. This is the standard approach and assumes homogeneous preferences.

The simulation has 360 respondents and 1000 runs. The simulation itself took 35M 43S .

The simulation is based on the output file output/agora_simulation.rds

Unbiasedness

Table 1 shows summary statistics of the estimated parameters for the 1000 runs. We want to make sure that they are nearly equal to the true parameters.

Table 1: Means and medians of estimated parameters over all runs

Att_Name	truepar	altscfeff.mean	befficientdesign.mean	efficientdesign.mean	orthodesign.mean	altscfeff.median	befficientdesign.median	efficientdesign.median	orthodesign.median
basc	-1.20	-1.20	-1.20	-1.21	-1.20	-1.20	-1.20	-1.20	-1.20
baction1	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.10
badvisory1	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40
bpartner1	0.30	0.30	0.30	0.30	0.30	0.30	0.30	0.30	0.30
bcomp1	0.02	0.02	0.02	0.02	0.02	0.02	0.02	0.02	0.02
basc2	-1.60	-1.60	-1.60	-1.61	-1.61	-1.60	-1.60	-1.62	-1.60
baction2	0.20	0.20	0.19	0.21	0.20	0.20	0.20	0.21	0.20
badvisory2	0.60	0.60	0.60	0.60	0.60	0.60	0.60	0.60	0.60
bpartner2	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40	0.40
bcomp2	0.01	0.02	0.02	0.02	0.02	0.01	0.01	0.02	0.02

Efficiency

Table 2 shows the summary statistics of values of robust p-values over all runs. In all designs except for orth, we have p values $< 0.00$.

Table 2: Summary statistics of estimated robust p values over all runs

parname	altscfeff.mean	befficientdesign.mean	efficientdesign.mean	orthodesign.mean	altscfeff.sd	befficientdesign.sd	efficientdesign.sd	orthodesign.sd	altscfeff.max	befficientdesign.max	efficientdesign.max	orthodesign.max	altscfeff.range	befficientdesign.range	efficientdesign.range	orthodesign.range	altscfeff.se	befficientdesign.se	efficientdesign.se	orthodesign.se	altscfeff.median	altscfeff.skew	altscfeff.kurtosis	befficientdesign.median	befficientdesign.skew	befficientdesign.kurtosis	efficientdesign.median	efficientdesign.skew	efficientdesign.kurtosis	orthodesign.median	orthodesign.skew	orthodesign.kurtosis
rob_pval0_basc	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.01	0.01	0.00	0.00	0.01	0.01	0.00	0.00	0.00	0.00	0.00	0.0	24.83	676.90	0.00	13.46	194.05	0.00	26.99	786.56	0.00	28.24	844.65
rob_pval0_baction1	0.22	0.33	0.33	0.24	0.27	0.30	0.29	0.28	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.01	0.01	0.01	0.01	0.1	1.31	0.65	0.24	0.68	-0.85	0.25	0.69	-0.73	0.10	1.24	0.36
rob_pval0_badvisory1	0.00	0.00	0.00	0.00	0.00	0.01	0.01	0.00	0.00	0.13	0.15	0.00	0.00	0.13	0.15	0.00	0.00	0.00	0.00	0.00	0.0	22.21	539.74	0.00	14.13	260.83	0.00	17.05	335.80	0.00	27.09	779.78
rob_pval0_bpartner1	0.00	0.02	0.02	0.00	0.00	0.06	0.06	0.01	0.07	0.67	0.97	0.13	0.07	0.67	0.97	0.13	0.00	0.00	0.00	0.00	0.0	11.62	151.62	0.00	6.42	52.14	0.00	8.54	94.00	0.00	13.19	206.37
rob_pval0_bcomp1	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.0	17.86	348.54	0.00	22.43	557.33	0.00	26.69	754.34	0.00	31.50	991.89
rob_pval0_basc2	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.0	24.60	663.58	0.00	26.63	768.97	0.00	20.55	470.56	0.00	21.78	520.52
rob_pval0_baction2	0.05	0.14	0.13	0.05	0.12	0.21	0.20	0.11	0.97	0.99	0.98	0.92	0.97	0.99	0.98	0.92	0.00	0.01	0.01	0.00	0.0	4.37	21.55	0.04	2.07	3.96	0.03	2.28	4.87	0.01	3.99	18.76
rob_pval0_badvisory2	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.0	29.13	880.56	0.00	12.86	203.27	0.00	11.91	166.91	0.00	31.53	992.99
rob_pval0_bpartner2	0.00	0.00	0.00	0.00	0.00	0.01	0.02	0.00	0.01	0.25	0.27	0.02	0.01	0.25	0.27	0.02	0.00	0.00	0.00	0.00	0.0	11.10	144.98	0.00	11.98	191.25	0.00	9.26	102.35	0.00	17.79	335.09
rob_pval0_bcomp2	0.00	0.00	0.00	0.00	0.00	0.01	0.00	0.00	0.00	0.12	0.09	0.00	0.00	0.12	0.09	0.00	0.00	0.00	0.00	0.00	0.0	10.87	142.62	0.00	17.34	360.86	0.00	13.78	214.20	0.00	27.13	781.95

To get better insights into efficiency, we can look at the standard deviations of the estimated parameters. Smaller standard deviations mean lower fluctuations around the mean. From an efficiency perspective, we should select the model with the lowest standard deviations. It turns out that not one design outperforms another. But the design with simple priors seems to be better for many attributes. This is a bit surprising, as the GDP was based on the priors of the avclasspap design.

Table 3: Standard deviations and other measures of dispersion of estimated parameters over all runs

parname	truepar	altscfeff.sd	befficientdesign.sd	efficientdesign.sd	orthodesign.sd	altscfeff.min	befficientdesign.min	efficientdesign.min	orthodesign.min	altscfeff.max	befficientdesign.max	efficientdesign.max	orthodesign.max	altscfeff.range	befficientdesign.range	efficientdesign.range	orthodesign.range	befficientdesign.se	efficientdesign.se	orthodesign.se	altscfeff.median	altscfeff.skew	altscfeff.kurtosis	befficientdesign.median	befficientdesign.skew	befficientdesign.kurtosis	efficientdesign.median	efficientdesign.skew	efficientdesign.kurtosis	orthodesign.median	orthodesign.skew	orthodesign.kurtosis
basc	-1.20	0.14	0.21	0.21	0.14	-1.70	-1.90	-1.87	-1.68	-0.78	-0.57	-0.49	-0.78	0.92	1.33	1.38	0.90	0.01	0.01	0.00	-1.20	-0.16	0.15	-1.20	-0.02	0.02	-1.20	-0.01	-0.04	-1.20	0.02	0.03
baction1	0.10	0.06	0.08	0.09	0.06	-0.10	-0.15	-0.25	-0.13	0.31	0.37	0.38	0.28	0.41	0.51	0.63	0.41	0.00	0.00	0.00	0.10	-0.07	-0.14	0.10	0.01	-0.36	0.10	-0.04	0.30	0.10	-0.07	-0.14
badvisory1	0.40	0.06	0.09	0.08	0.06	0.21	0.14	0.12	0.19	0.60	0.69	0.72	0.58	0.40	0.56	0.59	0.39	0.00	0.00	0.00	0.40	0.03	-0.10	0.40	0.13	0.06	0.40	0.09	0.16	0.40	-0.06	-0.02
bpartner1	0.30	0.06	0.09	0.09	0.06	0.11	-0.05	0.00	0.10	0.50	0.55	0.56	0.49	0.38	0.60	0.56	0.39	0.00	0.00	0.00	0.30	0.07	-0.02	0.30	-0.10	-0.21	0.30	-0.07	0.04	0.30	-0.01	-0.03
bcomp1	0.02	0.00	0.00	0.00	0.00	0.01	0.01	0.01	0.01	0.03	0.03	0.03	0.03	0.01	0.02	0.02	0.01	0.00	0.00	0.00	0.02	0.08	-0.10	0.02	0.04	-0.03	0.02	0.05	-0.10	0.02	-0.09	0.21
basc2	-1.60	0.15	0.24	0.25	0.17	-2.11	-2.36	-2.47	-2.11	-1.18	-0.89	-0.76	-1.10	0.93	1.48	1.71	1.00	0.01	0.01	0.01	-1.60	-0.10	-0.11	-1.60	-0.02	-0.21	-1.62	0.05	-0.03	-1.60	-0.11	-0.21
baction2	0.20	0.07	0.09	0.10	0.07	-0.02	-0.15	-0.12	-0.05	0.43	0.52	0.51	0.43	0.46	0.67	0.63	0.48	0.00	0.00	0.00	0.20	-0.05	-0.12	0.20	0.05	0.03	0.21	-0.03	-0.02	0.20	-0.01	-0.13
badvisory2	0.60	0.07	0.10	0.10	0.07	0.40	0.34	0.33	0.33	0.83	0.94	0.91	0.84	0.44	0.60	0.57	0.52	0.00	0.00	0.00	0.60	0.10	0.06	0.60	0.04	-0.15	0.60	0.04	-0.28	0.60	-0.03	0.03
bpartner2	0.40	0.07	0.10	0.10	0.07	0.20	0.11	0.11	0.16	0.61	0.69	0.72	0.62	0.41	0.58	0.61	0.46	0.00	0.00	0.00	0.40	-0.06	-0.16	0.40	0.18	-0.32	0.40	0.01	-0.05	0.40	0.13	0.12
bcomp2	0.01	0.00	0.00	0.00	0.00	0.01	0.01	0.01	0.01	0.02	0.02	0.03	0.02	0.01	0.02	0.02	0.01	0.00	0.00	0.00	0.01	0.08	-0.17	0.01	-0.13	-0.19	0.02	-0.05	-0.11	0.02	0.08	-0.21

Statistical power

In Table 4, we see the power (5% significance) for all designs

Table 4: Power simulations for Conditional Logit DGP

Design	Power (95%)
altscfeff	31.1
befficientdesign	5.3
efficientdesign	4.3
orthodesign	29.0

Illustration of simulated parameter values

To facilitate interpretation and judgement of the different designs, we plot the densities of estimated parameters from the five experimental designs.

$basc


$baction1


$badvisory1


$bpartner1


$bcomp1


$basc2


$baction2


$badvisory2


$bpartner2


$bcomp2