Simulation experimental design

Published

April 16, 2025

Preliminary information

This document documents the process of generating and testing different designs for a study with 5 attributes and two alternatives. The process of finding the best design follows different steps.

Generating different designs: The parameters used to generate the designs are denoted as priors
Simulating a data set: The simulation assumes a clearly defined data generating process (DGP). The DGP is specified in terms of error variance, utility parameters and a specification of the utility function. If we assume a latent class structure, we use different utility functions for different people or groups of people. The parameters used to generate the data are denoted as true parameters.
Estimating a model on the simulated dataset. Here, we can estimate a model that is exactly the same as the GDP or a model that deviates from the GDP. We denote the parameters estimated in each model as the estimated parameters.
We repeat steps 2 and 3 $N$ times to infer which design is the best. Here we look at statistical power (at a 5% significance level), unbiasedness and efficiency.
- Power: If we have 100 simulations, how often do we find an effect (knowing that the effect exists).
- Unbiasedness: If we have 100 simulations, we know that each estimation will not give the true parameter, but something around the true parameter. This is because the DGP has some randomness involved. But the estimated parameters should should fluctuate around the true parameter. Thus, the mean of the $N$ estimated parameters should be equal to the true parameter.
- Efficiency: This is a relative measure. The design we consider as efficient is the one, that has lower mean estimated standard errors (equally p-values) than all other other designs.
Once we have found a design, we check each choice set manually to make sure the choice sets are not bogous to the respondents. If we find bogous choice set we either take another design, or change this choice set.

Note: Unbiasedness and efficiency are independent of the error variance and effect size. Power, in contrast, depends on the magnitude of the true parameters or the error variance. As we will not know these values before we have collected the data, power is somehow an abitrarly measure. The best strategy is to use the parameters from a similar study as the true parameters, and multiply it by a constant $ c < 1$. Then we get a rather conservative power estimate. Still, we can always fool ourselves (and reviewers) by using parameters that give us the power we want.

For the simulations, we use 1500 respondents for a GDP with homogenous preferences (conditional logit model).

Simulation

Conditional logit

The following presents the results from a DGP of the conditional logit model. This is the standard approach and assumes homogeneous preferences.

The simulation has 1500 respondents and 500 runs. The simulation itself took 2H 13M 38S .

The simulation is based on the output file PreregUpload/simulationresults_Pretest_2_powa95sim500respondents1500.RDS

Unbiasedness

Table 1 shows the true parameters used for the simulation. Note that these are the average latent class values multiplied by 0.75.

Table 1: True parameter values used in the DGP

	x
bescorganisator2	0.25
bescorganisator3	-0.25
bparticipation2	-0.25
bparticipation3	0.28
bmotive2	0.25
bmotive3	0.44
bmotive4	0.37
bsimplicity	-0.20
bprice2	-0.92
bprice3	-1.07
bprice4	-2.07
bprice5	-1.68
bprice6	-2.62
bsq	-0.25

Table 2 shows summary statistics of the estimated parameters for the 500 runs. We want to make sure that they are nearly equal to the true parameters.

Table 2: Means and medians of estimated parameters over all runs

parname	truepar	Pretest214.mean	Pretest214.median
bescorganisator2	0.25	0.25	0.25
bescorganisator3	-0.25	-0.25	-0.25
bparticipation2	-0.25	-0.25	-0.25
bparticipation3	0.28	0.28	0.27
bmotive2	0.25	0.25	0.25
bmotive3	0.44	0.44	0.43
bmotive4	0.37	0.38	0.38
bsimplicity	-0.20	-0.20	-0.20
bprice2	-0.92	-0.92	-0.92
bprice3	-1.07	-1.07	-1.07
bprice4	-2.07	-2.07	-2.07
bprice5	-1.68	-1.69	-1.69
bprice6	-2.62	-2.62	-2.62
bsq	-0.25	-0.25	-0.25

Efficiency

Table 3 shows the summary statistics of values of robust p-values over all runs.

Table 3: Summary statistics of estimated robust p values over all runs

	parname	Pretest214.mean	Pretest214.sd	Pretest214.min	Pretest214.max	Pretest214.range	Pretest214.se	Pretest214.median	Pretest214.skew	Pretest214.kurtosis
15	rob_pval0_bescorganisator2	0.00	0.00	0	0.01	0.01	0	0	12.76	161.01
16	rob_pval0_bescorganisator3	0.00	0.01	0	0.17	0.17	0	0	19.37	403.17
17	rob_pval0_bparticipation2	0.00	0.00	0	0.01	0.01	0	0	7.69	57.27
18	rob_pval0_bparticipation3	0.00	0.00	0	0.01	0.01	0	0	12.76	161.01
19	rob_pval0_bmotive2	0.00	0.00	0	0.06	0.06	0	0	7.33	66.09
20	rob_pval0_bmotive3	0.00	0.00	0	0.00	0.00	0	0	NaN	NaN
21	rob_pval0_bmotive4	0.00	0.00	0	0.00	0.00	0	0	NaN	NaN
22	rob_pval0_bsimplicity	0.00	0.00	0	0.04	0.04	0	0	12.92	184.79
23	rob_pval0_bprice2	0.00	0.00	0	0.00	0.00	0	0	NaN	NaN
24	rob_pval0_bprice3	0.00	0.00	0	0.00	0.00	0	0	NaN	NaN
25	rob_pval0_bprice4	0.00	0.00	0	0.00	0.00	0	0	NaN	NaN
26	rob_pval0_bprice5	0.00	0.00	0	0.00	0.00	0	0	NaN	NaN
27	rob_pval0_bprice6	0.00	0.00	0	0.00	0.00	0	0	NaN	NaN
28	rob_pval0_bsq	0.01	0.03	0	0.44	0.44	0	0	7.63	76.10
NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA
NA.1	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA

Statistical power

In Table 4, we see the power (5% significance) for all designs

Table 4: Power simulations for Conditional Logit DGP

Design	Power (95%)
Pretest214	95

Illustration of simulated parameter values

To facilitate interpretation and judgement of the different designs, we plot the densities of estimated parameters from the five experimental designs.

$bescorganisator2


$bescorganisator3


$bparticipation2


$bparticipation3


$bmotive2


$bmotive3


$bmotive4


$bsimplicity


$bprice2


$bprice3


$bprice4


$bprice5


$bprice6


$bsq