Modeling Preferences for Lifestyle Interventions in Individuals with Overweight or Obesity and Chronic Low Back Pain

A Discrete Choice Experiment Using Random Utility Models

Author

Affiliations

Alexander Schurz

Bern University of Applied Sciences, Switzerland

Vrije Universiteit Brussels, Belgium

Bern University, Switzerland

Published

February 23, 2026

Code

knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
library(kableExtra)
library(tidyverse)
library(pakret)

1 Study aim

The primary aim of this study is to conduct a DCE to evaluate the preferences of individuals with overweight or obesity and CLBP, and to determine the relative importance of key attributes characterising lifestyle interventions tailored to this population.

2 Methods

Each alternative is described in terms of $p$ attributes, where each attribute has $l$ different levels. A literature review was conducted to identify pertinent attributes of lifestyle interventions in this area. The number of attributes in a DCE is limited to avoid cognitive overload, as too many attributes can complicate participants’ decision-making. As a result, the number of attributes was refined, following consultations with experts in CLBP care (three healthcare researchers, two physiotherapists, two general practitioners, and one psychologist) as well as two CLBP patients. To facilitate the scalability of the pain reduction attribute, predetermined values corresponding to the minimal clinically important difference (MCID) for numeric rating scales in individuals with low back pain were used (Lauridsen et al. 2006). The finalized set of attributes and their corresponding levels underwent external validation by health professionals and patients to ensure the credibility and relevance of the design.

Attribute	Variable	Levels	Levels (R)
Type of realization	`type`	On-site, individual session	`ffs`
		On-site, group session	`ffg`
		Videocalls	`vc`
		Online, time & place independent	`oti`
Content of program	`content`	Dietary intervention	`D`
		Exercise program	`E`
		Dietary and exercise program	`ED`
Weekly time effort	`effort`	2 h	`2`
		3 h	`3`
		4 h	`4`
		5 h	`5`
Expected weight loss	`weight`	5 kg	`5`
		10 kg	`10`
		15 kg	`15`
		20 kg	`20`
Pain reduction	`pain`	10%	`10`
		20%	`20`
		30%	`30`
		40%	`40`
		50%	`50`
Treatment costs	`cost`	500 CHF	`500`
		1000 CHF	`1000`
		1500 CHF	`1500`
		2000 CHF	`2000`

Additionally, we have the following person-specific attributes:

BMI
age
sex
Fanchise
FABQ_PA: Fear Avoidance Belief Questionnaire regarding physical activity (Staerkle et al. 2004; Pfingsten et al. 2000; Monticone et al. 2020)
FABQ_W: Fear Avoidance Belief Questionnaire regarding Work (Staerkle et al. 2004; Pfingsten et al. 2000; Monticone et al. 2020).

Code

Realization.type <- factor(
  c("face-to-face_single", "face-to-face_group", "videocalls", "online-time & place independent"),
  levels = c("face-to-face_single", "face-to-face_group", "videocalls", "online-time & place independent")
  )

Content <- factor(
  c("D+E", "Dietary intervention", "Exercise program"), 
  levels = c("D+E", "Dietary intervention", "Exercise program")
  )

Average.effort.per.week <- seq(2, 5, by = 1)

Weight.loss <- seq(-5,-20, by = -5)

Pain.reduction <- seq(-10,-50, by = -10)

Total.costs <- seq(500, 2000, by = 500)

Attributes <- list(Realization.type, Content, Average.effort.per.week, Weight.loss, Pain.reduction, Total.costs)

Levels <- c(4, 3, 4, 4, 5, 4)

names(Attributes) <- c("Realization.type", "Content", "Average.effort.per.week", "Weight.loss", "Pain.reduction", "Total.costs")

A full factorial design was initially created. There are $\prod_{q = 1}^{p} l_q = 3840$ possible combinations of choice sets. From the full factorial design, we created an efficient fractional design with 30 choice sets by using the Federov’s exchange algorithm (R. E. Wheeler 2004) from the R package AlgDesign. The developed unlabelled choice sets were divided into two blocks of 15 choice sets to reduce the workload of the patients. Each choice set included two unlabelled alternatives (alternative A and alternative B) describing different lifestyle interventions, as well as one opt-out option, allowing participants to choose not to participate in such a treatment (Campbell and Erdem 2019).

Code

library(AlgDesign) 
mat.fullfactorial <- gen.factorial(Levels, 
                     varNames = names(Attributes),
                     factors = "all")

list.fullfactorial.qual <- eval.design(~., mat.fullfactorial)

Code

set.seed(123) # Do not change this line!
list.fractional <- optFederov(~., mat.fullfactorial, nTrials = 30, approximate = FALSE) 
df.fractional <- data.frame(list.fractional$design)

The final choice sets, which align with the fractional design were created using the R package support.CEs (Aizaki 2012b).

The final choice sets are shown in appendix. An example of one choice set is presented in Figure 1.

Code

library(support.CEs)
library(apollo)

if(getRversion() >= "4.3.2") RNGkind(sample.kind = "Rounding") # this is necessary to get the right version of the used questionnaire

# Perform rotation design
set.seed(123) # This must stay the same during the whole procedure!
list.design <- rotation.design(
 candidate.array = df.fractional,
 attribute.names = Attributes,
 nalternatives = 2,
 nblocks = 2,
 row.renames = FALSE,
 randomize = TRUE, # FALSE: corresponds to rotation method, if set on TRUE: mix-and-match method
 seed = 123)

Patients were eligible to complete the DCE if they were 18 years or older, classified as overweight or obese (BMI > 25), and suffered from CLBP. These eligibility criteria were assessed at the start of the online questionnaire. A dominance test was administered prior to the DCE to ensure participants’ understanding of the task.

Following the completion of the 15 choice sets, participants were asked to complete the Fear Avoidance Belief Questionnaire (FABQ) (Staerkle et al. 2004; Pfingsten et al. 2000; Monticone et al. 2020). In addition, participants provided demographic information, including sex, age, weight, height, and the level of their Swiss health insurance coverage (Franchise). The subjective difficulty of the questionnaire was also assessed.

To distribute the online survey, flyers containing QR codes were distributed to medical doctors and physiotherapists who treat patients within the relevant population, as well as to the target population directly. The survey was further disseminated across Switzerland via social media advertisements on platforms such as Instagram and Facebook.

According to de Bekker-Grob and colleagues (2015), the following rule of thumb is proposed for calculating the required sample size for a DCE:

\[ N > \frac{500 \times c}{t \times a}, \]

where:

$N$ is the required sample size
$c$ represents the number of analysis cells
$t$ is the number of choice tasks per respondent
$a$ is the number of alternatives per choice task.

To account for all two-way interactions, the number of analysis cells $c$ should correspond to the largest product of the levels of two attributes. In our case, we have 4 levels for one attribute and 5 levels for another attribute, thus $c = 4 \times 5 = 20$.

Thus, at least 333 respondents are needed.

3 Statistial analysis

Descriptive statistics were applied to summarize demographics and clinical characteristics of the respondents, as well as the outcomes of other patient-reported measurement scores.

We estimated logit models in R using the mlogit package. The dependent variable was binary, indicating whether an alternative was chosen (0 = not chosen, 1 = chosen) within each choice set. Choices were nested within choice sets, which in turn were nested within decision makers. The independent variables were the attributes describing the alternatives as well as decision-maker-specific attributes. More formally, a decision maker, labeled $i$, faces a choice among $J$ alternatives. The decision maker chooses the alternative with the highest utility $U_{ij}$, $j = 1, ..., J$: choose alternative $k$ if $U_{ik} > U_{ij} \forall j \ne k$.

We can’t observe the decision maker’s utility, but only some attributes of the alternatives, labeled $Z_{ij} \forall j$ , and some attributes of the decision maker, labeled $X_i$. The function $V_{ij} = V (X_i, Z_{ij}) \forall j$ represents the observed/systematic part of $U_{ij}$. Therefore:

\[ U_{ij} = V_{ij} + \epsilon_{ij}. \]

$\epsilon_{ij}$ is the stochastic part, treated as random. The random vector $\epsilon_i = (\epsilon_{i1}, \epsilon_{i2}, \dots, \epsilon_{iJ})$ has a joint density function $f(\epsilon_i)$.

We modeled $V_{ij}$ as a linear function of $X_i$ and $Z_{ij}$:

\[ V_{ij} = \beta^T X_i + \gamma^T Z_{ij}, \] where $\beta$ and $\gamma$ are generic parameter vectors representing the effects of the individual-specific ($X_i$) and alternative-specific ($Z_ij$) variables, respectively. The parameters are generic and not alternative specific, because we have an unlabelled DCE.

Note that we model $V_{ij}$ and not the probability of choosing an alternative. Thus, only the difference of $V_{ik}-V_{ij}$ is relevant.

The probability that decision maker $i$ chooses alternative $k$ is: \[\begin{align*} P_{ik} &= \Pr(U_{ik} > U_{ij} \; \forall j \ne k) \\ &= \Pr(V_{ik} + \epsilon_{ik} > V_{ij} + \epsilon_{ij} \; \forall j \ne k) \\ &= \Pr(\epsilon_{ij} - \epsilon_{ik} < V_{ik} - V_{ij} \; \forall j \ne k). \end{align*}\]

Using the density $f(\epsilon_{i})$, the cumulative probability that each random term $\epsilon_{ik} − \epsilon_{ij}$ is below the observed quantity $V_{ik} − V_{ij}$ is:

\[ \int I[\epsilon_{ik} − \epsilon_{ij} < V_{ik} − V_{ij}] f(\epsilon_i)d\epsilon_i \]

where $I(·)$ is the indicator function equal to 1 when the expression in parentheses is true and 0 otherwise. It turns out that the solution to this integral is:

\[ P_{ij} = \frac{e^{V_{ij}}}{\Sigma_{k = 1}^J e^{V_{ik}}} = \frac{e^{\beta^TX_{i}+\gamma^TZ_{ij}}}{\Sigma_{k = 1}^J \ e^{\beta^TX_{i} + \gamma^TZ_{ik}}} \]

where:

$P_{ij}$ is the probability that person $i$ chooses alternative $j$.
$X_i$ are the attributes of person $i$.
$Z_{ij}$ are the attributes of alternative $j$ for person $i$.
$k$ in the denominator sums over all alternatives (from 1 to $J$).

Assumption: $\epsilon_{ij} \sim$ i.i.d. type I extreme value (Gumbel) with $Var(\epsilon_{ij}) = \frac{\pi^2}{6}$.

Depending on the type of variables included in the model, different names are used to describe the resulting logit model. There is no universally consistent nomenclature. However, a common convention is:

When $V_{ij} = \beta^TX_{i}$ $\to$ multinomial logit model.
When $V_{ij} = \gamma^TZ_{ij}$, $\to$ conditional logit model.
When $V_{ij} = \beta^TX_{i}+\gamma^TZ_{ij}$, $\to$ mixed logit model.

Note that sometimes mixed logit models refer to models with random coefficients.

An additional (reference) level None for the two categorical alternative-specific attributes type and content was introduced. For the quantitative attributes effort, pain, weight and cost, the value 0 was used. This creates a meaningful opt-out option, but with the disadvantage of estimating two more coefficients. This way, the estimates are the effects ($\log(Odds~ratios)$), relative to opt-out (effort = 0, pain = 0, weight = 0, cost = 0, content = None). In other words:

\[ V_{opt-out} = intercept = \gamma_0. \]

We are interested in whether fear-avoidance beliefs (FABQ_W and FABQ_PA) interact with the weekly effort required for therapy. Since this is an unlabelled DCE, modeling FABQ_W and FABQ_PA as person-specific variables using the mlogit() function is not appropriate: this would estimate alternative-specific coefficients $\beta$, which are not meaningful when alternatives are unlabelled.

Therefore, to investigate preference heterogeneity in a meaningful way, we included generic interaction terms between effort and the fear-avoidance variables, i.e., $effort \times FABQ_W$ and $effort \times FABQ_{PA}$.

Furthermore, we are interested in the interaction $cost \times franchise$.

While our aim was to model all alternative-specific attributes, the current sample size is not sufficient to estimate all parameters. Therefore, I excluded the variable content from the model, as it is of comparatively lower practical relevance for the research question.

We model the observable part of utility $V_{ij}$ as a linear function of alternative-specific attributes $Z_{ij}$ and interactions between the individual-specific variable FABQ_W and FABQ_PA and effort:

\[ \begin{aligned} V_{ij} =\ & \gamma_1 \times effort_{ij} + \gamma_2 \times pain_{ij} + \gamma_3 \times weight_{ij} + \gamma_4 \times cost_{ij} \\ & + \gamma_5 \times type_{oti,i} + \gamma_6 \times type_{vc,i} + \gamma_7 \times type_{ffg,i} + \gamma_8 \times type_{ffs,i} \\ & + \delta_1 \times (effort_{ij} \times FABQ_{PA,i}) + \delta_2 \times (effort_{ij} \times FABQ_{W,i}) + \delta_3 \times (cost_{ij} \times Franchise_i) \end{aligned} \]

where:

$\gamma_1, \dots, \gamma_8$ are the generic coefficients for the alternative-specific attributes (type was dummy coded)
$\delta_1, \dots, \delta_3$ capture the interaction effects of individual-specific variables $FABQ_{PA}$ and $FABQ_W$ with the attribute effort and the individual-specific variable Franchise with the attribute cost.

3.1 Main effects

In a linear model, coefficients represent the marginal effects of explanatory variables on the dependent variable. This direct interpretation does not hold in the case of a multinomial logit model. Nevertheless, useful insights can be derived through appropriate transformations of the coefficients. For variables that are specific to an alternative, the sign of the coefficient can still be directly interpreted. The marginal effect in this context is calculated by multiplying the coefficient by the product of two probabilities, which has a maximum value of 0.25. As a rule of thumb, dividing the coefficient by four provides an upper bound estimate of the marginal effect (Croissant 2020b). However, marginal utilities (i.e., the raw coefficients themselves) cannot be directly interpreted in isolation because they’re on a latent utility scale, meaning they don’t have a natural unit or reference point. What can be interpreted are ratios of these coefficients, which give us the marginal rate of substitution (MRS) between attributes:

\[ MRS = \frac{\left| \gamma_1 \right|}{\left| \gamma_2 \right|}. \]

In other words, the MRS tells us how much of one attribute a person is willing to give up to gain more of another attribute, while keeping their overall utility constant. For example, in the present DCE, the MRS between effort and pain would tell us how much time effort someone is willing to invest for an additional unit (%) of pain reduction. In our DCE, we were interested in the MRS between effort and pain, effort and weight, cost and pain as well as between cost and weight.

The range of the marginal effects of each attribute can be used to assess its relative importance. For each attribute, the range is the difference between its maximum and minimum marginal effects across all choices. The relative importance of each attribute can be calculated by normalizing the range of the attribute’s coefficient relative to the sum of the ranges for all attributes. This gives the proportion of the total effect attributable to each attribute. The relative importance of attribute $j$ is given by:

\[ \text{Relative Importance of Attribute } j = \frac{\left| (\max(Z_{ij}) - \min(Z_{ij})) \cdot \gamma_j \right|}{\sum_{j=1}^{J} \left| (\max(Z_{ij}) - \min(Z_{ij})) \cdot \gamma_j \right|} \]

Thus, the relative importance represents the share of the total variation in utility that is explained by each attribute. Relative importance was determined from a model containing alternative-specific attributes only.

The basic bootstrap confidence interval for $MRS$ and the relative importance is given by:

\[ \left[ 2\hat{\theta}_n - q^*_{1 - \alpha/2}, \; 2\hat{\theta}_n - q^*_{\alpha/2} \right] \]

Where: - $\hat{\theta}_n$ is the observed estimate from the original sample. - $ q^*_{} $ is the $\alpha$-quantile of the bootstrap distribution of the estimate. - $ $ is the significance level

Specific contrasts were tested for the categorical variable type. Let the coefficients for type levels be represented as:

$\gamma_{\text{oti}}$ for oti
$\gamma_{\text{vc}}$ for vc
$\gamma_{\text{ffg}}$ for ffg
$\gamma_{\text{ffs}}$ for ffs

To test whether the alternatives oti and vc jointly differ from ffg and ffs (i.e. online versus face-to-face), we specified the following contrast:

\[ \frac{1}{2} \left( \gamma_{\text{oti}} + \gamma_{\text{vc}} \right) - \frac{1}{2} \left( \gamma_{\text{ffg}} + \gamma_{\text{ffs}} \right). \]

This can be interpreted as comparing the average utility of choosing either oti or vc to that of choosing ffg or ffs.

Furthermore, pairwise comparisons for any two alternatives (e.g., oti and vc) were performed, which simply is the difference between their corresponding coefficients, for example $\gamma_{\text{oti}} - \gamma_{\text{vc}}$.

3.2 R functions

R version 4.4.2 (R Core Team 2025), together with the following R packages was used:

the ‘tidyverse’ package version 2.0.0 (Wickham et al. 2019) for data manipulation
the ‘AlgDesign’ package version 1.2.1.2 (B. Wheeler 2025) to create a fractional design.
the ‘support.CEs’ package version 0.7-0 (Aizaki 2012a) to create choice sets.
the ‘mlogit’ package version 1.1-3 (Croissant 2020a) to fit the models
the ‘multcomp’ package version 1.4-29 (Hothorn, Bretz, and Westfall 2008) to test contrasts
the ‘lmtest’ package version 0.9-40 (Zeileis and Hothorn 2002) for likelihood ratio tests.

4 Results

Code

df.qualtrics.raw <- rio::import("../data/DCE_data.csv")

# remove the first two rows (tries by the author)
df.qualtrics.adapted <- df.qualtrics.raw[-c(1:2), ]

Code

library(tableone)
library(officer)


# correct data types
df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate_at(vars(S_1:Difficulty_survey_1), as.integer) %>% 
  mutate(Progress=as.numeric(Progress)) %>% 
  mutate_at(vars(Age_1:Height_1),as.numeric) %>% 
  mutate(ID = 1:nrow(df.qualtrics.adapted))

# Rename variables 
# Screening questionnaires S_1 & S_2--> applicable/not applicable
# Sex --> male/female/not binary/other/not specified
# Franchise --> 300/500/1000/1500/2000/2500CHF
df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate(S_1 = ifelse(S_1 == 1, "overweight", "normal weight"),
         S_2 = ifelse(S_2 == 1, "CLBP", "no CLBP"),
         Sex = case_when(
           Sex == 1 ~ "Male",
           Sex == 2 ~ "Female",
           Sex == 3 ~ "not binary",
           Sex == 4 ~ "other",
           Sex == 5 ~ "not specified"
         ),
         Sex_bin = case_when(
           Sex == "Male" ~ "Male",
           Sex == "Female" ~ "Female",
           Sex == "not binary" ~ NA,
           Sex == "other" ~ NA,
           Sex == "not specified" ~ NA
         ),
         Franchise = case_when(
           Franchise == 1 ~ 300,
           Franchise == 2 ~ 500,
           Franchise == 3 ~ 1000,
           Franchise == 4 ~ 1500,
           Franchise == 5 ~ 2000,
           Franchise == 6 ~ 2500
         )
  )

# Calculation of outcomes from FABQ for the 2 subcategories "physical activity" (PA) and "workload" (W)
# Physical activity subscore = sum of questions 2-5
# Work subscore = sum of questions 6, 7, 9, 10, 11, 12, 15 = FABQ_W_1,2,4,5,6,7,10
df.qualtrics.adapted$FABQ_PA <- df.qualtrics.adapted$FABQ_PA_2 + df.qualtrics.adapted$FABQ_PA_3 + 
  df.qualtrics.adapted$FABQ_PA_4 + df.qualtrics.adapted$FABQ_PA_5

df.qualtrics.adapted$FABQ_W <- df.qualtrics.adapted$FABQ_W_1 + 
  df.qualtrics.adapted$FABQ_W_2 +df.qualtrics.adapted$FABQ_W_4 +
  df.qualtrics.adapted$FABQ_W_5 + df.qualtrics.adapted$FABQ_W_6 + df.qualtrics.adapted$FABQ_W_7 +
  df.qualtrics.adapted$FABQ_W_10

# Add a variable called "BLOCK" 
# All questions with NA values are now allocated to BLOCK 2. This causes the imbalance between BLOCK 1 and 2
df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate(BLOCK = ifelse(is.na(Q1.1),2,1))

# Add a variable called BMI based on Weight_1 and Height_1
df.qualtrics.adapted <- df.qualtrics.adapted %>% 
  mutate(BMI = round(Weight_1 / ((Height_1 / 100)^2), 1))

# A variable called "Dominance_test" is added for the variable T1 
df.qualtrics.adapted$Dominance_test <- df.qualtrics.adapted$`T1`

# Responses for "Dominance_test" are adapted to failed (=1) and passed (=2 or 3)
df.qualtrics.adapted$Dominance_test <- ifelse(
  df.qualtrics.adapted$Dominance_test == 1, "failed",
  ifelse(df.qualtrics.adapted$Dominance_test == 2, "passed",
         ifelse(df.qualtrics.adapted$Dominance_test == 3, "passed", NA)))

Block.1 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 1) %>% 
  dplyr::select(ID,BLOCK,Q1.1:Q1.15, 
         Age_1, Sex, Sex_bin, BMI, Franchise, FABQ_PA, FABQ_W) # predictors: Age, Sex, BMI, Franchise, FABQ_PA, FABQ_W
names(Block.1) <- c("ID", "BLOCK", "q1", "q2", "q3", "q4", "q5", "q6",
                    "q7", "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15", 
                    "age", "sex", "sex_bin", "BMI", "Franchise", "FABQ_PA", "FABQ_W")

Block.2 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 2) %>% 
  dplyr::select(ID,BLOCK,Q2.1:Q2.15, 
         Age_1, Sex, Sex_bin, BMI, Franchise, FABQ_PA, FABQ_W) # predictors: Age, Sex, BMI, Franchise, FABQ_PA, FABQ_W
names(Block.2) <- c("ID", "BLOCK",
                    "q1", "q2", "q3", "q4", "q5", "q6", "q7", "q8",
                    "q9", "q10", "q11", "q12", "q13", "q14", "q15",
                    "age", "sex", "sex_bin", "BMI", "Franchise", "FABQ_PA", "FABQ_W")

df.qualtrics.prep <- rbind(Block.1, Block.2)

Code

df.qualtrics.prep.long <- df.qualtrics.prep %>% 
  pivot_longer(starts_with("q"), values_to = "choice", names_to = "question") %>% 
  mutate(QES = parse_number(question))


df.design.dce <- rbind(list.design$alternatives$alt.1, list.design$alternatives$alt.2) %>% 
  pivot_wider(names_from = ALT, values_from = c(Realization.type, Content, Average.effort.per.week, Weight.loss, Pain.reduction, Total.costs)) %>% 
  mutate(Realization.type_3 = "None",
         Content_3 = "None",
         Average.effort.per.week_3 = 0,
         Weight.loss_3 = 0,
         Pain.reduction_3 = 0,
         Total.costs_3 = 0) %>% 
  mutate(across(Average.effort.per.week_1:Total.costs_2, ~ parse_number(as.character(as.factor(.))))) %>% 
  mutate(across(Realization.type_1:Realization.type_2, ~ fct_recode(as.factor(.),
                                  "ffs" = "face-to-face_single",
                                  "ffg" = "face-to-face_group",
                                  "vc" = "videocalls",
                                  "oti" = "online-time & place independent"))) %>% 
  mutate(across(Content_1:Content_2, ~ fct_recode(as.factor(.),
                                  "ED" = "D+E",
                                  "E" = "Exercise program",
                                  "D" = "Dietary intervention"))) %>% 
  dplyr::select(BLOCK, QES, sort(setdiff(names(.), c("BLOCK", "ALT"))))


df.dce.wide <- left_join(df.qualtrics.prep.long, df.design.dce, by = c("BLOCK", "QES")) %>% 
  rename(effort1 = Average.effort.per.week_1,
         effort2 = Average.effort.per.week_2,
         effort3 = Average.effort.per.week_3,
         content1 = Content_1,
         content2 = Content_2,
         content3 = Content_3,
         pain1 = Pain.reduction_1,
         pain2 = Pain.reduction_2,
         pain3 = Pain.reduction_3,
         type1 = Realization.type_1,
         type2 = Realization.type_2,
         type3 = Realization.type_3,
         cost1 = Total.costs_1,
         cost2 = Total.costs_2,
         cost3 = Total.costs_3,
         weight1 = Weight.loss_1,
         weight2 = Weight.loss_2,
         weight3 = Weight.loss_3) %>% 
  mutate(across(starts_with("pain"), ~ .x * -1)) %>% 
  mutate(across(starts_with("weight"), ~ .x * -1))

save(df.dce.wide, file = "../data/df-dce-wide.rda")



df.dce.long <- df.dce.wide %>% 
  pivot_longer(cols = effort1:weight3, 
               names_to = c(".value", "ALT"),  
               names_pattern = "([a-zA-Z]+)(\\d)") %>% 
  mutate(
    choice_bin = ifelse(choice == ALT, 1, 0),
    content = as.character(content),
    content = if_else(content == "ED", "D+E", content),
    content = factor(content, levels = c("None", "D", "E", "D+E")),
    type = factor(type, levels = c("None", "oti", "vc", "ffg", "ffs")),
    TC = factor(paste(type, content, sep = ":")),
    STR = 100 * ID + QES) %>% 
  relocate(TC, .after = type)%>% 
  dplyr::select(-type, -content) 

save(df.dce.long, file = "../data/df-dce-long.rda")

# DCE long without TC but content + type separated
df.dce.long.noTC <- df.dce.wide %>% 
  pivot_longer(cols = effort1:weight3, 
               names_to = c(".value", "ALT"),  
               names_pattern = "([a-zA-Z]+)(\\d)") %>% 
  mutate(
    choice_bin = ifelse(choice == ALT, 1, 0),
    content = as.character(content),
    content = if_else(content == "ED", "D+E", content),
    content = factor(content, levels = c("None", "D", "E", "D+E")),
    type = factor(type, levels = c("None", "oti", "vc", "ffg", "ffs")),
    TC = factor(paste(type, content, sep = ":")),
    STR = 100 * ID + QES) %>% 
  relocate(TC, .after = type) %>% 
  dplyr::select(-TC) 

save(df.dce.long.noTC, file = "../data/df-dce-long-noTC.rda")

Code

load("../data/df-dce-long.rda")

library(mlogit)
df.mlogit <- dfidx::dfidx(df.dce.long, idx = list("STR", "ALT"), drop.index = FALSE, pkg = "mlogit")
df.mlogit$TC <- relevel(as.factor(df.mlogit$TC), ref = "None:None")
df.mlogit <- df.mlogit %>% 
  arrange(ID)

Code

library(table1)
# Categorize the variable Progress based on a cutoff of 80 and rename it
# 80% completion of the survey would mean that all choice sets were answered but not further questions (demographics, FABQ)
df.qualtrics.adapted$Progress_cutoff <- ifelse(df.qualtrics.adapted$Progress>=80, ">=80", "<80")


# Define variables of interest
vars.demographics.DCE <- c("Sex", "Age_1", "Height_1", "Weight_1", "BMI", "Progress_cutoff", "Dominance_test", "Franchise", "FABQ_PA", "FABQ_W", "Difficulty_survey_1", "Duration (min)")

df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate(across(where(is.factor), droplevels)) # Entfernt ungenutzte Faktorlevel

# Format labels for table1
label(df.qualtrics.adapted$Sex_bin) <- "Sex"
label(df.qualtrics.adapted$Age_1) <- "Age (years)"
label(df.qualtrics.adapted$Height_1) <- "Height (cm)"
label(df.qualtrics.adapted$Weight_1) <- "Weight (kg)"
label(df.qualtrics.adapted$BMI) <- "Body Mass Index"
label(df.qualtrics.adapted$Progress_cutoff) <- "Progress cutoff (%)"
label(df.qualtrics.adapted$Franchise) <- "Franchise (CHF)"
label(df.qualtrics.adapted$FABQ_PA) <- "FABQ - Physical activity"
label(df.qualtrics.adapted$FABQ_W) <- "FABQ - Workload"
label(df.qualtrics.adapted$Difficulty_survey_1) <- "Survey difficulty (NAS 0-10)"
label(df.qualtrics.adapted$`Duration (in seconds)`) <- "Duration (sec)"

# Franchise should be reported as a factor in the demographics
df.qualtrics.adapted.2 <- df.qualtrics.adapted %>%
  mutate(
    Franchise = factor(Franchise,
                       levels = c(300, 500, 1000, 1500, 2000, 2500),
                       labels = c("300", "500", "1000", "1500", "2000", "2500")),
    Duration_min = round(`Duration (in seconds)`/60,2)
  ) %>%
  drop_na(Franchise) %>% 
  filter(BMI >= 25.0) %>% 
  labelled::set_variable_labels(
    Sex_bin = "Sex",
    Age_1 = "Age (years)",
    Height_1 = "Height (cm)",
    Weight_1 = "Weight (kg)",
    BMI = "Body Mass Index",
    Progress_cutoff = "Progress cutoff (%)",
    Franchise = "Franchise (CHF)",
    FABQ_PA = "FABQ - Physical activity",
    FABQ_W = "FABQ - Workload",
    Difficulty_survey_1 = "Survey difficulty (NAS 0–10)",
    Duration_min = "Duration (min)"
  )

# Create summary table stratified by BMI category
tbl.DCE.demographics <- table1(~ Sex_bin + `Age_1`+ `BMI`+ `Progress_cutoff`+ `Franchise`+ `FABQ_PA`+ `FABQ_W` + `Difficulty_survey_1` + `Duration_min`,
       data = df.qualtrics.adapted.2, 
       overall = "Overall", 
       droplevels = FALSE)

df.tab1 <- as.data.frame(tbl.DCE.demographics)


library(gtsummary)
library(gt)
library(officer)

# Create a gtsummary table
tbl1.DCE <- df.qualtrics.adapted.2 %>%
  drop_na(Franchise) %>%
  filter(BMI >= 25.0) %>% 
  mutate(
    `BMI category` = case_when(
      BMI >= 25 & BMI < 30 ~ "Overweight",
      BMI >= 30 ~ "Obesity",
      TRUE ~ NA),
    `BMI category` = factor(`BMI category`, levels = c("Overweight", "Obesity"))) %>% 
  dplyr::select(Sex_bin, Age_1, BMI, `BMI category`, `Franchise`, FABQ_PA, FABQ_W) %>%
  tbl_summary(
    statistic = list(all_continuous() ~ "{mean} ± {sd}"),
    digits = all_continuous() ~ 1,
    missing = "no"
  ) %>%
  bold_labels() %>%
  as_flex_table() %>%
  flextable::set_table_properties(width = 1, layout = "autofit") %>%
  flextable::padding(padding.top = 1, padding.bottom = 1) %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl1.DCE

Characteristic	N = 1591
Sex
Female	96 (60%)
Male	63 (40%)
Age (years)	45.2 ± 12.6
Body Mass Index	31.8 ± 5.2
BMI category
Overweight	68 (43%)
Obesity	91 (57%)
Franchise (CHF)
300	83 (52%)
500	20 (13%)
1000	7 (4.4%)
1500	6 (3.8%)
2000	2 (1.3%)
2500	41 (26%)
FABQ - Physical activity	12.8 ± 5.6
FABQ - Workload	19.4 ± 10.1
1n (%); Mean ± SD

Code

tbl1.DCE %>%
  flextable::save_as_docx(path = "../tables/tbl1.characteristics.docx")

Code

library(gridExtra)

make_plot_data <- function(var, var_name) {
  df <- as.data.frame(prop.table(table(df.mlogit$choice_bin, var), margin = 2))
  colnames(df) <- c("choice", "level", "freq")
  df$variable <- var_name
  return(df)
}

df.typecontent    <- make_plot_data(df.mlogit$TC,    "Type:Content")
df.cost    <- make_plot_data(df.mlogit$cost,    "Cost")
df.effort  <- make_plot_data(df.mlogit$effort,  "Effort")
df.pain    <- make_plot_data(df.mlogit$pain,    "Pain")
df.weight  <- make_plot_data(df.mlogit$weight,  "Weight")


df.plot.descr <- bind_rows(df.typecontent, df.cost, df.effort, df.pain, df.weight)

level_orders <- list(
  "Weight loss" = c(0, 5, 10, 15, 20))

plots <- df.plot.descr %>%
  group_split(variable) %>%
  lapply(function(df) {
    var_name <- unique(df$variable)
    if (var_name %in% names(level_orders)) {
      df$level <- factor(as.character(df$level), levels = level_orders[[var_name]])
    }

    ggplot(df, aes(x = level, y = freq, fill = choice)) +
      geom_bar(stat = "identity", position = "dodge") +
      labs(title = var_name, x = NULL, y = "Rel. frequency", size = 8) +
      theme_minimal() +
      scale_fill_grey() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8),
            plot.title = element_text(size = 9),
            axis.title.y = element_text(size = 8),
            legend.title = element_text(size = 8))
  })

# export ggsave
library(ggplot2)
library(gridExtra)
library(grid)
combined_plot <- grid.arrange(grobs = plots, ncol = 2, nrow = 3, widht = 8, height = 8)
ggsave(filename = "../figures/plot_grid.png", plot = combined_plot, width = 10, height = 8)

Figure 2: Univariate relationships between attribute levels and choices (TC).

Code

load("../data/df-dce-long-noTC.rda")

library(mlogit)
df.mlogit.noTC <- dfidx::dfidx(df.dce.long.noTC, idx = list("STR", "ALT"), drop.index = FALSE, pkg = "mlogit")

library(gridExtra)

make_plot_data <- function(var, var_name) {
  df <- as.data.frame(prop.table(table(df.mlogit.noTC$choice_bin, var), margin = 2))
  colnames(df) <- c("choice", "level", "freq")
  df$variable <- var_name
  return(df)
}

df.type    <- make_plot_data(df.mlogit.noTC$type,    "Type")
df.content   <- make_plot_data(df.mlogit.noTC$content,    "Content")
df.cost    <- make_plot_data(df.mlogit.noTC$cost,    "Cost")
df.effort  <- make_plot_data(df.mlogit.noTC$effort,  "Effort")
df.pain    <- make_plot_data(df.mlogit.noTC$pain,    "Pain")
df.weight  <- make_plot_data(df.mlogit.noTC$weight,  "Weight")


df.plot.descr.noTC <- bind_rows(df.type, df.content, df.cost, df.effort, df.pain, df.weight)

level_orders.noTC <- list(
  "Weight loss" = c(0, 5, 10, 15, 20))

plots.noTC <- df.plot.descr.noTC %>%
  group_split(variable) %>%
  lapply(function(df) {
    var_name <- unique(df$variable)
    if (var_name %in% names(level_orders)) {
      df$level <- factor(as.character(df$level), levels = level_orders[[var_name]])
    }

    ggplot(df, aes(x = level, y = freq, fill = choice)) +
      geom_bar(stat = "identity", position = "dodge") +
      labs(title = var_name, x = NULL, y = "Rel. frequency", size = 8) +
      theme_minimal() +
      scale_fill_grey() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8),
            plot.title = element_text(size = 9),
            axis.title.y = element_text(size = 8),
            legend.title = element_text(size = 8))
  })


# export ggsave
# library(ggplot2)
# library(gridExtra)
# library(grid)
# combined_plot_noTC <- grid.arrange(grobs = plots.noTC, ncol = 2, nrow = 3, widht = 8, height = 8)
# ggsave(filename = "../figures/plot_grid_noTC.png", plot = combined_plot_noTC, width = 10, height = 8)

4.1 Logit model

Code

mlogit.simple <- mlogit(choice_bin ~ TC + effort + weight + pain + cost | 0, data = df.mlogit)

library(broom)
mlogit.simple.tidy <- broom::tidy(mlogit.simple)

df.mlogit.simple <- as.data.frame(mlogit.simple.tidy) %>%
  rename(
    Term = term,
    Estimate = estimate,
    Std_Error = std.error,
    Z_value = statistic,
    p_value = p.value
  ) %>%
  mutate(Term = sub("^TC", "", Term))


# export as docx as estimate
library(flextable)
tbl.mlogit.simple.est <-  df.mlogit.simple %>% 
  mutate(across(c(Estimate, Std_Error, Z_value, p_value), ~ round(.x, 3)),
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost"),
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, Estimate, "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.simple.est.docx")

# export as docx as OR
library(flextable)
tbl.mlogit.simple.OR <-  df.mlogit.simple %>% 
  mutate(
    Odds_Ratio = round(exp(Estimate), 3),
    Std_Error = round(Std_Error, 3),
    Z_value = round(Z_value, 3),
    p_value = round(p_value, 3),
    Estimate = round(Estimate, 3),,
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost"),
        "Odds ratio" = Odds_Ratio,
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, "Odds ratio", "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl.mlogit.simple.OR

Term	Odds ratio	Standard error	z value
On-site, group session: Dietary intervention	3.679	0.195	6.671
On-site, group session: Dietary & Exercise intervention	5.456	0.195	8.682
On-site, group session: Exercise intervention	3.718	0.227	5.784
On-site, individual session: Dietary intervention	3.274	0.212	5.583
On-site, individual session: Dietary & Exercise intervention	6.229	0.213	8.596
On-site, individual session: Exercise intervention	4.253	0.198	7.293
Online, time & place independent: Dietary intervention	4.038	0.199	7.015
Online, time & place independent: Dietary & Exercise intervention	5.688	0.214	8.119
Online, time & place independent: Exercise intervention	3.315	0.195	6.142
Videocalls: Dietary intervention	2.702	0.226	4.403
Videocalls: Dietary & Exercise intervention	4.225	0.181	7.956
Videocalls: Exercise intervention	3.878	0.240	5.648
Effort	0.799	0.033	-6.851
Weight	1.050	0.006	7.823
Pain	1.034	0.003	12.596
Cost	0.999	0.000	-14.189

Code

tbl.mlogit.simple.OR %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.simple.OR.docx")

For the main model, data from 159 respondents were used, resulting in 7155 individual observations. Details on the data structure is shown in the appendix (?@sec-data-structure). The model output supports the expected effects, as illustrated in Figure 2. Specifically, higher time effort and costs are associated with decreased utility, while greater expected reductions in pain and weight are associated with increased utility. Furthermore, all four treatment delivery modes (oti, vc, ffg, ffs) significantly increase utility compared to the reference category, which is opt-out (i.e., no treatment). The main effect of type was tested by comparing the reduced model (without type) to the full model. The null hypothesis, which stated that the reduced model is valid, was rejected ($\chi^2 = 146.61$, $df = 12$, $p < 0.001$).

Code

mlogit.main <- mlogit(choice_bin ~ TC + effort + weight + pain + cost +
                        effort:FABQ_PA + effort:FABQ_W + cost:Franchise  | 
                        0, data = df.mlogit)

library(broom)
mlogit.main.tidy <- broom::tidy(mlogit.main)

df.mlogit.main <- as.data.frame(mlogit.main.tidy) %>%
    rename(
    Term = term,
    Estimate = estimate,
    Std_Error = std.error,
    Z_value = statistic,
    p_value = p.value
  ) %>%
  mutate(Term = sub("^TC", "", Term))

# export as docx as estimate
library(flextable)
tbl.mlogit.main <-  df.mlogit.main %>% 
  mutate(across(c(Estimate, Std_Error, Z_value, p_value), ~ round(.x, 3)),
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost",
                  "Effort: FABQ-PA", 
                  "Effort: FABQ-W",
                  "Cost: Franchise"),
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, Estimate, "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.main.est.docx")

# Export as docx as OR
tbl.mlogit.main.OR <-  df.mlogit.main %>% 
  mutate(
    Odds_Ratio = round(exp(Estimate), 3),
    Std_Error = round(Std_Error, 3),
    Z_value = round(Z_value, 3),
    p_value = round(p_value, 3),
    Estimate = round(Estimate, 3),
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost",
                  "Effort: FABQ-PA", 
                  "Effort: FABQ-W",
                  "Cost: Franchise"),
        "Odds ratio" = Odds_Ratio,
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, "Odds ratio", "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl.mlogit.main.OR

Term	Odds ratio	Standard error	z value	p value
On-site, group session: Dietary intervention	3.858	0.202	6.691	0.000
On-site, group session: Dietary & Exercise intervention	5.830	0.205	8.607	0.000
On-site, group session: Exercise intervention	4.142	0.235	6.042	0.000
On-site, individual session: Dietary intervention	3.523	0.222	5.663	0.000
On-site, individual session: Dietary & Exercise intervention	6.323	0.222	8.295	0.000
On-site, individual session: Exercise intervention	4.320	0.207	7.053	0.000
Online, time & place independent: Dietary intervention	4.137	0.207	6.863	0.000
Online, time & place independent: Dietary & Exercise intervention	6.034	0.225	8.001	0.000
Online, time & place independent: Exercise intervention	3.584	0.205	6.238	0.000
Videocalls: Dietary intervention	2.964	0.236	4.595	0.000
Videocalls: Dietary & Exercise intervention	4.504	0.189	7.977	0.000
Videocalls: Exercise intervention	4.302	0.250	5.834	0.000
Effort	0.723	0.049	-6.641	0.000
Weight	1.051	0.006	7.638	0.000
Pain	1.035	0.003	12.329	0.000
Cost	0.999	0.000	-12.619	0.000
Effort: FABQ-PA	1.007	0.003	2.441	0.015
Effort: FABQ-W	0.999	0.002	-0.447	0.655
Cost: Franchise	1.000	0.000	1.025	0.305

Code

tbl.mlogit.main.OR %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.main.OR.docx")

5 Further analyses

Specific contrasts for the attribute type were examined. No significant difference in utility was found between online and face-to-face intervention delivery overall. However, in pairwise comparisons, the difference between face-to-face delivery in a single session and delivery via videocalls was statistically significant, with a preference for the face-to-face single-session format.

Code

# library(multcomp)
# 
# K <- rbind(
#   "oti - vc"   = c(0, 0, 0, 0,  1, -1,  0, 0, 0, 0, 0),
#   "oti - ffg"  = c(0, 0, 0, 0,  1,  0, -1, 0, 0, 0, 0),
#   "oti - ffs"  = c(0, 0, 0, 0,  1,  0, 0, -1, 0, 0, 0),
#   "vc - ffg"   = c(0, 0, 0, 0,  0,  1, -1, 0, 0, 0, 0),
#   "vc - ffs"   = c(0, 0, 0, 0,  0,  1, 0, -1, 0, 0, 0),
#   "ffg - ffs"  = c(0, 0, 0, 0,  0,  0, 1, -1, 0, 0, 0),
#   "online_vs_ftof" = c(0, 0, 0, 0, 0.5, 0.5, -0.5, -0.5, 0, 0, 0)
# )
# 
# glht.test <- glht(mlogit.main, linfct = K)
# 
# glht.summary <- summary(glht.test)
# 
# df.contrasts <- data.frame(
#   Estimate = glht.summary$test$coefficients,
#   Std_Error = glht.summary$test$sigma,
#   z_value = glht.summary$test$tstat,
#   p_value = glht.summary$test$pvalues
# )
# 
# df.contrasts %>%
#   kbl(digits = 3, escape = TRUE) %>%
#   kable_styling(font_size = 8)
# 
# # export as docx
# df.contrasts.exp <- df.contrasts %>%
#     rownames_to_column(var = "Contrast")
# 
# tbl.contrasts <- df.contrasts.exp %>%
#   mutate(across(c(Estimate, Std_Error, z_value, p_value), ~ round(.x, 3))) %>%
#   transmute(
#     Contrast = c(
#       "Online, time & place independent - Videocalls",
#       "Online, time & place independent - On-site, group session",
#       "Online, time & place independent - On-site individual session",
#       "Videocalls - On-site, group session",
#       "Videocalls - On-site, individual session",
#       "On-site, group session - On-site, individual session",
#       "Online (online+videocalls) - On-site (individual or group session)"
#     ),
#     Estimate,
#     `Standard error` = Std_Error,
#     `z value` = z_value,
#     `p value` = p_value
#   ) %>%
#   flextable::flextable() %>%
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>%
#   flextable::save_as_docx(path = "../tables/tbl.contrasts.docx")

Code

# names of first 12 coefficients
coef_names <- sub("^TC", "", names(coef(mlogit.main))[1:12])

# all contrasts
contrasts_list <- combn(1:12, 2, simplify = FALSE)

K <- do.call(rbind, lapply(contrasts_list, function(pair) {
  k_vec <- rep(0, length(coef(mlogit.main)))
  k_vec[pair[1]] <- 1
  k_vec[pair[2]] <- -1
  names(k_vec) <- names(coef(mlogit.main))
  return(k_vec)
}))

rownames(K) <- sapply(contrasts_list, function(pair) {
  paste0(coef_names[pair[1]], " - ", coef_names[pair[2]])
})


library(multcomp)
glht.test <- glht(mlogit.main, linfct = K)
glht.summary <- summary(glht.test)

df.contrasts <- data.frame(
  Estimate = glht.summary$test$coefficients,
  Std_Error = glht.summary$test$sigma,
  z_value = glht.summary$test$tstat,
  p_value = glht.summary$test$pvalues
)

df.contrasts %>%
  kbl(
  digits = 3,
  escape = TRUE,
  format = "latex") %>%
  kable_styling(
    font_size = 8,
    latex_options = c("striped", "hold_position"),
    full_width = FALSE)
# most prefered content: D+E
# most prefered type: ffs>oti>ffg
# export as docx
df.contrasts.exp <- df.contrasts %>%
    rownames_to_column(var = "Contrast")

tbl.contrasts <- df.contrasts.exp %>%
  mutate(across(c(Estimate, Std_Error, z_value, p_value), ~ round(.x, 3))) %>%
  flextable::flextable() %>%
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl.contrasts
tbl.contrasts %>%
  flextable::save_as_docx(path = "../tables/tbl.contrasts.docx")

Table 1: Tested contrases for the attribute type.

Contrast	Estimate	Std_Error	z_value	p_value
ffg:D - ffg:D+E	-0.413	0.153	-2.700	0.217
ffg:D - ffg:E	-0.071	0.218	-0.326	1.000
ffg:D - ffs:D	0.091	0.205	0.442	1.000
ffg:D - ffs:D+E	-0.494	0.174	-2.844	0.154
ffg:D - ffs:E	-0.113	0.167	-0.680	1.000
ffg:D - oti:D	-0.070	0.158	-0.443	1.000
ffg:D - oti:D+E	-0.447	0.211	-2.117	0.595
ffg:D - oti:E	0.073	0.171	0.430	1.000
ffg:D - vc:D	0.264	0.180	1.462	0.946
ffg:D - vc:D+E	-0.155	0.152	-1.018	0.997
ffg:D - vc:E	-0.109	0.205	-0.532	1.000
ffg:D+E - ffg:E	0.342	0.197	1.738	0.841
ffg:D+E - ffs:D	0.504	0.192	2.618	0.259
ffg:D+E - ffs:D+E	-0.081	0.165	-0.492	1.000
ffg:D+E - ffs:E	0.300	0.156	1.920	0.734
ffg:D+E - oti:D	0.343	0.164	2.094	0.611
ffg:D+E - oti:D+E	-0.034	0.209	-0.165	1.000
ffg:D+E - oti:E	0.486	0.152	3.199	0.058
ffg:D+E - vc:D	0.677	0.173	3.918	0.005
ffg:D+E - vc:D+E	0.258	0.165	1.566	0.914
ffg:D+E - vc:E	0.304	0.190	1.599	0.902
ffg:E - ffs:D	0.162	0.203	0.797	1.000
ffg:E - ffs:D+E	-0.423	0.181	-2.336	0.435
ffg:E - ffs:E	-0.042	0.195	-0.215	1.000
ffg:E - oti:D	0.001	0.195	0.007	1.000
ffg:E - oti:D+E	-0.376	0.228	-1.650	0.882
ffg:E - oti:E	0.145	0.168	0.861	0.999
ffg:E - vc:D	0.335	0.203	1.647	0.883
ffg:E - vc:D+E	-0.084	0.197	-0.425	1.000
ffg:E - vc:E	-0.038	0.188	-0.201	1.000
ffs:D - ffs:D+E	-0.585	0.166	-3.517	0.021
ffs:D - ffs:E	-0.204	0.174	-1.174	0.990
ffs:D - oti:D	-0.160	0.177	-0.908	0.999
ffs:D - oti:D+E	-0.538	0.194	-2.779	0.181
ffs:D - oti:E	-0.017	0.184	-0.094	1.000
ffs:D - vc:D	0.173	0.191	0.906	0.999
ffs:D - vc:D+E	-0.246	0.188	-1.304	0.976
ffs:D - vc:E	-0.200	0.202	-0.988	0.998
ffs:D+E - ffs:E	0.381	0.145	2.621	0.256
ffs:D+E - oti:D	0.424	0.147	2.884	0.139
ffs:D+E - oti:D+E	0.047	0.205	0.228	1.000
ffs:D+E - oti:E	0.568	0.159	3.564	0.018
ffs:D+E - vc:D	0.758	0.147	5.153	0.000
ffs:D+E - vc:D+E	0.339	0.160	2.113	0.598
ffs:D+E - vc:E	0.385	0.175	2.203	0.532
ffs:E - oti:D	0.043	0.161	0.269	1.000
ffs:E - oti:D+E	-0.334	0.197	-1.699	0.859
ffs:E - oti:E	0.187	0.153	1.224	0.985
ffs:E - vc:D	0.377	0.164	2.293	0.467
ffs:E - vc:D+E	-0.042	0.151	-0.277	1.000
ffs:E - vc:E	0.004	0.187	0.023	1.000
oti:D - oti:D+E	-0.378	0.210	-1.794	0.810
oti:D - oti:E	0.143	0.162	0.883	0.999
oti:D - vc:D	0.333	0.174	1.913	0.738
oti:D - vc:D+E	-0.085	0.155	-0.551	1.000
oti:D - vc:E	-0.039	0.186	-0.210	1.000
oti:D+E - oti:E	0.521	0.194	2.691	0.221
oti:D+E - vc:D	0.711	0.203	3.506	0.022
oti:D+E - vc:D+E	0.292	0.204	1.436	0.952
oti:D+E - vc:E	0.338	0.220	1.540	0.923
oti:E - vc:D	0.190	0.175	1.086	0.995
oti:E - vc:D+E	-0.228	0.156	-1.467	0.944
oti:E - vc:E	-0.182	0.181	-1.007	0.997
vc:D - vc:D+E	-0.419	0.171	-2.448	0.360
vc:D - vc:E	-0.373	0.171	-2.173	0.554
vc:D+E - vc:E	0.046	0.181	0.254	1.000

The MRS provide insights into the trade-offs participants were willing to make between different attributes. The MRS between effort and pain was -8.99, indicating that participants were willing to invest nearly 9 hours per week to achieve a 1% reduction in pain. The MRS between effort and weight was -5.08, suggesting that participants were willing to invest about 5 hours per week to achieve a 1% reduction in weight. The MRS between cost and pain was -0.038, implying that participants were willing to pay 0.038 CHF to gain a 1% reduction in pain. The MRS between cost and weight was -0.021, indicating a willingness to pay 0.021 CHF for a 1% reduction in body weight.

These results suggest that participants placed a higher value on reducing pain and weight through effort than through financial cost, highlighting a greater willingness to trade personal effort over monetary expense to achieve better outcomes.

Since effort and cost are considered burdens, whereas pain and weight are regarded as benefits, a negative MRS indicates that participants are willing to spend either in terms of effort or monetary cost to achieve a reduction in pain or weight.

Code

mlogit.summary <- summary(mlogit.main)

coefs <- mlogit.summary$coefficients # 1st column = estimates

MRS_effort_pain  <- coefs["pain"] / coefs["effort"]
MRS_effort_weight <- coefs["weight"] / coefs["effort"]
MRS_cost_pain    <- coefs["pain"] / coefs["cost"]
MRS_cost_weight   <- coefs["weight"] / coefs["cost"]


df.mrs <- data.frame(
  Comparison = c("Effort vs Pain", "Effort vs Weight", "Cost vs Pain", "Cost vs Weight"),
  MRS = c(MRS_effort_pain, MRS_effort_weight, MRS_cost_pain, MRS_cost_weight)
)


library(kableExtra)
df.mrs %>%
  kbl(
  digits = 3,
  escape = TRUE,
  format = "latex") %>%
  kable_styling(
    font_size = 8,
    latex_options = c("striped", "hold_position"),
    full_width = FALSE)
# export as docx
library(flextable)
tbl.mrs <-  df.mrs %>% 
  mutate(MRS = round(MRS, 3),
         Comparison = c("Effort vs Pain reduction", "Effort vs Weight loss", "Cost vs Pain reduction", "Cost vs Weight loss")) %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 
tbl.mrs
tbl.mrs %>%
  flextable::save_as_docx(path = "../tables/tbl.mrs.docx")

Table 2: Marginal rates of substitution.

Comparison	MRS
Effort vs Pain reduction	-0.105
Effort vs Weight loss	-0.153
Cost vs Pain reduction	-30.832
Cost vs Weight loss	-44.786

The relative importance of each attribute (Table 3) was derived from the multinomial logit model containing alternative-specific attributes only. Among the attributes, intervention costs had the greatest influence on participants’ choices, accounting for 35.1% of the decision-making weight. This was followed by pain reduction (23.8%), weekly time effort (17.3%), and weight loss (17.0%). The type of intervention was the least influential attribute, contributing only 6.7% to the overall decision. These findings indicate that monetary considerations and pain reduction were the primary drivers of choice behavior in this study.

Code

m.rel.imp <- mlogit(choice_bin ~ TC + effort + pain + weight + cost 
                    | 0, data = df.mlogit)

coefs <- summary(m.rel.imp)$coefficients


range_effort <- abs((max(df.mlogit$effort) - min(df.mlogit$effort)) * coefs["effort"])
range_pain <- abs((max(df.mlogit$pain) - min(df.mlogit$pain)) * coefs["pain"])
range_weight <- abs((max(df.mlogit$weight) - min(df.mlogit$weight)) * coefs["weight"])
range_cost <- abs((max(df.mlogit$cost) - min(df.mlogit$cost)) * coefs["cost"])
tc_coefs <- coefs[grep("^TC", names(coefs))]
range_TC <- abs(max(tc_coefs) - min(tc_coefs))

total_range <- range_effort + range_pain + range_weight + range_cost + range_TC

Rel.importance <- c(
  effort = range_effort / total_range,
  pain = range_pain / total_range,
  weight = range_weight / total_range,
  cost = range_cost / total_range,
  TC = range_TC / total_range
)

names(Rel.importance) <- c("effort", "pain", "weight", "cost", "TC")

as.data.frame(Rel.importance) %>%
  mutate(sign = c("-", "+", "+", "-", "NA")) %>% 
  kbl(
  digits = 3,
  escape = TRUE,
  format = "latex") %>%
  kable_styling(
    font_size = 8,
    latex_options = c("striped", "hold_position"),
    full_width = FALSE)
library(flextable)
tbl.rel.importance <- as.data.frame(Rel.importance) %>%
  rownames_to_column(var = "Attribute") %>%
  mutate(
    Attribute = c("Effort", "Pain", "Weight", "Cost", "Type:Content"),
    Sign = c("-", "+", "+", "-", "NA"),
    Rel.importance = round(Rel.importance, 3)
  ) %>%
  arrange(desc(Rel.importance)) %>%
  rename(`Relative importance` = Rel.importance) %>% 
  flextable() %>%
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  line_spacing(space = 1.0) %>%
  fontsize(size = 8, part = "all") %>%
  font(fontname = "Times New Roman", part = "all") 
tbl.rel.importance
tbl.rel.importance %>%
  save_as_docx(path = "../tables/tbl.rel.importance.docx")

Table 3: Relative importance of attributes.

Attribute	Relative importance	Sign
Cost	0.319	-
Pain	0.246	+
Effort	0.166	-
Weight	0.145	+
Type:Content	0.124	NA

Code

# 
# mrs.effidxmrs.effortmrs.effortmrs.effort.pain.temp  <- vector()
# mrs.effort.weight.temp <- vector()
# mrs.cost.pain.temp <- vector()
# mrs.cost.weight.temp <- vector()
# 
# rel.imp.effort <- vector()
# rel.imp.pain <- vector()
# rel.imp.weight <- vector()
# rel.imp.cost <- vector()
# rel.imp.TC <- vector()
# 
# 
# # for(i in 1:1000){
# # 
# #   index <- sample(1:max(unique(df.mlogit$ID)), max(unique(df.mlogit$ID)), replace = TRUE) # create indexes
# # 
# #   df.mlogit.temp <- bind_rows(lapply(index, function(i) df.mlogit[df.mlogit$ID==i,]))
# # 
# #   df.mlogit.temp <- mlogit.data(df.mlogit.temp, choice="choice_bin", shape="long",
# #                                alt.var="ALT", chid.var="STR")
# # 
# #   df.mlogit.temp <- lapply(index, function(i) {
# #   df.mlogit[df.mlogit$ID == i, ]}) %>%
# #   bind_rows()
# # 
# #   df.mlogit.temp$ID <- rep(1:max(unique(df.mlogit$ID)), each = 45)
# # 
# #   df.mlogit.temp$STR <- 100 * df.mlogit.temp$ID + df.mlogit.temp$QES # make new unique STR
# # 
# #   df.mlogit.temp <- as.data.frame(df.mlogit.temp)
# # 
# #   df.mlogit.temp <- dfidx::dfidx(df.mlogit.temp,idx = list(c("STR", "ALT")),drop.index = TRUE)
# # 
# #   mlogit.main.temp <- mlogit(choice_bin ~ TC + effort + weight + pain + cost +
# #                         effort:FABQ_PA + effort:FABQ_W + cost:Franchise  |
# #                         0, data = df.mlogit.temp)
# # 
# # 
# #   mlogit.summary.temp <- summary(mlogit.main.temp)
# # 
# #   coefs <- mlogit.summary.temp$coefficients # 1st column = estimates
# # 
# #   mrs.effort.pain.temp[i]  <- coefs["pain"] / coefs["effort"]
# #   mrs.effort.weight.temp[i] <- coefs["weight"] / coefs["effort"]
# #   mrs.cost.pain.temp[i] <- coefs["pain"] / coefs["cost"]
# #   mrs.cost.weight.temp[i]   <- coefs["weight"] / coefs["cost"]
# # 
# # 
# #   # Relative importance
# #   m.rel.imp <- mlogit(choice_bin ~ TC + effort + pain + weight + cost
# #                     | 0, data = df.mlogit.temp)
# # 
# #   coefs <- summary(m.rel.imp)$coefficients
# # 
# # 
# #   range_effort <- abs((max(df.mlogit$effort) - min(df.mlogit$effort)) * coefs["effort"])
# #   range_pain <- abs((max(df.mlogit$pain) - min(df.mlogit$pain)) * coefs["pain"])
# #   range_weight <- abs((max(df.mlogit$weight) - min(df.mlogit$weight)) * coefs["weight"])
# #   range_cost <- abs((max(df.mlogit$cost) - min(df.mlogit$cost)) * coefs["cost"])
# #   tc_coefs <- coefs[grep("^TC", names(coefs))]
# #   range_TC <- abs(max(tc_coefs) - min(tc_coefs))
# #   total_range <- range_effort + range_pain + range_weight + range_cost + range_TC
# # 
# # 
# #   rel.imp.effort[i] <- range_effort / total_range
# #   rel.imp.pain[i] <- range_pain / total_range
# #   rel.imp.weight[i] <- range_weight / total_range
# #   rel.imp.cost[i] <- range_cost / total_range
# #   rel.imp.TC[i] <- range_TC / total_range
# # 
# # }
# 
# 
# 
# df.mrs$lower[1] <- 2*df.mrs$MRS[1] - quantile(mrs.effort.pain.temp, probs = c(0.975))
# df.mrs$upper[1] <- 2*df.mrs$MRS[1] - quantile(mrs.effort.pain.temp, probs = c(0.025))
# df.mrs$lower[2] <- 2*df.mrs$MRS[2] - quantile(mrs.effort.weight.temp, probs = c(0.975))
# df.mrs$upper[2] <- 2*df.mrs$MRS[2] - quantile(mrs.effort.weight.temp, probs = c(0.025))
# df.mrs$lower[3] <- 2*df.mrs$MRS[3] - quantile(mrs.cost.pain.temp, probs = c(0.975))
# df.mrs$upper[3] <- 2*df.mrs$MRS[3] - quantile(mrs.cost.pain.temp, probs = c(0.025))
# df.mrs$lower[4] <- 2*df.mrs$MRS[4] - quantile(mrs.cost.weight.temp, probs = c(0.975))
# df.mrs$upper[4] <- 2*df.mrs$MRS[4] - quantile(mrs.cost.weight.temp, probs = c(0.025))
# 
# 
# df.rel.imp <- data.frame(rel_imp = Rel.importance,
#            attr = names(Rel.importance))
# 
# row.names(df.rel.imp) <- NULL
# df.rel.imp$lower[1] <- 2*df.rel.imp$rel_imp[1] - quantile(rel.imp.effort, probs = c(0.975))
# df.rel.imp$upper[1] <- 2*df.rel.imp$rel_imp[1] - quantile(rel.imp.effort, probs = c(0.025))
# df.rel.imp$lower[2] <- 2*df.rel.imp$rel_imp[2] - quantile(rel.imp.pain, probs = c(0.975))
# df.rel.imp$upper[2] <- 2*df.rel.imp$rel_imp[2] - quantile(rel.imp.pain, probs = c(0.025))
# df.rel.imp$lower[3] <- 2*df.rel.imp$rel_imp[3] - quantile(rel.imp.weight, probs = c(0.975))
# df.rel.imp$upper[3] <- 2*df.rel.imp$rel_imp[3] - quantile(rel.imp.weight, probs = c(0.025))
# df.rel.imp$lower[4] <- 2*df.rel.imp$rel_imp[4] - quantile(rel.imp.cost, probs = c(0.975))
# df.rel.imp$upper[4] <- 2*df.rel.imp$rel_imp[4] - quantile(rel.imp.cost, probs = c(0.025))
# df.rel.imp$lower[5] <- 2*df.rel.imp$rel_imp[5] - quantile(rel.imp.TC, probs = c(0.975))
# df.rel.imp$upper[5] <- 2*df.rel.imp$rel_imp[5] - quantile(rel.imp.TC, probs = c(0.025))
#

Table 4

Code

# # prepare dataset mrs
# df.mrs.adapted <- df.mrs %>% 
#   transmute(
#     Attribute = Comparison,
#     Estimate = round(MRS, 3),
#     `95% CI` = paste0("[", round(lower, 3), "; ", round(upper, 3), "]")) %>%
#   mutate(Metric = "Marginal rate of substitution") %>%
#   relocate(Metric) %>%
#   mutate(Metric = replace(Metric, row_number() != 1, ""))
# 
# # Adapt df.rel.imp
# df.rel.imp.adapted <- df.rel.imp %>%
#   dplyr::select(attr, rel_imp, lower, upper) %>% 
#   transmute(
#     Attribute = case_when(
#       attr == "effort" ~ "Effort",
#       attr == "pain" ~ "Pain",
#       attr == "weight" ~ "Weight",
#       attr == "cost" ~ "Cost",
#       TRUE ~ attr),
#     Estimate = round(rel_imp, 3),
#     `95% CI` = paste0("[", 
#                       formatC(lower, format = "f", digits = 3), 
#                       "; ", 
#                       formatC(upper, format = "f", digits = 3), 
#                       "]")) %>%
#   mutate(Metric = "Relative importance") %>%
#   relocate(Metric) %>%
#   mutate(Metric = replace(Metric, row_number() != 1, ""))
# 
# # combine datasets
# df.mrs.ri.combined <- bind_rows(df.mrs.adapted, df.rel.imp.adapted) %>% 
#   flextable::flextable() %>% 
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>%
#   flextable::save_as_docx(path = "../tables/tbl.mrs.ri.docx")

Code

library(ggplot2)
library(dplyr)

# Data preparation
df.rel.importance <- data.frame(
  Attribute = c("Effort", "Pain", "Weight", "Cost", "Type:Content"),
  Importance = round(Rel.importance * 100, 1)  # Convert to percentage
) %>%
  mutate(
    # Invert sign for Cost and Effort
    Importance = ifelse(Attribute %in% c("Cost", "Effort"), -Importance, Importance),
    
    # Set desired factor level order
    Attribute = factor(Attribute, levels = c("Cost", "Pain", "Effort", "Weight", "Type:Content")))

grey_palette <- c("#333333", "#555555", "#777777", "#999999", "#BBBBBB")

# Plot
fig.rel.importance <- ggplot(df.rel.importance, aes(x = Attribute, y = Importance, fill = Attribute)) +
  geom_bar(stat = "identity", width = 0.7) +
  geom_text(aes(label = paste0(Importance, "%"),
                vjust = ifelse(Importance < 0, 1.5, -0.5)),
            size = 3) +
  geom_hline(yintercept = 0, size = 0.3, color = "black") +
  scale_fill_manual(values = grey_palette) +
  scale_y_continuous(
    labels = scales::percent_format(scale = 1), 
    limits = c(-100, 100),
    breaks = seq(-100, 100, by = 20)) +
  labs(
    x = NULL,
    y = "Relative importance (%)",
    fill = NULL
  ) +
  theme_minimal(base_size = 10) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    axis.ticks.x = element_blank(),
    legend.position = "none",
    panel.grid.major.x = element_blank(),  # remove vertical grid lines
    panel.grid.minor.x = element_blank()
  )
fig.rel.importance

Code

ggsave("../figures/fig.rel.importance.png", fig.rel.importance, width = 6, height = 4, dpi = 300)

6 References

Aizaki, Hideo. 2012a. “Basic Functions for Supporting an Implementation of Choice Experiments in R.” Journal of Statistical Software 50 (C2): 1–24. https://doi.org/10.18637/jss.v050.c02.

———. 2012b. “Basic Functions for Supporting an Implementation of Choice Experiments in R.” Journal of Statistical Software 50 (September): 1–24. https://doi.org/10.18637/jss.v050.c02.

Bekker-Grob, Esther W. de, Bas Donkers, Marcel F. Jonker, and Elly A. Stolk. 2015. “Sample Size Requirements for Discrete-Choice Experiments in Healthcare: A Practical Guide.” The Patient - Patient-Centered Outcomes Research 8 (5): 373–84. https://doi.org/10.1007/s40271-015-0118-z.

Campbell, Danny, and Seda Erdem. 2019. “Including Opt-Out Options in Discrete Choice Experiments: Issues to Consider.” The Patient - Patient-Centered Outcomes Research 12 (1): 1–14. https://doi.org/10.1007/s40271-018-0324-6.

Croissant, Yves. 2020a. “Estimation of Random Utility Models in R: The mlogit Package.” Journal of Statistical Software 95 (11): 1–41. https://doi.org/10.18637/jss.v095.i11.

———. 2020b. “Estimation of Random Utility Models in R: The Mlogit Package.” Journal of Statistical Software 95 (October): 1–41. https://doi.org/10.18637/jss.v095.i11.

Hothorn, Torsten, Frank Bretz, and Peter Westfall. 2008. “Simultaneous Inference in General Parametric Models.” Biometrical Journal 50 (3): 346–63.

Lauridsen, Henrik H., Jan Hartvigsen, Claus Manniche, Lars Korsholm, and Niels Grunnet-Nilsson. 2006. “Responsiveness and Minimal Clinically Important Difference for Pain and Disability Instruments in Low Back Pain Patients.” BMC Musculoskeletal Disorders 7 (October): 82. https://doi.org/10.1186/1471-2474-7-82.

Monticone, Marco, Luca Frigau, Howard Vernon, Barbara Rocca, Andrea Giordano, Salvatore Simone Vullo, Francesco Mola, and Franco Franchignoni. 2020. “Reliability, Responsiveness and Minimal Clinically Important Difference of the Two Fear Avoidance and Beliefs Questionnaire Scales in Italian Subjects with Chronic Low Back Pain Undergoing Multidisciplinary Rehabilitation.” European Journal of Physical and Rehabilitation Medicine 56 (5): 600–606. https://doi.org/10.23736/S1973-9087.20.06158-4.

Pfingsten, M., B. Kröner-Herwig, E. Leibing, U. Kronshage, and J. Hildebrandt. 2000. “Validation of the German Version of the Fear-Avoidance Beliefs Questionnaire (FABQ).” European Journal of Pain (London, England) 4 (3): 259–66. https://doi.org/10.1053/eujp.2000.0178.

R Core Team. 2025. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Staerkle, Ralph, Anne F. Mannion, Achim Elfering, Astrid Junge, Norbert K. Semmer, Nicola Jacobshagen, Dieter Grob, Jiri Dvorak, and Norbert Boos. 2004. “Longitudinal Validation of the Fear-Avoidance Beliefs Questionnaire (FABQ) in a Swiss-German Sample of Low Back Pain Patients.” European Spine Journal: Official Publication of the European Spine Society, the European Spinal Deformity Society, and the European Section of the Cervical Spine Research Society 13 (4): 332–40. https://doi.org/10.1007/s00586-003-0663-3.

Wheeler, Bob. 2025. AlgDesign: Algorithmic Experimental Design. https://doi.org/10.32614/CRAN.package.AlgDesign.

Wheeler, R. E. 2004. “optFederov. AlgDesign. The R Project for Statistical Computing https://www.r-project.org/.” https://www.rdocumentation.org/packages/AlgDesign/versions/1.2.1.1/topics/optFederov.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Zeileis, Achim, and Torsten Hothorn. 2002. “Diagnostic Checking in Regression Relationships.” R News 2 (3): 7–10. https://CRAN.R-project.org/doc/Rnews/.

7 Appendix

Code

tbl.dce.alternatives <- rbind(
  list.design$alternatives$alt.1,
  list.design$alternatives$alt.2)

tbl.dce.alternatives %>%
  flextable() %>%
  set_header_labels(.labels = colnames(tbl.dce.alternatives)) %>%  # optional
  autofit() %>%
  bold(part = "header") %>%
  align(j = 1:ncol(tbl.dce.alternatives), align = "center", part = "all") %>%
  line_spacing(space = 1.0) %>%
  fontsize(size = 8, part = "all") %>%
  font(fontname = "Times New Roman", part = "all")
# Save as CSV
write.csv(tbl.dce.alternatives, "../tables/tbl.choicesets.csv", row.names = FALSE)

Table 5: Choice sets, devided into two blocks.

BLOCK	QES	ALT	Realization.type	Content	Average.effort.per.week	Weight.loss	Pain.reduction	Total.costs
1	1	1	online-time & place independent	Exercise program	4	-20	-10	1500
1	2	1	face-to-face_group	D+E	2	-5	-40	500
1	3	1	online-time & place independent	D+E	3	-20	-30	500
1	4	1	face-to-face_single	Dietary intervention	4	-10	-20	1000
1	5	1	online-time & place independent	Dietary intervention	5	-5	-30	1000
1	6	1	videocalls	Dietary intervention	3	-10	-40	2000
1	7	1	videocalls	Dietary intervention	5	-20	-50	500
1	8	1	videocalls	D+E	4	-5	-20	1500
1	9	1	online-time & place independent	Dietary intervention	3	-10	-40	1500
1	10	1	face-to-face_group	D+E	5	-10	-10	2000
1	11	1	videocalls	Exercise program	4	-5	-40	2000
1	12	1	face-to-face_group	Dietary intervention	4	-15	-10	500
1	13	1	videocalls	D+E	3	-15	-50	1000
1	14	1	face-to-face_group	Exercise program	3	-5	-20	1000
1	15	1	face-to-face_group	Dietary intervention	2	-20	-50	1500
2	1	1	face-to-face_group	Exercise program	5	-10	-30	1500
2	2	1	online-time & place independent	Exercise program	2	-5	-50	2000
2	3	1	videocalls	Exercise program	5	-20	-20	500
2	4	1	face-to-face_single	D+E	5	-15	-20	1500
2	5	1	online-time & place independent	Exercise program	5	-15	-40	1000
2	6	1	online-time & place independent	D+E	4	-10	-50	500
2	7	1	videocalls	Dietary intervention	2	-15	-30	1500
2	8	1	face-to-face_single	Exercise program	2	-10	-30	500
2	9	1	face-to-face_single	Dietary intervention	3	-5	-10	500
2	10	1	face-to-face_single	D+E	5	-5	-50	1500
2	11	1	face-to-face_group	D+E	4	-20	-30	2000
2	12	1	face-to-face_single	Exercise program	3	-15	-50	2000
2	13	1	face-to-face_single	D+E	2	-20	-40	1000
2	14	1	online-time & place independent	Dietary intervention	2	-15	-20	2000
2	15	1	videocalls	D+E	2	-10	-10	1000
1	1	2	videocalls	D+E	2	-15	-40	2000
1	2	2	videocalls	Dietary intervention	5	-5	-40	500
1	3	2	face-to-face_group	Dietary intervention	2	-10	-10	2000
1	4	2	online-time & place independent	D+E	2	-5	-30	1000
1	5	2	face-to-face_group	Dietary intervention	3	-5	-50	1500
1	6	2	face-to-face_single	D+E	5	-5	-20	2000
1	7	2	face-to-face_single	D+E	3	-10	-10	500
1	8	2	online-time & place independent	Dietary intervention	5	-10	-30	2000
1	9	2	face-to-face_single	Dietary intervention	5	-15	-10	1000
1	10	2	face-to-face_group	Dietary intervention	2	-20	-30	2000
1	11	2	face-to-face_group	Exercise program	5	-15	-30	1500
1	12	2	videocalls	D+E	4	-10	-30	1500
1	13	2	face-to-face_single	Exercise program	4	-15	-50	2000
1	14	2	online-time & place independent	Exercise program	3	-20	-40	2000
1	15	2	face-to-face_group	D+E	4	-20	-10	500
2	1	2	online-time & place independent	Exercise program	4	-15	-50	500
2	2	2	online-time & place independent	D+E	5	-10	-50	500
2	3	2	videocalls	Dietary intervention	3	-10	-50	1000
2	4	2	face-to-face_single	Dietary intervention	4	-5	-40	1000
2	5	2	face-to-face_group	D+E	3	-15	-40	1000
2	6	2	videocalls	Exercise program	3	-5	-10	2000
2	7	2	videocalls	Dietary intervention	2	-15	-20	500
2	8	2	face-to-face_single	D+E	2	-20	-50	1500
2	9	2	face-to-face_single	Exercise program	3	-20	-30	500
2	10	2	online-time & place independent	Dietary intervention	3	-15	-20	1500
2	11	2	face-to-face_single	Exercise program	2	-10	-40	1500
2	12	2	online-time & place independent	Exercise program	2	-5	-10	1000
2	13	2	face-to-face_group	Exercise program	4	-10	-20	1000
2	14	2	online-time & place independent	Dietary intervention	4	-20	-10	1500
2	15	2	videocalls	Exercise program	5	-20	-20	1000

Code

# df.dce.long %>% 
#   slice(1:45) %>% 
#   kbl() %>% 
#   kable_styling(font_size = 8, 
#                 full_width = F)

library(dplyr)
library(flextable)
library(tibble)

# Create the subset of the data
df.dce.long.2 <- df.dce.long %>% 
  slice(1:45) %>% 
  dplyr::select(-sex_bin)

flextable.dce.long.2 <- df.dce.long.2 %>%
  flextable() %>%
  colformat_num(big.mark = "", digits = 0) %>%  # Kein Tausendertrennzeichen
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  line_spacing(space = 1.0) %>%
  fontsize(size = 8, part = "all") %>%
  font(fontname = "Times New Roman", part = "all")

flextable.dce.long.2
flextable.dce.long.2 %>% 
  flextable::save_as_docx(flextable.dce.long.2, path = "../tables/tbl.data.structure.docx")

Table 6: Data from the first patient. There are 15 choice sets, each with three alternatives. Thus, there are 45 rows per patient.

ID	BLOCK	age	sex	BMI	Franchise	FABQ_PA	FABQ_W	question	choice	QES	ALT	effort	pain	TC	cost	weight	choice_bin	STR
2	1	60	Female	30.8	300	25	31	q1	2	1	1	4	10	oti:E	1500	20	0	201
2	1	60	Female	30.8	300	25	31	q1	2	1	2	2	40	vc:D+E	2000	15	1	201
2	1	60	Female	30.8	300	25	31	q1	2	1	3	0	0	None:None	0	0	0	201
2	1	60	Female	30.8	300	25	31	q2	1	2	1	2	40	ffg:D+E	500	5	1	202
2	1	60	Female	30.8	300	25	31	q2	1	2	2	5	40	vc:D	500	5	0	202
2	1	60	Female	30.8	300	25	31	q2	1	2	3	0	0	None:None	0	0	0	202
2	1	60	Female	30.8	300	25	31	q3	1	3	1	3	30	oti:D+E	500	20	1	203
2	1	60	Female	30.8	300	25	31	q3	1	3	2	2	10	ffg:D	2000	10	0	203
2	1	60	Female	30.8	300	25	31	q3	1	3	3	0	0	None:None	0	0	0	203
2	1	60	Female	30.8	300	25	31	q4	2	4	1	4	20	ffs:D	1000	10	0	204
2	1	60	Female	30.8	300	25	31	q4	2	4	2	2	30	oti:D+E	1000	5	1	204
2	1	60	Female	30.8	300	25	31	q4	2	4	3	0	0	None:None	0	0	0	204
2	1	60	Female	30.8	300	25	31	q5	2	5	1	5	30	oti:D	1000	5	0	205
2	1	60	Female	30.8	300	25	31	q5	2	5	2	3	50	ffg:D	1500	5	1	205
2	1	60	Female	30.8	300	25	31	q5	2	5	3	0	0	None:None	0	0	0	205
2	1	60	Female	30.8	300	25	31	q6	2	6	1	3	40	vc:D	2000	10	0	206
2	1	60	Female	30.8	300	25	31	q6	2	6	2	5	20	ffs:D+E	2000	5	1	206
2	1	60	Female	30.8	300	25	31	q6	2	6	3	0	0	None:None	0	0	0	206
2	1	60	Female	30.8	300	25	31	q7	2	7	1	5	50	vc:D	500	20	0	207
2	1	60	Female	30.8	300	25	31	q7	2	7	2	3	10	ffs:D+E	500	10	1	207
2	1	60	Female	30.8	300	25	31	q7	2	7	3	0	0	None:None	0	0	0	207
2	1	60	Female	30.8	300	25	31	q8	1	8	1	4	20	vc:D+E	1500	5	1	208
2	1	60	Female	30.8	300	25	31	q8	1	8	2	5	30	oti:D	2000	10	0	208
2	1	60	Female	30.8	300	25	31	q8	1	8	3	0	0	None:None	0	0	0	208
2	1	60	Female	30.8	300	25	31	q9	2	9	1	3	40	oti:D	1500	10	0	209
2	1	60	Female	30.8	300	25	31	q9	2	9	2	5	10	ffs:D	1000	15	1	209
2	1	60	Female	30.8	300	25	31	q9	2	9	3	0	0	None:None	0	0	0	209
2	1	60	Female	30.8	300	25	31	q10	1	10	1	5	10	ffg:D+E	2000	10	1	210
2	1	60	Female	30.8	300	25	31	q10	1	10	2	2	30	ffg:D	2000	20	0	210
2	1	60	Female	30.8	300	25	31	q10	1	10	3	0	0	None:None	0	0	0	210
2	1	60	Female	30.8	300	25	31	q11	2	11	1	4	40	vc:E	2000	5	0	211
2	1	60	Female	30.8	300	25	31	q11	2	11	2	5	30	ffg:E	1500	15	1	211
2	1	60	Female	30.8	300	25	31	q11	2	11	3	0	0	None:None	0	0	0	211
2	1	60	Female	30.8	300	25	31	q12	1	12	1	4	10	ffg:D	500	15	1	212
2	1	60	Female	30.8	300	25	31	q12	1	12	2	4	30	vc:D+E	1500	10	0	212
2	1	60	Female	30.8	300	25	31	q12	1	12	3	0	0	None:None	0	0	0	212
2	1	60	Female	30.8	300	25	31	q13	1	13	1	3	50	vc:D+E	1000	15	1	213
2	1	60	Female	30.8	300	25	31	q13	1	13	2	4	50	ffs:E	2000	15	0	213
2	1	60	Female	30.8	300	25	31	q13	1	13	3	0	0	None:None	0	0	0	213
2	1	60	Female	30.8	300	25	31	q14	1	14	1	3	20	ffg:E	1000	5	1	214
2	1	60	Female	30.8	300	25	31	q14	1	14	2	3	40	oti:E	2000	20	0	214
2	1	60	Female	30.8	300	25	31	q14	1	14	3	0	0	None:None	0	0	0	214
2	1	60	Female	30.8	300	25	31	q15	2	15	1	2	50	ffg:D	1500	20	0	215
2	1	60	Female	30.8	300	25	31	q15	2	15	2	4	10	ffg:D+E	500	20	1	215
2	1	60	Female	30.8	300	25	31	q15	2	15	3	0	0	None:None	0	0	0	215

I estimated the same model as described in the methods using the clogit function from the survival package. The resulting coefficients are identical, see Table 8.

Code

model.formula.simple <- choice_bin ~
  TC +
  effort +
  weight +
  pain +
  cost +
  strata(STR)

library(survival)
clogit.simple <- clogit(model.formula.simple, 
                         data = df.mlogit, 
                         method = "approximate")
# goodness-of-fit measures
support.CEs::gofm(clogit.simple)


Rho-squared = 0.1928412 
Adjusted rho-squared = 0.1872868 
Akaike information criterion (AIC) = 4682.141 
Bayesian information criterion (BIC) = 4776.088 
Number of coefficients = 16 
Log likelihood at start = -2880.561 
Log likelihood at convergence = -2325.07

Code

clogit.simple.tidy <- tidy(clogit.simple)


kbl(clogit.simple.tidy, digits = 3) %>% 
  kable_styling(font_size = 8)
# df.clogit.simple <- as.data.frame(clogit.simple.tidy) %>%
#   rename(
#     Term = term,
#     Estimate = estimate,
#     Std_Error = std.error,
#     Z_value = statistic,
#     p_value = p.value) %>%
#   mutate(Term = forcats::fct_recode(
#     Term,
#     "Ffg:D"             = "TC1",
#     "Ffg:D+E"           = "TC2",
#     "Ffg:E"             = "TC3",
#     "Ffs:D"             = "TC4",
#     "Ffs:D+E"           = "TC5",
#     "Ffs:E"             = "TC6",
#     "Oti:D"             = "TC7",
#     "Oti:D+E"           = "TC8",
#     "Oti:E"             = "TC9",
#     "Vc:D"              = "TC10",
#     "Vc:D+E"            = "TC11",
#     "Vc:E"              = "TC12",
#     "Effort" = "effort",
#     "Weight" = "weight",
#     "Pain" = "pain",
#     "Cost" = "cost"))


#save(df.clogit.simple, file = "../data/df-clogit-simple.rda")
#colnames(df.clogit.simple) <- c("Term", "Estimate", "Std. Error", "z value", "p value")

#kbl(df.clogit.simple, digits = 3) %>%
#  kable_styling(font_size = 8)

#library(flextable)
# tbl.clogit.simple <-  df.clogit.simple %>% 
#   mutate(across(c(Estimate, `Std. Error`, `z value`, `p value`), ~ round(.x, 3)),
#          Term = c("On-site, group session: Dietary intervention",
#                   "On-site, group session: Dietary & Exercise intervention",
#                   "On-site, group session: Exercise intervention",
#                   "On-site, individual session: Dietary intervention",
#                   "On-site, individual session: Dietary & Exercise  intervention",
#                   "On-site, individual session: Exercise intervention",
#                   "Online, time & place independent: Dietary intervention",
#                   "Online, time & place independent: Dietary & Exercise intervention",
#                   "Online, time & place independent: Exercise intervention",
#                   "Videocalls: Dietary intervention",
#                   "Videocalls: Dietary & Exercise  intervention",
#                   "Videocalls: Exercise intervention",
#                   "Effort",
#                   "Weight",
#                   "Pain",
#                   "Cost")) %>%
#   dplyr::select(Term, Estimate, "Std. Error", "z value", "p value") %>% 
#   flextable::flextable() %>% 
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>% 
#   flextable::save_as_docx(path = "../tables/tbl.clogit.simple.docx")

Table 7: Output from the clogit function for the simple model.

term	estimate	std.error	statistic
TCffg:D	1.303	0.195	6.671
TCffg:D+E	1.697	0.195	8.682
TCffg:E	1.313	0.227	5.784
TCffs:D	1.186	0.212	5.583
TCffs:D+E	1.829	0.213	8.596
TCffs:E	1.448	0.198	7.293
TCoti:D	1.396	0.199	7.015
TCoti:D+E	1.738	0.214	8.119
TCoti:E	1.199	0.195	6.142
TCvc:D	0.994	0.226	4.403
TCvc:D+E	1.441	0.181	7.956
TCvc:E	1.355	0.240	5.648
effort	-0.224	0.033	-6.851
weight	0.049	0.006	7.823
pain	0.033	0.003	12.596
cost	-0.001	0.000	-14.189

Code

model.formula.main <- choice_bin ~
  TC +
  effort +
  pain +
  weight +
  cost +
  effort:FABQ_PA + 
  effort:FABQ_W + 
  cost:Franchise +
  strata(STR)

library(survival)
clogit.main <- clogit(model.formula.main, 
                         data = df.dce.long, 
                         method = "approximate")
# goodness-of-fit measure # McFadden’s Rho-squared = 0.1966 = moderate model fit
support.CEs::gofm(clogit.main)


Rho-squared = 0.1966142 
Adjusted rho-squared = 0.1894529 
Akaike information criterion (AIC) = 4301.004 
Bayesian information criterion (BIC) = 4411.004 
Number of coefficients = 19 
Log likelihood at start = -2653.149 
Log likelihood at convergence = -2131.502

Code

# MRS/WTP for cost and pain/weight
support.CEs::mwtp(
  output = clogit.main,
  monetary.variables = c("cost"),
  nonmonetary.variables = c("pain", "weight"),
  nreplications = 5000,
  confidence.level = c(0.95),
  seed = 123
)


        MWTP  2.5% 97.5%
pain   30.83 25.80 36.93
weight 44.79 32.69 59.35

method = Krinsky and Robb

Code

# MRS/WTP for effort and pain/weight
support.CEs::mwtp(
  output = clogit.main,
  monetary.variables = c("effort"),
  nonmonetary.variables = c("pain", "weight"),
  nreplications = 5000,
  confidence.level = c(0.95),
  seed = 123
)


          MWTP    2.5%   97.5%
pain   0.10533 0.07675 0.15375
weight 0.15300 0.10817 0.22464

method = Krinsky and Robb

Code

df.clogit.main <- tidy(clogit.main)

save(df.clogit.main, file = "../data/df-clogit-main.rda")
colnames(df.clogit.main) <- c("Term", "Estimate", "Std. Error", "z value", "p value")

kbl(df.clogit.main, digits = 3) %>% 
  kable_styling(font_size = 8)

Table 8: Output from the clogit function for the main model.

Term	Estimate	Std. Error	z value	p value
TCffg:D+E	0.413	0.153	2.700	0.007
TCffg:E	0.071	0.218	0.326	0.744
TCffs:D	-0.091	0.205	-0.442	0.658
TCffs:D+E	0.494	0.174	2.844	0.004
TCffs:E	0.113	0.167	0.680	0.497
TCNone:None	-1.350	0.202	-6.691	0.000
TCoti:D	0.070	0.158	0.443	0.658
TCoti:D+E	0.447	0.211	2.117	0.034
TCoti:E	-0.073	0.171	-0.430	0.667
TCvc:D	-0.264	0.180	-1.462	0.144
TCvc:D+E	0.155	0.152	1.018	0.309
TCvc:E	0.109	0.205	0.532	0.595
effort	-0.324	0.049	-6.641	0.000
pain	0.034	0.003	12.329	0.000
weight	0.050	0.006	7.638	0.000
cost	-0.001	0.000	-12.619	0.000
effort:FABQ_PA	0.007	0.003	2.441	0.015
effort:FABQ_W	-0.001	0.002	-0.447	0.655
cost:Franchise	0.000	0.000	1.025	0.305

The model can also be estimated when including the variable content.

Code

# model.formula.content <- choice_bin ~
#   type +
#   content +
#   effort +
#   pain +
#   weight +
#   cost +
#   effort:FABQ_PA + 
#   effort:FABQ_W + 
#   cost:Franchise +
#   strata(STR)
# 
# library(survival)
# clogit.content <- clogit(model.formula.content, 
#                          data = df.dce.long, 
#                          method = "approximate")
# 
# df.clogit.content <- tidy(clogit.content)
# 
# colnames(df.clogit.content) <- c("Term", "Estimate", "Std. Error", "z value", "p value")
# 
# kbl(df.clogit.content, digits = 3) %>% 
#   kable_styling(font_size = 8)
# 
# tbl.model.content <-  df.clogit.content %>% 
#   mutate(across(c("Estimate", "Std. Error", "z value", "p value"), ~ round(.x, 3)),
#           Term = c("Effort", "Pain reduction", "Weight reduction", "Cost", "Type: Online, time & place independent", "Type: Videocalls", "Type: On-site, group session", 
#                   "Type: On-site, individual session", "Content: Dietary intervention", "Content: Exercise intervention", "Content: Dietary + Exercise intervention",
#                   "Effort: FABQ-PA", "Effort: FABQ-W", "Cost: Franchise")) %>% 
#   flextable::flextable() %>% 
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>%
#   flextable::save_as_docx(path = "../tables/tbl.model.content.docx")

Once again, there appears to be an estimation issue with one level of content, specifically, level ED. Additionally, the coefficients for levels E and D are negative, which contradicts the findings: according to the univariate descriptives, E and D were preferred more than None.

Code

# prop.table(table(df.dce.long$choice_bin, df.dce.long$content), margin = 2) %>% 
#   kbl(digits = 3) %>% 
#   kable_styling(font_size = 8)

Code

Block.KM.1 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 1) %>% 
  dplyr::select(ID, BLOCK,S_1, S_2, T1, Q1.1:Q1.15, 
                FABQ_PA_1: FABQ_PA_5, FABQ_W_1: FABQ_W_11,
                Sex, Age_1, Weight_1, Height_1, Franchise, Difficulty_survey_1)
         
names(Block.KM.1) <- c("ID", "BLOCK", "S1", "S2", "T1", "q1", "q2", "q3", "q4", "q5", "q6",
                    "q7", "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15", 
                    "FABQ_PA_1", "FABQ_PA_2", "FABQ_PA_3", "FABQ_PA_4", "FABQ_PA_5",
                    "FABQ_W_1", "FABQ_W_2", "FABQ_W_3", "FABQ_W_4", "FABQ_W_5", "FABQ_W_6",
                    "FABQ_W_7", "FABQ_W_8", "FABQ_W_9", "FABQ_W_10", "FABQ_W_11", 
                    "Sex", "Age", "Weight", "Height", "Franchise", "Survey_difficulty")

Block.KM.2 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 2) %>% 
  dplyr::select(ID, BLOCK,S_1, S_2, T1, Q2.1:Q2.15, 
                FABQ_PA_1: FABQ_PA_5, FABQ_W_1: FABQ_W_11,
                Sex, Age_1, Weight_1, Height_1, Franchise, Difficulty_survey_1)
         
names(Block.KM.2) <- c("ID", "BLOCK", "S1", "S2", "T1", "q1", "q2", "q3", "q4", "q5", "q6",
                    "q7", "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15", 
                    "FABQ_PA_1", "FABQ_PA_2", "FABQ_PA_3", "FABQ_PA_4", "FABQ_PA_5",
                    "FABQ_W_1", "FABQ_W_2", "FABQ_W_3", "FABQ_W_4", "FABQ_W_5", "FABQ_W_6",
                    "FABQ_W_7", "FABQ_W_8", "FABQ_W_9", "FABQ_W_10", "FABQ_W_11", 
                    "Sex", "Age", "Weight", "Height", "Franchise", "Survey_difficulty")

df.qualtrics.KM <- rbind(Block.KM.1, Block.KM.2)

Code

library(dplyr)
library(tidyr)
library(survival)
library(ggplot2)

# Select variables
df.km <- df.qualtrics.KM %>% 
  dplyr::select(ID, S1:Survey_difficulty) %>% 
  mutate(across(S1:Survey_difficulty, as.character))

step_names <- names(df.km)[-1]  # drop ID

# Develop dataset for KM survival model
df.km.long <- df.km %>%
  pivot_longer(cols = -ID, names_to = "step", values_to = "value") %>%
  group_by(ID) %>%
  mutate(step_number = row_number(),
         is_na = is.na(value)) %>%
  summarise(
    time = ifelse(any(is_na), min(step_number[is_na]), max(step_number)),
    event = ifelse(any(is_na), 1, 0)
  )

df.km.long <- df.km.long %>%
  bind_rows(
    tibble(ID = unique(df.km.long$ID), time = 0, event = 0)
  ) %>%
  arrange(ID, time)

# Fit KM survival model
km.fit <- survfit(Surv(time, event) ~ 1, data = df.km.long)

# Convert survfit to data frame for ggplot
df.km <- data.frame(
  time = km.fit$time,
  surv = km.fit$surv
)

# Plot using ggplot2
library(ggpubr)

ggplot.km <- ggplot(df.km, aes(x = time, y = surv)) +
  geom_step(linewidth = 1.2, color = "#2C3E50") +        # thicker line, nice navy color
  scale_x_continuous(
    breaks = 1:length(step_names),
    labels = step_names
  ) +
  scale_y_continuous(
    limits = c(0, 1),
    breaks = seq(0, 1, by = 0.1)
  ) +
  labs(
    x = "Survey Step",
    y = "Probability of Continuing Participation",
    title = "Survival of Participants Across Survey Steps"
  ) +
  theme_pubr(base_size = 14, base_family = "Times New Roman") +  # ggpubr theme
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    axis.title.x = element_text(face = "bold"),
    axis.title.y = element_text(face = "bold"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5)
  )

ggplot.km

Code

ggsave(filename = "../figures/plot_kaplan_meier.png", plot = ggplot.km, width = 10, height = 8, dpi = 500)