Modeling Preferences for Lifestyle Interventions in Individuals with Overweight or Obesity and Chronic Low Back Pain

A Discrete Choice Experiment Using Random Utility Models

Author
Affiliations

Alexander Schurz

Bern University of Applied Sciences, Switzerland

Vrije Universiteit Brussels, Belgium

Bern University, Switzerland

Published

February 23, 2026

Code
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
library(kableExtra)
library(tidyverse)
library(pakret)

1 Study aim

The primary aim of this study is to conduct a DCE to evaluate the preferences of individuals with overweight or obesity and CLBP, and to determine the relative importance of key attributes characterising lifestyle interventions tailored to this population.

2 Methods

Each alternative is described in terms of \(p\) attributes, where each attribute has \(l\) different levels. A literature review was conducted to identify pertinent attributes of lifestyle interventions in this area. The number of attributes in a DCE is limited to avoid cognitive overload, as too many attributes can complicate participants’ decision-making. As a result, the number of attributes was refined, following consultations with experts in CLBP care (three healthcare researchers, two physiotherapists, two general practitioners, and one psychologist) as well as two CLBP patients. To facilitate the scalability of the pain reduction attribute, predetermined values corresponding to the minimal clinically important difference (MCID) for numeric rating scales in individuals with low back pain were used (Lauridsen et al. 2006). The finalized set of attributes and their corresponding levels underwent external validation by health professionals and patients to ensure the credibility and relevance of the design.

Attribute Variable Levels Levels (R)
Type of realization type On-site, individual session ffs
On-site, group session ffg
Videocalls vc
Online, time & place independent oti
Content of program content Dietary intervention D
Exercise program E
Dietary and exercise program ED
Weekly time effort effort 2 h 2
3 h 3
4 h 4
5 h 5
Expected weight loss weight 5 kg 5
10 kg 10
15 kg 15
20 kg 20
Pain reduction pain 10% 10
20% 20
30% 30
40% 40
50% 50
Treatment costs cost 500 CHF 500
1000 CHF 1000
1500 CHF 1500
2000 CHF 2000

Additionally, we have the following person-specific attributes:

Code
Realization.type <- factor(
  c("face-to-face_single", "face-to-face_group", "videocalls", "online-time & place independent"),
  levels = c("face-to-face_single", "face-to-face_group", "videocalls", "online-time & place independent")
  )

Content <- factor(
  c("D+E", "Dietary intervention", "Exercise program"), 
  levels = c("D+E", "Dietary intervention", "Exercise program")
  )

Average.effort.per.week <- seq(2, 5, by = 1)

Weight.loss <- seq(-5,-20, by = -5)

Pain.reduction <- seq(-10,-50, by = -10)

Total.costs <- seq(500, 2000, by = 500)

Attributes <- list(Realization.type, Content, Average.effort.per.week, Weight.loss, Pain.reduction, Total.costs)

Levels <- c(4, 3, 4, 4, 5, 4)

names(Attributes) <- c("Realization.type", "Content", "Average.effort.per.week", "Weight.loss", "Pain.reduction", "Total.costs")

A full factorial design was initially created. There are \(\prod_{q = 1}^{p} l_q = 3840\) possible combinations of choice sets. From the full factorial design, we created an efficient fractional design with 30 choice sets by using the Federov’s exchange algorithm (R. E. Wheeler 2004) from the R package AlgDesign. The developed unlabelled choice sets were divided into two blocks of 15 choice sets to reduce the workload of the patients. Each choice set included two unlabelled alternatives (alternative A and alternative B) describing different lifestyle interventions, as well as one opt-out option, allowing participants to choose not to participate in such a treatment (Campbell and Erdem 2019).

Code
library(AlgDesign) 
mat.fullfactorial <- gen.factorial(Levels, 
                     varNames = names(Attributes),
                     factors = "all")

list.fullfactorial.qual <- eval.design(~., mat.fullfactorial)
Code
set.seed(123) # Do not change this line!
list.fractional <- optFederov(~., mat.fullfactorial, nTrials = 30, approximate = FALSE) 
df.fractional <- data.frame(list.fractional$design)

The final choice sets, which align with the fractional design were created using the R package support.CEs (Aizaki 2012b).

The final choice sets are shown in appendix. An example of one choice set is presented in Figure 1.

Figure 1
Code
library(support.CEs)
library(apollo)

if(getRversion() >= "4.3.2") RNGkind(sample.kind = "Rounding") # this is necessary to get the right version of the used questionnaire

# Perform rotation design
set.seed(123) # This must stay the same during the whole procedure!
list.design <- rotation.design(
 candidate.array = df.fractional,
 attribute.names = Attributes,
 nalternatives = 2,
 nblocks = 2,
 row.renames = FALSE,
 randomize = TRUE, # FALSE: corresponds to rotation method, if set on TRUE: mix-and-match method
 seed = 123)

Patients were eligible to complete the DCE if they were 18 years or older, classified as overweight or obese (BMI > 25), and suffered from CLBP. These eligibility criteria were assessed at the start of the online questionnaire. A dominance test was administered prior to the DCE to ensure participants’ understanding of the task.

Following the completion of the 15 choice sets, participants were asked to complete the Fear Avoidance Belief Questionnaire (FABQ) (Staerkle et al. 2004; Pfingsten et al. 2000; Monticone et al. 2020). In addition, participants provided demographic information, including sex, age, weight, height, and the level of their Swiss health insurance coverage (Franchise). The subjective difficulty of the questionnaire was also assessed.

To distribute the online survey, flyers containing QR codes were distributed to medical doctors and physiotherapists who treat patients within the relevant population, as well as to the target population directly. The survey was further disseminated across Switzerland via social media advertisements on platforms such as Instagram and Facebook.

According to de Bekker-Grob and colleagues (2015), the following rule of thumb is proposed for calculating the required sample size for a DCE:

\[ N > \frac{500 \times c}{t \times a}, \]

where:

  • \(N\) is the required sample size
  • \(c\) represents the number of analysis cells
  • \(t\) is the number of choice tasks per respondent
  • \(a\) is the number of alternatives per choice task.

To account for all two-way interactions, the number of analysis cells \(c\) should correspond to the largest product of the levels of two attributes. In our case, we have 4 levels for one attribute and 5 levels for another attribute, thus \(c = 4 \times 5 = 20\).

Thus, at least 333 respondents are needed.

3 Statistial analysis

Descriptive statistics were applied to summarize demographics and clinical characteristics of the respondents, as well as the outcomes of other patient-reported measurement scores.

We estimated logit models in R using the mlogit package. The dependent variable was binary, indicating whether an alternative was chosen (0 = not chosen, 1 = chosen) within each choice set. Choices were nested within choice sets, which in turn were nested within decision makers. The independent variables were the attributes describing the alternatives as well as decision-maker-specific attributes. More formally, a decision maker, labeled \(i\), faces a choice among \(J\) alternatives. The decision maker chooses the alternative with the highest utility \(U_{ij}\), \(j = 1, ..., J\): choose alternative \(k\) if \(U_{ik} > U_{ij} \forall j \ne k\).

We can’t observe the decision maker’s utility, but only some attributes of the alternatives, labeled \(Z_{ij} \forall j\) , and some attributes of the decision maker, labeled \(X_i\). The function \(V_{ij} = V (X_i, Z_{ij}) \forall j\) represents the observed/systematic part of \(U_{ij}\). Therefore:

\[ U_{ij} = V_{ij} + \epsilon_{ij}. \]

\(\epsilon_{ij}\) is the stochastic part, treated as random. The random vector \(\epsilon_i = (\epsilon_{i1}, \epsilon_{i2}, \dots, \epsilon_{iJ})\) has a joint density function \(f(\epsilon_i)\).

We modeled \(V_{ij}\) as a linear function of \(X_i\) and \(Z_{ij}\):

\[ V_{ij} = \beta^T X_i + \gamma^T Z_{ij}, \] where \(\beta\) and \(\gamma\) are generic parameter vectors representing the effects of the individual-specific (\(X_i\)) and alternative-specific (\(Z_ij\)) variables, respectively. The parameters are generic and not alternative specific, because we have an unlabelled DCE.

Note that we model \(V_{ij}\) and not the probability of choosing an alternative. Thus, only the difference of \(V_{ik}-V_{ij}\) is relevant.

The probability that decision maker \(i\) chooses alternative \(k\) is: \[\begin{align*} P_{ik} &= \Pr(U_{ik} > U_{ij} \; \forall j \ne k) \\ &= \Pr(V_{ik} + \epsilon_{ik} > V_{ij} + \epsilon_{ij} \; \forall j \ne k) \\ &= \Pr(\epsilon_{ij} - \epsilon_{ik} < V_{ik} - V_{ij} \; \forall j \ne k). \end{align*}\]

Using the density \(f(\epsilon_{i})\), the cumulative probability that each random term \(\epsilon_{ik} − \epsilon_{ij}\) is below the observed quantity \(V_{ik} − V_{ij}\) is:

\[ \int I[\epsilon_{ik} − \epsilon_{ij} < V_{ik} − V_{ij}] f(\epsilon_i)d\epsilon_i \]

where \(I(·)\) is the indicator function equal to 1 when the expression in parentheses is true and 0 otherwise. It turns out that the solution to this integral is:

\[ P_{ij} = \frac{e^{V_{ij}}}{\Sigma_{k = 1}^J e^{V_{ik}}} = \frac{e^{\beta^TX_{i}+\gamma^TZ_{ij}}}{\Sigma_{k = 1}^J \ e^{\beta^TX_{i} + \gamma^TZ_{ik}}} \]

where:

  • \(P_{ij}\) is the probability that person \(i\) chooses alternative \(j\).
  • \(X_i\) are the attributes of person \(i\).
  • \(Z_{ij}\) are the attributes of alternative \(j\) for person \(i\).
  • \(k\) in the denominator sums over all alternatives (from 1 to \(J\)).

Assumption: \(\epsilon_{ij} \sim\) i.i.d. type I extreme value (Gumbel) with \(Var(\epsilon_{ij}) = \frac{\pi^2}{6}\).

Depending on the type of variables included in the model, different names are used to describe the resulting logit model. There is no universally consistent nomenclature. However, a common convention is:

  • When \(V_{ij} = \beta^TX_{i}\) \(\to\) multinomial logit model.

  • When \(V_{ij} = \gamma^TZ_{ij}\), \(\to\) conditional logit model.

  • When \(V_{ij} = \beta^TX_{i}+\gamma^TZ_{ij}\), \(\to\) mixed logit model.

Note that sometimes mixed logit models refer to models with random coefficients.

An additional (reference) level None for the two categorical alternative-specific attributes type and content was introduced. For the quantitative attributes effort, pain, weight and cost, the value 0 was used. This creates a meaningful opt-out option, but with the disadvantage of estimating two more coefficients. This way, the estimates are the effects (\(\log(Odds~ratios)\)), relative to opt-out (effort = 0, pain = 0, weight = 0, cost = 0, content = None). In other words:

\[ V_{opt-out} = intercept = \gamma_0. \]

We are interested in whether fear-avoidance beliefs (FABQ_W and FABQ_PA) interact with the weekly effort required for therapy. Since this is an unlabelled DCE, modeling FABQ_W and FABQ_PA as person-specific variables using the mlogit() function is not appropriate: this would estimate alternative-specific coefficients \(\beta\), which are not meaningful when alternatives are unlabelled.

Therefore, to investigate preference heterogeneity in a meaningful way, we included generic interaction terms between effort and the fear-avoidance variables, i.e., \(effort \times FABQ_W\) and \(effort \times FABQ_{PA}\).

Furthermore, we are interested in the interaction \(cost \times franchise\).

While our aim was to model all alternative-specific attributes, the current sample size is not sufficient to estimate all parameters. Therefore, I excluded the variable content from the model, as it is of comparatively lower practical relevance for the research question.

We model the observable part of utility \(V_{ij}\) as a linear function of alternative-specific attributes \(Z_{ij}\) and interactions between the individual-specific variable FABQ_W and FABQ_PA and effort:

\[ \begin{aligned} V_{ij} =\ & \gamma_1 \times effort_{ij} + \gamma_2 \times pain_{ij} + \gamma_3 \times weight_{ij} + \gamma_4 \times cost_{ij} \\ & + \gamma_5 \times type_{oti,i} + \gamma_6 \times type_{vc,i} + \gamma_7 \times type_{ffg,i} + \gamma_8 \times type_{ffs,i} \\ & + \delta_1 \times (effort_{ij} \times FABQ_{PA,i}) + \delta_2 \times (effort_{ij} \times FABQ_{W,i}) + \delta_3 \times (cost_{ij} \times Franchise_i) \end{aligned} \]

where:

  • \(\gamma_1, \dots, \gamma_8\) are the generic coefficients for the alternative-specific attributes (type was dummy coded)
  • \(\delta_1, \dots, \delta_3\) capture the interaction effects of individual-specific variables \(FABQ_{PA}\) and \(FABQ_W\) with the attribute effort and the individual-specific variable Franchise with the attribute cost.

3.1 Main effects

In a linear model, coefficients represent the marginal effects of explanatory variables on the dependent variable. This direct interpretation does not hold in the case of a multinomial logit model. Nevertheless, useful insights can be derived through appropriate transformations of the coefficients. For variables that are specific to an alternative, the sign of the coefficient can still be directly interpreted. The marginal effect in this context is calculated by multiplying the coefficient by the product of two probabilities, which has a maximum value of 0.25. As a rule of thumb, dividing the coefficient by four provides an upper bound estimate of the marginal effect (Croissant 2020b). However, marginal utilities (i.e., the raw coefficients themselves) cannot be directly interpreted in isolation because they’re on a latent utility scale, meaning they don’t have a natural unit or reference point. What can be interpreted are ratios of these coefficients, which give us the marginal rate of substitution (MRS) between attributes:

\[ MRS = \frac{\left| \gamma_1 \right|}{\left| \gamma_2 \right|}. \]

In other words, the MRS tells us how much of one attribute a person is willing to give up to gain more of another attribute, while keeping their overall utility constant. For example, in the present DCE, the MRS between effort and pain would tell us how much time effort someone is willing to invest for an additional unit (%) of pain reduction. In our DCE, we were interested in the MRS between effort and pain, effort and weight, cost and pain as well as between cost and weight.

The range of the marginal effects of each attribute can be used to assess its relative importance. For each attribute, the range is the difference between its maximum and minimum marginal effects across all choices. The relative importance of each attribute can be calculated by normalizing the range of the attribute’s coefficient relative to the sum of the ranges for all attributes. This gives the proportion of the total effect attributable to each attribute. The relative importance of attribute \(j\) is given by:

\[ \text{Relative Importance of Attribute } j = \frac{\left| (\max(Z_{ij}) - \min(Z_{ij})) \cdot \gamma_j \right|}{\sum_{j=1}^{J} \left| (\max(Z_{ij}) - \min(Z_{ij})) \cdot \gamma_j \right|} \]

Thus, the relative importance represents the share of the total variation in utility that is explained by each attribute. Relative importance was determined from a model containing alternative-specific attributes only.

The basic bootstrap confidence interval for \(MRS\) and the relative importance is given by:

\[ \left[ 2\hat{\theta}_n - q^*_{1 - \alpha/2}, \; 2\hat{\theta}_n - q^*_{\alpha/2} \right] \]

Where: - \(\hat{\theta}_n\) is the observed estimate from the original sample. - $ q^*_{} $ is the \(\alpha\)-quantile of the bootstrap distribution of the estimate. - $ $ is the significance level

Specific contrasts were tested for the categorical variable type. Let the coefficients for type levels be represented as:

  • \(\gamma_{\text{oti}}\) for oti
  • \(\gamma_{\text{vc}}\) for vc
  • \(\gamma_{\text{ffg}}\) for ffg
  • \(\gamma_{\text{ffs}}\) for ffs

To test whether the alternatives oti and vc jointly differ from ffg and ffs (i.e. online versus face-to-face), we specified the following contrast:

\[ \frac{1}{2} \left( \gamma_{\text{oti}} + \gamma_{\text{vc}} \right) - \frac{1}{2} \left( \gamma_{\text{ffg}} + \gamma_{\text{ffs}} \right). \]

This can be interpreted as comparing the average utility of choosing either oti or vc to that of choosing ffg or ffs.

Furthermore, pairwise comparisons for any two alternatives (e.g., oti and vc) were performed, which simply is the difference between their corresponding coefficients, for example \(\gamma_{\text{oti}} - \gamma_{\text{vc}}\).

3.2 R functions

R version 4.4.2 (R Core Team 2025), together with the following R packages was used:

4 Results

Code
df.qualtrics.raw <- rio::import("../data/DCE_data.csv")

# remove the first two rows (tries by the author)
df.qualtrics.adapted <- df.qualtrics.raw[-c(1:2), ]
Code
library(tableone)
library(officer)


# correct data types
df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate_at(vars(S_1:Difficulty_survey_1), as.integer) %>% 
  mutate(Progress=as.numeric(Progress)) %>% 
  mutate_at(vars(Age_1:Height_1),as.numeric) %>% 
  mutate(ID = 1:nrow(df.qualtrics.adapted))

# Rename variables 
# Screening questionnaires S_1 & S_2--> applicable/not applicable
# Sex --> male/female/not binary/other/not specified
# Franchise --> 300/500/1000/1500/2000/2500CHF
df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate(S_1 = ifelse(S_1 == 1, "overweight", "normal weight"),
         S_2 = ifelse(S_2 == 1, "CLBP", "no CLBP"),
         Sex = case_when(
           Sex == 1 ~ "Male",
           Sex == 2 ~ "Female",
           Sex == 3 ~ "not binary",
           Sex == 4 ~ "other",
           Sex == 5 ~ "not specified"
         ),
         Sex_bin = case_when(
           Sex == "Male" ~ "Male",
           Sex == "Female" ~ "Female",
           Sex == "not binary" ~ NA,
           Sex == "other" ~ NA,
           Sex == "not specified" ~ NA
         ),
         Franchise = case_when(
           Franchise == 1 ~ 300,
           Franchise == 2 ~ 500,
           Franchise == 3 ~ 1000,
           Franchise == 4 ~ 1500,
           Franchise == 5 ~ 2000,
           Franchise == 6 ~ 2500
         )
  )

# Calculation of outcomes from FABQ for the 2 subcategories "physical activity" (PA) and "workload" (W)
# Physical activity subscore = sum of questions 2-5
# Work subscore = sum of questions 6, 7, 9, 10, 11, 12, 15 = FABQ_W_1,2,4,5,6,7,10
df.qualtrics.adapted$FABQ_PA <- df.qualtrics.adapted$FABQ_PA_2 + df.qualtrics.adapted$FABQ_PA_3 + 
  df.qualtrics.adapted$FABQ_PA_4 + df.qualtrics.adapted$FABQ_PA_5

df.qualtrics.adapted$FABQ_W <- df.qualtrics.adapted$FABQ_W_1 + 
  df.qualtrics.adapted$FABQ_W_2 +df.qualtrics.adapted$FABQ_W_4 +
  df.qualtrics.adapted$FABQ_W_5 + df.qualtrics.adapted$FABQ_W_6 + df.qualtrics.adapted$FABQ_W_7 +
  df.qualtrics.adapted$FABQ_W_10

# Add a variable called "BLOCK" 
# All questions with NA values are now allocated to BLOCK 2. This causes the imbalance between BLOCK 1 and 2
df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate(BLOCK = ifelse(is.na(Q1.1),2,1))

# Add a variable called BMI based on Weight_1 and Height_1
df.qualtrics.adapted <- df.qualtrics.adapted %>% 
  mutate(BMI = round(Weight_1 / ((Height_1 / 100)^2), 1))

# A variable called "Dominance_test" is added for the variable T1 
df.qualtrics.adapted$Dominance_test <- df.qualtrics.adapted$`T1`

# Responses for "Dominance_test" are adapted to failed (=1) and passed (=2 or 3)
df.qualtrics.adapted$Dominance_test <- ifelse(
  df.qualtrics.adapted$Dominance_test == 1, "failed",
  ifelse(df.qualtrics.adapted$Dominance_test == 2, "passed",
         ifelse(df.qualtrics.adapted$Dominance_test == 3, "passed", NA)))

Block.1 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 1) %>% 
  dplyr::select(ID,BLOCK,Q1.1:Q1.15, 
         Age_1, Sex, Sex_bin, BMI, Franchise, FABQ_PA, FABQ_W) # predictors: Age, Sex, BMI, Franchise, FABQ_PA, FABQ_W
names(Block.1) <- c("ID", "BLOCK", "q1", "q2", "q3", "q4", "q5", "q6",
                    "q7", "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15", 
                    "age", "sex", "sex_bin", "BMI", "Franchise", "FABQ_PA", "FABQ_W")

Block.2 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 2) %>% 
  dplyr::select(ID,BLOCK,Q2.1:Q2.15, 
         Age_1, Sex, Sex_bin, BMI, Franchise, FABQ_PA, FABQ_W) # predictors: Age, Sex, BMI, Franchise, FABQ_PA, FABQ_W
names(Block.2) <- c("ID", "BLOCK",
                    "q1", "q2", "q3", "q4", "q5", "q6", "q7", "q8",
                    "q9", "q10", "q11", "q12", "q13", "q14", "q15",
                    "age", "sex", "sex_bin", "BMI", "Franchise", "FABQ_PA", "FABQ_W")

df.qualtrics.prep <- rbind(Block.1, Block.2)
Code
df.qualtrics.prep.long <- df.qualtrics.prep %>% 
  pivot_longer(starts_with("q"), values_to = "choice", names_to = "question") %>% 
  mutate(QES = parse_number(question))


df.design.dce <- rbind(list.design$alternatives$alt.1, list.design$alternatives$alt.2) %>% 
  pivot_wider(names_from = ALT, values_from = c(Realization.type, Content, Average.effort.per.week, Weight.loss, Pain.reduction, Total.costs)) %>% 
  mutate(Realization.type_3 = "None",
         Content_3 = "None",
         Average.effort.per.week_3 = 0,
         Weight.loss_3 = 0,
         Pain.reduction_3 = 0,
         Total.costs_3 = 0) %>% 
  mutate(across(Average.effort.per.week_1:Total.costs_2, ~ parse_number(as.character(as.factor(.))))) %>% 
  mutate(across(Realization.type_1:Realization.type_2, ~ fct_recode(as.factor(.),
                                  "ffs" = "face-to-face_single",
                                  "ffg" = "face-to-face_group",
                                  "vc" = "videocalls",
                                  "oti" = "online-time & place independent"))) %>% 
  mutate(across(Content_1:Content_2, ~ fct_recode(as.factor(.),
                                  "ED" = "D+E",
                                  "E" = "Exercise program",
                                  "D" = "Dietary intervention"))) %>% 
  dplyr::select(BLOCK, QES, sort(setdiff(names(.), c("BLOCK", "ALT"))))


df.dce.wide <- left_join(df.qualtrics.prep.long, df.design.dce, by = c("BLOCK", "QES")) %>% 
  rename(effort1 = Average.effort.per.week_1,
         effort2 = Average.effort.per.week_2,
         effort3 = Average.effort.per.week_3,
         content1 = Content_1,
         content2 = Content_2,
         content3 = Content_3,
         pain1 = Pain.reduction_1,
         pain2 = Pain.reduction_2,
         pain3 = Pain.reduction_3,
         type1 = Realization.type_1,
         type2 = Realization.type_2,
         type3 = Realization.type_3,
         cost1 = Total.costs_1,
         cost2 = Total.costs_2,
         cost3 = Total.costs_3,
         weight1 = Weight.loss_1,
         weight2 = Weight.loss_2,
         weight3 = Weight.loss_3) %>% 
  mutate(across(starts_with("pain"), ~ .x * -1)) %>% 
  mutate(across(starts_with("weight"), ~ .x * -1))

save(df.dce.wide, file = "../data/df-dce-wide.rda")



df.dce.long <- df.dce.wide %>% 
  pivot_longer(cols = effort1:weight3, 
               names_to = c(".value", "ALT"),  
               names_pattern = "([a-zA-Z]+)(\\d)") %>% 
  mutate(
    choice_bin = ifelse(choice == ALT, 1, 0),
    content = as.character(content),
    content = if_else(content == "ED", "D+E", content),
    content = factor(content, levels = c("None", "D", "E", "D+E")),
    type = factor(type, levels = c("None", "oti", "vc", "ffg", "ffs")),
    TC = factor(paste(type, content, sep = ":")),
    STR = 100 * ID + QES) %>% 
  relocate(TC, .after = type)%>% 
  dplyr::select(-type, -content) 

save(df.dce.long, file = "../data/df-dce-long.rda")

# DCE long without TC but content + type separated
df.dce.long.noTC <- df.dce.wide %>% 
  pivot_longer(cols = effort1:weight3, 
               names_to = c(".value", "ALT"),  
               names_pattern = "([a-zA-Z]+)(\\d)") %>% 
  mutate(
    choice_bin = ifelse(choice == ALT, 1, 0),
    content = as.character(content),
    content = if_else(content == "ED", "D+E", content),
    content = factor(content, levels = c("None", "D", "E", "D+E")),
    type = factor(type, levels = c("None", "oti", "vc", "ffg", "ffs")),
    TC = factor(paste(type, content, sep = ":")),
    STR = 100 * ID + QES) %>% 
  relocate(TC, .after = type) %>% 
  dplyr::select(-TC) 

save(df.dce.long.noTC, file = "../data/df-dce-long-noTC.rda")
Code
load("../data/df-dce-long.rda")

library(mlogit)
df.mlogit <- dfidx::dfidx(df.dce.long, idx = list("STR", "ALT"), drop.index = FALSE, pkg = "mlogit")
df.mlogit$TC <- relevel(as.factor(df.mlogit$TC), ref = "None:None")
df.mlogit <- df.mlogit %>% 
  arrange(ID)
Code
library(table1)
# Categorize the variable Progress based on a cutoff of 80 and rename it
# 80% completion of the survey would mean that all choice sets were answered but not further questions (demographics, FABQ)
df.qualtrics.adapted$Progress_cutoff <- ifelse(df.qualtrics.adapted$Progress>=80, ">=80", "<80")


# Define variables of interest
vars.demographics.DCE <- c("Sex", "Age_1", "Height_1", "Weight_1", "BMI", "Progress_cutoff", "Dominance_test", "Franchise", "FABQ_PA", "FABQ_W", "Difficulty_survey_1", "Duration (min)")

df.qualtrics.adapted <- df.qualtrics.adapted %>%
  mutate(across(where(is.factor), droplevels)) # Entfernt ungenutzte Faktorlevel

# Format labels for table1
label(df.qualtrics.adapted$Sex_bin) <- "Sex"
label(df.qualtrics.adapted$Age_1) <- "Age (years)"
label(df.qualtrics.adapted$Height_1) <- "Height (cm)"
label(df.qualtrics.adapted$Weight_1) <- "Weight (kg)"
label(df.qualtrics.adapted$BMI) <- "Body Mass Index"
label(df.qualtrics.adapted$Progress_cutoff) <- "Progress cutoff (%)"
label(df.qualtrics.adapted$Franchise) <- "Franchise (CHF)"
label(df.qualtrics.adapted$FABQ_PA) <- "FABQ - Physical activity"
label(df.qualtrics.adapted$FABQ_W) <- "FABQ - Workload"
label(df.qualtrics.adapted$Difficulty_survey_1) <- "Survey difficulty (NAS 0-10)"
label(df.qualtrics.adapted$`Duration (in seconds)`) <- "Duration (sec)"

# Franchise should be reported as a factor in the demographics
df.qualtrics.adapted.2 <- df.qualtrics.adapted %>%
  mutate(
    Franchise = factor(Franchise,
                       levels = c(300, 500, 1000, 1500, 2000, 2500),
                       labels = c("300", "500", "1000", "1500", "2000", "2500")),
    Duration_min = round(`Duration (in seconds)`/60,2)
  ) %>%
  drop_na(Franchise) %>% 
  filter(BMI >= 25.0) %>% 
  labelled::set_variable_labels(
    Sex_bin = "Sex",
    Age_1 = "Age (years)",
    Height_1 = "Height (cm)",
    Weight_1 = "Weight (kg)",
    BMI = "Body Mass Index",
    Progress_cutoff = "Progress cutoff (%)",
    Franchise = "Franchise (CHF)",
    FABQ_PA = "FABQ - Physical activity",
    FABQ_W = "FABQ - Workload",
    Difficulty_survey_1 = "Survey difficulty (NAS 0–10)",
    Duration_min = "Duration (min)"
  )

# Create summary table stratified by BMI category
tbl.DCE.demographics <- table1(~ Sex_bin + `Age_1`+ `BMI`+ `Progress_cutoff`+ `Franchise`+ `FABQ_PA`+ `FABQ_W` + `Difficulty_survey_1` + `Duration_min`,
       data = df.qualtrics.adapted.2, 
       overall = "Overall", 
       droplevels = FALSE)

df.tab1 <- as.data.frame(tbl.DCE.demographics)


library(gtsummary)
library(gt)
library(officer)

# Create a gtsummary table
tbl1.DCE <- df.qualtrics.adapted.2 %>%
  drop_na(Franchise) %>%
  filter(BMI >= 25.0) %>% 
  mutate(
    `BMI category` = case_when(
      BMI >= 25 & BMI < 30 ~ "Overweight",
      BMI >= 30 ~ "Obesity",
      TRUE ~ NA),
    `BMI category` = factor(`BMI category`, levels = c("Overweight", "Obesity"))) %>% 
  dplyr::select(Sex_bin, Age_1, BMI, `BMI category`, `Franchise`, FABQ_PA, FABQ_W) %>%
  tbl_summary(
    statistic = list(all_continuous() ~ "{mean} ± {sd}"),
    digits = all_continuous() ~ 1,
    missing = "no"
  ) %>%
  bold_labels() %>%
  as_flex_table() %>%
  flextable::set_table_properties(width = 1, layout = "autofit") %>%
  flextable::padding(padding.top = 1, padding.bottom = 1) %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl1.DCE 

Characteristic

N = 1591

Sex

Female

96 (60%)

Male

63 (40%)

Age (years)

45.2 ± 12.6

Body Mass Index

31.8 ± 5.2

BMI category

Overweight

68 (43%)

Obesity

91 (57%)

Franchise (CHF)

300

83 (52%)

500

20 (13%)

1000

7 (4.4%)

1500

6 (3.8%)

2000

2 (1.3%)

2500

41 (26%)

FABQ - Physical activity

12.8 ± 5.6

FABQ - Workload

19.4 ± 10.1

1n (%); Mean ± SD

Code
tbl1.DCE %>%
  flextable::save_as_docx(path = "../tables/tbl1.characteristics.docx")
Code
library(gridExtra)

make_plot_data <- function(var, var_name) {
  df <- as.data.frame(prop.table(table(df.mlogit$choice_bin, var), margin = 2))
  colnames(df) <- c("choice", "level", "freq")
  df$variable <- var_name
  return(df)
}

df.typecontent    <- make_plot_data(df.mlogit$TC,    "Type:Content")
df.cost    <- make_plot_data(df.mlogit$cost,    "Cost")
df.effort  <- make_plot_data(df.mlogit$effort,  "Effort")
df.pain    <- make_plot_data(df.mlogit$pain,    "Pain")
df.weight  <- make_plot_data(df.mlogit$weight,  "Weight")


df.plot.descr <- bind_rows(df.typecontent, df.cost, df.effort, df.pain, df.weight)

level_orders <- list(
  "Weight loss" = c(0, 5, 10, 15, 20))

plots <- df.plot.descr %>%
  group_split(variable) %>%
  lapply(function(df) {
    var_name <- unique(df$variable)
    if (var_name %in% names(level_orders)) {
      df$level <- factor(as.character(df$level), levels = level_orders[[var_name]])
    }

    ggplot(df, aes(x = level, y = freq, fill = choice)) +
      geom_bar(stat = "identity", position = "dodge") +
      labs(title = var_name, x = NULL, y = "Rel. frequency", size = 8) +
      theme_minimal() +
      scale_fill_grey() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8),
            plot.title = element_text(size = 9),
            axis.title.y = element_text(size = 8),
            legend.title = element_text(size = 8))
  })

# export ggsave
library(ggplot2)
library(gridExtra)
library(grid)
combined_plot <- grid.arrange(grobs = plots, ncol = 2, nrow = 3, widht = 8, height = 8)
ggsave(filename = "../figures/plot_grid.png", plot = combined_plot, width = 10, height = 8)
Figure 2: Univariate relationships between attribute levels and choices (TC).
Code
load("../data/df-dce-long-noTC.rda")

library(mlogit)
df.mlogit.noTC <- dfidx::dfidx(df.dce.long.noTC, idx = list("STR", "ALT"), drop.index = FALSE, pkg = "mlogit")

library(gridExtra)

make_plot_data <- function(var, var_name) {
  df <- as.data.frame(prop.table(table(df.mlogit.noTC$choice_bin, var), margin = 2))
  colnames(df) <- c("choice", "level", "freq")
  df$variable <- var_name
  return(df)
}

df.type    <- make_plot_data(df.mlogit.noTC$type,    "Type")
df.content   <- make_plot_data(df.mlogit.noTC$content,    "Content")
df.cost    <- make_plot_data(df.mlogit.noTC$cost,    "Cost")
df.effort  <- make_plot_data(df.mlogit.noTC$effort,  "Effort")
df.pain    <- make_plot_data(df.mlogit.noTC$pain,    "Pain")
df.weight  <- make_plot_data(df.mlogit.noTC$weight,  "Weight")


df.plot.descr.noTC <- bind_rows(df.type, df.content, df.cost, df.effort, df.pain, df.weight)

level_orders.noTC <- list(
  "Weight loss" = c(0, 5, 10, 15, 20))

plots.noTC <- df.plot.descr.noTC %>%
  group_split(variable) %>%
  lapply(function(df) {
    var_name <- unique(df$variable)
    if (var_name %in% names(level_orders)) {
      df$level <- factor(as.character(df$level), levels = level_orders[[var_name]])
    }

    ggplot(df, aes(x = level, y = freq, fill = choice)) +
      geom_bar(stat = "identity", position = "dodge") +
      labs(title = var_name, x = NULL, y = "Rel. frequency", size = 8) +
      theme_minimal() +
      scale_fill_grey() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8),
            plot.title = element_text(size = 9),
            axis.title.y = element_text(size = 8),
            legend.title = element_text(size = 8))
  })


# export ggsave
# library(ggplot2)
# library(gridExtra)
# library(grid)
# combined_plot_noTC <- grid.arrange(grobs = plots.noTC, ncol = 2, nrow = 3, widht = 8, height = 8)
# ggsave(filename = "../figures/plot_grid_noTC.png", plot = combined_plot_noTC, width = 10, height = 8)

4.1 Logit model

Code
mlogit.simple <- mlogit(choice_bin ~ TC + effort + weight + pain + cost | 0, data = df.mlogit)

library(broom)
mlogit.simple.tidy <- broom::tidy(mlogit.simple)

df.mlogit.simple <- as.data.frame(mlogit.simple.tidy) %>%
  rename(
    Term = term,
    Estimate = estimate,
    Std_Error = std.error,
    Z_value = statistic,
    p_value = p.value
  ) %>%
  mutate(Term = sub("^TC", "", Term))


# export as docx as estimate
library(flextable)
tbl.mlogit.simple.est <-  df.mlogit.simple %>% 
  mutate(across(c(Estimate, Std_Error, Z_value, p_value), ~ round(.x, 3)),
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost"),
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, Estimate, "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.simple.est.docx")

# export as docx as OR
library(flextable)
tbl.mlogit.simple.OR <-  df.mlogit.simple %>% 
  mutate(
    Odds_Ratio = round(exp(Estimate), 3),
    Std_Error = round(Std_Error, 3),
    Z_value = round(Z_value, 3),
    p_value = round(p_value, 3),
    Estimate = round(Estimate, 3),,
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost"),
        "Odds ratio" = Odds_Ratio,
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, "Odds ratio", "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl.mlogit.simple.OR

Term

Odds ratio

Standard error

z value

p value

On-site, group session: Dietary intervention

3.679

0.195

6.671

0

On-site, group session: Dietary & Exercise intervention

5.456

0.195

8.682

0

On-site, group session: Exercise intervention

3.718

0.227

5.784

0

On-site, individual session: Dietary intervention

3.274

0.212

5.583

0

On-site, individual session: Dietary & Exercise intervention

6.229

0.213

8.596

0

On-site, individual session: Exercise intervention

4.253

0.198

7.293

0

Online, time & place independent: Dietary intervention

4.038

0.199

7.015

0

Online, time & place independent: Dietary & Exercise intervention

5.688

0.214

8.119

0

Online, time & place independent: Exercise intervention

3.315

0.195

6.142

0

Videocalls: Dietary intervention

2.702

0.226

4.403

0

Videocalls: Dietary & Exercise intervention

4.225

0.181

7.956

0

Videocalls: Exercise intervention

3.878

0.240

5.648

0

Effort

0.799

0.033

-6.851

0

Weight

1.050

0.006

7.823

0

Pain

1.034

0.003

12.596

0

Cost

0.999

0.000

-14.189

0

Code
tbl.mlogit.simple.OR %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.simple.OR.docx")

For the main model, data from 159 respondents were used, resulting in 7155 individual observations. Details on the data structure is shown in the appendix (?@sec-data-structure). The model output supports the expected effects, as illustrated in Figure 2. Specifically, higher time effort and costs are associated with decreased utility, while greater expected reductions in pain and weight are associated with increased utility. Furthermore, all four treatment delivery modes (oti, vc, ffg, ffs) significantly increase utility compared to the reference category, which is opt-out (i.e., no treatment). The main effect of type was tested by comparing the reduced model (without type) to the full model. The null hypothesis, which stated that the reduced model is valid, was rejected (\(\chi^2 = 146.61\), \(df = 12\), \(p < 0.001\)).

Code
mlogit.main <- mlogit(choice_bin ~ TC + effort + weight + pain + cost +
                        effort:FABQ_PA + effort:FABQ_W + cost:Franchise  | 
                        0, data = df.mlogit)

library(broom)
mlogit.main.tidy <- broom::tidy(mlogit.main)

df.mlogit.main <- as.data.frame(mlogit.main.tidy) %>%
    rename(
    Term = term,
    Estimate = estimate,
    Std_Error = std.error,
    Z_value = statistic,
    p_value = p.value
  ) %>%
  mutate(Term = sub("^TC", "", Term))

# export as docx as estimate
library(flextable)
tbl.mlogit.main <-  df.mlogit.main %>% 
  mutate(across(c(Estimate, Std_Error, Z_value, p_value), ~ round(.x, 3)),
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost",
                  "Effort: FABQ-PA", 
                  "Effort: FABQ-W",
                  "Cost: Franchise"),
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, Estimate, "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.main.est.docx")

# Export as docx as OR
tbl.mlogit.main.OR <-  df.mlogit.main %>% 
  mutate(
    Odds_Ratio = round(exp(Estimate), 3),
    Std_Error = round(Std_Error, 3),
    Z_value = round(Z_value, 3),
    p_value = round(p_value, 3),
    Estimate = round(Estimate, 3),
         Term = c("On-site, group session: Dietary intervention",
                  "On-site, group session: Dietary & Exercise intervention",
                  "On-site, group session: Exercise intervention",
                  "On-site, individual session: Dietary intervention",
                  "On-site, individual session: Dietary & Exercise  intervention",
                  "On-site, individual session: Exercise intervention",
                  "Online, time & place independent: Dietary intervention",
                  "Online, time & place independent: Dietary & Exercise intervention",
                  "Online, time & place independent: Exercise intervention",
                  "Videocalls: Dietary intervention",
                  "Videocalls: Dietary & Exercise  intervention",
                  "Videocalls: Exercise intervention",
                  "Effort",
                  "Weight",
                  "Pain",
                  "Cost",
                  "Effort: FABQ-PA", 
                  "Effort: FABQ-W",
                  "Cost: Franchise"),
        "Odds ratio" = Odds_Ratio,
         "Standard error" = Std_Error,
         "z value" = Z_value,
         "p value" = p_value) %>%
  dplyr::select(Term, "Odds ratio", "Standard error", "z value", "p value") %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl.mlogit.main.OR

Term

Odds ratio

Standard error

z value

p value

On-site, group session: Dietary intervention

3.858

0.202

6.691

0.000

On-site, group session: Dietary & Exercise intervention

5.830

0.205

8.607

0.000

On-site, group session: Exercise intervention

4.142

0.235

6.042

0.000

On-site, individual session: Dietary intervention

3.523

0.222

5.663

0.000

On-site, individual session: Dietary & Exercise intervention

6.323

0.222

8.295

0.000

On-site, individual session: Exercise intervention

4.320

0.207

7.053

0.000

Online, time & place independent: Dietary intervention

4.137

0.207

6.863

0.000

Online, time & place independent: Dietary & Exercise intervention

6.034

0.225

8.001

0.000

Online, time & place independent: Exercise intervention

3.584

0.205

6.238

0.000

Videocalls: Dietary intervention

2.964

0.236

4.595

0.000

Videocalls: Dietary & Exercise intervention

4.504

0.189

7.977

0.000

Videocalls: Exercise intervention

4.302

0.250

5.834

0.000

Effort

0.723

0.049

-6.641

0.000

Weight

1.051

0.006

7.638

0.000

Pain

1.035

0.003

12.329

0.000

Cost

0.999

0.000

-12.619

0.000

Effort: FABQ-PA

1.007

0.003

2.441

0.015

Effort: FABQ-W

0.999

0.002

-0.447

0.655

Cost: Franchise

1.000

0.000

1.025

0.305

Code
tbl.mlogit.main.OR %>% 
  flextable::save_as_docx(path = "../tables/tbl.mlogit.main.OR.docx")

5 Further analyses

Specific contrasts for the attribute type were examined. No significant difference in utility was found between online and face-to-face intervention delivery overall. However, in pairwise comparisons, the difference between face-to-face delivery in a single session and delivery via videocalls was statistically significant, with a preference for the face-to-face single-session format.

Code
# library(multcomp)
# 
# K <- rbind(
#   "oti - vc"   = c(0, 0, 0, 0,  1, -1,  0, 0, 0, 0, 0),
#   "oti - ffg"  = c(0, 0, 0, 0,  1,  0, -1, 0, 0, 0, 0),
#   "oti - ffs"  = c(0, 0, 0, 0,  1,  0, 0, -1, 0, 0, 0),
#   "vc - ffg"   = c(0, 0, 0, 0,  0,  1, -1, 0, 0, 0, 0),
#   "vc - ffs"   = c(0, 0, 0, 0,  0,  1, 0, -1, 0, 0, 0),
#   "ffg - ffs"  = c(0, 0, 0, 0,  0,  0, 1, -1, 0, 0, 0),
#   "online_vs_ftof" = c(0, 0, 0, 0, 0.5, 0.5, -0.5, -0.5, 0, 0, 0)
# )
# 
# glht.test <- glht(mlogit.main, linfct = K)
# 
# glht.summary <- summary(glht.test)
# 
# df.contrasts <- data.frame(
#   Estimate = glht.summary$test$coefficients,
#   Std_Error = glht.summary$test$sigma,
#   z_value = glht.summary$test$tstat,
#   p_value = glht.summary$test$pvalues
# )
# 
# df.contrasts %>%
#   kbl(digits = 3, escape = TRUE) %>%
#   kable_styling(font_size = 8)
# 
# # export as docx
# df.contrasts.exp <- df.contrasts %>%
#     rownames_to_column(var = "Contrast")
# 
# tbl.contrasts <- df.contrasts.exp %>%
#   mutate(across(c(Estimate, Std_Error, z_value, p_value), ~ round(.x, 3))) %>%
#   transmute(
#     Contrast = c(
#       "Online, time & place independent - Videocalls",
#       "Online, time & place independent - On-site, group session",
#       "Online, time & place independent - On-site individual session",
#       "Videocalls - On-site, group session",
#       "Videocalls - On-site, individual session",
#       "On-site, group session - On-site, individual session",
#       "Online (online+videocalls) - On-site (individual or group session)"
#     ),
#     Estimate,
#     `Standard error` = Std_Error,
#     `z value` = z_value,
#     `p value` = p_value
#   ) %>%
#   flextable::flextable() %>%
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>%
#   flextable::save_as_docx(path = "../tables/tbl.contrasts.docx")
Code
# names of first 12 coefficients
coef_names <- sub("^TC", "", names(coef(mlogit.main))[1:12])

# all contrasts
contrasts_list <- combn(1:12, 2, simplify = FALSE)

K <- do.call(rbind, lapply(contrasts_list, function(pair) {
  k_vec <- rep(0, length(coef(mlogit.main)))
  k_vec[pair[1]] <- 1
  k_vec[pair[2]] <- -1
  names(k_vec) <- names(coef(mlogit.main))
  return(k_vec)
}))

rownames(K) <- sapply(contrasts_list, function(pair) {
  paste0(coef_names[pair[1]], " - ", coef_names[pair[2]])
})


library(multcomp)
glht.test <- glht(mlogit.main, linfct = K)
glht.summary <- summary(glht.test)

df.contrasts <- data.frame(
  Estimate = glht.summary$test$coefficients,
  Std_Error = glht.summary$test$sigma,
  z_value = glht.summary$test$tstat,
  p_value = glht.summary$test$pvalues
)

df.contrasts %>%
  kbl(
  digits = 3,
  escape = TRUE,
  format = "latex") %>%
  kable_styling(
    font_size = 8,
    latex_options = c("striped", "hold_position"),
    full_width = FALSE)
# most prefered content: D+E
# most prefered type: ffs>oti>ffg
# export as docx
df.contrasts.exp <- df.contrasts %>%
    rownames_to_column(var = "Contrast")

tbl.contrasts <- df.contrasts.exp %>%
  mutate(across(c(Estimate, Std_Error, z_value, p_value), ~ round(.x, 3))) %>%
  flextable::flextable() %>%
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 

tbl.contrasts
tbl.contrasts %>%
  flextable::save_as_docx(path = "../tables/tbl.contrasts.docx")
Table 1: Tested contrases for the attribute type.

Contrast

Estimate

Std_Error

z_value

p_value

ffg:D - ffg:D+E

-0.413

0.153

-2.700

0.217

ffg:D - ffg:E

-0.071

0.218

-0.326

1.000

ffg:D - ffs:D

0.091

0.205

0.442

1.000

ffg:D - ffs:D+E

-0.494

0.174

-2.844

0.154

ffg:D - ffs:E

-0.113

0.167

-0.680

1.000

ffg:D - oti:D

-0.070

0.158

-0.443

1.000

ffg:D - oti:D+E

-0.447

0.211

-2.117

0.595

ffg:D - oti:E

0.073

0.171

0.430

1.000

ffg:D - vc:D

0.264

0.180

1.462

0.946

ffg:D - vc:D+E

-0.155

0.152

-1.018

0.997

ffg:D - vc:E

-0.109

0.205

-0.532

1.000

ffg:D+E - ffg:E

0.342

0.197

1.738

0.841

ffg:D+E - ffs:D

0.504

0.192

2.618

0.259

ffg:D+E - ffs:D+E

-0.081

0.165

-0.492

1.000

ffg:D+E - ffs:E

0.300

0.156

1.920

0.734

ffg:D+E - oti:D

0.343

0.164

2.094

0.611

ffg:D+E - oti:D+E

-0.034

0.209

-0.165

1.000

ffg:D+E - oti:E

0.486

0.152

3.199

0.058

ffg:D+E - vc:D

0.677

0.173

3.918

0.005

ffg:D+E - vc:D+E

0.258

0.165

1.566

0.914

ffg:D+E - vc:E

0.304

0.190

1.599

0.902

ffg:E - ffs:D

0.162

0.203

0.797

1.000

ffg:E - ffs:D+E

-0.423

0.181

-2.336

0.435

ffg:E - ffs:E

-0.042

0.195

-0.215

1.000

ffg:E - oti:D

0.001

0.195

0.007

1.000

ffg:E - oti:D+E

-0.376

0.228

-1.650

0.882

ffg:E - oti:E

0.145

0.168

0.861

0.999

ffg:E - vc:D

0.335

0.203

1.647

0.883

ffg:E - vc:D+E

-0.084

0.197

-0.425

1.000

ffg:E - vc:E

-0.038

0.188

-0.201

1.000

ffs:D - ffs:D+E

-0.585

0.166

-3.517

0.021

ffs:D - ffs:E

-0.204

0.174

-1.174

0.990

ffs:D - oti:D

-0.160

0.177

-0.908

0.999

ffs:D - oti:D+E

-0.538

0.194

-2.779

0.181

ffs:D - oti:E

-0.017

0.184

-0.094

1.000

ffs:D - vc:D

0.173

0.191

0.906

0.999

ffs:D - vc:D+E

-0.246

0.188

-1.304

0.976

ffs:D - vc:E

-0.200

0.202

-0.988

0.998

ffs:D+E - ffs:E

0.381

0.145

2.621

0.256

ffs:D+E - oti:D

0.424

0.147

2.884

0.139

ffs:D+E - oti:D+E

0.047

0.205

0.228

1.000

ffs:D+E - oti:E

0.568

0.159

3.564

0.018

ffs:D+E - vc:D

0.758

0.147

5.153

0.000

ffs:D+E - vc:D+E

0.339

0.160

2.113

0.598

ffs:D+E - vc:E

0.385

0.175

2.203

0.532

ffs:E - oti:D

0.043

0.161

0.269

1.000

ffs:E - oti:D+E

-0.334

0.197

-1.699

0.859

ffs:E - oti:E

0.187

0.153

1.224

0.985

ffs:E - vc:D

0.377

0.164

2.293

0.467

ffs:E - vc:D+E

-0.042

0.151

-0.277

1.000

ffs:E - vc:E

0.004

0.187

0.023

1.000

oti:D - oti:D+E

-0.378

0.210

-1.794

0.810

oti:D - oti:E

0.143

0.162

0.883

0.999

oti:D - vc:D

0.333

0.174

1.913

0.738

oti:D - vc:D+E

-0.085

0.155

-0.551

1.000

oti:D - vc:E

-0.039

0.186

-0.210

1.000

oti:D+E - oti:E

0.521

0.194

2.691

0.221

oti:D+E - vc:D

0.711

0.203

3.506

0.022

oti:D+E - vc:D+E

0.292

0.204

1.436

0.952

oti:D+E - vc:E

0.338

0.220

1.540

0.923

oti:E - vc:D

0.190

0.175

1.086

0.995

oti:E - vc:D+E

-0.228

0.156

-1.467

0.944

oti:E - vc:E

-0.182

0.181

-1.007

0.997

vc:D - vc:D+E

-0.419

0.171

-2.448

0.360

vc:D - vc:E

-0.373

0.171

-2.173

0.554

vc:D+E - vc:E

0.046

0.181

0.254

1.000

The MRS provide insights into the trade-offs participants were willing to make between different attributes. The MRS between effort and pain was -8.99, indicating that participants were willing to invest nearly 9 hours per week to achieve a 1% reduction in pain. The MRS between effort and weight was -5.08, suggesting that participants were willing to invest about 5 hours per week to achieve a 1% reduction in weight. The MRS between cost and pain was -0.038, implying that participants were willing to pay 0.038 CHF to gain a 1% reduction in pain. The MRS between cost and weight was -0.021, indicating a willingness to pay 0.021 CHF for a 1% reduction in body weight.

These results suggest that participants placed a higher value on reducing pain and weight through effort than through financial cost, highlighting a greater willingness to trade personal effort over monetary expense to achieve better outcomes.

Since effort and cost are considered burdens, whereas pain and weight are regarded as benefits, a negative MRS indicates that participants are willing to spend either in terms of effort or monetary cost to achieve a reduction in pain or weight.

Code
mlogit.summary <- summary(mlogit.main)

coefs <- mlogit.summary$coefficients # 1st column = estimates

MRS_effort_pain  <- coefs["pain"] / coefs["effort"]
MRS_effort_weight <- coefs["weight"] / coefs["effort"]
MRS_cost_pain    <- coefs["pain"] / coefs["cost"]
MRS_cost_weight   <- coefs["weight"] / coefs["cost"]


df.mrs <- data.frame(
  Comparison = c("Effort vs Pain", "Effort vs Weight", "Cost vs Pain", "Cost vs Weight"),
  MRS = c(MRS_effort_pain, MRS_effort_weight, MRS_cost_pain, MRS_cost_weight)
)


library(kableExtra)
df.mrs %>%
  kbl(
  digits = 3,
  escape = TRUE,
  format = "latex") %>%
  kable_styling(
    font_size = 8,
    latex_options = c("striped", "hold_position"),
    full_width = FALSE)
# export as docx
library(flextable)
tbl.mrs <-  df.mrs %>% 
  mutate(MRS = round(MRS, 3),
         Comparison = c("Effort vs Pain reduction", "Effort vs Weight loss", "Cost vs Pain reduction", "Cost vs Weight loss")) %>% 
  flextable::flextable() %>% 
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  flextable::line_spacing(space = 1.0) %>%
  flextable::fontsize(size = 8, part = "all") %>%
  flextable::font(fontname = "Times New Roman", part = "all") 
tbl.mrs
tbl.mrs %>%
  flextable::save_as_docx(path = "../tables/tbl.mrs.docx")
Table 2: Marginal rates of substitution.

Comparison

MRS

Effort vs Pain reduction

-0.105

Effort vs Weight loss

-0.153

Cost vs Pain reduction

-30.832

Cost vs Weight loss

-44.786

The relative importance of each attribute (Table 3) was derived from the multinomial logit model containing alternative-specific attributes only. Among the attributes, intervention costs had the greatest influence on participants’ choices, accounting for 35.1% of the decision-making weight. This was followed by pain reduction (23.8%), weekly time effort (17.3%), and weight loss (17.0%). The type of intervention was the least influential attribute, contributing only 6.7% to the overall decision. These findings indicate that monetary considerations and pain reduction were the primary drivers of choice behavior in this study.

Code
m.rel.imp <- mlogit(choice_bin ~ TC + effort + pain + weight + cost 
                    | 0, data = df.mlogit)

coefs <- summary(m.rel.imp)$coefficients


range_effort <- abs((max(df.mlogit$effort) - min(df.mlogit$effort)) * coefs["effort"])
range_pain <- abs((max(df.mlogit$pain) - min(df.mlogit$pain)) * coefs["pain"])
range_weight <- abs((max(df.mlogit$weight) - min(df.mlogit$weight)) * coefs["weight"])
range_cost <- abs((max(df.mlogit$cost) - min(df.mlogit$cost)) * coefs["cost"])
tc_coefs <- coefs[grep("^TC", names(coefs))]
range_TC <- abs(max(tc_coefs) - min(tc_coefs))

total_range <- range_effort + range_pain + range_weight + range_cost + range_TC

Rel.importance <- c(
  effort = range_effort / total_range,
  pain = range_pain / total_range,
  weight = range_weight / total_range,
  cost = range_cost / total_range,
  TC = range_TC / total_range
)

names(Rel.importance) <- c("effort", "pain", "weight", "cost", "TC")

as.data.frame(Rel.importance) %>%
  mutate(sign = c("-", "+", "+", "-", "NA")) %>% 
  kbl(
  digits = 3,
  escape = TRUE,
  format = "latex") %>%
  kable_styling(
    font_size = 8,
    latex_options = c("striped", "hold_position"),
    full_width = FALSE)
library(flextable)
tbl.rel.importance <- as.data.frame(Rel.importance) %>%
  rownames_to_column(var = "Attribute") %>%
  mutate(
    Attribute = c("Effort", "Pain", "Weight", "Cost", "Type:Content"),
    Sign = c("-", "+", "+", "-", "NA"),
    Rel.importance = round(Rel.importance, 3)
  ) %>%
  arrange(desc(Rel.importance)) %>%
  rename(`Relative importance` = Rel.importance) %>% 
  flextable() %>%
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  line_spacing(space = 1.0) %>%
  fontsize(size = 8, part = "all") %>%
  font(fontname = "Times New Roman", part = "all") 
tbl.rel.importance
tbl.rel.importance %>%
  save_as_docx(path = "../tables/tbl.rel.importance.docx")
Table 3: Relative importance of attributes.

Attribute

Relative importance

Sign

Cost

0.319

-

Pain

0.246

+

Effort

0.166

-

Weight

0.145

+

Type:Content

0.124

NA

Code
# 
# mrs.effidxmrs.effortmrs.effortmrs.effort.pain.temp  <- vector()
# mrs.effort.weight.temp <- vector()
# mrs.cost.pain.temp <- vector()
# mrs.cost.weight.temp <- vector()
# 
# rel.imp.effort <- vector()
# rel.imp.pain <- vector()
# rel.imp.weight <- vector()
# rel.imp.cost <- vector()
# rel.imp.TC <- vector()
# 
# 
# # for(i in 1:1000){
# # 
# #   index <- sample(1:max(unique(df.mlogit$ID)), max(unique(df.mlogit$ID)), replace = TRUE) # create indexes
# # 
# #   df.mlogit.temp <- bind_rows(lapply(index, function(i) df.mlogit[df.mlogit$ID==i,]))
# # 
# #   df.mlogit.temp <- mlogit.data(df.mlogit.temp, choice="choice_bin", shape="long",
# #                                alt.var="ALT", chid.var="STR")
# # 
# #   df.mlogit.temp <- lapply(index, function(i) {
# #   df.mlogit[df.mlogit$ID == i, ]}) %>%
# #   bind_rows()
# # 
# #   df.mlogit.temp$ID <- rep(1:max(unique(df.mlogit$ID)), each = 45)
# # 
# #   df.mlogit.temp$STR <- 100 * df.mlogit.temp$ID + df.mlogit.temp$QES # make new unique STR
# # 
# #   df.mlogit.temp <- as.data.frame(df.mlogit.temp)
# # 
# #   df.mlogit.temp <- dfidx::dfidx(df.mlogit.temp,idx = list(c("STR", "ALT")),drop.index = TRUE)
# # 
# #   mlogit.main.temp <- mlogit(choice_bin ~ TC + effort + weight + pain + cost +
# #                         effort:FABQ_PA + effort:FABQ_W + cost:Franchise  |
# #                         0, data = df.mlogit.temp)
# # 
# # 
# #   mlogit.summary.temp <- summary(mlogit.main.temp)
# # 
# #   coefs <- mlogit.summary.temp$coefficients # 1st column = estimates
# # 
# #   mrs.effort.pain.temp[i]  <- coefs["pain"] / coefs["effort"]
# #   mrs.effort.weight.temp[i] <- coefs["weight"] / coefs["effort"]
# #   mrs.cost.pain.temp[i] <- coefs["pain"] / coefs["cost"]
# #   mrs.cost.weight.temp[i]   <- coefs["weight"] / coefs["cost"]
# # 
# # 
# #   # Relative importance
# #   m.rel.imp <- mlogit(choice_bin ~ TC + effort + pain + weight + cost
# #                     | 0, data = df.mlogit.temp)
# # 
# #   coefs <- summary(m.rel.imp)$coefficients
# # 
# # 
# #   range_effort <- abs((max(df.mlogit$effort) - min(df.mlogit$effort)) * coefs["effort"])
# #   range_pain <- abs((max(df.mlogit$pain) - min(df.mlogit$pain)) * coefs["pain"])
# #   range_weight <- abs((max(df.mlogit$weight) - min(df.mlogit$weight)) * coefs["weight"])
# #   range_cost <- abs((max(df.mlogit$cost) - min(df.mlogit$cost)) * coefs["cost"])
# #   tc_coefs <- coefs[grep("^TC", names(coefs))]
# #   range_TC <- abs(max(tc_coefs) - min(tc_coefs))
# #   total_range <- range_effort + range_pain + range_weight + range_cost + range_TC
# # 
# # 
# #   rel.imp.effort[i] <- range_effort / total_range
# #   rel.imp.pain[i] <- range_pain / total_range
# #   rel.imp.weight[i] <- range_weight / total_range
# #   rel.imp.cost[i] <- range_cost / total_range
# #   rel.imp.TC[i] <- range_TC / total_range
# # 
# # }
# 
# 
# 
# df.mrs$lower[1] <- 2*df.mrs$MRS[1] - quantile(mrs.effort.pain.temp, probs = c(0.975))
# df.mrs$upper[1] <- 2*df.mrs$MRS[1] - quantile(mrs.effort.pain.temp, probs = c(0.025))
# df.mrs$lower[2] <- 2*df.mrs$MRS[2] - quantile(mrs.effort.weight.temp, probs = c(0.975))
# df.mrs$upper[2] <- 2*df.mrs$MRS[2] - quantile(mrs.effort.weight.temp, probs = c(0.025))
# df.mrs$lower[3] <- 2*df.mrs$MRS[3] - quantile(mrs.cost.pain.temp, probs = c(0.975))
# df.mrs$upper[3] <- 2*df.mrs$MRS[3] - quantile(mrs.cost.pain.temp, probs = c(0.025))
# df.mrs$lower[4] <- 2*df.mrs$MRS[4] - quantile(mrs.cost.weight.temp, probs = c(0.975))
# df.mrs$upper[4] <- 2*df.mrs$MRS[4] - quantile(mrs.cost.weight.temp, probs = c(0.025))
# 
# 
# df.rel.imp <- data.frame(rel_imp = Rel.importance,
#            attr = names(Rel.importance))
# 
# row.names(df.rel.imp) <- NULL
# df.rel.imp$lower[1] <- 2*df.rel.imp$rel_imp[1] - quantile(rel.imp.effort, probs = c(0.975))
# df.rel.imp$upper[1] <- 2*df.rel.imp$rel_imp[1] - quantile(rel.imp.effort, probs = c(0.025))
# df.rel.imp$lower[2] <- 2*df.rel.imp$rel_imp[2] - quantile(rel.imp.pain, probs = c(0.975))
# df.rel.imp$upper[2] <- 2*df.rel.imp$rel_imp[2] - quantile(rel.imp.pain, probs = c(0.025))
# df.rel.imp$lower[3] <- 2*df.rel.imp$rel_imp[3] - quantile(rel.imp.weight, probs = c(0.975))
# df.rel.imp$upper[3] <- 2*df.rel.imp$rel_imp[3] - quantile(rel.imp.weight, probs = c(0.025))
# df.rel.imp$lower[4] <- 2*df.rel.imp$rel_imp[4] - quantile(rel.imp.cost, probs = c(0.975))
# df.rel.imp$upper[4] <- 2*df.rel.imp$rel_imp[4] - quantile(rel.imp.cost, probs = c(0.025))
# df.rel.imp$lower[5] <- 2*df.rel.imp$rel_imp[5] - quantile(rel.imp.TC, probs = c(0.975))
# df.rel.imp$upper[5] <- 2*df.rel.imp$rel_imp[5] - quantile(rel.imp.TC, probs = c(0.025))
# 
Table 4
Code
# # prepare dataset mrs
# df.mrs.adapted <- df.mrs %>% 
#   transmute(
#     Attribute = Comparison,
#     Estimate = round(MRS, 3),
#     `95% CI` = paste0("[", round(lower, 3), "; ", round(upper, 3), "]")) %>%
#   mutate(Metric = "Marginal rate of substitution") %>%
#   relocate(Metric) %>%
#   mutate(Metric = replace(Metric, row_number() != 1, ""))
# 
# # Adapt df.rel.imp
# df.rel.imp.adapted <- df.rel.imp %>%
#   dplyr::select(attr, rel_imp, lower, upper) %>% 
#   transmute(
#     Attribute = case_when(
#       attr == "effort" ~ "Effort",
#       attr == "pain" ~ "Pain",
#       attr == "weight" ~ "Weight",
#       attr == "cost" ~ "Cost",
#       TRUE ~ attr),
#     Estimate = round(rel_imp, 3),
#     `95% CI` = paste0("[", 
#                       formatC(lower, format = "f", digits = 3), 
#                       "; ", 
#                       formatC(upper, format = "f", digits = 3), 
#                       "]")) %>%
#   mutate(Metric = "Relative importance") %>%
#   relocate(Metric) %>%
#   mutate(Metric = replace(Metric, row_number() != 1, ""))
# 
# # combine datasets
# df.mrs.ri.combined <- bind_rows(df.mrs.adapted, df.rel.imp.adapted) %>% 
#   flextable::flextable() %>% 
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>%
#   flextable::save_as_docx(path = "../tables/tbl.mrs.ri.docx")
Code
library(ggplot2)
library(dplyr)

# Data preparation
df.rel.importance <- data.frame(
  Attribute = c("Effort", "Pain", "Weight", "Cost", "Type:Content"),
  Importance = round(Rel.importance * 100, 1)  # Convert to percentage
) %>%
  mutate(
    # Invert sign for Cost and Effort
    Importance = ifelse(Attribute %in% c("Cost", "Effort"), -Importance, Importance),
    
    # Set desired factor level order
    Attribute = factor(Attribute, levels = c("Cost", "Pain", "Effort", "Weight", "Type:Content")))

grey_palette <- c("#333333", "#555555", "#777777", "#999999", "#BBBBBB")

# Plot
fig.rel.importance <- ggplot(df.rel.importance, aes(x = Attribute, y = Importance, fill = Attribute)) +
  geom_bar(stat = "identity", width = 0.7) +
  geom_text(aes(label = paste0(Importance, "%"),
                vjust = ifelse(Importance < 0, 1.5, -0.5)),
            size = 3) +
  geom_hline(yintercept = 0, size = 0.3, color = "black") +
  scale_fill_manual(values = grey_palette) +
  scale_y_continuous(
    labels = scales::percent_format(scale = 1), 
    limits = c(-100, 100),
    breaks = seq(-100, 100, by = 20)) +
  labs(
    x = NULL,
    y = "Relative importance (%)",
    fill = NULL
  ) +
  theme_minimal(base_size = 10) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    axis.ticks.x = element_blank(),
    legend.position = "none",
    panel.grid.major.x = element_blank(),  # remove vertical grid lines
    panel.grid.minor.x = element_blank()
  )
fig.rel.importance

Code
ggsave("../figures/fig.rel.importance.png", fig.rel.importance, width = 6, height = 4, dpi = 300)

6 References

Aizaki, Hideo. 2012a. “Basic Functions for Supporting an Implementation of Choice Experiments in R.” Journal of Statistical Software 50 (C2): 1–24. https://doi.org/10.18637/jss.v050.c02.
———. 2012b. “Basic Functions for Supporting an Implementation of Choice Experiments in R.” Journal of Statistical Software 50 (September): 1–24. https://doi.org/10.18637/jss.v050.c02.
Bekker-Grob, Esther W. de, Bas Donkers, Marcel F. Jonker, and Elly A. Stolk. 2015. “Sample Size Requirements for Discrete-Choice Experiments in Healthcare: A Practical Guide.” The Patient - Patient-Centered Outcomes Research 8 (5): 373–84. https://doi.org/10.1007/s40271-015-0118-z.
Campbell, Danny, and Seda Erdem. 2019. “Including Opt-Out Options in Discrete Choice Experiments: Issues to Consider.” The Patient - Patient-Centered Outcomes Research 12 (1): 1–14. https://doi.org/10.1007/s40271-018-0324-6.
Croissant, Yves. 2020a. “Estimation of Random Utility Models in R: The mlogit Package.” Journal of Statistical Software 95 (11): 1–41. https://doi.org/10.18637/jss.v095.i11.
———. 2020b. “Estimation of Random Utility Models in R: The Mlogit Package.” Journal of Statistical Software 95 (October): 1–41. https://doi.org/10.18637/jss.v095.i11.
Hothorn, Torsten, Frank Bretz, and Peter Westfall. 2008. “Simultaneous Inference in General Parametric Models.” Biometrical Journal 50 (3): 346–63.
Lauridsen, Henrik H., Jan Hartvigsen, Claus Manniche, Lars Korsholm, and Niels Grunnet-Nilsson. 2006. “Responsiveness and Minimal Clinically Important Difference for Pain and Disability Instruments in Low Back Pain Patients.” BMC Musculoskeletal Disorders 7 (October): 82. https://doi.org/10.1186/1471-2474-7-82.
Monticone, Marco, Luca Frigau, Howard Vernon, Barbara Rocca, Andrea Giordano, Salvatore Simone Vullo, Francesco Mola, and Franco Franchignoni. 2020. “Reliability, Responsiveness and Minimal Clinically Important Difference of the Two Fear Avoidance and Beliefs Questionnaire Scales in Italian Subjects with Chronic Low Back Pain Undergoing Multidisciplinary Rehabilitation.” European Journal of Physical and Rehabilitation Medicine 56 (5): 600–606. https://doi.org/10.23736/S1973-9087.20.06158-4.
Pfingsten, M., B. Kröner-Herwig, E. Leibing, U. Kronshage, and J. Hildebrandt. 2000. “Validation of the German Version of the Fear-Avoidance Beliefs Questionnaire (FABQ).” European Journal of Pain (London, England) 4 (3): 259–66. https://doi.org/10.1053/eujp.2000.0178.
R Core Team. 2025. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Staerkle, Ralph, Anne F. Mannion, Achim Elfering, Astrid Junge, Norbert K. Semmer, Nicola Jacobshagen, Dieter Grob, Jiri Dvorak, and Norbert Boos. 2004. “Longitudinal Validation of the Fear-Avoidance Beliefs Questionnaire (FABQ) in a Swiss-German Sample of Low Back Pain Patients.” European Spine Journal: Official Publication of the European Spine Society, the European Spinal Deformity Society, and the European Section of the Cervical Spine Research Society 13 (4): 332–40. https://doi.org/10.1007/s00586-003-0663-3.
Wheeler, Bob. 2025. AlgDesign: Algorithmic Experimental Design. https://doi.org/10.32614/CRAN.package.AlgDesign.
Wheeler, R. E. 2004. optFederov. AlgDesign. The R Project for Statistical Computing https://www.r-project.org/.” https://www.rdocumentation.org/packages/AlgDesign/versions/1.2.1.1/topics/optFederov.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Zeileis, Achim, and Torsten Hothorn. 2002. “Diagnostic Checking in Regression Relationships.” R News 2 (3): 7–10. https://CRAN.R-project.org/doc/Rnews/.

7 Appendix

Code
tbl.dce.alternatives <- rbind(
  list.design$alternatives$alt.1,
  list.design$alternatives$alt.2)

tbl.dce.alternatives %>%
  flextable() %>%
  set_header_labels(.labels = colnames(tbl.dce.alternatives)) %>%  # optional
  autofit() %>%
  bold(part = "header") %>%
  align(j = 1:ncol(tbl.dce.alternatives), align = "center", part = "all") %>%
  line_spacing(space = 1.0) %>%
  fontsize(size = 8, part = "all") %>%
  font(fontname = "Times New Roman", part = "all")
# Save as CSV
write.csv(tbl.dce.alternatives, "../tables/tbl.choicesets.csv", row.names = FALSE)
Table 5: Choice sets, devided into two blocks.

BLOCK

QES

ALT

Realization.type

Content

Average.effort.per.week

Weight.loss

Pain.reduction

Total.costs

1

1

1

online-time & place independent

Exercise program

4

-20

-10

1500

1

2

1

face-to-face_group

D+E

2

-5

-40

500

1

3

1

online-time & place independent

D+E

3

-20

-30

500

1

4

1

face-to-face_single

Dietary intervention

4

-10

-20

1000

1

5

1

online-time & place independent

Dietary intervention

5

-5

-30

1000

1

6

1

videocalls

Dietary intervention

3

-10

-40

2000

1

7

1

videocalls

Dietary intervention

5

-20

-50

500

1

8

1

videocalls

D+E

4

-5

-20

1500

1

9

1

online-time & place independent

Dietary intervention

3

-10

-40

1500

1

10

1

face-to-face_group

D+E

5

-10

-10

2000

1

11

1

videocalls

Exercise program

4

-5

-40

2000

1

12

1

face-to-face_group

Dietary intervention

4

-15

-10

500

1

13

1

videocalls

D+E

3

-15

-50

1000

1

14

1

face-to-face_group

Exercise program

3

-5

-20

1000

1

15

1

face-to-face_group

Dietary intervention

2

-20

-50

1500

2

1

1

face-to-face_group

Exercise program

5

-10

-30

1500

2

2

1

online-time & place independent

Exercise program

2

-5

-50

2000

2

3

1

videocalls

Exercise program

5

-20

-20

500

2

4

1

face-to-face_single

D+E

5

-15

-20

1500

2

5

1

online-time & place independent

Exercise program

5

-15

-40

1000

2

6

1

online-time & place independent

D+E

4

-10

-50

500

2

7

1

videocalls

Dietary intervention

2

-15

-30

1500

2

8

1

face-to-face_single

Exercise program

2

-10

-30

500

2

9

1

face-to-face_single

Dietary intervention

3

-5

-10

500

2

10

1

face-to-face_single

D+E

5

-5

-50

1500

2

11

1

face-to-face_group

D+E

4

-20

-30

2000

2

12

1

face-to-face_single

Exercise program

3

-15

-50

2000

2

13

1

face-to-face_single

D+E

2

-20

-40

1000

2

14

1

online-time & place independent

Dietary intervention

2

-15

-20

2000

2

15

1

videocalls

D+E

2

-10

-10

1000

1

1

2

videocalls

D+E

2

-15

-40

2000

1

2

2

videocalls

Dietary intervention

5

-5

-40

500

1

3

2

face-to-face_group

Dietary intervention

2

-10

-10

2000

1

4

2

online-time & place independent

D+E

2

-5

-30

1000

1

5

2

face-to-face_group

Dietary intervention

3

-5

-50

1500

1

6

2

face-to-face_single

D+E

5

-5

-20

2000

1

7

2

face-to-face_single

D+E

3

-10

-10

500

1

8

2

online-time & place independent

Dietary intervention

5

-10

-30

2000

1

9

2

face-to-face_single

Dietary intervention

5

-15

-10

1000

1

10

2

face-to-face_group

Dietary intervention

2

-20

-30

2000

1

11

2

face-to-face_group

Exercise program

5

-15

-30

1500

1

12

2

videocalls

D+E

4

-10

-30

1500

1

13

2

face-to-face_single

Exercise program

4

-15

-50

2000

1

14

2

online-time & place independent

Exercise program

3

-20

-40

2000

1

15

2

face-to-face_group

D+E

4

-20

-10

500

2

1

2

online-time & place independent

Exercise program

4

-15

-50

500

2

2

2

online-time & place independent

D+E

5

-10

-50

500

2

3

2

videocalls

Dietary intervention

3

-10

-50

1000

2

4

2

face-to-face_single

Dietary intervention

4

-5

-40

1000

2

5

2

face-to-face_group

D+E

3

-15

-40

1000

2

6

2

videocalls

Exercise program

3

-5

-10

2000

2

7

2

videocalls

Dietary intervention

2

-15

-20

500

2

8

2

face-to-face_single

D+E

2

-20

-50

1500

2

9

2

face-to-face_single

Exercise program

3

-20

-30

500

2

10

2

online-time & place independent

Dietary intervention

3

-15

-20

1500

2

11

2

face-to-face_single

Exercise program

2

-10

-40

1500

2

12

2

online-time & place independent

Exercise program

2

-5

-10

1000

2

13

2

face-to-face_group

Exercise program

4

-10

-20

1000

2

14

2

online-time & place independent

Dietary intervention

4

-20

-10

1500

2

15

2

videocalls

Exercise program

5

-20

-20

1000

Code
# df.dce.long %>% 
#   slice(1:45) %>% 
#   kbl() %>% 
#   kable_styling(font_size = 8, 
#                 full_width = F)

library(dplyr)
library(flextable)
library(tibble)

# Create the subset of the data
df.dce.long.2 <- df.dce.long %>% 
  slice(1:45) %>% 
  dplyr::select(-sex_bin)

flextable.dce.long.2 <- df.dce.long.2 %>%
  flextable() %>%
  colformat_num(big.mark = "", digits = 0) %>%  # Kein Tausendertrennzeichen
  autofit() %>%
  bold(part = "header") %>%
  align(align = "center", part = "all") %>%
  line_spacing(space = 1.0) %>%
  fontsize(size = 8, part = "all") %>%
  font(fontname = "Times New Roman", part = "all")

flextable.dce.long.2
flextable.dce.long.2 %>% 
  flextable::save_as_docx(flextable.dce.long.2, path = "../tables/tbl.data.structure.docx")
Table 6: Data from the first patient. There are 15 choice sets, each with three alternatives. Thus, there are 45 rows per patient.

ID

BLOCK

age

sex

BMI

Franchise

FABQ_PA

FABQ_W

question

choice

QES

ALT

effort

pain

TC

cost

weight

choice_bin

STR

2

1

60

Female

30.8

300

25

31

q1

2

1

1

4

10

oti:E

1500

20

0

201

2

1

60

Female

30.8

300

25

31

q1

2

1

2

2

40

vc:D+E

2000

15

1

201

2

1

60

Female

30.8

300

25

31

q1

2

1

3

0

0

None:None

0

0

0

201

2

1

60

Female

30.8

300

25

31

q2

1

2

1

2

40

ffg:D+E

500

5

1

202

2

1

60

Female

30.8

300

25

31

q2

1

2

2

5

40

vc:D

500

5

0

202

2

1

60

Female

30.8

300

25

31

q2

1

2

3

0

0

None:None

0

0

0

202

2

1

60

Female

30.8

300

25

31

q3

1

3

1

3

30

oti:D+E

500

20

1

203

2

1

60

Female

30.8

300

25

31

q3

1

3

2

2

10

ffg:D

2000

10

0

203

2

1

60

Female

30.8

300

25

31

q3

1

3

3

0

0

None:None

0

0

0

203

2

1

60

Female

30.8

300

25

31

q4

2

4

1

4

20

ffs:D

1000

10

0

204

2

1

60

Female

30.8

300

25

31

q4

2

4

2

2

30

oti:D+E

1000

5

1

204

2

1

60

Female

30.8

300

25

31

q4

2

4

3

0

0

None:None

0

0

0

204

2

1

60

Female

30.8

300

25

31

q5

2

5

1

5

30

oti:D

1000

5

0

205

2

1

60

Female

30.8

300

25

31

q5

2

5

2

3

50

ffg:D

1500

5

1

205

2

1

60

Female

30.8

300

25

31

q5

2

5

3

0

0

None:None

0

0

0

205

2

1

60

Female

30.8

300

25

31

q6

2

6

1

3

40

vc:D

2000

10

0

206

2

1

60

Female

30.8

300

25

31

q6

2

6

2

5

20

ffs:D+E

2000

5

1

206

2

1

60

Female

30.8

300

25

31

q6

2

6

3

0

0

None:None

0

0

0

206

2

1

60

Female

30.8

300

25

31

q7

2

7

1

5

50

vc:D

500

20

0

207

2

1

60

Female

30.8

300

25

31

q7

2

7

2

3

10

ffs:D+E

500

10

1

207

2

1

60

Female

30.8

300

25

31

q7

2

7

3

0

0

None:None

0

0

0

207

2

1

60

Female

30.8

300

25

31

q8

1

8

1

4

20

vc:D+E

1500

5

1

208

2

1

60

Female

30.8

300

25

31

q8

1

8

2

5

30

oti:D

2000

10

0

208

2

1

60

Female

30.8

300

25

31

q8

1

8

3

0

0

None:None

0

0

0

208

2

1

60

Female

30.8

300

25

31

q9

2

9

1

3

40

oti:D

1500

10

0

209

2

1

60

Female

30.8

300

25

31

q9

2

9

2

5

10

ffs:D

1000

15

1

209

2

1

60

Female

30.8

300

25

31

q9

2

9

3

0

0

None:None

0

0

0

209

2

1

60

Female

30.8

300

25

31

q10

1

10

1

5

10

ffg:D+E

2000

10

1

210

2

1

60

Female

30.8

300

25

31

q10

1

10

2

2

30

ffg:D

2000

20

0

210

2

1

60

Female

30.8

300

25

31

q10

1

10

3

0

0

None:None

0

0

0

210

2

1

60

Female

30.8

300

25

31

q11

2

11

1

4

40

vc:E

2000

5

0

211

2

1

60

Female

30.8

300

25

31

q11

2

11

2

5

30

ffg:E

1500

15

1

211

2

1

60

Female

30.8

300

25

31

q11

2

11

3

0

0

None:None

0

0

0

211

2

1

60

Female

30.8

300

25

31

q12

1

12

1

4

10

ffg:D

500

15

1

212

2

1

60

Female

30.8

300

25

31

q12

1

12

2

4

30

vc:D+E

1500

10

0

212

2

1

60

Female

30.8

300

25

31

q12

1

12

3

0

0

None:None

0

0

0

212

2

1

60

Female

30.8

300

25

31

q13

1

13

1

3

50

vc:D+E

1000

15

1

213

2

1

60

Female

30.8

300

25

31

q13

1

13

2

4

50

ffs:E

2000

15

0

213

2

1

60

Female

30.8

300

25

31

q13

1

13

3

0

0

None:None

0

0

0

213

2

1

60

Female

30.8

300

25

31

q14

1

14

1

3

20

ffg:E

1000

5

1

214

2

1

60

Female

30.8

300

25

31

q14

1

14

2

3

40

oti:E

2000

20

0

214

2

1

60

Female

30.8

300

25

31

q14

1

14

3

0

0

None:None

0

0

0

214

2

1

60

Female

30.8

300

25

31

q15

2

15

1

2

50

ffg:D

1500

20

0

215

2

1

60

Female

30.8

300

25

31

q15

2

15

2

4

10

ffg:D+E

500

20

1

215

2

1

60

Female

30.8

300

25

31

q15

2

15

3

0

0

None:None

0

0

0

215

I estimated the same model as described in the methods using the clogit function from the survival package. The resulting coefficients are identical, see Table 8.

Code
model.formula.simple <- choice_bin ~
  TC +
  effort +
  weight +
  pain +
  cost +
  strata(STR)

library(survival)
clogit.simple <- clogit(model.formula.simple, 
                         data = df.mlogit, 
                         method = "approximate")
# goodness-of-fit measures
support.CEs::gofm(clogit.simple)

Rho-squared = 0.1928412 
Adjusted rho-squared = 0.1872868 
Akaike information criterion (AIC) = 4682.141 
Bayesian information criterion (BIC) = 4776.088 
Number of coefficients = 16 
Log likelihood at start = -2880.561 
Log likelihood at convergence = -2325.07 
Code
clogit.simple.tidy <- tidy(clogit.simple)


kbl(clogit.simple.tidy, digits = 3) %>% 
  kable_styling(font_size = 8)
# df.clogit.simple <- as.data.frame(clogit.simple.tidy) %>%
#   rename(
#     Term = term,
#     Estimate = estimate,
#     Std_Error = std.error,
#     Z_value = statistic,
#     p_value = p.value) %>%
#   mutate(Term = forcats::fct_recode(
#     Term,
#     "Ffg:D"             = "TC1",
#     "Ffg:D+E"           = "TC2",
#     "Ffg:E"             = "TC3",
#     "Ffs:D"             = "TC4",
#     "Ffs:D+E"           = "TC5",
#     "Ffs:E"             = "TC6",
#     "Oti:D"             = "TC7",
#     "Oti:D+E"           = "TC8",
#     "Oti:E"             = "TC9",
#     "Vc:D"              = "TC10",
#     "Vc:D+E"            = "TC11",
#     "Vc:E"              = "TC12",
#     "Effort" = "effort",
#     "Weight" = "weight",
#     "Pain" = "pain",
#     "Cost" = "cost"))


#save(df.clogit.simple, file = "../data/df-clogit-simple.rda")
#colnames(df.clogit.simple) <- c("Term", "Estimate", "Std. Error", "z value", "p value")

#kbl(df.clogit.simple, digits = 3) %>%
#  kable_styling(font_size = 8)

#library(flextable)
# tbl.clogit.simple <-  df.clogit.simple %>% 
#   mutate(across(c(Estimate, `Std. Error`, `z value`, `p value`), ~ round(.x, 3)),
#          Term = c("On-site, group session: Dietary intervention",
#                   "On-site, group session: Dietary & Exercise intervention",
#                   "On-site, group session: Exercise intervention",
#                   "On-site, individual session: Dietary intervention",
#                   "On-site, individual session: Dietary & Exercise  intervention",
#                   "On-site, individual session: Exercise intervention",
#                   "Online, time & place independent: Dietary intervention",
#                   "Online, time & place independent: Dietary & Exercise intervention",
#                   "Online, time & place independent: Exercise intervention",
#                   "Videocalls: Dietary intervention",
#                   "Videocalls: Dietary & Exercise  intervention",
#                   "Videocalls: Exercise intervention",
#                   "Effort",
#                   "Weight",
#                   "Pain",
#                   "Cost")) %>%
#   dplyr::select(Term, Estimate, "Std. Error", "z value", "p value") %>% 
#   flextable::flextable() %>% 
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>% 
#   flextable::save_as_docx(path = "../tables/tbl.clogit.simple.docx")
Table 7: Output from the clogit function for the simple model.
term estimate std.error statistic p.value
TCffg:D 1.303 0.195 6.671 0
TCffg:D+E 1.697 0.195 8.682 0
TCffg:E 1.313 0.227 5.784 0
TCffs:D 1.186 0.212 5.583 0
TCffs:D+E 1.829 0.213 8.596 0
TCffs:E 1.448 0.198 7.293 0
TCoti:D 1.396 0.199 7.015 0
TCoti:D+E 1.738 0.214 8.119 0
TCoti:E 1.199 0.195 6.142 0
TCvc:D 0.994 0.226 4.403 0
TCvc:D+E 1.441 0.181 7.956 0
TCvc:E 1.355 0.240 5.648 0
effort -0.224 0.033 -6.851 0
weight 0.049 0.006 7.823 0
pain 0.033 0.003 12.596 0
cost -0.001 0.000 -14.189 0
Code
model.formula.main <- choice_bin ~
  TC +
  effort +
  pain +
  weight +
  cost +
  effort:FABQ_PA + 
  effort:FABQ_W + 
  cost:Franchise +
  strata(STR)

library(survival)
clogit.main <- clogit(model.formula.main, 
                         data = df.dce.long, 
                         method = "approximate")
# goodness-of-fit measure # McFadden’s Rho-squared = 0.1966 = moderate model fit
support.CEs::gofm(clogit.main)

Rho-squared = 0.1966142 
Adjusted rho-squared = 0.1894529 
Akaike information criterion (AIC) = 4301.004 
Bayesian information criterion (BIC) = 4411.004 
Number of coefficients = 19 
Log likelihood at start = -2653.149 
Log likelihood at convergence = -2131.502 
Code
# MRS/WTP for cost and pain/weight
support.CEs::mwtp(
  output = clogit.main,
  monetary.variables = c("cost"),
  nonmonetary.variables = c("pain", "weight"),
  nreplications = 5000,
  confidence.level = c(0.95),
  seed = 123
)

        MWTP  2.5% 97.5%
pain   30.83 25.80 36.93
weight 44.79 32.69 59.35

method = Krinsky and Robb 
Code
# MRS/WTP for effort and pain/weight
support.CEs::mwtp(
  output = clogit.main,
  monetary.variables = c("effort"),
  nonmonetary.variables = c("pain", "weight"),
  nreplications = 5000,
  confidence.level = c(0.95),
  seed = 123
)

          MWTP    2.5%   97.5%
pain   0.10533 0.07675 0.15375
weight 0.15300 0.10817 0.22464

method = Krinsky and Robb 
Code
df.clogit.main <- tidy(clogit.main)

save(df.clogit.main, file = "../data/df-clogit-main.rda")
colnames(df.clogit.main) <- c("Term", "Estimate", "Std. Error", "z value", "p value")

kbl(df.clogit.main, digits = 3) %>% 
  kable_styling(font_size = 8)
Table 8: Output from the clogit function for the main model.
Term Estimate Std. Error z value p value
TCffg:D+E 0.413 0.153 2.700 0.007
TCffg:E 0.071 0.218 0.326 0.744
TCffs:D -0.091 0.205 -0.442 0.658
TCffs:D+E 0.494 0.174 2.844 0.004
TCffs:E 0.113 0.167 0.680 0.497
TCNone:None -1.350 0.202 -6.691 0.000
TCoti:D 0.070 0.158 0.443 0.658
TCoti:D+E 0.447 0.211 2.117 0.034
TCoti:E -0.073 0.171 -0.430 0.667
TCvc:D -0.264 0.180 -1.462 0.144
TCvc:D+E 0.155 0.152 1.018 0.309
TCvc:E 0.109 0.205 0.532 0.595
effort -0.324 0.049 -6.641 0.000
pain 0.034 0.003 12.329 0.000
weight 0.050 0.006 7.638 0.000
cost -0.001 0.000 -12.619 0.000
effort:FABQ_PA 0.007 0.003 2.441 0.015
effort:FABQ_W -0.001 0.002 -0.447 0.655
cost:Franchise 0.000 0.000 1.025 0.305

The model can also be estimated when including the variable content.

Code
# model.formula.content <- choice_bin ~
#   type +
#   content +
#   effort +
#   pain +
#   weight +
#   cost +
#   effort:FABQ_PA + 
#   effort:FABQ_W + 
#   cost:Franchise +
#   strata(STR)
# 
# library(survival)
# clogit.content <- clogit(model.formula.content, 
#                          data = df.dce.long, 
#                          method = "approximate")
# 
# df.clogit.content <- tidy(clogit.content)
# 
# colnames(df.clogit.content) <- c("Term", "Estimate", "Std. Error", "z value", "p value")
# 
# kbl(df.clogit.content, digits = 3) %>% 
#   kable_styling(font_size = 8)
# 
# tbl.model.content <-  df.clogit.content %>% 
#   mutate(across(c("Estimate", "Std. Error", "z value", "p value"), ~ round(.x, 3)),
#           Term = c("Effort", "Pain reduction", "Weight reduction", "Cost", "Type: Online, time & place independent", "Type: Videocalls", "Type: On-site, group session", 
#                   "Type: On-site, individual session", "Content: Dietary intervention", "Content: Exercise intervention", "Content: Dietary + Exercise intervention",
#                   "Effort: FABQ-PA", "Effort: FABQ-W", "Cost: Franchise")) %>% 
#   flextable::flextable() %>% 
#   autofit() %>%
#   bold(part = "header") %>%
#   align(align = "center", part = "all") %>%
#   flextable::line_spacing(space = 1.0) %>%
#   flextable::fontsize(size = 8, part = "all") %>%
#   flextable::font(fontname = "Times New Roman", part = "all") %>%
#   flextable::save_as_docx(path = "../tables/tbl.model.content.docx")

Once again, there appears to be an estimation issue with one level of content, specifically, level ED. Additionally, the coefficients for levels E and D are negative, which contradicts the findings: according to the univariate descriptives, E and D were preferred more than None.

Code
# prop.table(table(df.dce.long$choice_bin, df.dce.long$content), margin = 2) %>% 
#   kbl(digits = 3) %>% 
#   kable_styling(font_size = 8)
Code
Block.KM.1 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 1) %>% 
  dplyr::select(ID, BLOCK,S_1, S_2, T1, Q1.1:Q1.15, 
                FABQ_PA_1: FABQ_PA_5, FABQ_W_1: FABQ_W_11,
                Sex, Age_1, Weight_1, Height_1, Franchise, Difficulty_survey_1)
         
names(Block.KM.1) <- c("ID", "BLOCK", "S1", "S2", "T1", "q1", "q2", "q3", "q4", "q5", "q6",
                    "q7", "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15", 
                    "FABQ_PA_1", "FABQ_PA_2", "FABQ_PA_3", "FABQ_PA_4", "FABQ_PA_5",
                    "FABQ_W_1", "FABQ_W_2", "FABQ_W_3", "FABQ_W_4", "FABQ_W_5", "FABQ_W_6",
                    "FABQ_W_7", "FABQ_W_8", "FABQ_W_9", "FABQ_W_10", "FABQ_W_11", 
                    "Sex", "Age", "Weight", "Height", "Franchise", "Survey_difficulty")

Block.KM.2 <- df.qualtrics.adapted %>% 
  filter(BLOCK == 2) %>% 
  dplyr::select(ID, BLOCK,S_1, S_2, T1, Q2.1:Q2.15, 
                FABQ_PA_1: FABQ_PA_5, FABQ_W_1: FABQ_W_11,
                Sex, Age_1, Weight_1, Height_1, Franchise, Difficulty_survey_1)
         
names(Block.KM.2) <- c("ID", "BLOCK", "S1", "S2", "T1", "q1", "q2", "q3", "q4", "q5", "q6",
                    "q7", "q8", "q9", "q10", "q11", "q12", "q13", "q14", "q15", 
                    "FABQ_PA_1", "FABQ_PA_2", "FABQ_PA_3", "FABQ_PA_4", "FABQ_PA_5",
                    "FABQ_W_1", "FABQ_W_2", "FABQ_W_3", "FABQ_W_4", "FABQ_W_5", "FABQ_W_6",
                    "FABQ_W_7", "FABQ_W_8", "FABQ_W_9", "FABQ_W_10", "FABQ_W_11", 
                    "Sex", "Age", "Weight", "Height", "Franchise", "Survey_difficulty")

df.qualtrics.KM <- rbind(Block.KM.1, Block.KM.2)
Code
library(dplyr)
library(tidyr)
library(survival)
library(ggplot2)

# Select variables
df.km <- df.qualtrics.KM %>% 
  dplyr::select(ID, S1:Survey_difficulty) %>% 
  mutate(across(S1:Survey_difficulty, as.character))

step_names <- names(df.km)[-1]  # drop ID

# Develop dataset for KM survival model
df.km.long <- df.km %>%
  pivot_longer(cols = -ID, names_to = "step", values_to = "value") %>%
  group_by(ID) %>%
  mutate(step_number = row_number(),
         is_na = is.na(value)) %>%
  summarise(
    time = ifelse(any(is_na), min(step_number[is_na]), max(step_number)),
    event = ifelse(any(is_na), 1, 0)
  )

df.km.long <- df.km.long %>%
  bind_rows(
    tibble(ID = unique(df.km.long$ID), time = 0, event = 0)
  ) %>%
  arrange(ID, time)

# Fit KM survival model
km.fit <- survfit(Surv(time, event) ~ 1, data = df.km.long)

# Convert survfit to data frame for ggplot
df.km <- data.frame(
  time = km.fit$time,
  surv = km.fit$surv
)

# Plot using ggplot2
library(ggpubr)

ggplot.km <- ggplot(df.km, aes(x = time, y = surv)) +
  geom_step(linewidth = 1.2, color = "#2C3E50") +        # thicker line, nice navy color
  scale_x_continuous(
    breaks = 1:length(step_names),
    labels = step_names
  ) +
  scale_y_continuous(
    limits = c(0, 1),
    breaks = seq(0, 1, by = 0.1)
  ) +
  labs(
    x = "Survey Step",
    y = "Probability of Continuing Participation",
    title = "Survival of Participants Across Survey Steps"
  ) +
  theme_pubr(base_size = 14, base_family = "Times New Roman") +  # ggpubr theme
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    axis.title.x = element_text(face = "bold"),
    axis.title.y = element_text(face = "bold"),
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5)
  )

ggplot.km

Code
ggsave(filename = "../figures/plot_kaplan_meier.png", plot = ggplot.km, width = 10, height = 8, dpi = 500)