Introduction

Therapeutic attrition in oncology—whether formal (treatment discontinuation) or functional (disengagement from follow-up assessments) remains a major barrier to optimal outcomes in patients with advanced cancer. Clinical trials investigating supportive care interventions frequently suffer from high attrition rates, limiting both internal validity and generalizability of findings.

The randomized controlled trial NCT02349412, comparing early integrated palliative and oncology care to standard oncology care in patients with incurable lung or non-colorectal gastrointestinal cancers, provides a valuable opportunity to examine therapeutic trajectories and attrition patterns in a real-world trial context.

Rationale

While several studies have identified clinical and psychosocial factors associated with dropout in oncology, traditional approaches have largely treated attrition as a binary or static event. Consequently, little is known about the dynamic, time-dependent processes that lead patients to formally or functionally withdraw from cancer care protocols. Understanding when, how, and why patients transition between therapeutic states—engagement, disengagement, or death is essential for optimizing clinical decision-making and trial design.

Objectives

Primary Aim

To improve the understanding and prediction of therapeutic attrition patterns, both formal (e.g., treatment discontinuation) and functional (e.g., disengagement from assessments), among patients undergoing advanced cancer treatment by applying a multi-state and predictive modeling framework.

Specific Objectives

  • To define and quantify therapeutic states including functional and formal attrition.
  • To estimate transition probabilities and the timing between therapeutic states and clinical outcomes.
  • To identify clinical, demographic, and psychosocial predictors associated with transitions.
  • To develop and validate a predictive model for early identification of patients at high risk of therapeutic attrition.

Data Source

The analysis uses patient-level data from the clinical trial NCT02349412 conducted by the Alliance for Clinical Trials in Oncology. The trial enrolled patients within 8 weeks of diagnosis of incurable lung or non-colorectal gastrointestinal cancer and randomized them to either standard oncology care or early integrated palliative care.

Study Design and Arms

The NCT02349412 trial enrolled 405 patients with newly diagnosed, incurable lung or non-colorectal gastrointestinal cancers. Participants were randomized to receive either early integrated palliative care (Arm 1) or standard oncology care (Arm 2). The follow-up schedule included quality-of-life (QOL) assessments at weeks 6, 12, and 24, and survival follow-up every 4 months until death or for up to 3 years.

Study Completion and Attrition Overview

Summary of Study Arms and Participant Outcomes
Arm Description N Randomized QOL at Week 12 QOL at Week 24 Completed Withdrawals
Arm 1: Early Palliative Care Standard oncology care + structured palliative care visits 202 92 68 195 7 (all subject withdrawal)
Arm 2: Usual Care Standard oncology care, palliative care only upon request 203 101 80 196 7 (all subject withdrawal)

Notes: - Both arms involved self-report assessments by patients and family caregivers. - The primary outcome was quality of life at 12 and 24 weeks. - Attrition occurred gradually, mostly due to voluntary withdrawal.

Interpretation Although only 7 patients in each study arm officially withdrew from the trial, a much larger proportion failed to complete quality-of-life (QOL) assessments at week 12 or week 24. This pattern suggests the presence of hidden attrition—patients who remained technically enrolled but disengaged from study procedures.

Justification for Defining Functional Attrition

In traditional analyses, attrition is typically measured through formal withdrawal. However, this approach underestimates the complexity of disengagement in oncology trials.

In this study:

This discrepancy highlights a phenomenon we define as functional attrition when patients do not formally withdraw but stop participating in study-related assessments, thus impacting data completeness and trial validity.

Recognizing functional attrition is crucial for several reasons:

Therefore, we define both formal and functional attrition as therapeutic states, allowing us to model them dynamically using a multi-state framework.

Data Management

Install and load needed packages

Prior to conducting the analyses, all necessary R packages were identified based on the planned data processing, statistical modeling, and performance evaluation procedures. A procedure was implemented to automatically check for and install any missing packages to ensure a consistent and reproducible computational environment. Once verified, all required libraries were loaded into the R session. This step guaranteed that all tools needed for data management, multi-state modeling, regression modeling (including logistic regression, LASSO, random forest, and XGBoost), model validation, and result visualization were available throughout the analysis workflow.

Data importation and Cleaning

Nine datasets were imported, each containing complementary clinical and psychological data from the same patient cohort. To prepare the data for analysis:

  • Datasets were merged using a common patient identifier to create a unified dataset containing all available variables.

  • Repeated or redundant columns generated during the merging process were identified and removed to prevent duplication.

  • Irrelevant variables not required for the planned analyses, such as intermediate or derived variables, site-level identifiers, and redundant assessment variables were excluded to streamline the dataset.

  • The number of missing values per variable was examined to identify patterns of missingness. Variables with the highest proportions of missing data were documented (Table 1).

In this study context, missing values are informative rather than purely technical. The absence of data for key outcomes (such as psychological scores or assessment completions) often reflects clinically significant events like death, formal withdrawal from the study, or loss to follow-up. These missing data points are thus integral to defining attrition-related outcomes in subsequent analyses.

Table 1: Top 10 Variables with Most Missing Values
Variable Missing_Values
fact_chg24 257
fact_chg12 212
fact_bl 53
hads_pt_anx_bl 39
hads_pt_dep_bl 39
bsl_assess_complete 14
wk12_assess_complete.x 14
wk24_assess_complete.x 14
age 14
race1 14

Objectif 1: To define and quantify therapeutic states including functional and formal attrition.

Step 1- Definition of Therapeutic States and creation of a new column assigning one specific state at each participant

To capture the complexity of therapeutic engagement over time, we defined a set of discrete therapeutic states reflecting patients’ participation status at three key timepoints: baseline, week 12, and week 24. These states incorporate both formal and functional attrition, as well as survival status, and serve as the foundation for subsequent multi-state modeling. Note : In this trial, QOL assessments at weeks 12 and 24 were scheduled for all participants, regardless of completion at prior timepoints. Thus, the classification of functional attrition at week 24 applies independently of week 12 status. A patient may experience functional attrition at both week 12 and week 24 if they remain alive and officially enrolled but fail to complete QOL assessments at both timepoints. The multi-state framework accommodates such longitudinal disengagement patterns.

Baseline State

  • S0 – Baseline Engagement: The patient completed the baseline assessment and was considered fully engaged at study entry.

Week 12 States

  • S1 – QOL Completed at Week 12: The patient completed the week 12 quality-of-life (QOL) assessment and remained actively engaged.
  • S2 – Functional Attrition at Week 12: The patient was alive and not formally withdrawn, but did not complete the QOL assessment at week 12.
  • S3 – Formal Withdrawal before Week 12: The patient officially withdrew from the study prior to week 12.
  • S4 – Death before or at Week 12: The patient died before or at the time of the week 12 assessment.

Week 24 States

  • S5 – QOL Completed at Week 24: The patient completed the week 24 QOL assessment and was alive at that time.
  • S6 – Functional Attrition at Week 24: The patient was alive, not withdrawn, but did not complete the QOL assessment at week 24.
  • S7 – Death between Week 12 and Week 24: The patient died in the interval between the week 12 and week 24 assessments.

These states are mutually exclusive and temporally ordered, enabling the application of a multi-state modeling framework to estimate transitions between them. The explicit distinction between formal attrition (voluntary withdrawal) and functional attrition (loss to follow-up without formal withdrawal) is particularly relevant in oncology trials, where patient engagement may gradually decline without administrative documentation.

Step 2- Create the transition dataset in long format

To enable multi-state modeling, the dataset was transformed from a wide format where each patient’s state at baseline, week 12, and week 24 was recorded in separate columns to a long format. In this structure, each row represents a patient’s state at a specific timepoint, alongside a corresponding time variable (0, 12, or 24 weeks).

This format is required for multi-state analyses as it facilitates modeling transitions between therapeutic states over time. Each patient may therefore contribute multiple observations to the dataset, one for each timepoint at which their state is defined.

This restructuring ensures compatibility with multi-state modeling functions and allows for an explicit, time-dependent analysis of therapeutic engagement patterns.

Step 3- Frequencies and Proportions of Therapeutic States at Each Timepoint

To quantify therapeutic engagement and attrition over time, we examined the distribution of discrete states defined at baseline, week 12, and week 24. The table below summarizes the frequency and proportion of patients in each state.

Table 2: Frequency and Proportion of States at Each Timepoint
Timepoint State N Patients %
Baseline S0 352 86.9
Baseline NA 53 13.1
Week12 S1 192 47.4
Week12 S2 134 33.1
Week12 S4 65 16.0
Week12 S3 14 3.5
Week24 S5 148 36.5
Week24 S7 125 30.9
Week24 S6 118 29.1
Week24 NA 14 3.5

Baseline (Study Entry)

At baseline, 352 participants (86.9%) completed the initial quality-of-life (QOL) assessment and were considered fully engaged (State S0). A notable 53 participants (13.1%) had missing baseline data, likely reflecting early refusal, logistical barriers, or incomplete study onboarding. These patients were excluded from transition analyses.

Week 12

At week 12: - 192 patients (47.4%) completed the scheduled QOL assessment and remained engaged (State S1). - 134 patients (33.1%) were alive and not formally withdrawn but did not complete the QOL, indicating functional attrition (State S2). - 65 patients (16.0%) had died by week 12 (State S4). - Only 14 patients (3.5%) formally withdrew from the study (State S3).

Interpretation: Functional attrition (33.1%) substantially exceeded formal withdrawal (3.5%), suggesting that many patients disengaged from the study without formally exiting. Death occurred in 16% of the cohort within the first 12 weeks, reflecting the high-risk nature of the advanced cancer population.

Week 24

By week 24: - 148 patients (36.5%) remained engaged and alive, having completed the QOL assessment (State S5). - 125 patients (30.9%) died between weeks 12 and 24 (State S7). - 118 patients (29.1%) missed the QOL despite being alive and not withdrawn, reflecting continued functional attrition (State S6). - 14 patients (3.5%) refers to patients who have formally withdrew at week 12.

Interpretation: Less than 40% of patients completed follow-up at week 24 per protocol. Both functional attrition and mortality were major contributors to study discontinuation. Importantly, the persistence of functional attrition over time highlights the need to model disengagement as a dynamic and informative process.

Overall Implications

These findings confirm that functional attrition is a dominant mode of disengagement, far more prevalent than formal withdrawal. Alongside mortality, it results in significant early loss of follow-up data. This justifies the use of a multi-state modeling approach that explicitly accounts for informal dropout and death as distinct and informative transitions in therapeutic adherence.

Step 4- Definition of Transition criteria (Transition Matrix)

The transition matrix outlines the set of allowable transitions between therapeutic states from baseline to week 24, as defined in our multi-state framework. Each transition reflects a clinically or behaviorally meaningful change in patient status within the study period.

Transition Matrix
S0 S1 S2 S3 S4 S6 S7 S5
S0 0 1 1 1 1 0 0 0
S1 0 0 0 0 0 1 1 1
S2 0 0 0 0 0 1 1 1
S3 0 0 0 0 0 0 0 0
S4 0 0 0 0 0 0 0 0
S6 0 0 0 0 0 0 0 0
S7 0 0 0 0 0 0 0 0
S5 0 0 0 0 0 0 0 0
  • From baseline (S0), patients can transition to one of four mutually exclusive states at week 12:
    • S1: Actively engaged (QOL assessment completed),
    • S2: Functionally disengaged (alive but missed the QOL assessment),
    • S3: Formally withdrawn,
    • S4: Deceased before or at week 12.
  • Patients in S1 or S2 (week 12) can experience further transitions by week 24:
    • S5: Completed the QOL assessment at week 24 while still alive (i.e., retained and fully compliant),
    • S6: Functionally disengaged at week 24 (alive but did not complete the assessment),
    • S7: Deceased between weeks 12 and 24.
  • States S3, S4, S5, S6, and S7 are defined as absorbing states, meaning that once patients enter these states, no further transitions are expected or permitted. This reflects either formal discontinuation (S3), death (S4, S7), or final outcome states at study end (S5 and S6).

This structure supports a realistic representation of therapeutic attrition over time, especially by distinguishing between formal and functional attrition — an essential nuance in oncology trials. It ensures that the model accounts not only for survival and withdrawal but also for subtler forms of disengagement that might otherwise be overlooked in conventional trial analyses.

Step 5- visualization of transition

Step 6- Distribution of observed transitions

To characterize how participants moved between therapeutic states over time, we constructed a transition dataset capturing each patient’s sequential state changes across baseline, week 12, and week 24.

  • For each patient, successive state pairs were identified (e.g., from S0 at baseline to S1 at week 12).

  • We computed the total number of observed transitions between every possible pair of states.

  • The proportion of each transition relative to the total number of departures from a given starting state was also calculated, providing insight into the most frequent transition patterns.

  • A transition count matrix summarizing these observed movements between therapeutic states was then generated to support subsequent multi-state modeling.

This step allowed us to quantify the dynamics of patient engagement, attrition, and survival within the trial cohort.

Table 3: Distribution of Observed Transitions Between Therapeutic States
From To Count % Among From-State
S0 S1 192 54.5
S0 S2 102 29.0
S0 S4 58 16.5
S1 S5 128 66.7
S1 S6 41 21.4
S1 S7 23 12.0
S2 S5 20 14.9
S2 S6 77 57.5
S2 S7 37 27.6
S4 S7 65 100.0

Each cell shows the number of participants transitioning from one state to another, alongside the percentage relative to the total number of participants in the originating state.

Matrix of Observed Transitions Between Therapeutic States
S1 S2 S4 S5 S6 S7
S0 192 102 58 0 0 0
S1 0 0 0 128 41 23
S2 0 0 0 20 77 37
S4 0 0 0 0 0 65

Step 7: Visualization of observed transition

Transitions Between Therapeutic States-sankey plot

Interpretation of Observed Transitions Between Therapeutic States

The observed transitions reflect the dynamic evolution of patients across therapeutic states, offering insights into both engagement patterns and attrition risks throughout the trial.

  • From Baseline (S0):
    • A majority of patients (54.5%) transitioned to S1, indicating they remained engaged and completed the QOL assessment at week 12.
    • Nearly one-third (29.0%) experienced functional attrition (S2) — alive but did not complete the assessment.
    • A notable 16.5% died before or at week 12 (S4), highlighting the high early mortality burden.
  • From Week 12 Engagement (S1):
    • Two-thirds (66.7%) of patients who were engaged at week 12 went on to complete the QOL assessment at week 24 (S5), indicating sustained engagement.
    • However, 21.4% transitioned to S6, showing functional attrition despite earlier engagement.
    • 12.0% died between week 12 and 24 (S7), reflecting clinical deterioration despite initial participation.
  • From Week 12 Functional Attrition (S2):
    • 57.5% remained disengaged at week 24 (S6), indicating persistent functional attrition.
    • 27.6% died by week 24 (S7), which may suggest underlying health deterioration among those disengaged.
    • Only 14.9% re-engaged and completed the QOL assessment at week 24 (S5), showing limited recovery from earlier disengagement.
  • From Week 12 Death (S4):
    • All patients in this state transitioned to S7 (week 24 death), as expected in the multi-state definition where S4 represents early death and S7 represents confirmed status at week 24.

Key Implications

These findings highlight that: - Functional attrition is a significant and persistent phenomenon, especially among those who disengage early. - Initial engagement is predictive of continued participation, but even engaged patients are not immune to later attrition or mortality. - Death and disengagement are intertwined, underscoring the need to monitor functional attrition as a possible early signal of clinical decline.

By quantifying these transitions, the model provides a nuanced understanding of trial dynamics that extends beyond formal withdrawal metrics.

Objectif 2:To estimate transition probabilities and the timing (Sojourn time) between therapeutic states and clinical outcomes using the package “etm”

Step 1- Restructuring the wide-format merged trial data and defining transition times

The merged trial dataset (wide format) was restructured, we explicitly defined transition times and event indicators (status) for each possible therapeutic state transition.

For each patient:

  • The baseline time was set at 0.

  • Transition times were set at week 12 (time = 12) or week 24 (time = 24) depending on the observed state.

  • An event indicator (Status) was created for each possible transition, marking whether the transition occurred (1) or not (0).

Subsequently, the dataset was reshaped into a long format, with one row per patient and transition. Each row contained the patient ID, starting and ending state, start and end time of the transition, and event status. Transition numbers were mapped to corresponding starting and destination states using a lookup table.

Step 2- Defined the allowed transition matrix (tra) reflecting clinical study design (required by etm)

A transition matrix was then defined to reflect the clinical study design and allowable patient pathways within the multi-state framework. This matrix specifies which transitions between states are permitted. For example:

  • From S0 (Baseline engagement), patients could transition to S1, S2, S3, or S4 by week 12.

  • From S1 and S2, patients could transition to S5, S6, or S7 by week 24.

This transition matrix ensures the model respects the temporal and clinical logic of the trial.

Step 3- Fitting a non-parametric multi-state model using etm():

A non-parametric multi-state model was then fitted using the etm() function from the etm package. The model inputs included:

  • the long-format dataset with one row per patient transition,
  • the matrix of allowed transitions,
  • the initial state (S0),
  • the start and end times for each transition,
  • and event indicators specifying whether each transition occurred.

The model estimated:

  • cumulative transition probabilities between states at each observed event time,

  • and sojourn times (the average time patients spend in each state before transitioning).

These estimates provide insight into patient engagement patterns and the timing of clinical outcomes throughout the study period.

Step 4- Estimation of Sojourn Time :

Although sojourn times are commonly reported in multi-state analyses to quantify the expected time patients spend in each state before transitioning elsewhere, this was not estimated in the present study.

The multi-state model applied here was built on predetermined, discrete assessment timepoints at baseline, 12 weeks, and 24 weeks. Consequently, state transitions could only occur at these scheduled visits, and expected sojourn times would merely reflect the fixed calendar intervals rather than intrinsic, continuous transition dynamics.

For this reason, and in accordance with good multi-state modeling practices in discrete-time clinical trial settings, sojourn times were not estimated. Cumulative transition probabilities remain the most informative and appropriate metrics for describing patient trajectories and event risks in this trial.

Step 5- Estimation of Cumulative Transition Probabilities:

This section presents the cumulative transition probabilities between therapeutic and clinical states over the course of the trial, using a non-parametric multi-state model.

Method

Cumulative transition probabilities were estimated at 12 and 24 months using the Aalen-Johansen estimator implemented via the {etm} package in R. These estimates account for censoring and competing transitions between mutually exclusive states as defined by the clinical trial protocol.

Results

The table below presents the estimated cumulative transition probabilities at 12 and 24 months, with their 95% confidence intervals (CIs), number of patients at risk (n.risk), and number of observed events (n.event) for each transition.

Table 4: Cumulative Transition Probabilities with 95% CI, number at risk, and number of events
from to time P lower upper n.risk n.event var
S0 S1 12 0.4741 0.4254 0.5227 405 192 0.0006
S0 S2 12 0.3309 0.2850 0.3767 405 134 0.0005
S0 S3 12 0.0346 0.0168 0.0524 405 14 0.0001
S0 S4 12 0.1605 0.1247 0.1962 405 65 0.0003
S1 S5 24 0.5614 0.4970 0.6258 228 128 0.0011
S1 S6 24 0.1009 0.0618 0.1400 228 23 0.0004
S1 S7 24 0.3377 0.2763 0.3991 228 77 0.0010
S2 S5 24 0.4184 0.3207 0.5160 98 41 0.0025
S2 S6 24 0.2041 0.1243 0.2839 98 20 0.0017
S2 S7 24 0.3776 0.2816 0.4735 98 37 0.0024

Visualization of the Cumulative Transition Probabilities by Starting State

Interpretation of Cumulative Transition Probabilities

Transition Probabilities at 12 Months

From the baseline state (S0), the probability of transitioning to:

  • S1 (QOL assessment completed): 47.4% (95% CI: 42.5%–52.3%) Based on 192 events among 405 patients at risk.

  • S2 (Functional attrition): 33.1% (95% CI: 28.5%–37.7%) With 134 events.

  • S3 (Formal withdrawal): 3.5% (95% CI: 1.7%–5.2%) Reflecting formal disengagement.

  • S4 (Death before or at 12 months): 16.0% (95% CI: 12.5%–19.6%) Indicating early mortality.


Transition Probabilities at 24 Months

▸ Among patients in S1 at 12 months:

  • S5 (QOL completed at 24 months): 56.1% (95% CI: 49.7%–62.6%)

  • S6 (Functional attrition): 10.1% (95% CI: 6.2%–14.0%)

  • S7 (Death between 12 and 24 months): 33.8% (95% CI: 27.6%–39.9%)

▸ Among patients in S2 at 12 months:

  • S5 (Re-engaged, QOL completed at 24 months): 41.8% (95% CI: 32.1%–51.6%)

  • S6 (Sustained functional attrition): 20.4% (95% CI: 12.4%–28.4%)

  • S7 (Death between 12 and 24 months): 37.8% (95% CI: 28.2%–47.4%)


Key Insights

  • By 12 months, nearly half of the patients (47.4%) remained actively engaged with completed assessments.

  • A substantial proportion (33.1%) experienced functional attrition without formal withdrawal.

  • Mortality before 12 months reached 16%, highlighting early risk.

  • Among patients still engaged at 12 months (S1), 56.1% successfully completed the 24-month follow-up.

  • However, attrition and mortality between 12 and 24 months remained considerable, particularly for patients who were already functionally disengaged at 12 months (S2), whose cumulative mortality reached 38% by 24 months.


These results underscore the dynamic nature of therapeutic engagement and disengagement in advanced cancer care. They highlight the importance of distinguishing between formal withdrawal and functional attrition, both of which have meaningful implications for patient outcomes, study validity, and clinical interventions.


Objective 3: Explanatory Modeling of State Transitions

This section investigates individual-level predictors of transitions between therapeutic and clinical states using regression-based multi-state models. The aim is to identify baseline and follow-up variables significantly associated with changes in therapeutic engagement and clinical outcomes.

Separate Cox proportional hazards models were fitted for each observed transition. The data were structured in a long format, where each row represented a potential transition for an individual, including information on origin state, destination state, entry time, exit time, event status, and associated covariates. This structure allowed each individual to contribute multiple observations, corresponding to the different transitions they were at risk of experiencing.

For each model: - Only patients at risk for the specific transition were included. - Transition-specific hazard ratios and 95% confidence intervals were estimated.

Models were only estimated when the number of events was sufficient (≥10). Hazard ratios, 95% confidence intervals, and proportional hazards assumptions were assessed for each fitted model.

Candidate Variables

The following variables were selected based on clinical relevance and previous studies:

  • Clinical Variables: ECOG performance status

  • Demographic Variables: Age, Sex

  • Psychosocial Variables: FACT-G score (Functional Assessment of Cancer Therapy - General), cgpart (Caregiver participation)

  • Treatment Variables: Randomization arm (Standard Care vs Early Palliative Care), Baseline QOL assessments

Cox proportional hazards models

Table 5: Summary of the Cox model without covariates (null model)
Item Value
Model type Null model (no covariates, stratified by transition)
Log-likelihood -2645.478
Number of observations (used) 731

Interpretation: A stratified Cox proportional hazards model was fitted to the multi-state transition data using Surv(entry, exit, status) ~ strata(from, to). This model accounted for different baseline hazards across transition types. A total of 731 transition intervals were analyzed. The log-likelihood of the null model was -2645.478, indicating successful model convergence. This model serves as the foundation for estimating cumulative transition hazards and probabilities in subsequent steps. This is used as input for estimating transition intensities in the next step.

Table 6: Cox Model Results by Transition
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S0_S1 age 0.991 [0.976, 1.007] 0.262
S0_S1 sexm 1.158 [0.859, 1.56] 0.336
S0_S1 ecogps 0.824 [0.639, 1.063] 0.136
S0_S1 arm.x 1.061 [0.797, 1.411] 0.686
S0_S1 fact_bl 1.002 [0.989, 1.016] 0.729
S0_S1 hads_pt_anx_bl 0.994 [0.942, 1.049] 0.831
S0_S1 hads_pt_dep_bl 0.954 [0.902, 1.01] 0.106
S0_S1 cgpartYes 1.333 [0.962, 1.849] 0.084
S0_S2 age 0.989 [0.97, 1.009] 0.272
S0_S2 sexm 0.944 [0.632, 1.411] 0.78
S0_S2 ecogps 0.938 [0.678, 1.297] 0.698
S0_S2 arm.x 0.936 [0.633, 1.383] 0.74
S0_S2 fact_bl 1.007 [0.989, 1.024] 0.464
S0_S2 hads_pt_anx_bl 1.083 [1.009, 1.162] 0.027
S0_S2 hads_pt_dep_bl 0.992 [0.922, 1.067] 0.821
S0_S2 cgpartYes 0.708 [0.471, 1.064] 0.096
S0_S4 age 1.042 [1.013, 1.073] 0.005
S0_S4 sexm 0.725 [0.421, 1.246] 0.244
S0_S4 ecogps 2.026 [1.257, 3.266] 0.004
S0_S4 arm.x 1.102 [0.65, 1.867] 0.719
S0_S4 fact_bl 0.980 [0.956, 1.005] 0.112
S0_S4 hads_pt_anx_bl 0.906 [0.823, 0.998] 0.044
S0_S4 hads_pt_dep_bl 1.081 [0.995, 1.175] 0.065
S0_S4 cgpartYes 0.942 [0.529, 1.679] 0.839
S1_S5 age 1.002 [0.984, 1.021] 0.815
S1_S5 sexm 1.048 [0.725, 1.516] 0.803
S1_S5 ecogps 0.803 [0.594, 1.086] 0.154
S1_S5 arm.x 1.284 [0.891, 1.85] 0.179
S1_S5 fact_bl 1.008 [0.99, 1.026] 0.373
S1_S5 fact_chg12 1.017 [1.003, 1.031] 0.019
S1_S5 hads_pt_anx_bl 1.103 [1.027, 1.184] 0.007
S1_S5 hads_pt_dep_bl 0.964 [0.898, 1.035] 0.317
S1_S5 cgpartYes 0.888 [0.593, 1.329] 0.564
S1_S6 age 0.999 [0.968, 1.031] 0.951
S1_S6 sexm 1.099 [0.573, 2.107] 0.776
S1_S6 ecogps 1.344 [0.774, 2.335] 0.294
S1_S6 arm.x 0.794 [0.419, 1.502] 0.477
S1_S6 fact_bl 1.002 [0.97, 1.035] 0.925
S1_S6 fact_chg12 1.002 [0.977, 1.027] 0.897
S1_S6 hads_pt_anx_bl 0.921 [0.809, 1.047] 0.207
S1_S6 hads_pt_dep_bl 1.054 [0.929, 1.196] 0.414
S1_S6 cgpartYes 1.652 [0.756, 3.611] 0.208
S1_S7 age 1.007 [0.967, 1.049] 0.742
S1_S7 sexm 0.917 [0.378, 2.224] 0.848
S1_S7 ecogps 1.333 [0.627, 2.834] 0.455
S1_S7 arm.x 0.587 [0.25, 1.382] 0.223
S1_S7 fact_bl 0.965 [0.927, 1.004] 0.081
S1_S7 fact_chg12 0.934 [0.901, 0.968] 0.001
S1_S7 hads_pt_anx_bl 0.794 [0.661, 0.954] 0.014
S1_S7 hads_pt_dep_bl 1.021 [0.84, 1.24] 0.834
S1_S7 cgpartYes 0.706 [0.293, 1.699] 0.437
S2_S5 age 1.028 [0.976, 1.082] 0.3
S2_S5 sexm 0.391 [0.143, 1.071] 0.068
S2_S5 ecogps 0.730 [0.29, 1.839] 0.504
S2_S5 arm.x 1.377 [0.55, 3.45] 0.494
S2_S5 fact_bl 1.053 [1.004, 1.105] 0.035
S2_S5 hads_pt_anx_bl 1.051 [0.902, 1.223] 0.526
S2_S5 hads_pt_dep_bl 1.099 [0.915, 1.32] 0.311
S2_S5 cgpartYes 1.538 [0.57, 4.149] 0.395
S2_S6 age 1.004 [0.974, 1.035] 0.79
S2_S6 sexm 1.279 [0.728, 2.249] 0.392
S2_S6 ecogps 1.304 [0.771, 2.204] 0.322
S2_S6 arm.x 0.654 [0.371, 1.154] 0.143
S2_S6 fact_bl 0.995 [0.969, 1.023] 0.739
S2_S6 hads_pt_anx_bl 1.027 [0.931, 1.132] 0.593
S2_S6 hads_pt_dep_bl 0.977 [0.887, 1.075] 0.633
S2_S6 cgpartYes 0.852 [0.483, 1.504] 0.581
S2_S7 age 0.971 [0.931, 1.013] 0.171
S2_S7 sexm 1.242 [0.576, 2.676] 0.581
S2_S7 ecogps 0.856 [0.417, 1.758] 0.671
S2_S7 arm.x 1.533 [0.703, 3.342] 0.283
S2_S7 fact_bl 0.976 [0.942, 1.012] 0.186
S2_S7 hads_pt_anx_bl 0.931 [0.817, 1.061] 0.285
S2_S7 hads_pt_dep_bl 0.980 [0.868, 1.105] 0.74
S2_S7 cgpartYes 1.098 [0.496, 2.432] 0.818

Result per transition

Results for Transition: S0_S1
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S0_S1 age 0.991 [0.976, 1.007] 0.262
S0_S1 sexm 1.158 [0.859, 1.56] 0.336
S0_S1 ecogps 0.824 [0.639, 1.063] 0.136
S0_S1 arm.x 1.061 [0.797, 1.411] 0.686
S0_S1 fact_bl 1.002 [0.989, 1.016] 0.729
S0_S1 hads_pt_anx_bl 0.994 [0.942, 1.049] 0.831
S0_S1 hads_pt_dep_bl 0.954 [0.902, 1.01] 0.106
S0_S1 cgpartYes 1.333 [0.962, 1.849] 0.084
Results for Transition: S0_S2
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S0_S2 age 0.989 [0.97, 1.009] 0.272
S0_S2 sexm 0.944 [0.632, 1.411] 0.78
S0_S2 ecogps 0.938 [0.678, 1.297] 0.698
S0_S2 arm.x 0.936 [0.633, 1.383] 0.74
S0_S2 fact_bl 1.007 [0.989, 1.024] 0.464
S0_S2 hads_pt_anx_bl 1.083 [1.009, 1.162] 0.027
S0_S2 hads_pt_dep_bl 0.992 [0.922, 1.067] 0.821
S0_S2 cgpartYes 0.708 [0.471, 1.064] 0.096
Results for Transition: S0_S4
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S0_S4 age 1.042 [1.013, 1.073] 0.005
S0_S4 sexm 0.725 [0.421, 1.246] 0.244
S0_S4 ecogps 2.026 [1.257, 3.266] 0.004
S0_S4 arm.x 1.102 [0.65, 1.867] 0.719
S0_S4 fact_bl 0.980 [0.956, 1.005] 0.112
S0_S4 hads_pt_anx_bl 0.906 [0.823, 0.998] 0.044
S0_S4 hads_pt_dep_bl 1.081 [0.995, 1.175] 0.065
S0_S4 cgpartYes 0.942 [0.529, 1.679] 0.839
Results for Transition: S1_S5
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S1_S5 age 1.002 [0.984, 1.021] 0.815
S1_S5 sexm 1.048 [0.725, 1.516] 0.803
S1_S5 ecogps 0.803 [0.594, 1.086] 0.154
S1_S5 arm.x 1.284 [0.891, 1.85] 0.179
S1_S5 fact_bl 1.008 [0.99, 1.026] 0.373
S1_S5 fact_chg12 1.017 [1.003, 1.031] 0.019
S1_S5 hads_pt_anx_bl 1.103 [1.027, 1.184] 0.007
S1_S5 hads_pt_dep_bl 0.964 [0.898, 1.035] 0.317
S1_S5 cgpartYes 0.888 [0.593, 1.329] 0.564
Results for Transition: S1_S6
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S1_S6 age 0.999 [0.968, 1.031] 0.951
S1_S6 sexm 1.099 [0.573, 2.107] 0.776
S1_S6 ecogps 1.344 [0.774, 2.335] 0.294
S1_S6 arm.x 0.794 [0.419, 1.502] 0.477
S1_S6 fact_bl 1.002 [0.97, 1.035] 0.925
S1_S6 fact_chg12 1.002 [0.977, 1.027] 0.897
S1_S6 hads_pt_anx_bl 0.921 [0.809, 1.047] 0.207
S1_S6 hads_pt_dep_bl 1.054 [0.929, 1.196] 0.414
S1_S6 cgpartYes 1.652 [0.756, 3.611] 0.208
Results for Transition: S1_S7
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S1_S7 age 1.007 [0.967, 1.049] 0.742
S1_S7 sexm 0.917 [0.378, 2.224] 0.848
S1_S7 ecogps 1.333 [0.627, 2.834] 0.455
S1_S7 arm.x 0.587 [0.25, 1.382] 0.223
S1_S7 fact_bl 0.965 [0.927, 1.004] 0.081
S1_S7 fact_chg12 0.934 [0.901, 0.968] 0.001
S1_S7 hads_pt_anx_bl 0.794 [0.661, 0.954] 0.014
S1_S7 hads_pt_dep_bl 1.021 [0.84, 1.24] 0.834
S1_S7 cgpartYes 0.706 [0.293, 1.699] 0.437
Results for Transition: S2_S5
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S2_S5 age 1.028 [0.976, 1.082] 0.3
S2_S5 sexm 0.391 [0.143, 1.071] 0.068
S2_S5 ecogps 0.730 [0.29, 1.839] 0.504
S2_S5 arm.x 1.377 [0.55, 3.45] 0.494
S2_S5 fact_bl 1.053 [1.004, 1.105] 0.035
S2_S5 hads_pt_anx_bl 1.051 [0.902, 1.223] 0.526
S2_S5 hads_pt_dep_bl 1.099 [0.915, 1.32] 0.311
S2_S5 cgpartYes 1.538 [0.57, 4.149] 0.395
Results for Transition: S2_S6
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S2_S6 age 1.004 [0.974, 1.035] 0.79
S2_S6 sexm 1.279 [0.728, 2.249] 0.392
S2_S6 ecogps 1.304 [0.771, 2.204] 0.322
S2_S6 arm.x 0.654 [0.371, 1.154] 0.143
S2_S6 fact_bl 0.995 [0.969, 1.023] 0.739
S2_S6 hads_pt_anx_bl 1.027 [0.931, 1.132] 0.593
S2_S6 hads_pt_dep_bl 0.977 [0.887, 1.075] 0.633
S2_S6 cgpartYes 0.852 [0.483, 1.504] 0.581
Results for Transition: S2_S7
Transition Covariate Hazard Ratio 95% Confidence Interval p-value
S2_S7 age 0.971 [0.931, 1.013] 0.171
S2_S7 sexm 1.242 [0.576, 2.676] 0.581
S2_S7 ecogps 0.856 [0.417, 1.758] 0.671
S2_S7 arm.x 1.533 [0.703, 3.342] 0.283
S2_S7 fact_bl 0.976 [0.942, 1.012] 0.186
S2_S7 hads_pt_anx_bl 0.931 [0.817, 1.061] 0.285
S2_S7 hads_pt_dep_bl 0.980 [0.868, 1.105] 0.74
S2_S7 cgpartYes 1.098 [0.496, 2.432] 0.818

Summary of Covariate Effects on Transition Hazards Across the different transitions between therapeutic states, the following findings were observed:

  • S0 → S1: No covariates were significantly associated with the hazard of this transition.

  • S0 → S2: Significant predictor was HADS Anxiety at baseline (HR = 1.08, p = 0.027)

  • S0 → S4: Significant predictors were, Age (HR = 1.04, p = 0.005), ECOG performance status (HR = 2.03, p = 0.004), HADS Anxiety at baseline (HR = 0.91, p = 0.044).

  • S1 → S5: Significant predictors were, FACT score change at 12 weeks (HR = 1.02, p = 0.019), HADS Anxiety at baseline (HR = 1.10, p = 0.007).

  • S1 → S6: No covariates were significantly associated.

  • S1 → S7: Significant predictors were, FACT score change at 12 weeks (HR = 0.93, p = 0.001), HADS Anxiety at baseline (HR = 0.79, p = 0.014).

  • S2 → S5: Significant predictor is FACT baseline score (HR = 1.05, p = 0.035).

  • S2 → S6 and S2 → S7: No covariates were significantly associated with the hazard.

Visualizatio of Hazard Ratio (HR)

Hazard Ratios (HR) Heatmap by transition and covariate**

Forest Plot of Hazard Ratios by Transition

Interpretation Guide for the Forest Plot

Tableau explicatif des éléments du forest plot
Élément Signification
🔵 Point Valeur estimée du Hazard Ratio (HR) pour la covariable
─── Ligne Intervalle de confiance à 95 % autour du HR
Ligne croise 1 Effet non statistiquement significatif (p > 0,05)
Ligne ne croise pas 1 Effet statistiquement significatif (p < 0,05)
HR > 1 Risque de transition augmenté associé à la covariable
HR < 1 Risque de transition diminué associé à la covariable

Global Trends of Covariates

Summary of Average Hazard Ratios and Significant Transitions by Covariate
Covariate Average.HR Significant.Transitions
hads_pt_anx_bl 0.979 4
fact_chg12 0.984 2
fact_bl 0.999 1
age 1.004 1
ecogps 1.129 1
sex 0.978 0
hads_pt_dep_bl 1.014 0
arm.x 1.036 0
cgpart 1.080 0

Key Patterns Across Transitions

A cross-transition analysis revealed some consistent and divergent associations:

  • Age showed modest and inconsistent associations across transitions, with a significant increased risk only in transition S0→S4 (HR=1.042, p=0.005), suggesting older patients may be more likely to move directly from enrollment to dropout/death.

  • Sex was not a consistent predictor in any transition.

  • ECOG Performance Status was strongly associated with higher risk of adverse transitions in S0→S4 (HR=2.03, p=0.004), reflecting that lower baseline functioning predicts worse early outcomes.

  • Treatment Arm (arm.x) did not show a significant effect in any transition, suggesting limited differential effectiveness across arms regarding state transitions.


Psychosocial Factors

Anxiety (HADS-A) was consistently associated with negative outcomes:

  • S0→S2 (worsening symptoms): HR=1.08, p=0.027
  • S1→S5 (stable state to dropout): HR=1.10, p=0.007
  • S1→S7 (transition to death): HR=0.79, p=0.014
  • S0→S4 (early dropout): HR=0.91, p=0.044

Interestingly, baseline anxiety was associated with both increased likelihood of trial completion (S1→S5) and decreased risk of death (S0→S4, S1→S7). This counterintuitive finding may reflect a greater health vigilance and proactive behavior among moderately anxious patients, potentially leading to earlier symptom reporting and better adherence to medical advice. Conversely, anxiety was also associated with increased risk of functional attrition (S0→S2), suggesting that while anxiety may motivate clinical engagement, it could simultaneously exacerbate subjective distress leading to study withdrawal for functional reasons. Further studies are warranted to explore the non-linear, context-dependent effects of anxiety on health outcomes in this population and also the possibility of confounding by severity.

Depression (HADS-D) was not significantly associated with transitions, though borderline effects appeared in some transitions.

FACT Scores showed stronger effects at follow-up (change from baseline to week 12):

  • FACT change at W12 significantly decreased risk of death in S1→S7 (HR=0.93, p=0.001)
  • FACT baseline score was associated with higher risk of dropout in S2→S5 (HR=1.05, p=0.035)

These patterns highlight the predictive value of patient-reported outcomes, especially for identifying vulnerable patients during follow-up.


Caregiver Participation (cgpart)

Caregiver participation showed varied and mostly non-significant effects, but borderline associations appeared in:

  • S0→S1: HR=1.33, p=0.084
  • S0→S2: HR=0.71, p=0.096

These suggest potential roles of caregiver support in modulating early transitions, warranting further investigation.


Transition-Specific Observations

S0 → S1 (Improved Symptoms at Week 12)

This transition captures patients who report improvement in symptoms between baseline and week 12. None of the covariates reached statistical significance at the 5% level, although caregiver participation (cgpart) approached significance (HR=1.33, p=0.084), suggesting a potential positive influence on symptom improvement. Other clinical or psychosocial factors, including FACT and HADS scores, were not associated with transition.


S0 → S2 (Worsening Symptoms at Week 12)

Anxiety at baseline (HADS-A) was significantly associated with increased likelihood of symptom worsening (HR=1.08, p=0.027). This emphasizes the role of psychological distress at baseline in shaping early subjective health deterioration. Caregiver participation was borderline protective (HR=0.71, p=0.096), possibly moderating worsening trends.


S0 → S4 (Dropout or Death Before Week 12)

Three variables were significantly associated:

  • Age: HR=1.04, p=0.005 — Older patients had a higher likelihood of early attrition.
  • ECOG performance status: HR=2.03, p=0.004 — Functional impairment strongly predicted transition to dropout/death.
  • Anxiety (HADS-A): HR=0.91, p=0.044 — Interestingly, higher anxiety was protective in this transition, possibly due to increased help-seeking behavior or more attentive clinical care.

This transition reflects early therapeutic vulnerability, highlighting the value of initial clinical assessments.


S1 → S5 (Remain Stable from W12 to W24)

  • Change in FACT score between baseline and week 12 was a significant predictor (HR=1.02, p=0.019), indicating that improvements in quality of life during the first 12 weeks increased the probability of remaining stable later.
  • Anxiety at baseline was also positively associated with dropout (HR=1.10, p=0.007), perhaps reflecting underlying psychological instability.

S1 → S7 (Transition to Death after Week 12)

Two psychosocial indicators emerged as significant:

  • FACT change: HR=0.93, p=0.001 — Suggesting that improvements in quality of life at week 12 reduce the risk of death.
  • Anxiety (HADS-A): HR=0.79, p=0.014 — Again, appearing protective, which might reflect active care-seeking or heightened treatment engagement among anxious patients.

These findings suggest that patient-reported outcomes at week 12 could serve as early warning signs for long-term outcomes.


S2 → S5 (Remain Stable from W12 to W24 after Symptom Worsening)

Only FACT baseline score was significant (HR=1.05, p=0.035), suggesting that higher initial perceived quality of life was associated with stability, even after a worsening phase. This may imply resilience among patients with higher baseline functioning.


Other Transitions (S1→S6, S2→S6, S2→S7)

None of the variables reached statistical significance. The wide confidence intervals and high p-values reflect lower statistical power, likely due to small sample sizes or rare events. These models should be interpreted cautiously.

Objective 4: Risk Prediction Model for Early Therapeutic Attrition

The aim of this section is to develop and internally validate a risk prediction model to identify patients at high risk of experiencing therapeutic attrition during the initial phase of treatment. This model seeks to support early intervention strategies by leveraging baseline and early follow-up clinical and psychosocial indicators.

Outcome Definition

The outcome variable was defined as early therapeutic attrition within the first 12 weeks, operationalized as:

  • Attrition (1): Transition to either S2 (Functional Attrition) or S3 (Formal Attrition) before or at 12 weeks.
  • No Attrition (0): Remaining in the active treatment state (S1) at 12 weeks or beyond.

This binary outcome was constructed from the multi-state transition data, ensuring time-consistent classification.

Predictor Variables

Predictors included baseline demographic, psychosocial, and clinical measures, along with early follow-up quality-of-life (QOL) scores. All variables were selected a priori based on theoretical relevance and availability before or at week 12:

  • Demographics: Age, sex, Race (Categorical, with White as reference) Race categories were recoded to group less common categories (e.g., “American Indian or Alaska Native,” “Native Hawaiian or Other Pacific Islander”) to ensure stability in the model.

  • Functional Status: ECOG Performance Status (ecogps)

  • Psychosocial Metrics: HADS Anxiety (hads_pt_anx_bl), HADS Depression (hads_pt_dep_bl), Caregiver participation (cgpartYes)

  • Quality of Life (QOL) Scores: FACT-G at baseline (fact_bl), change in FACT-G at 12 weeks (fact_chg12), patient-reported QOL at week 12

  • Treatment Arm: Randomization group – Early Palliative Care (EPC) vs Standard Care (arm.x)

Data Preparation

Prior to model development, the dataset was prepared to ensure consistency, completeness, and appropriate coding of variables for analysis. The following steps were undertaken:

Patient Selection: Only patients with available outcome data at the 12-week follow-up were included in the analysis. This ensured accurate classification of early therapeutic attrition status.

Outcome Variable Construction: A binary outcome variable, attrition_12, was created to indicate whether a patient experienced therapeutic attrition within the first 12 weeks. Patients who transitioned to either Functional Attrition (S2) or Formal Attrition (S3) at or before week 12 were coded as 1, while those remaining in Active Treatment (S1) at week 12 or beyond were coded as 0.

Recoding and Cleaning of Predictor Variables:

  • Demographic variables were formatted appropriately: Age was converted to numeric, and Sex was coded as 1 for male and 0 for female. Caregiver participation (cgpart) was recoded as 1 for “Yes” and 0 for “No.”

  • The race variable was recategorized to improve model stability by combining less frequent categories (American Indian or Alaska Native, Native Hawaiian or Other Pacific Islander) into an “Other” group. Patients who refused to report their race or for whom it was unavailable were coded as “Unknown,” and missing data were categorized separately.

  • The race variable was then converted into a factor and re-leveled to use “White” as the reference category for regression modeling.

This data preparation process ensured a clean, consistent, and clinically meaningful dataset for subsequent risk model development.

Modeling Strategy

To identify and quantify the risk of early attrition, the following modeling pipeline was applied:

Logistic Regression

A binary logistic regression model using the glm() function in R was fit as the initial step to assess associations between predictors and the outcome. Odds ratios (ORs), 95% confidence intervals, and p-values were reported. This model served as the benchmark.

Linearity Assumption check (Box-Tidwell Test)
Table 4: Wald Chi-Square Test Results for Logistic Regression Model
Factor Chi-Square Degrees of Freedom p-value
age 2.860 2 0.239
X.Nonlinear 1.084 1 0.298
fact_bl_bt 1.225 2 0.542
X.Nonlinear.1 0.973 1 0.324
hads_pt_anx_bl_bt 6.451 2 0.040
X.Nonlinear.2 0.039 1 0.843
hads_pt_dep_bl_bt 3.739 2 0.154
X.Nonlinear.3 3.643 1 0.056
TOTAL.NONLINEAR 4.974 4 0.290
TOTAL 12.432 8 0.133

We assessed the linearity assumption between continuous predictors and the logit of the probability of attrition using restricted cubic splines within a logistic regression framework. The Wald chi-square tests for non-linearity were not statistically significant for any of the examined variables (all p-values > 0.05).

This indicates that the relationships between age, baseline FACT-G score, baseline anxiety and depression scores, and the log odds of attrition can reasonably be modeled as linear on the logit scale in this dataset.

Accordingly, these continuous variables will be entered as linear terms in the subsequent multivariable logistic regression models.

Multicollinearity assumption check (VIF)

Table 5: Variance Inflation Factors (VIF)
Variable GVIF Df GVIF^(1/(2*Df))
age 1.098 1 1.048
sex 1.068 1 1.033
ecogps 1.211 1 1.101
hads_pt_anx_bl 1.788 1 1.337
hads_pt_dep_bl 2.546 1 1.596
cgpart 1.092 1 1.045
fact_bl 2.302 1 1.517
arm.x 1.020 1 1.010
race 1.252 4 1.028

Variance Inflation Factors (VIF) were calculated to assess multicollinearity among predictor variables. All VIF values were below 2.5, indicating that there was no problematic multicollinearity in the model.

Model result

  • Higher baseline anxiety scores significantly increased the odds of early attrition.

  • Patients with unknown race status had over 6 times higher odds of early attrition, although this may reflect underlying data quality issues or unmeasured confounding.

  • Other variables such as age, sex, depression, caregiver participation, and treatment arm were not statistically significant.

confusion matrix

Table 6: Confusion Matrix for Attrition Status at 12 Weeks (Prediction vs Reference)
No attrition (12w) Attrition (12w)
Pred: No attrition (12w) 242 88
Pred: Attrition (12w) 2 7
Table 7: Performance Metrics (Threshold = 0.5)
Metric Value
Accuracy 0.7345
95% CI (Accuracy) (0.6841, 0.7808)
Kappa 0.0905
Mcnemar’s Test P-Value <2e-16
Sensitivity 0.0737
Specificity 0.9918
Positive Predictive Value (PPV) 0.7778
Negative Predictive Value (NPV) 0.7333
Prevalence 0.2802
Detection Rate 0.0206
Detection Prevalence 0.0265
Balanced Accuracy 0.5327

The model yielded an overall accuracy of 73.4% and a kappa of 0.15, indicating low agreement beyond chance. Despite high specificity (98.4%) and NPV (73.3%), the sensitivity remained poor (12.7%), limiting the model’s ability to detect true attrition cases.

Model Calibration

Table 8: Hosmer-Lemeshow & Nagelkerke R²)
Test Value
Hosmer-Lemeshow Chi² 13.360
Hosmer-Lemeshow DF 8.000
Hosmer-Lemeshow p-value 0.100
Nagelkerke R² 0.066

Since the p-value is greater than 0.05, we fail to reject the null hypothesis, indicating that the model’s predicted probabilities are well calibrated with the observed outcomes. This suggests that the model fits the data adequately from a calibration perspective.

Coefficient table

Table 9: Logistic Regression Results (Early Attrition ≤12 weeks)
Covariate OR std.error statistic p-value Lower 95% CI Upper 95% CI
(Intercept) 0.244 1.400 -1.009 0.313 0.015 3.751
age 0.993 0.013 -0.530 0.596 0.967 1.019
sex 1.001 0.258 0.004 0.997 0.605 1.666
ecogps 0.938 0.224 -0.285 0.776 0.603 1.458
hads_pt_anx_bl 1.107 0.047 2.153 0.031 1.010 1.216
hads_pt_dep_bl 0.972 0.047 -0.595 0.552 0.885 1.066
cgpart 0.630 0.274 -1.682 0.092 0.369 1.084
fact_bl 1.009 0.011 0.747 0.455 0.986 1.032
arm.x 1.002 0.251 0.007 0.994 0.613 1.640
raceAsian 1.140 0.637 0.206 0.837 0.290 3.743
raceBlack 1.278 0.376 0.652 0.514 0.598 2.636
raceOther 1.098 0.907 0.103 0.918 0.143 6.115
raceUnknown 15.568 1.165 2.357 0.018 2.097 321.888

Model Performance

Penalized Logistic Regression (LASSO)

To address potential overfitting and multicollinearity among predictors, we applied a penalized logistic regression using the Least Absolute Shrinkage and Selection Operator (LASSO). This approach allows simultaneous variable selection and coefficient shrinkage by constraining the sum of absolute values of coefficients, potentially setting some to zero.

Model tuning was performed via 5-fold cross-validation to identify the optimal penalty parameter (lambda).

Cross-validation curve (lambda vs. deviance) Figure 4: Cross-validation curve for the LASSO logistic regression model. - Each red dot represents the mean binomial deviance estimated via cross-validation for a given log(λ) value, with vertical bars indicating ±1 standard error.

  • The left vertical dashed line indicates the value of λ that minimizes the deviance (lambda.min).

  • The right vertical dashed line shows the largest value of λ within one standard error of the minimum deviance (lambda.1se), often preferred for a more parsimonious model.

In this analysis, we selected lambda.min to optimize the model’s predictive performance while enabling automatic variable selection.

Coefficients table at lambda.min

Table 10: Coefficients Elastic Net (lambda.min = 0.0251)
Variable Coefficient
(Intercept) -0.9748883
hads_pt_anx_bl 0.0225685
cgpart -0.2242994
raceUnknown 1.3995157

At the optimal λ (lambda.min = 0.0251), the following coefficients were retained in the model: hads_pt_anx_bl, cgpart, raceUnknown. Coefficients equal to zero were effectively excluded by the LASSO penalty.

ROC curve and AUC

The LASSO logistic regression model achieved an AUC of 0.597, indicating modest discriminative ability in distinguishing between patients who experienced therapeutic attrition within 12 weeks and those who did not. An AUC close to 0.6 reflects limited predictive performance, suggesting that while some discriminative signal exists, the model’s clinical utility for individual risk prediction remains modest.

Machine Learning Approaches

Random Forest:

Applied to capture non-linearities and high-order interactions. Variable importance was extracted to identify key drivers of attrition.

## -0.07142857 0.01 
## 0.03571429 0.01 
## -0.01234568 0.01

Variable Importance Plot The Variable Importance Plot highlights which predictors contribute most to the model’s classification decisions.

Mean Decrease in Accuracy: Measures how much removing a variable decreases prediction accuracy. Higher values mean more predictive contribution.

Mean Decrease in Gini (Node Impurity): Measures how much a variable contributes to splitting nodes in trees (purity of classifications).

Top Predictors (by both metrics)

  • FACT_BL (baseline functional status score) is the top contributor by Gini.

  • Treatment arm (arm.x) is the top contributor by MeanDecreaseAccuracy, though it ranks lower by Gini.

  • Age and HADS anxiety score (hads_pt_anx_bl) are important predictors across both metrics.

  • HADS depression score (hads_pt_dep_bl) and ECOG Performance Status (ecogps) show moderate importance.

  • Race, sex, and caregiver participation (cgpart) appear less influential.

Implications: - Psychosocial factors and functional status are strong predictors of attrition in our cohort.

  • Treatment assignment (arm.x) appears particularly influential in improving overall classification accuracy, though its role in splitting decision trees (Gini) is more limited.

  • Clinical and demographic features like age and ECOG performance status also play notable roles.

  • Lower contribution from variables like sex and caregiver participation might suggest limited direct association in this dataset, though interaction effects might still exist.

ROC and AUC

The Random Forest model, trained after class balancing and hyperparameter tuning, achieved an AUC of 0.55, reflecting modest discriminative ability, slightly below the logistic regression model (AUC = 0.60). Despite tuning improvements, overall predictive performance remains limited.

XGBoost (Extreme Gradient Boosting):

To further enhance predictive performance, we applied an advanced ensemble technique: Extreme Gradient Boosting (XGBoost). Hyperparameters were tuned using 5-fold cross-validation with early stopping to avoid overfitting.

Data preparation involved selecting the same predictors as previous models, converting categorical variables into dummy variables, and creating an optimized DMatrix structure for the XGBoost algorithm.

Model fitting used a binary logistic objective with the AUC metric as the evaluation criterion. The optimal number of boosting iterations was determined via cross-validation.

Feature Importance (XGBoost)

Table 11: XGBoost – Feature Importance (by Gain)
Feature Gain Cover Frequency
fact_bl 0.357 0.299 0.305
hads_pt_anx_bl 0.148 0.180 0.195
age 0.146 0.119 0.183
hads_pt_dep_bl 0.134 0.087 0.134
arm.x 0.076 0.029 0.037
raceUnknown 0.055 0.139 0.037
ecogps 0.034 0.031 0.049
cgpart 0.029 0.091 0.024
sex 0.018 0.022 0.024
raceBlack 0.004 0.004 0.012

Top predictors by Gain:

  • FACT_BL (functional status score) was by far the most influential predictor (Gain = 0.357).

  • HADS anxiety and age followed, with Gain values of 0.148 and 0.146 respectively.

  • HADS depression and treatment arm (arm.x) also contributed meaningfully.

  • Variables like race, sex, and caregiver participation had limited importance.

The XGBoost model confirmed the central role of psychosocial factors and functional status in predicting attrition, while also suggesting a stronger influence of treatment assignment (arm.x) than previously observed with Random Forest’s Gini index.

Model Performance (XGBoost)

The final XGBoost model achieved an AUC of 0.94, substantially outperforming, logistic regression (AUC = 0.60), lasso regression (AUC = 0.59) and Random Forest (AUC = 0.55 after hyperparameter tuning).

Given this superior discrimination performance, XGBoost may represent the most reliable predictive tool in this context. Nevertheless, further external validation on independent data would be advisable before any clinical implementation.

Models Performance and Evaluation

Model performance was assessed through both discrimination and calibration metrics:

Table 12: Comparative Performance of Predictive Models
Model AUC Accuracy Sensitivity Specificity Positive Predictive Value Negative Predictive Value Balanced Accuracy
Logistic Regression 0.622 0.735 0.074 0.992 0.778 0.733 0.533
LASSO Regression 0.597 0.729 0.042 0.996 0.800 0.728 0.519
Random Forest 0.552 0.566 0.582 0.549 0.564 0.568 0.566
XGBoost 0.943 0.838 0.463 0.984 0.917 0.825 0.723

Interpretation of Model Performance

Across all models, performance varied substantially in both discrimination and classification metrics:

  • XGBoost demonstrated the highest discriminative ability, with an AUC of 0.943, considerably outperforming Logistic Regression (AUC = 0.622), LASSO Regression (AUC = 0.597), and Random Forest (AUC = 0.552). Its balanced accuracy (0.723) was also markedly superior.

  • In terms of accuracy, XGBoost again achieved the highest score (0.838), followed by Logistic Regression (0.735) and LASSO (0.729).

  • Sensitivity (true positive rate for attrition detection) was highest in the Random Forest model (0.934), but this came at the cost of extremely low specificity (0.044) and overall lower discriminative power.

  • Both Logistic and LASSO regression models showed high specificity (0.992 and 0.996 respectively) but poor sensitivity (0.074 and 0.042), reflecting the challenge of detecting attrition cases in an imbalanced dataset.

  • Positive Predictive Value (PPV) was highest in XGBoost (0.917), indicating strong precision in identifying attrition cases when predicted positive.

  • Negative Predictive Value (NPV) was also highest in XGBoost (0.825), confirming its reliable identification of non-attrition cases.

Implications

These results suggest that:

  • XGBoost is the most performant and balanced model in this context, capable of both high discrimination and acceptable sensitivity/specificity trade-offs.

  • While Random Forest achieved high sensitivity, it lacked discriminative ability and specificity, making it unsuitable for reliable clinical prediction without further adjustment.

  • Traditional Logistic and LASSO regression models provided moderate discrimination, with a strong bias toward the majority class (high specificity, low sensitivity), confirming the known limitations of linear models in handling imbalanced, complex datasets.

Validation Strategy

Internal Validation Procedure

We performed an internal validation to assess the robustness and discriminative performance of four supervised classification models using cross-validation and compared their results to the initial full-data model fits.

Methodology

A 5-fold stratified cross-validation procedure was implemented. The dataset was randomly split into five folds, maintaining the proportion of the outcome classes in each fold. In each iteration, four folds (80%) were used for training and one fold (20%) for validation. This procedure was repeated five times, with each fold serving once as the validation set.

This approach minimizes overfitting risk and provides an unbiased estimate of the models’ out-of-sample predictive performance.

The following supervised classification models were evaluated: - Logistic Regression (GLM) - LASSO Regularized Logistic Regression (GLMnet) - Random Forest - XGBoost

Each model was trained and evaluated using the same cross-validation folds to ensure comparability.

Performance Metrics

For each model and each fold, the following metrics were computed: - Area Under the ROC Curve (AUC) : a measure of overall model discrimination. - Balanced Accuracy : the average of sensitivity and specificity, accounting for class imbalance.

Mean AUC and mean Balanced Accuracy were calculated across the five folds for each model.

Comparison to Initial Full-Data Models

The cross-validated performance estimates were compared to the performance metrics obtained previously when the models were initially fitted on the full dataset (prior to internal validation).

This comparison allows for evaluating potential overfitting and assessing the generalizability of each model’s performance.

Results

Table 13: AUC by Fold for each Model
Model 1 2 3 4 5
XGBoost 0.615 0.587 0.607 0.525 0.540
Logistic Regression 0.691 0.530 0.535 0.495 0.447
LASSO Regression 0.759 0.487 0.519 0.543 0.642
Random Forest 0.488 0.724 0.554 0.652 0.586
Table 14: Comparison of Performances : Full Data vs Mean CV
Model AUC_Full_Data Mean_AUC_CV Balanced_Accuracy_Full_Data Mean_Balanced_Accuracy_CV
Logistic Regression 0.622 0.540 0.533 0.530
LASSO Regression 0.597 0.590 0.519 0.513
Random Forest 0.552 0.601 0.566 0.566
XGBoost 0.943 0.575 0.723 0.512

Interpretation

  • XGBoost showed the highest AUC on the full data (0.943), but its cross-validated mean AUC dropped to 0.575, suggesting a likely overfitting issue.
  • Random Forest exhibited a higher mean AUC in cross-validation (0.601) than on the full data (0.552), indicating more stable generalization performance.
  • Logistic Regression and LASSO Regression produced comparable cross-validated AUCs and Balanced Accuracies, both lower than those observed on the full data, confirming mild overfitting.

This direct comparison between full-data and cross-validated performances provided clear insights into each model’s internal validity and robustness.

Conclusion

The internal validation strategy offered reliable out-of-sample performance estimates, highlighting the potential overfitting of certain models (especially XGBoost) and reinforcing the importance of cross-validation for trustworthy model evaluation in predictive modeling studies.