Section 1

(1.1.2) Demographics Table for Baseline Population Visit 1

Here we define the “baseline-population” demographics with the first instance for patients who completed at least one survey by phone-call interview.

Demographics have been re-coded into the following

  • Education:

    • College Graduates vs others
  • Employment

    • Full/part-time employed vs others
  • Income

    • Greater to or equal to median (75k) vs less than median
  • Marital Status

    • Married vs others
  • Insurance

    • Insured (Commercial + Medicaid) vs Uninsured (Medicaid Part D + No Coverage)
  • Line of Therapy

    • First line vs others
  • Months on Therapy

    • Less than 13 months vs greater than or equal to 13 months
  • Daily Dosing Frequency

    • once daily vs twice daily vs other (more complex)
##                               
##                                level              Overall      
##   n                                               99           
##   Q6 (mean (SD))                                  63.12 (14.43)
##   Q1 (%)                       Male               59 (60.2)    
##                                Female             39 (39.8)    
##   daily_freq_recoded (%)       Once daily         76 (76.8)    
##                                Other              13 (13.1)    
##                                Twice daily        10 (10.1)    
##   edu_recoded (%)              HighSchool_or_Less 28 (28.6)    
##                                Other              70 (71.4)    
##   employment_recoded (%)       Employed           31 (31.3)    
##                                Other              68 (68.7)    
##   income_recoded (%)           <75k               53 (53.5)    
##                                >=75k              32 (32.3)    
##                                Missing            14 (14.1)    
##   marital_recoded (%)          Married            72 (72.7)    
##                                Other              27 (27.3)    
##   insurance_recoded (%)        Covered            46 (49.5)    
##                                Other              47 (50.5)    
##   line_therapy_recoded (%)     FirstLine          64 (66.7)    
##                                Other              32 (33.3)    
##   months_therapy_recoded (%)   <13 months         46 (46.5)    
##                                >=13 months        46 (46.5)    
##                                Missing            7 (7.1)      
##   wilson_adherence (mean (SD))                    92.15 (13.14)
Table 1B. Recoded Demographics and Baseline Characteristics
Baseline Population Characteristics for Visit 1
Variable level Overall
n
99
Age (Years)
Median [Min, Max] 66 [25, 87]
Sex
Male 59 (60.2)
Female 39 (39.8)
Daily Dosing Frequency
Once daily 76 (76.8)
Other 13 (13.1)
Twice daily 10 (10.1)
Education
HighSchool_or_Less 28 (28.6)
Other 70 (71.4)
Employment
Employed 31 (31.3)
Other 68 (68.7)
Income
<75k 53 (53.5)
>=75k 32 (32.3)
Missing 14 (14.1)
Marital Status
Married 72 (72.7)
Other 27 (27.3)
Insurance Coverage
Covered 46 (49.5)
Other 47 (50.5)
Line of Therapy
FirstLine 64 (66.7)
Other 32 (33.3)
Months on Therapy
<13 months 46 (46.5)
>=13 months 46 (46.5)
Missing 7 (7.1)
wilson_adherencemeanSD
92.15 (13.14)

(1.1.3) Demographics Table for Final Population Visit 1

In this section we define the demographics section for the “final-population”, where this table includes demographics of patients who completed all three patients at their first instance.

##                               
##                                level              Overall      
##   n                                               42           
##   Q6 (mean (SD))                                  61.05 (12.16)
##   Q1 (%)                       Male               25 (59.5)    
##                                Female             17 (40.5)    
##   daily_freq_recoded (%)       Once daily         36 (85.7)    
##                                Other              3 (7.1)      
##                                Twice daily        3 (7.1)      
##   edu_recoded (%)              HighSchool_or_Less 14 (34.1)    
##                                Other              27 (65.9)    
##   employment_recoded (%)       Employed           15 (35.7)    
##                                Other              27 (64.3)    
##   income_recoded (%)           <75k               28 (66.7)    
##                                >=75k              14 (33.3)    
##   marital_recoded (%)          Married            30 (71.4)    
##                                Other              12 (28.6)    
##   insurance_recoded (%)        Covered            19 (45.2)    
##                                Other              23 (54.8)    
##   line_therapy_recoded (%)     FirstLine          24 (57.1)    
##                                Other              18 (42.9)    
##   months_therapy_recoded (%)   <13 months         22 (52.4)    
##                                >=13 months        19 (45.2)    
##                                Missing            1 (2.4)      
##   wilson_adherence (mean (SD))                    90.49 (13.03)
Table 1B. Recoded Demographics and Baseline Characteristics
Final Population Characteristics for Visit 1
Variable level Overall
n
42
Age (Years)
Median [Min, Max] 65 [27, 79]
Sex
Male 25 (59.5)
Female 17 (40.5)
Daily Dosing Frequency
Once daily 36 (85.7)
Other 3 (7.1)
Twice daily 3 (7.1)
Education
HighSchool_or_Less 14 (34.1)
Other 27 (65.9)
Employment
Employed 15 (35.7)
Other 27 (64.3)
Income
<75k 28 (66.7)
>=75k 14 (33.3)
Marital Status
Married 30 (71.4)
Other 12 (28.6)
Insurance Coverage
Covered 19 (45.2)
Other 23 (54.8)
Line of Therapy
FirstLine 24 (57.1)
Other 18 (42.9)
Months on Therapy
<13 months 22 (52.4)
>=13 months 19 (45.2)
Missing 1 (2.4)
wilson_adherencemeanSD
90.49 (13.03)

Section 2

(2.1.1) Missing Wilson Adherence Data

This Table Describes how many patients in each instance had missing Wilson Adherence values. Wilson Adherence is calculated as a function of 3 questions self-rated adherence performance, self-rated frequency performance, and self-rated # of missed medication days. If any of these responses were missing, a Wilson adherence score was not calculated

instance n_total n_missing n_nonmissing pct_missing
1 99 4 95 4.0
2 51 2 49 3.9
3 42 0 42 0.0

(2.1.2) Adherence IQR Table

This table displays the Interquartile ranges, means, minimum, maximum and sample size of the baseline population (anyone with visit 1), and the final population (those who completed 3 surveys), at visits one, two and 3. The change in sample size amongst the final population is caused by N/A Wilson adherence values not being used in the calculations.

Summary of Adherence Scores Across Visits
Mean (SD), Median (IQR), Min–Max, and Sample Size
Visit Mean (SD) Median (IQR) Min, Max Sample Size
1 (baseline population) 92.15 (13.14) 100.00 [86.67, 100.00] 30.00, 100.00 95
1 (final population) 90.49 (13.03) 100.00 [84.44, 100.00] 53.33, 100.00 41
2 (final population) 95.01 (7.64) 100.00 [91.11, 100.00] 73.33, 100.00 41
3 (final population) 93.25 (11.81) 100.00 [87.50, 100.00] 56.67, 100.00 42

(2.1.3) Adherence change over time Sankey Plots

These plots show how adherence values, binned 100%, <100-90, and <90, change across the entire data set and across only the final population.

(2.1.4) Distributions of Adherence by visit

The following tables show the skewed nature of Wilson adherence values across the baseline population and all three instances of the final population.

(2.2.1) Linear Regression - Wilson Adherence

This section shows the results of a linear regression using the binary covariates as predictors and Wilson adherence as the outcome. This model uses the baseline population.

How to interpret:

  • Those with a non-traditional dosing regimen indicated as “other” have more than 10 points lower adherence values on average when compared to the reference group.

  • Those who are not married have more than 8 points lower adherence values on average when compared to the reference group.

## 
## Call:
## lm(formula = wilson_adherence ~ age + sex + edu + emp + income + 
##     marital + insurance + line + months + dose, data = baseline)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.251  -4.176   3.100   7.145  15.931 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        89.7078     7.4448  12.050   <2e-16 ***
## age                 0.1084     0.1104   0.981   0.3298    
## sexFemale          -1.0969     2.8505  -0.385   0.7015    
## eduOther           -0.3217     3.4312  -0.094   0.9256    
## empOther            3.2456     3.2928   0.986   0.3277    
## income>=75k        -1.9532     3.3054  -0.591   0.5565    
## incomeMissing      -0.8120     4.7557  -0.171   0.8649    
## maritalOther       -8.0152     3.5269  -2.273   0.0261 *  
## insuranceOther     -3.4669     2.9159  -1.189   0.2385    
## lineOther           1.9704     3.0175   0.653   0.5159    
## months>=13 months  -1.0550     2.7061  -0.390   0.6978    
## monthsMissing       4.3492     5.8031   0.749   0.4561    
## doseOther         -10.2194     3.9688  -2.575   0.0121 *  
## doseTwice daily    -1.1378     5.7565  -0.198   0.8439    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.67 on 70 degrees of freedom
##   (15 observations deleted due to missingness)
## Multiple R-squared:  0.1975, Adjusted R-squared:  0.04847 
## F-statistic: 1.325 on 13 and 70 DF,  p-value: 0.2197

(2.2.2) Logistic Regression; Add Reference Groups

The following sections use the aforementioned covariates as predictors, this time with adherence as a binary/categorical variable using several different cutoffs. This also uses the baseline population.

Upon the suggestion of Dr. Li a quasibinomial distrubition was investigated. This model would be applicable under the condition that there was not overdispersion. The dispersion of the logistic regression model was equal to 1, therefore no additional paramter was added to the model.

logistic regression equation:

  • \[ \log\left(\frac{P(Y_i = 1)}{1 - P(Y_i = 1)}\right) = \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Sex}_{i,\text{Female}} + \beta_3 \text{Sex}_{i,\text{PreferNot}} + \beta_4 \text{Education}_{i,\text{Other}} + \beta_5 \text{Employment}_{i,\text{Employed}} + \beta_6 \text{Income}_{i,\ge 75k} + \beta_7 \text{Income}_{i,\text{Missing}} + \beta_8 \text{Marital}_{i,\text{Married}} + \beta_9 \text{Insurance}_{i,\text{Covered}} + \beta_{10} \text{LineTherapy}_{i,\text{Other}} + \beta_{11} \text{MonthsTherapy}_{i,\ge 13} + \beta_{12} \text{MonthsTherapy}_{i,\text{Missing}} + \beta_{13} \text{Dose}_{i,\text{TwiceDaily}} + \beta_{14} \text{Dose}_{i,\text{Other}} \]

The null hypothesis is that for all coefficients:

  • \[ H_0: \beta_j = 0 \]

The alternative hypothesis is that at least one coefficient:

  • \[ H_A: \beta_j \neq 0 \]

How to interpret:

  • For any statistically significant covariates, the odds increases by a factor of \(OR\) when compared to the reference group

(2.2.2.1) 100 vs <100

## [1] 1

term OR OR_low OR_high p_label
Age (Years) 1.0454656 1.00081017 1.0999404 0.060
Sex: Female 0.6140329 0.20139303 1.8281298 0.381
Education: Other 1.1503679 0.29906641 4.2953449 0.835
employment_recodedOther 1.5648611 0.45915566 5.3783739 0.471
Income: >=75k 0.4482122 0.11050386 1.6365486 0.237
Income: Missing 0.3891570 0.05337266 2.8075829 0.339
marital_recodedOther 0.2591365 0.06148264 0.9677439 0.052
insurance_recodedOther 0.5453849 0.16257262 1.6758237 0.303
Line of Therapy: Other 2.0431898 0.61804811 7.5883279 0.258
Months of Therapy: >=13 0.6930642 0.24303937 1.9467856 0.486
Months of Therapy: Missing 3.0392245 0.31490362 72.5333031 0.386
Dose Regimen: Other 0.2466343 0.04656360 1.1410160 0.081
Dose Regimen: Twice daily 0.1384896 0.01242514 1.1726514 0.078
## [1] 1

(2.2.2.2) >=95 vs <95

term OR OR_low OR_high p_label
Age (Years) 1.0462381 1.00086116 1.1017480 0.060
Sex: Female 0.7981758 0.26064514 2.4307700 0.689
Education: Other 1.0751006 0.27547125 4.0594957 0.915
employment_recodedOther 1.9003785 0.55986814 6.5866345 0.302
Income: >=75k 0.5596621 0.14045884 2.0655612 0.390
Income: Missing 0.4297634 0.05888038 3.1410598 0.394
marital_recodedOther 0.2755105 0.06551329 1.0291944 0.063
insurance_recodedOther 0.5475143 0.16115798 1.7072684 0.312
Line of Therapy: Other 2.3021493 0.67245449 9.1418001 0.204
Months of Therapy: ≥13 0.7174947 0.25025359 2.0336739 0.531
Months of Therapy: Missing 2.7489103 0.27877818 67.0329618 0.434
Dose Regimen: Other 0.2113518 0.03972981 0.9722677 0.052
Dose Regimen: Twice daily 0.1314847 0.01181185 1.1264572 0.072

(2.2.2.2) >=90 vs <90

term OR OR_low OR_high p_label
Age (Years) 1.0105498 0.96446609 1.056541 0.646
Sex: Female 0.6389413 0.20449257 1.977177 0.434
Education: Other 1.6720635 0.42599075 6.599741 0.456
employment_recodedOther 3.1702775 0.87114328 12.525764 0.086
Income: ≥75k 0.6791506 0.16630607 2.627930 0.576
Income: Missing 2.2285736 0.25619259 50.744264 0.519
marital_recodedOther 0.3872457 0.09508918 1.433184 0.164
insurance_recodedOther 0.7051001 0.21072541 2.246603 0.558
Line of Therapy: Other 1.7426779 0.49302330 6.967925 0.404
Months of Therapy: ≥13 0.9763625 0.33065615 2.936620 0.965
Months of Therapy: Missing 1.2020973 0.12357053 27.091981 0.883
Dose Regimen: Other 0.2525363 0.04975346 1.223640 0.086
Dose Regimen: Twice daily 1.2365734 0.11364012 30.821438 0.873

(2.2.3) Uni-variate analysis of baseline population

Here we performed univariate analysis between a single covariate and Wilson Adherence. These tests examine whether there exists a difference in distribution of Wilson adherence within demographic groups. These are non-parametric tests used because of the non-normal distribution of adherence demonstrated in section 2.1.4. P-values are included in box-plots.

  • Covariates with 2 categories were compared using the Wilcoxon rank-sum test.

  • Covariates with 3 categories or more were compared using the Kruskal–Wallis test.

Marital Status remains the only statistically significant covariate in this portion of the analysis.

Section 3

(3.1) Missing PRO data

Below is a summary of how many times a given question was missed/missing in the dataset. The following approach was adopted for data imputation.

  • For 2nd and 3rd surveys of a patient if a question is missing, their score from their first survey will be used to replace the missing score

  • For 1st surveys, a missing question will be imputed with the population mode of the particular question at visit 1.

From this point onward, the aforementioned changes have been implemented in the PRO data.

Table 3.1 — Missing PRO Scores by Visit
N (%) of patients missing each PRO item
Question Visit 1 (Baseline) Visit 2 (Final Pop.) Visit 3 (Final Pop.)
Q21 – In general, would you say your health is... 0 (0%) 0 (0%) 0 (0%)
Q22 – In general, would you say your quality of life is... 0 (0%) 0 (0%) 0 (0%)
Q23 – In general, how would you rate your physical health? 0 (0%) 0 (0%) 0 (0%)
Q24 – In general, how would you rate your mental health? 0 (0%) 0 (0%) 0 (0%)
Q25 – In general, how would you rate your satisfaction with your social activities? 0 (0%) 0 (0%) 0 (0%)
Q26 – In general, how well do you carry out social activities and roles? 0 (0%) 0 (0%) 0 (0%)
Q27 – To what extent are you able to carry out everyday physical activities? 0 (0%) 0 (0%) 0 (0%)
Q28 – In the past 7 days: emotional problems such as anxiety or irritability 0 (0%) 0 (0%) 0 (0%)
Q29 – In the past 7 days: fatigue on average 0 (0%) 0 (0%) 0 (0%)
Q30 – How would you rate your pain on average? 0 (0%) 0 (0%) 0 (0%)
Q31 – Mouth or throat sores – severity 0 (0%) 0 (0%) 0 (0%)
Q32 – Mouth or throat sores – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q33 – Nausea – frequency 0 (0%) 0 (0%) 0 (0%)
Q34 – Nausea – severity at its worst 0 (0%) 0 (0%) 0 (0%)
Q35 – Constipation – severity 0 (0%) 0 (0%) 0 (0%)
Q37 – Diarrhea – frequency 0 (0%) 0 (0%) 0 (0%)
Q38 – Rash – presence 0 (0%) 0 (0%) 0 (0%)
Q39 – Hand–foot syndrome – severity 0 (0%) 0 (0%) 0 (0%)
Q40 – Numbness/tingling – severity 0 (0%) 0 (0%) 0 (0%)
Q41 – Numbness/tingling – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q42 – Blurry vision – severity 0 (0%) 0 (0%) 0 (0%)
Q43 – Blurry vision – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q44 – Problems with concentration – severity 0 (0%) 0 (0%) 0 (0%)
Q45 – Problems with concentration – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q46 – Pain – frequency 0 (0%) 0 (0%) 0 (0%)
Q47 – Pain – severity 0 (0%) 0 (0%) 0 (0%)
Q48 – Pain – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q49 – Headache – frequency 0 (0%) 0 (0%) 0 (0%)
Q50 – Headache – severity 0 (0%) 0 (0%) 0 (0%)
Q51 – Headache – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q52 – Aching muscles – frequency 0 (0%) 0 (0%) 0 (0%)
Q53 – Aching muscles – severity 0 (0%) 0 (0%) 0 (0%)
Q54 – Aching muscles – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q55 – Aching joints – frequency 0 (0%) 0 (0%) 0 (0%)
Q56 – Aching joints – severity 0 (0%) 0 (0%) 0 (0%)
Q57 – Aching joints – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q58 – Insomnia – severity 0 (0%) 0 (0%) 0 (0%)
Q59 – Insomnia – interference with daily activities 0 (0%) 0 (0%) 0 (0%)
Q60 – Fatigue – severity 0 (0%) 0 (0%) 0 (0%)
Q61 – Fatigue – interference with daily activities 0 (0%) 0 (0%) 0 (0%)

(3.2.1) Domain Separation Guide

Based on the guidance of clinician Dr. Muluneh, the following “domains” have been created, each containing relevant questions from the PRO questionnaire.

Table: PRO-CTCAE Items by Symptom Domain
Domain mapping for PRO-CTCAE Questions Q31–Q61
Domain Q# Symptom / Description
GI Q31 Mouth/throat sores – severity
GI Q32 Mouth/throat sores – interference
GI Q33 Nausea – frequency
GI Q34 Nausea – severity
GI Q35 Constipation – severity
GI Q37 Diarrhea – frequency
Dermatologic Q38 Rash – presence
Dermatologic Q39 Hand–foot syndrome – severity
Pain Q40 Numbness/tingling – severity
Pain Q41 Numbness/tingling – interference
Pain Q46 Pain – frequency
Pain Q47 Pain – severity
Pain Q48 Pain – interference
Pain Q52 Muscle aches – frequency
Pain Q53 Muscle aches – severity
Pain Q54 Muscle aches – interference
Pain Q55 Joint aches – frequency
Pain Q56 Joint aches – severity
Pain Q57 Joint aches – interference
Neurologic Q42 Blurry vision – severity
Neurologic Q43 Blurry vision – interference
Neurologic Q44 Concentration problems – severity
Neurologic Q45 Concentration problems – interference
Neurologic Q49 Headache – frequency
Neurologic Q50 Headache – severity
Neurologic Q51 Headache – interference
Constitutional Q58 Insomnia – severity
Constitutional Q59 Insomnia – interference
Constitutional Q60 Fatigue – severity
Constitutional Q61 Fatigue – interference

(3.2.2) Data Imputation

(3.2.3) Composite Score visualization by Domain

For tracking domain level changes over time, a composite score has been created for each domain. A higher score corresponds to a higher symptom burden, which is also the case in the individual PRO questions. The plots below plot the composite score by domain vs the time from the first survey of each patient.

Presence of rash in Q38 has been recoded from no = 1 and yes = 2 -> no = 1 and yes = 5 to match the scale of the other questions in the dataset.

Descriptive Table: Composite PRO Domain Scores
Mean (SD), Median (IQR), and Min–Max by Domain
Domain Mean (SD) Median (IQR) Min–Max
GI 8.8 (3.6) 8.0 [6.0, 10.0] 5.0–27.0
Dermatologic 5.4 (1.4) 6.0 [6.0, 6.0] 1.0–8.0
Pain 27.4 (12.3) 25.0 [17.0, 34.5] 14.0–65.0
Neurologic 5.9 (3.1) 4.0 [4.0, 6.0] 3.0–18.0
Constitutional 8.4 (3.8) 8.0 [6.0, 10.0] 4.0–20.0

(3.2.4) Composite Score Descriptive Table across visits

Below is a descriptive table of the domain composite scores by population and visit.

Descriptive Table: Composite PRO Domain Scores by Visit and Population
Mean (SD), Median (IQR), and Min–Max
Population / Visit Mean (SD) Median (IQR) Min–Max
GI
Baseline population – Visit 1 8.9 (3.7) 8.0 [6.0, 10.0] 5.0–27.0
Final population – Visit 1 9.6 (4.3) 8.5 [7.0, 10.8] 6.0–27.0
Final population – Visit 2 8.4 (3.1) 7.5 [6.0, 10.0] 6.0–18.0
Final population – Visit 3 8.7 (3.7) 8.0 [6.0, 9.8] 6.0–22.0
Dermatologic
Baseline population – Visit 1 5.5 (1.3) 6.0 [6.0, 6.0] 1.0–7.0
Final population – Visit 1 5.7 (0.9) 6.0 [6.0, 6.0] 2.0–6.0
Final population – Visit 2 5.1 (1.7) 6.0 [6.0, 6.0] 2.0–8.0
Final population – Visit 3 5.4 (1.4) 6.0 [6.0, 6.0] 2.0–6.0
Pain
Baseline population – Visit 1 26.1 (12.8) 23.0 [15.0, 34.0] 14.0–65.0
Final population – Visit 1 28.2 (12.6) 25.0 [19.0, 34.8] 14.0–65.0
Final population – Visit 2 29.4 (12.4) 27.0 [20.2, 34.8] 14.0–63.0
Final population – Visit 3 28.7 (11.0) 26.0 [20.2, 36.8] 14.0–58.0
Neurologic
Baseline population – Visit 1 5.7 (2.9) 4.0 [4.0, 6.0] 3.0–16.0
Final population – Visit 1 6.4 (3.1) 5.5 [4.0, 8.0] 4.0–16.0
Final population – Visit 2 6.3 (3.6) 4.5 [4.0, 6.0] 4.0–18.0
Final population – Visit 3 5.6 (3.0) 4.0 [4.0, 6.0] 4.0–16.0
Constitutional
Baseline population – Visit 1 8.1 (3.7) 8.0 [5.0, 10.0] 4.0–18.0
Final population – Visit 1 8.3 (3.5) 8.0 [6.0, 10.0] 4.0–18.0
Final population – Visit 2 8.4 (3.7) 8.0 [6.0, 10.0] 4.0–18.0
Final population – Visit 3 8.9 (4.0) 8.0 [6.0, 11.0] 4.0–19.0

(3.3.1) Ordinal Regression of composite scores (final population visit 1)

This model treats the composite scores as an ordinal variable since there is a finite number of possible outcomes, and larger composite score are indicative of greater symptomatic burden. This model uses the final population visit 1.

Because of the wide range of responses, sometimes ranging between 10-60, the composite scores are collapsed into four quantiles of the composite score.

Ordinal Regression Results — GI Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.99 (0.92, 1.05) 0.648
sex2 2.55 (0.66, 10.50) 0.179
eduOther 1.12 (0.24, 5.22) 0.885
empOther 3.47 (0.65, 19.27) 0.143
income>=75k 1.27 (0.23, 7.41) 0.785
maritalOther 0.33 (0.06, 1.63) 0.176
insuranceOther 0.41 (0.08, 1.85) 0.247
lineOther 1.46 (0.38, 5.83) 0.583
doseOther 0.12 (0.01, 1.45) 0.103
doseTwice daily 1.39 (0.03, 51.41) 0.858
Ordinal Regression Results — Pain Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.98 (0.91, 1.05) 0.579
sex2 1.09 (0.30, 3.98) 0.891
eduOther 0.96 (0.22, 4.28) 0.956
empOther 2.41 (0.45, 13.23) 0.299
income>=75k 0.79 (0.14, 4.36) 0.789
maritalOther 0.73 (0.15, 3.49) 0.689
insuranceOther 1.62 (0.39, 6.88) 0.507
lineOther 0.58 (0.15, 2.16) 0.416
doseOther 1.08 (0.10, 12.88) 0.949
doseTwice daily 0.49 (0.01, 15.60) 0.686
Ordinal Regression Results — Neurologic Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.98 (0.91, 1.05) 0.509
sex2 0.48 (0.13, 1.72) 0.267
eduOther 0.55 (0.11, 2.64) 0.451
empOther 0.55 (0.10, 2.79) 0.476
income>=75k 0.49 (0.08, 2.74) 0.420
maritalOther 0.32 (0.06, 1.64) 0.176
insuranceOther 1.31 (0.30, 5.79) 0.721
lineOther 0.54 (0.14, 2.04) 0.366
doseOther 0.68 (0.07, 6.51) 0.729
doseTwice daily 0.07 (0.00, 2.40) 0.152
Ordinal Regression Results — Dermatologic Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.03 (0.96, 1.12) 0.373
sex2 0.18 (0.04, 0.72) 0.020
eduOther 0.41 (0.07, 2.00) 0.277
empOther 0.49 (0.09, 2.52) 0.401
income>=75k 0.54 (0.08, 3.55) 0.522
maritalOther 2.51 (0.48, 13.37) 0.272
insuranceOther 4.24 (1.01, 20.05) 0.056
lineOther 0.83 (0.21, 3.19) 0.791
doseOther 1.55 (0.12, 21.04) 0.731
doseTwice daily 0.31 (0.01, 17.80) 0.551
Ordinal Regression Results — Constitutional Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.96 (0.88, 1.03) 0.253
sex2 1.95 (0.53, 7.39) 0.317
eduOther 0.73 (0.13, 3.95) 0.709
empOther 5.80 (1.01, 38.01) 0.053
income>=75k 0.86 (0.16, 4.80) 0.858
maritalOther 0.42 (0.07, 2.26) 0.319
insuranceOther 0.90 (0.20, 3.91) 0.891
lineOther 0.39 (0.10, 1.46) 0.168
doseOther 0.58 (0.06, 6.67) 0.643
doseTwice daily 9.70 (0.14, 748.32) 0.277
## # A tibble: 4 × 2
##   quartile     n
##      <int> <int>
## 1        1    11
## 2        2    11
## 3        3    10
## 4        4    10

(3.3.2) continued. Univarite analysis of stat sig covariates

Statistically significant covariates from section 3.3.1, underwent nonparametric univariate tests to check for differences in distributions of composite scores between different groups.

Insurance and sex, while statistically significant in the ordinal regression model of 3.3.1 for dermatological domain, are difficult to interpret because of the identical means and IQR of the two groups. This is because of only two questions, including the binary Q38, contributing to the lack of variability.

Employment, while not statistically significant in the ordinal regression (p = .053), was statistically significant in this nonparametric test.

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Dermatologic_score by sex
## W = 234, p-value = 0.3387
## alternative hypothesis: true location shift is not equal to 0

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Constitutional_score by emp
## W = 117.5, p-value = 0.02518
## alternative hypothesis: true location shift is not equal to 0

Dermatologic Domain — Significant Covariate
Median (IQR) PRO score by sex (Final population, Visit 1)
Sex Dermatologic PRO Score
Male 6.0 (6.0–6.0)
Female 6.0 (6.0–6.0)
Constitutional Domain — Significant Covariate
Median (IQR) PRO score by employment status (Final population, Visit 1)
Employment Status Constitutional PRO Score
Employed 6.0 (4.0–9.0)
Other 9.0 (6.5–11.0)

###(3.4)

(3.4.1 & 3.4.2)

In this section we converted the PRO scores for each individual question into a binary outcome to compare whether the proportion of responses changes across visits. This analysis uses the final population.

The original scale is between 1 and 5. In table 3.4.1, for each question we compare whether the proportion of responses equal to 1 vs greater than 1 changes across the three visits.

Similarly in table 3.4.2, for each question we compare whether the proportion of responses less than or equal to 2 vs greater than 3 changes across the three visits.

Table 3.2.1 — PRO Change Over Time (Binary: PRO > 1)
Final population only; Cochran’s Q test across 3 visits
Question Q# Visit 1 Visit 2 Visit 3 Cochran’s Q p-value
Q21 – In general, would you say your health is... Q21 41 (97.6%) 42 (100%) 40 (95.2%) 0.2230
Q22 – In general, would you say your quality of life is... Q22 34 (81%) 37 (88.1%) 35 (83.3%) 0.5290
Q23 – In general, how would you rate your physical health? Q23 40 (97.6%) 40 (97.6%) 40 (97.6%) 1.0000
Q24 – In general, how would you rate your mental health? Q24 30 (71.4%) 37 (88.1%) 36 (85.7%) 0.0681
Q25 – In general, how would you rate your satisfaction with your social activities? Q25 34 (81%) 37 (88.1%) 35 (83.3%) 0.5290
Q26 – In general, how well do you carry out social activities and roles? Q26 32 (80%) 36 (90%) 30 (75%) 0.0970
Q27 – To what extent are you able to carry out everyday physical activities? Q27 30 (71.4%) 25 (59.5%) 23 (54.8%) 0.0617
Q28 – In the past 7 days: emotional problems such as anxiety or irritability Q28 22 (52.4%) 23 (54.8%) 24 (57.1%) 0.8290
Q29 – In the past 7 days: fatigue on average Q29 38 (92.7%) 38 (92.7%) 34 (82.9%) 0.1350
Q30 – How would you rate your pain on average? Q30 28 (66.7%) 27 (64.3%) 29 (69%) 0.7410
Q31 – Mouth or throat sores – severity Q31 10 (23.8%) 3 (7.1%) 3 (7.1%) 0.0169
Q32 – Mouth or throat sores – interference with daily activities Q32 7 (16.7%) 3 (7.1%) 3 (7.1%) 0.1690
Q33 – Nausea – frequency Q33 16 (38.1%) 12 (28.6%) 14 (33.3%) 0.3010
Q34 – Nausea – severity at its worst Q34 17 (40.5%) 12 (28.6%) 14 (33.3%) 0.1780
Q35 – Constipation – severity Q35 13 (31%) 12 (28.6%) 10 (23.8%) 0.7050
Q37 – Diarrhea – frequency Q37 20 (48.8%) 15 (36.6%) 15 (36.6%) 0.2870
Q38 – Rash – presence Q38 37 (88.1%) 32 (76.2%) 32 (76.2%) 0.2100
Q39 – Hand–foot syndrome – severity Q39 4 (9.5%) 3 (7.1%) 5 (11.9%) 0.6870
Q40 – Numbness/tingling – severity Q40 22 (52.4%) 19 (45.2%) 17 (40.5%) 0.3270
Q41 – Numbness/tingling – interference with daily activities Q41 18 (42.9%) 18 (42.9%) 14 (33.3%) 0.4490
Q42 – Blurry vision – severity Q42 11 (26.8%) 13 (31.7%) 7 (17.1%) 0.1350
Q43 – Blurry vision – interference with daily activities Q43 12 (28.6%) 13 (31%) 8 (19%) 0.2470
Q44 – Problems with concentration – severity Q44 18 (42.9%) 13 (31%) 12 (28.6%) 0.1610
Q45 – Problems with concentration – interference with daily activities Q45 17 (40.5%) 14 (33.3%) 12 (28.6%) 0.3270
Q46 – Pain – frequency Q46 29 (69%) 27 (64.3%) 33 (78.6%) 0.0970
Q47 – Pain – severity Q47 30 (71.4%) 28 (66.7%) 31 (73.8%) 0.5290
Q48 – Pain – interference with daily activities Q48 24 (57.1%) 25 (59.5%) 24 (57.1%) 0.9490
Q49 – Headache – frequency Q49 16 (38.1%) 17 (40.5%) 19 (45.2%) 0.6620
Q50 – Headache – severity Q50 16 (38.1%) 17 (40.5%) 17 (40.5%) 0.9360
Q51 – Headache – interference with daily activities Q51 12 (28.6%) 13 (31%) 11 (26.2%) 0.7940
Q52 – Aching muscles – frequency Q52 18 (43.9%) 27 (65.9%) 26 (63.4%) 0.0362
Q53 – Aching muscles – severity Q53 18 (42.9%) 28 (66.7%) 27 (64.3%) 0.0160
Q54 – Aching muscles – interference with daily activities Q54 14 (33.3%) 22 (52.4%) 23 (54.8%) 0.0260
Q55 – Aching joints – frequency Q55 20 (47.6%) 27 (64.3%) 23 (54.8%) 0.1130
Q56 – Aching joints – severity Q56 21 (50%) 27 (64.3%) 23 (54.8%) 0.2110
Q57 – Aching joints – interference with daily activities Q57 18 (42.9%) 23 (54.8%) 22 (52.4%) 0.3500
Q58 – Insomnia – severity Q58 15 (35.7%) 17 (40.5%) 18 (42.9%) 0.6920
Q59 – Insomnia – interference with daily activities Q59 15 (35.7%) 16 (38.1%) 19 (45.2%) 0.5040
Q60 – Fatigue – severity Q60 33 (78.6%) 33 (78.6%) 30 (71.4%) 0.4410
Q61 – Fatigue – interference with daily activities Q61 29 (69%) 32 (76.2%) 28 (66.7%) 0.4860
Table 3.2.2 — PRO Change Over Time (Binary: PRO > 3)
Final population only; Cochran’s Q test across 3 visits
Question Q# Visit 1 Visit 2 Visit 3 Cochran’s Q p-value
Q21 – In general, would you say your health is... Q21 18 (42.9%) 18 (42.9%) 16 (38.1%) 0.6950
Q22 – In general, would you say your quality of life is... Q22 10 (23.8%) 10 (23.8%) 7 (16.7%) 0.4070
Q23 – In general, how would you rate your physical health? Q23 17 (41.5%) 16 (39%) 18 (43.9%) 0.7170
Q24 – In general, how would you rate your mental health? Q24 8 (19%) 7 (16.7%) 7 (16.7%) 0.9050
Q25 – In general, how would you rate your satisfaction with your social activities? Q25 11 (26.2%) 10 (23.8%) 9 (21.4%) 0.7790
Q26 – In general, how well do you carry out social activities and roles? Q26 13 (32.5%) 11 (27.5%) 12 (30%) 0.8070
Q27 – To what extent are you able to carry out everyday physical activities? Q27 8 (19%) 5 (11.9%) 6 (14.3%) 0.6070
Q28 – In the past 7 days: emotional problems such as anxiety or irritability Q28 2 (4.8%) 4 (9.5%) 5 (11.9%) 0.4170
Q29 – In the past 7 days: fatigue on average Q29 2 (4.9%) 8 (19.5%) 6 (14.6%) 0.0446
Q30 – How would you rate your pain on average? Q30 20 (47.6%) 20 (47.6%) 23 (54.8%) 0.5490
Q31 – Mouth or throat sores – severity Q31 4 (9.5%) 1 (2.4%) 0 (0%) 0.0743
Q32 – Mouth or throat sores – interference with daily activities Q32 3 (7.1%) 0 (0%) 0 (0%) 0.0498
Q33 – Nausea – frequency Q33 5 (11.9%) 2 (4.8%) 4 (9.5%) 0.4170
Q34 – Nausea – severity at its worst Q34 4 (9.5%) 2 (4.8%) 1 (2.4%) 0.1740
Q35 – Constipation – severity Q35 1 (2.4%) 0 (0%) 3 (7.1%) 0.1740
Q37 – Diarrhea – frequency Q37 4 (9.8%) 4 (9.8%) 4 (9.8%) 1.0000
Q38 – Rash – presence Q38 37 (88.1%) 32 (76.2%) 32 (76.2%) 0.2100
Q39 – Hand–foot syndrome – severity Q39 1 (2.4%) 0 (0%) 3 (7.1%) 0.0970
Q40 – Numbness/tingling – severity Q40 5 (11.9%) 3 (7.1%) 4 (9.5%) 0.6870
Q41 – Numbness/tingling – interference with daily activities Q41 6 (14.3%) 5 (11.9%) 4 (9.5%) 0.7410
Q42 – Blurry vision – severity Q42 1 (2.4%) 4 (9.8%) 3 (7.3%) 0.0970
Q43 – Blurry vision – interference with daily activities Q43 3 (7.1%) 5 (11.9%) 2 (4.8%) 0.2470
Q44 – Problems with concentration – severity Q44 4 (9.5%) 2 (4.8%) 1 (2.4%) 0.0970
Q45 – Problems with concentration – interference with daily activities Q45 4 (9.5%) 5 (11.9%) 2 (4.8%) 0.4590
Q46 – Pain – frequency Q46 16 (38.1%) 14 (33.3%) 14 (33.3%) 0.7660
Q47 – Pain – severity Q47 7 (16.7%) 6 (14.3%) 7 (16.7%) 0.9130
Q48 – Pain – interference with daily activities Q48 10 (23.8%) 6 (14.3%) 6 (14.3%) 0.3680
Q49 – Headache – frequency Q49 5 (11.9%) 7 (16.7%) 4 (9.5%) 0.3680
Q50 – Headache – severity Q50 4 (9.5%) 2 (4.8%) 2 (4.8%) 0.3680
Q51 – Headache – interference with daily activities Q51 4 (9.5%) 4 (9.5%) 2 (4.8%) 0.5130
Q52 – Aching muscles – frequency Q52 8 (19.5%) 14 (34.1%) 13 (31.7%) 0.1440
Q53 – Aching muscles – severity Q53 4 (9.5%) 3 (7.1%) 4 (9.5%) 0.8820
Q54 – Aching muscles – interference with daily activities Q54 5 (11.9%) 7 (16.7%) 7 (16.7%) 0.7170
Q55 – Aching joints – frequency Q55 10 (23.8%) 11 (26.2%) 8 (19%) 0.6270
Q56 – Aching joints – severity Q56 9 (21.4%) 4 (9.5%) 4 (9.5%) 0.1030
Q57 – Aching joints – interference with daily activities Q57 8 (19%) 7 (16.7%) 4 (9.5%) 0.3380
Q58 – Insomnia – severity Q58 2 (4.8%) 3 (7.1%) 5 (11.9%) 0.3110
Q59 – Insomnia – interference with daily activities Q59 5 (11.9%) 3 (7.1%) 7 (16.7%) 0.1350
Q60 – Fatigue – severity Q60 6 (14.3%) 6 (14.3%) 9 (21.4%) 0.5260
Q61 – Fatigue – interference with daily activities Q61 10 (23.8%) 12 (28.6%) 14 (33.3%) 0.4240

(3.4.3) Ordinal Regression of Individual questions (Collapse questions)

In this section we once again perform ordinal regression, this time at the question level, across all three visits of the final population where questions have 5 naturally occuring ordinal categories . The questions were selected by Dr. Muluneh, including:

  • Diarrhea (frequency)

  • Nausea (frequency and severity)

  • Fatigue (severity and interference)

  • And maybe also: Rash 

Currently we are unable to perform ordinal regression on nausea questions Q33 and Q34, possibly because of the lack of responses in some categories. To circumvent this we will attempt to collapse the questions into fewer responses.

## # A tibble: 30 × 9
##    term  estimate std.error statistic p.value conf.low conf.high coef.type Item 
##    <chr>    <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl> <chr>     <chr>
##  1 age      1.01     0.0363    0.267    0.789   0.938       1.09 location  Q37  
##  2 sex2     0.390    0.717    -1.31     0.189   0.0891      1.54 location  Q37  
##  3 eduO…    0.766    0.835    -0.319    0.750   0.141       3.95 location  Q37  
##  4 empO…    2.32     0.887     0.950    0.342   0.415      14.5  location  Q37  
##  5 inco…    1.02     0.930     0.0243   0.981   0.167       6.78 location  Q37  
##  6 mari…    1.05     0.848     0.0531   0.958   0.193       5.64 location  Q37  
##  7 insu…    0.807    0.765    -0.280    0.780   0.171       3.61 location  Q37  
##  8 line…    1.27     0.723     0.330    0.742   0.305       5.36 location  Q37  
##  9 dose…    0.829    1.38     -0.136    0.892   0.0317     10.9  location  Q37  
## 10 dose…    1.67     1.92      0.266    0.790   0.0297     72.6  location  Q37  
## # ℹ 20 more rows
Ordinal Regression Results — Q37
Final population, Visit 1
Covariate Odds Ratio (95% CI) p-value
age 1.01 (0.94, 1.09) 0.789
sex2 0.39 (0.09, 1.54) 0.189
eduOther 0.77 (0.14, 3.95) 0.750
empOther 2.32 (0.42, 14.55) 0.342
income>=75k 1.02 (0.17, 6.78) 0.981
maritalOther 1.05 (0.19, 5.64) 0.958
insuranceOther 0.81 (0.17, 3.61) 0.780
lineOther 1.27 (0.30, 5.36) 0.742
doseOther 0.83 (0.03, 10.92) 0.892
doseTwice daily 1.67 (0.03, 72.64) 0.790
Ordinal Regression Results — Q60
Final population, Visit 1
Covariate Odds Ratio (95% CI) p-value
age 0.99 (0.92, 1.06) 0.736
sex2 1.19 (0.33, 4.34) 0.795
eduOther 0.85 (0.18, 4.04) 0.834
empOther 4.33 (0.82, 24.76) 0.088
income>=75k 1.18 (0.20, 7.03) 0.852
maritalOther 0.75 (0.14, 3.75) 0.727
insuranceOther 1.46 (0.34, 6.21) 0.607
lineOther 0.86 (0.22, 3.32) 0.827
doseOther 0.79 (0.07, 8.50) 0.842
doseTwice daily 0.43 (0.01, 14.96) 0.640
Ordinal Regression Results — Q61
Final population, Visit 1
Covariate Odds Ratio (95% CI) p-value
age 0.99 (0.92, 1.06) 0.862
sex2 2.81 (0.74, 11.28) 0.134
eduOther 0.75 (0.15, 3.55) 0.711
empOther 3.17 (0.61, 18.28) 0.176
income>=75k 1.87 (0.32, 12.01) 0.492
maritalOther 0.27 (0.05, 1.41) 0.131
insuranceOther 1.08 (0.25, 4.58) 0.913
lineOther 1.01 (0.28, 3.66) 0.986
doseOther 1.10 (0.11, 10.93) 0.931
doseTwice daily 0.70 (0.01, 25.66) 0.846
## # A tibble: 20 × 9
##    term  estimate std.error statistic p.value conf.low conf.high coef.type Item 
##    <chr>    <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl> <chr>     <chr>
##  1 age      1.03     0.0345    0.727   0.467    0.957      1.10  location  Q60  
##  2 sex2     0.911    0.696    -0.134   0.893    0.229      3.61  location  Q60  
##  3 eduO…    0.620    0.823    -0.580   0.562    0.122      3.20  location  Q60  
##  4 empO…    3.88     0.910     1.49    0.136    0.660     24.7   location  Q60  
##  5 inco…    0.363    0.913    -1.11    0.266    0.0573     2.13  location  Q60  
##  6 mari…    0.242    0.875    -1.62    0.105    0.0409     1.30  location  Q60  
##  7 insu…    1.02     0.727     0.0298  0.976    0.242      4.33  location  Q60  
##  8 line…    1.33     0.682     0.419   0.675    0.349      5.18  location  Q60  
##  9 dose…    0.362    1.22     -0.835   0.404    0.0301     3.94  location  Q60  
## 10 dose…    0.945    1.82     -0.0309  0.975    0.0251    35.8   location  Q60  
## 11 age      1.02     0.0344    0.534   0.594    0.949      1.09  location  Q61  
## 12 sex2     2.61     0.732     1.31    0.190    0.631     11.5   location  Q61  
## 13 eduO…    0.764    0.812    -0.332   0.740    0.154      3.83  location  Q61  
## 14 empO…    4.65     0.920     1.67    0.0950   0.786     30.3   location  Q61  
## 15 inco…    0.270    0.936    -1.40    0.162    0.0401     1.64  location  Q61  
## 16 mari…    0.115    0.920    -2.35    0.0189   0.0175     0.663 location  Q61  
## 17 insu…    0.786    0.739    -0.327   0.744    0.179      3.34  location  Q61  
## 18 line…    0.757    0.706    -0.394   0.694    0.186      3.03  location  Q61  
## 19 dose…    0.323    1.19     -0.948   0.343    0.0294     3.49  location  Q61  
## 20 dose…    0.819    1.78     -0.112   0.911    0.0236    27.5   location  Q61
## NULL
Ordinal Regression Results — Q60
Final population, Visit 2
Covariate Odds Ratio (95% CI) p-value
age 1.03 (0.96, 1.10) 0.467
sex2 0.91 (0.23, 3.61) 0.893
eduOther 0.62 (0.12, 3.20) 0.562
empOther 3.88 (0.66, 24.70) 0.136
income>=75k 0.36 (0.06, 2.13) 0.266
maritalOther 0.24 (0.04, 1.30) 0.105
insuranceOther 1.02 (0.24, 4.33) 0.976
lineOther 1.33 (0.35, 5.18) 0.675
doseOther 0.36 (0.03, 3.94) 0.404
doseTwice daily 0.95 (0.03, 35.77) 0.975
Ordinal Regression Results — Q61
Final population, Visit 2
Covariate Odds Ratio (95% CI) p-value
age 1.02 (0.95, 1.09) 0.594
sex2 2.61 (0.63, 11.46) 0.190
eduOther 0.76 (0.15, 3.83) 0.740
empOther 4.65 (0.79, 30.33) 0.095
income>=75k 0.27 (0.04, 1.64) 0.162
maritalOther 0.12 (0.02, 0.66) 0.019
insuranceOther 0.79 (0.18, 3.34) 0.744
lineOther 0.76 (0.19, 3.03) 0.694
doseOther 0.32 (0.03, 3.49) 0.343
doseTwice daily 0.82 (0.02, 27.51) 0.911
## # A tibble: 30 × 9
##    term  estimate std.error statistic p.value conf.low conf.high coef.type Item 
##    <chr>    <dbl>     <dbl>     <dbl>   <dbl>    <dbl>     <dbl> <chr>     <chr>
##  1 age      0.983    0.0456    -0.371   0.711   0.898       1.08 location  Q37  
##  2 sex2     1.22     0.809      0.250   0.802   0.233       5.95 location  Q37  
##  3 eduO…    0.886    1.05      -0.115   0.909   0.110       7.55 location  Q37  
##  4 empO…    2.23     0.971      0.828   0.408   0.353      17.7  location  Q37  
##  5 inco…    3.02     1.08       1.02    0.306   0.394      30.5  location  Q37  
##  6 mari…    1.54     1.04       0.414   0.679   0.199      12.9  location  Q37  
##  7 insu…    4.67     0.956      1.61    0.107   0.763      35.8  location  Q37  
##  8 line…    0.517    0.834     -0.791   0.429   0.0948      2.63 location  Q37  
##  9 dose…    0.498    1.47      -0.475   0.635   0.0169      8.03 location  Q37  
## 10 dose…    3.06     2.03       0.552   0.581   0.0480    197.   location  Q37  
## # ℹ 20 more rows
Ordinal Regression Results — Q37
Final population, Visit 3
Covariate Odds Ratio (95% CI) p-value
age 0.98 (0.90, 1.08) 0.711
sex2 1.22 (0.23, 5.95) 0.802
eduOther 0.89 (0.11, 7.55) 0.909
empOther 2.23 (0.35, 17.74) 0.408
income>=75k 3.02 (0.39, 30.48) 0.306
maritalOther 1.54 (0.20, 12.87) 0.679
insuranceOther 4.67 (0.76, 35.77) 0.107
lineOther 0.52 (0.09, 2.63) 0.429
doseOther 0.50 (0.02, 8.03) 0.635
doseTwice daily 3.06 (0.05, 196.66) 0.581
Ordinal Regression Results — Q60
Final population, Visit 3
Covariate Odds Ratio (95% CI) p-value
age 0.94 (0.87, 1.01) 0.092
sex2 0.59 (0.14, 2.29) 0.452
eduOther 2.65 (0.55, 13.77) 0.230
empOther 6.13 (1.16, 35.92) 0.036
income>=75k 0.72 (0.12, 4.35) 0.713
maritalOther 0.79 (0.16, 3.89) 0.772
insuranceOther 1.12 (0.26, 4.70) 0.879
lineOther 1.76 (0.43, 7.51) 0.433
doseOther 1.04 (0.11, 9.55) 0.971
doseTwice daily 0.86 (0.02, 29.87) 0.933
Ordinal Regression Results — Q61
Final population, Visit 3
Covariate Odds Ratio (95% CI) p-value
age 0.94 (0.87, 1.01) 0.107
sex2 0.85 (0.20, 3.46) 0.821
eduOther 1.81 (0.39, 9.02) 0.456
empOther 7.28 (1.38, 46.97) 0.024
income>=75k 0.64 (0.10, 4.10) 0.629
maritalOther 0.53 (0.10, 2.91) 0.466
insuranceOther 0.74 (0.15, 3.40) 0.694
lineOther 2.31 (0.60, 9.36) 0.227
doseOther 0.68 (0.07, 6.03) 0.724
doseTwice daily 1.96 (0.04, 76.34) 0.714

(3.4.4) Ordinal Regression of composite scores final populations visits 1 2 and 3

###Add visit 2 and 3

library(dplyr)
library(ordinal)
library(broom)

domain_vars <- c(
  "GI_score",
  "Dermatologic_score",
  "Pain_score",
  "Neurologic_score",
  "Constitutional_score"
)

domain_ord_results <- lapply(domain_vars, function(v) {

  df <- values_clean %>%
    filter(
      instance == 1,
      patient_id %in% final_ids,
      !is.na(.data[[v]])
    ) %>%
    mutate(
      age    = suppressWarnings(as.numeric(Q6)),
      sex    = factor(Q1),
      edu    = factor(edu_recoded),
      emp    = factor(employment_recoded),
      income = factor(income_recoded),
      marital = factor(marital_recoded),
      insurance = factor(insurance_recoded),
      line   = factor(line_therapy_recoded),
      dose   = factor(daily_freq_recoded),

      # Outcome collapsed into quartiles
      outcome = factor(ntile(.data[[v]], 4), ordered = TRUE) # 
    )

  fit <- clm(
    outcome ~ age + sex + edu + emp + income +
      marital + insurance + line + dose,
    data = df,
    link = "logit",
    Hess = TRUE
  )

  tidy(fit, exponentiate = TRUE, conf.int = TRUE) %>%
    filter(!grepl("\\|", term)) %>%   # drop thresholds
    mutate(Domain = gsub("_score", "", v))
}) %>%
  bind_rows()


library(gt)

domain_tables <- lapply(unique(domain_ord_results$Domain), function(d) {

  domain_ord_results %>%
    filter(Domain == d) %>%
    select(Domain, term, estimate, conf.low, conf.high, p.value) %>%
    mutate(
      OR_CI = sprintf("%.2f (%.2f, %.2f)", estimate, conf.low, conf.high),
      p_value = ifelse(p.value < 0.001, "<0.001", sprintf("%.3f", p.value))
    ) %>%
    select(term, OR_CI, p_value) %>%
    gt() %>%
    tab_header(
      title = md(paste0("**Ordinal Regression Results — ", d, " Domain**")),
      subtitle = md("Final population, Visit 1 (Odds Ratios with 95% CI)")
    ) %>%
    cols_label(
      term = "Covariate",
      OR_CI = "Odds Ratio (95% CI)",
      p_value = "p-value"
    ) %>%
    opt_row_striping()
})

names(domain_tables) <- unique(domain_ord_results$Domain)


domain_tables$GI
Ordinal Regression Results — GI Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.99 (0.92, 1.05) 0.648
sex2 2.55 (0.66, 10.50) 0.179
eduOther 1.12 (0.24, 5.22) 0.885
empOther 3.47 (0.65, 19.27) 0.143
income>=75k 1.27 (0.23, 7.41) 0.785
maritalOther 0.33 (0.06, 1.63) 0.176
insuranceOther 0.41 (0.08, 1.85) 0.247
lineOther 1.46 (0.38, 5.83) 0.583
doseOther 0.12 (0.01, 1.45) 0.103
doseTwice daily 1.39 (0.03, 51.41) 0.858
domain_tables$Pain
Ordinal Regression Results — Pain Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.98 (0.91, 1.05) 0.579
sex2 1.09 (0.30, 3.98) 0.891
eduOther 0.96 (0.22, 4.28) 0.956
empOther 2.41 (0.45, 13.23) 0.299
income>=75k 0.79 (0.14, 4.36) 0.789
maritalOther 0.73 (0.15, 3.49) 0.689
insuranceOther 1.62 (0.39, 6.88) 0.507
lineOther 0.58 (0.15, 2.16) 0.416
doseOther 1.08 (0.10, 12.88) 0.949
doseTwice daily 0.49 (0.01, 15.60) 0.686
domain_tables$Neurologic
Ordinal Regression Results — Neurologic Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.98 (0.91, 1.05) 0.509
sex2 0.48 (0.13, 1.72) 0.267
eduOther 0.55 (0.11, 2.64) 0.451
empOther 0.55 (0.10, 2.79) 0.476
income>=75k 0.49 (0.08, 2.74) 0.420
maritalOther 0.32 (0.06, 1.64) 0.176
insuranceOther 1.31 (0.30, 5.79) 0.721
lineOther 0.54 (0.14, 2.04) 0.366
doseOther 0.68 (0.07, 6.51) 0.729
doseTwice daily 0.07 (0.00, 2.40) 0.152
domain_tables$Dermatologic
Ordinal Regression Results — Dermatologic Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.03 (0.96, 1.12) 0.373
sex2 0.18 (0.04, 0.72) 0.020
eduOther 0.41 (0.07, 2.00) 0.277
empOther 0.49 (0.09, 2.52) 0.401
income>=75k 0.54 (0.08, 3.55) 0.522
maritalOther 2.51 (0.48, 13.37) 0.272
insuranceOther 4.24 (1.01, 20.05) 0.056
lineOther 0.83 (0.21, 3.19) 0.791
doseOther 1.55 (0.12, 21.04) 0.731
doseTwice daily 0.31 (0.01, 17.80) 0.551
domain_tables$Constitutional
Ordinal Regression Results — Constitutional Domain
Final population, Visit 1 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.96 (0.88, 1.03) 0.253
sex2 1.95 (0.53, 7.39) 0.317
eduOther 0.73 (0.13, 3.95) 0.709
empOther 5.80 (1.01, 38.01) 0.053
income>=75k 0.86 (0.16, 4.80) 0.858
maritalOther 0.42 (0.07, 2.26) 0.319
insuranceOther 0.90 (0.20, 3.91) 0.891
lineOther 0.39 (0.10, 1.46) 0.168
doseOther 0.58 (0.06, 6.67) 0.643
doseTwice daily 9.70 (0.14, 748.32) 0.277
values_clean %>%
  filter(
    instance == 1,
    patient_id %in% final_ids,
    !is.na(Pain_score)
  ) %>%
  mutate(
    quartile = ntile(Pain_score, 4)
  ) %>%
  count(quartile)
## # A tibble: 4 × 2
##   quartile     n
##      <int> <int>
## 1        1    11
## 2        2    11
## 3        3    10
## 4        4    10
domains <- c(
  "GI_score",
  "Dermatologic_score",
  "Pain_score",
  "Neurologic_score",
  "Constitutional_score"
)

for (d in domains) {

  p <- ggplot(
    values_clean %>%
      filter(
        instance == 1,
        patient_id %in% final_ids,
        !is.na(.data[[d]])
      ),
    aes(x = .data[[d]])
  ) +
    geom_histogram(
      binwidth = 1,
      fill = "steelblue",
      color = "black"
    ) +
    theme_minimal(base_size = 14) +
    labs(
      title = paste("Distribution of", gsub("_score", "", d), "Domain Scores"),
      subtitle = "Baseline (Visit 1), Final Population",
      x = paste(gsub("_score", "", d), "Composite Score"),
      y = "Count"
    )

  print(p)
}

library(dplyr)
library(ordinal)
library(broom)
library(gt)

domain_vars <- c(
  "GI_score",
  "Dermatologic_score",
  "Pain_score",
  "Neurologic_score",
  "Constitutional_score"
)

domain_ord_results_v2 <- lapply(domain_vars, function(v) {

  df <- values_clean %>%
    filter(
      instance == 2,
      patient_id %in% final_ids,
      !is.na(.data[[v]])
    ) %>%
    mutate(
      age        = suppressWarnings(as.numeric(Q6)),
      sex        = factor(Q1),
      edu        = factor(edu_recoded),
      emp        = factor(employment_recoded),
      income     = factor(income_recoded),
      marital    = factor(marital_recoded),
      insurance  = factor(insurance_recoded),
      line       = factor(line_therapy_recoded),
      dose       = factor(daily_freq_recoded),

      # Outcome collapsed into quartiles
      outcome = factor(ntile(.data[[v]], 4), ordered = TRUE)
    )

  fit <- clm(
    outcome ~ age + sex + edu + emp + income +
      marital + insurance + line + dose,
    data = df,
    link = "logit",
    Hess = TRUE
  )

  tidy(fit, exponentiate = TRUE, conf.int = TRUE) %>%
    filter(!grepl("\\|", term)) %>%
    mutate(Domain = gsub("_score", "", v))
}) %>%
  bind_rows()

domain_tables_v2 <- lapply(unique(domain_ord_results_v2$Domain), function(d) {

  domain_ord_results_v2 %>%
    filter(Domain == d) %>%
    select(term, estimate, conf.low, conf.high, p.value) %>%
    mutate(
      OR_CI = sprintf("%.2f (%.2f, %.2f)", estimate, conf.low, conf.high),
      p_value = ifelse(p.value < 0.001, "<0.001",
                       sprintf("%.3f", p.value))
    ) %>%
    select(term, OR_CI, p_value) %>%
    gt() %>%
    tab_header(
      title = md(paste0("**Ordinal Regression Results — ", d, " Domain**")),
      subtitle = md("Final population, Visit 2 (Odds Ratios with 95% CI)")
    ) %>%
    cols_label(
      term = "Covariate",
      OR_CI = "Odds Ratio (95% CI)",
      p_value = "p-value"
    ) %>%
    opt_row_striping()
})

names(domain_tables_v2) <- unique(domain_ord_results_v2$Domain)

domain_tables_v2$GI
Ordinal Regression Results — GI Domain
Final population, Visit 2 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.00 (0.93, 1.07) 0.986
sex2 1.67 (0.43, 6.65) 0.457
eduOther 0.68 (0.13, 3.50) 0.645
empOther 1.17 (0.23, 5.91) 0.850
income>=75k 0.73 (0.14, 3.90) 0.715
maritalOther 0.70 (0.14, 3.32) 0.655
insuranceOther 1.03 (0.26, 4.13) 0.968
lineOther 0.25 (0.06, 1.00) 0.054
doseOther 0.79 (0.08, 8.81) 0.836
doseTwice daily 0.45 (0.01, 16.75) 0.668
domain_tables_v2$Pain
Ordinal Regression Results — Pain Domain
Final population, Visit 2 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.97 (0.90, 1.04) 0.376
sex2 0.64 (0.17, 2.30) 0.495
eduOther 1.23 (0.25, 6.30) 0.800
empOther 2.30 (0.45, 12.13) 0.314
income>=75k 0.53 (0.09, 2.93) 0.471
maritalOther 0.79 (0.15, 3.98) 0.775
insuranceOther 0.77 (0.16, 3.52) 0.730
lineOther 0.95 (0.24, 3.71) 0.939
doseOther 0.90 (0.09, 9.14) 0.928
doseTwice daily 0.46 (0.01, 15.80) 0.669
domain_tables_v2$Neurologic
Ordinal Regression Results — Neurologic Domain
Final population, Visit 2 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.01 (0.94, 1.08) 0.823
sex2 0.45 (0.11, 1.75) 0.253
eduOther 1.64 (0.29, 10.16) 0.579
empOther 1.02 (0.20, 4.88) 0.979
income>=75k 0.34 (0.05, 1.80) 0.218
maritalOther 0.76 (0.13, 4.42) 0.759
insuranceOther 2.27 (0.53, 10.41) 0.273
lineOther 1.23 (0.32, 4.81) 0.764
doseOther 0.13 (0.00, 1.70) 0.146
doseTwice daily 1.27 (0.04, 61.49) 0.897
domain_tables_v2$Dermatologic
Ordinal Regression Results — Dermatologic Domain
Final population, Visit 2 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.08 (1.01, 1.17) 0.034
sex2 1.40 (0.35, 6.00) 0.640
eduOther 0.09 (0.01, 0.52) 0.010
empOther 1.54 (0.28, 8.81) 0.616
income>=75k 27.90 (3.92, 261.40) 0.002
maritalOther 2.14 (0.43, 11.38) 0.360
insuranceOther 0.44 (0.09, 1.97) 0.289
lineOther 0.62 (0.15, 2.47) 0.498
doseOther 0.54 (0.03, 7.27) 0.644
doseTwice daily 2.92 (0.05, 189.89) 0.594
domain_tables_v2$Constitutional
Ordinal Regression Results — Constitutional Domain
Final population, Visit 2 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.04 (0.98, 1.12) 0.206
sex2 1.28 (0.32, 5.14) 0.723
eduOther 0.23 (0.04, 1.20) 0.091
empOther 1.44 (0.30, 6.99) 0.643
income>=75k 0.18 (0.03, 1.01) 0.057
maritalOther 0.11 (0.02, 0.58) 0.013
insuranceOther 1.11 (0.25, 4.93) 0.891
lineOther 0.45 (0.11, 1.83) 0.272
doseOther 0.35 (0.03, 4.54) 0.402
doseTwice daily 0.82 (0.03, 36.37) 0.911
library(dplyr)
library(ordinal)
library(broom)
library(gt)

domain_ord_results_v3 <- lapply(domain_vars, function(v) {

  df <- values_clean %>%
    filter(
      instance == 3,
      patient_id %in% final_ids,
      !is.na(.data[[v]])
    ) %>%
    mutate(
      age        = suppressWarnings(as.numeric(Q6)),
      sex        = factor(Q1),
      edu        = factor(edu_recoded),
      emp        = factor(employment_recoded),
      income     = factor(income_recoded),
      marital    = factor(marital_recoded),
      insurance  = factor(insurance_recoded),
      line       = factor(line_therapy_recoded),
      dose       = factor(daily_freq_recoded),
      outcome    = factor(ntile(.data[[v]], 4), ordered = TRUE)
    )

  fit <- clm(
    outcome ~ age + sex + edu + emp + income +
      marital + insurance + line + dose,
    data = df,
    link = "logit",
    Hess = TRUE
  )

  tidy(fit, exponentiate = TRUE, conf.int = TRUE) %>%
    filter(!grepl("\\|", term)) %>%
    mutate(Domain = gsub("_score", "", v))
}) %>%
  bind_rows()

domain_tables_v3 <- lapply(unique(domain_ord_results_v3$Domain), function(d) {

  domain_ord_results_v3 %>%
    filter(Domain == d) %>%
    select(term, estimate, conf.low, conf.high, p.value) %>%
    mutate(
      OR_CI = sprintf("%.2f (%.2f, %.2f)", estimate, conf.low, conf.high),
      p_value = ifelse(p.value < 0.001, "<0.001",
                       sprintf("%.3f", p.value))
    ) %>%
    select(term, OR_CI, p_value) %>%
    gt() %>%
    tab_header(
      title = md(paste0("**Ordinal Regression Results — ", d, " Domain**")),
      subtitle = md("Final population, Visit 3 (Odds Ratios with 95% CI)")
    ) %>%
    cols_label(
      term = "Covariate",
      OR_CI = "Odds Ratio (95% CI)",
      p_value = "p-value"
    ) %>%
    opt_row_striping()
})

names(domain_tables_v3) <- unique(domain_ord_results_v3$Domain)

domain_tables_v3$GI
Ordinal Regression Results — GI Domain
Final population, Visit 3 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.02 (0.96, 1.10) 0.509
sex2 2.47 (0.67, 9.55) 0.178
eduOther 1.33 (0.28, 6.67) 0.721
empOther 1.65 (0.37, 7.48) 0.510
income>=75k 0.58 (0.11, 3.00) 0.518
maritalOther 0.96 (0.20, 4.55) 0.961
insuranceOther 1.22 (0.26, 5.66) 0.800
lineOther 1.29 (0.34, 4.96) 0.711
doseOther 0.06 (0.00, 0.70) 0.037
doseTwice daily 0.17 (0.00, 5.18) 0.311
domain_tables_v3$Pain
Ordinal Regression Results — Pain Domain
Final population, Visit 3 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.93 (0.86, 1.00) 0.072
sex2 0.47 (0.11, 1.84) 0.290
eduOther 1.93 (0.37, 10.78) 0.437
empOther 3.13 (0.54, 18.80) 0.202
income>=75k 0.73 (0.13, 3.92) 0.713
maritalOther 1.66 (0.31, 8.95) 0.552
insuranceOther 2.53 (0.49, 14.59) 0.279
lineOther 0.65 (0.16, 2.61) 0.547
doseOther 0.27 (0.02, 2.81) 0.275
doseTwice daily 2.13 (0.07, 67.70) 0.660
domain_tables_v3$Neurologic
Ordinal Regression Results — Neurologic Domain
Final population, Visit 3 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.97 (0.90, 1.03) 0.283
sex2 0.46 (0.12, 1.66) 0.238
eduOther 2.26 (0.44, 13.07) 0.342
empOther 1.36 (0.30, 6.33) 0.691
income>=75k 1.13 (0.22, 5.78) 0.883
maritalOther 0.87 (0.17, 4.62) 0.872
insuranceOther 2.90 (0.71, 12.86) 0.144
lineOther 1.05 (0.26, 4.20) 0.946
doseOther 0.69 (0.06, 6.54) 0.743
doseTwice daily 0.48 (0.01, 20.21) 0.698
domain_tables_v3$Dermatologic
Ordinal Regression Results — Dermatologic Domain
Final population, Visit 3 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 1.08 (1.00, 1.18) 0.053
sex2 2.58 (0.61, 12.40) 0.214
eduOther 0.55 (0.11, 2.70) 0.458
empOther 0.74 (0.13, 4.17) 0.731
income>=75k 11.40 (1.75, 85.86) 0.013
maritalOther 7.81 (1.58, 45.02) 0.015
insuranceOther 1.24 (0.28, 5.46) 0.771
lineOther 1.08 (0.27, 4.27) 0.914
doseOther 4.71 (0.40, 64.02) 0.219
doseTwice daily 0.27 (0.01, 10.05) 0.477
domain_tables_v3$Constitutional
Ordinal Regression Results — Constitutional Domain
Final population, Visit 3 (Odds Ratios with 95% CI)
Covariate Odds Ratio (95% CI) p-value
age 0.88 (0.80, 0.96) 0.005
sex2 0.67 (0.16, 2.68) 0.578
eduOther 11.58 (1.76, 103.14) 0.017
empOther 14.52 (2.45, 108.77) 0.005
income>=75k 0.45 (0.08, 2.45) 0.355
maritalOther 0.54 (0.09, 3.20) 0.495
insuranceOther 3.04 (0.58, 17.65) 0.195
lineOther 0.63 (0.15, 2.54) 0.515
doseOther 0.04 (0.00, 0.48) 0.016
doseTwice daily 21.72 (0.33, 1573.50) 0.136

(3.4.5) Sankey plot of Individual questions

(3.4.6) Linear Regression of Pooled Composite Scores

## 
## Call:
## lm(formula = PRO_score ~ age + sex + edu + emp + income + marital + 
##     insurance + line + months + dose, data = baseline_PRO)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -25.795  -8.988  -0.753   5.585  58.487 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)   
## (Intercept)        79.3147    23.4012   3.389   0.0021 **
## age                -0.4247     0.3731  -1.138   0.2647   
## sexFemale           0.4162     7.3631   0.057   0.9553   
## eduOther            2.9437     8.6834   0.339   0.7371   
## empOther            9.4084     9.2337   1.019   0.3170   
## income>=75k        -3.8378     9.8181  -0.391   0.6988   
## maritalOther       -5.7604     9.1010  -0.633   0.5319   
## insuranceOther      2.9680     8.0090   0.371   0.7137   
## lineOther          -4.5230     7.3226  -0.618   0.5418   
## months>=13 months  -1.1630     7.4979  -0.155   0.8779   
## monthsMissing      45.5393    22.6714   2.009   0.0543 . 
## doseOther          -5.9408    13.4858  -0.441   0.6629   
## doseTwice daily    -3.0115    19.7453  -0.153   0.8799   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.2656, Adjusted R-squared:  -0.04917 
## F-statistic: 0.8438 on 12 and 28 DF,  p-value: 0.6079

(3.4.7) Linear Regression of Domain Composite Scores

## 
## ==============================
## DOMAIN: GI_score 
## ==============================
## 
## Call:
## lm(formula = PRO_score ~ age + sex + edu + emp + income + marital + 
##     insurance + line + months + dose, data = baseline_domain)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -5.158 -2.027  0.000  1.619  6.894 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       12.09137    3.68061   3.285  0.00274 ** 
## age               -0.06541    0.05868  -1.115  0.27447    
## sexFemale          1.97720    1.15809   1.707  0.09884 .  
## eduOther           0.76850    1.36575   0.563  0.57812    
## empOther           2.09506    1.45231   1.443  0.16023    
## income>=75k       -0.65181    1.54421  -0.422  0.67618    
## maritalOther      -2.71098    1.43144  -1.894  0.06861 .  
## insuranceOther    -0.97184    1.25968  -0.771  0.44688    
## lineOther          0.01430    1.15172   0.012  0.99018    
## months>=13 months  0.61280    1.17929   0.520  0.60740    
## monthsMissing     17.86634    3.56582   5.010  2.7e-05 ***
## doseOther         -4.14073    2.12108  -1.952  0.06098 .  
## doseTwice daily    0.54877    3.10560   0.177  0.86101    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.146 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.6237, Adjusted R-squared:  0.4625 
## F-statistic: 3.868 on 12 and 28 DF,  p-value: 0.001546

## 
## ==============================
## DOMAIN: Dermatologic_score 
## ==============================
## 
## Call:
## lm(formula = PRO_score ~ age + sex + edu + emp + income + marital + 
##     insurance + line + months + dose, data = baseline_domain)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.7821 -0.1394  0.1829  0.3560  1.0081 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        5.587781   1.090860   5.122 1.98e-05 ***
## age                0.007704   0.017393   0.443    0.661    
## sexFemale         -0.386076   0.343234  -1.125    0.270    
## eduOther          -0.384628   0.404781  -0.950    0.350    
## empOther          -0.045117   0.430434  -0.105    0.917    
## income>=75k        0.145251   0.457673   0.317    0.753    
## maritalOther       0.545225   0.424249   1.285    0.209    
## insuranceOther    -0.204823   0.373345  -0.549    0.588    
## lineOther          0.012721   0.341346   0.037    0.971    
## months>=13 months -0.193389   0.349517  -0.553    0.584    
## monthsMissing      0.289494   1.056838   0.274    0.786    
## doseOther          0.689347   0.628645   1.097    0.282    
## doseTwice daily    0.319359   0.920438   0.347    0.731    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9324 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.2015, Adjusted R-squared:  -0.1406 
## F-statistic: 0.589 on 12 and 28 DF,  p-value: 0.8325

## 
## ==============================
## DOMAIN: Pain_score 
## ==============================
## 
## Call:
## lm(formula = PRO_score ~ age + sex + edu + emp + income + marital + 
##     insurance + line + months + dose, data = baseline_domain)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -17.462  -8.340  -0.125   3.868  39.918 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)  
## (Intercept)        38.6113    16.0918   2.399   0.0233 *
## age                -0.2360     0.2566  -0.920   0.3655  
## sexFemale          -1.2182     5.0632  -0.241   0.8116  
## eduOther            1.8431     5.9711   0.309   0.7599  
## empOther            5.0400     6.3496   0.794   0.4340  
## income>=75k        -1.0246     6.7514  -0.152   0.8805  
## maritalOther       -0.6577     6.2583  -0.105   0.9170  
## insuranceOther      3.7782     5.5074   0.686   0.4983  
## lineOther          -3.0181     5.0354  -0.599   0.5537  
## months>=13 months  -1.7985     5.1559  -0.349   0.7298  
## monthsMissing      24.6636    15.5900   1.582   0.1249  
## doseOther          -0.5294     9.2735  -0.057   0.9549  
## doseTwice daily    -2.3094    13.5778  -0.170   0.8662  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.75 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.1876, Adjusted R-squared:  -0.1605 
## F-statistic: 0.539 on 12 and 28 DF,  p-value: 0.8704

## 
## ==============================
## DOMAIN: Neurologic_score 
## ==============================
## 
## Call:
## lm(formula = PRO_score ~ age + sex + edu + emp + income + marital + 
##     insurance + line + months + dose, data = baseline_domain)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.0978 -1.9024 -0.6009  0.9250  8.4517 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)   
## (Intercept)       11.55553    4.07359   2.837  0.00838 **
## age               -0.05078    0.06495  -0.782  0.44090   
## sexFemale         -0.82513    1.28174  -0.644  0.52497   
## eduOther           0.24408    1.51157   0.161  0.87288   
## empOther          -0.48601    1.60737  -0.302  0.76461   
## income>=75k       -1.89187    1.70908  -1.107  0.27774   
## maritalOther      -1.74734    1.58427  -1.103  0.27945   
## insuranceOther     0.11692    1.39418   0.084  0.93376   
## lineOther         -0.08192    1.27469  -0.064  0.94921   
## months>=13 months -0.36592    1.30520  -0.280  0.78127   
## monthsMissing      1.47877    3.94654   0.375  0.71071   
## doseOther         -1.12278    2.34755  -0.478  0.63617   
## doseTwice daily   -3.74674    3.43718  -1.090  0.28498   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.482 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.1473, Adjusted R-squared:  -0.2182 
## F-statistic: 0.403 on 12 and 28 DF,  p-value: 0.9503

## 
## ==============================
## DOMAIN: Constitutional_score 
## ==============================
## 
## Call:
## lm(formula = PRO_score ~ age + sex + edu + emp + income + marital + 
##     insurance + line + months + dose, data = baseline_domain)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.2408 -1.8826 -0.1184  1.6216  8.7712 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)  
## (Intercept)       11.46875    4.23898   2.706   0.0115 *
## age               -0.08017    0.06759  -1.186   0.2455  
## sexFemale          0.86837    1.33378   0.651   0.5203  
## eduOther           0.47270    1.57294   0.301   0.7660  
## empOther           2.80445    1.67263   1.677   0.1047  
## income>=75k       -0.41468    1.77848  -0.233   0.8173  
## maritalOther      -1.18961    1.64859  -0.722   0.4765  
## insuranceOther     0.24963    1.45079   0.172   0.8646  
## lineOther         -1.44999    1.32644  -1.093   0.2836  
## months>=13 months  0.58205    1.35819   0.429   0.6715  
## monthsMissing      1.24103    4.10678   0.302   0.7647  
## doseOther         -0.83729    2.44286  -0.343   0.7343  
## doseTwice daily    2.17659    3.57674   0.609   0.5477  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.623 on 28 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.2214, Adjusted R-squared:  -0.1123 
## F-statistic: 0.6634 on 12 and 28 DF,  p-value: 0.7704

Section 4

(4.1) Binned plots by question (Item score vs adherence)

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]

## 
## [[11]]

## 
## [[12]]

## 
## [[13]]

## 
## [[14]]

## 
## [[15]]

## 
## [[16]]

## 
## [[17]]

## 
## [[18]]

## 
## [[19]]

## 
## [[20]]

## 
## [[21]]

## 
## [[22]]

## 
## [[23]]

## 
## [[24]]

## 
## [[25]]

## 
## [[26]]

## 
## [[27]]

## 
## [[28]]

## 
## [[29]]

## 
## [[30]]

###Domain Emax Curves

(4.2) Spearman correlation

(4.3) Add Table for Correaltions

library(dplyr)
library(tidyr)
library(gt)

# -----------------------------
# Domains
# -----------------------------
domains <- c(
  "GI_score",
  "Dermatologic_score",
  "Pain_score",
  "Neurologic_score",
  "Constitutional_score"
)

# -----------------------------
# Final population
# -----------------------------
final_pop <- values_clean %>%
  filter(instance %in% c(1, 2, 3)) %>%
  group_by(patient_id) %>%
  summarise(n_visits = n_distinct(instance), .groups = "drop") %>%
  filter(n_visits == 3) %>%
  pull(patient_id)

# -----------------------------
# Collect correlations
# -----------------------------
corr_results <- list()

for (d in domains) {
  for (v in c(1, 2, 3)) {

    df <- values_clean %>%
      filter(
        patient_id %in% final_pop,
        instance == v,
        !is.na(wilson_adherence),
        !is.na(.data[[d]])
      )

    if (nrow(df) < 5) next

    ct <- cor.test(
      df$wilson_adherence,
      df[[d]],
      method = "spearman",
      exact = FALSE
    )

    corr_results[[length(corr_results) + 1]] <- tibble(
      Domain = gsub("_score", "", d),
      Visit  = paste0("Visit ", v),
      r      = round(ct$estimate, 2),
      p      = ct$p.value
    )
  }
}

corr_results <- bind_rows(corr_results) %>%
  mutate(
    p_value = ifelse(p < 0.001, "<0.001", sprintf("%.3f", p))
  )

# -----------------------------
# Wide-format table
# -----------------------------
corr_table <- corr_results %>%
  select(Domain, Visit, r, p_value) %>%
  pivot_wider(
    names_from = Visit,
    values_from = c(r, p_value),
    names_glue = "{Visit}_{.value}"
  ) %>%
  rename(
    `Visit 1 r` = `Visit 1_r`,
    `Visit 1 p-value` = `Visit 1_p_value`,
    `Visit 2 r` = `Visit 2_r`,
    `Visit 2 p-value` = `Visit 2_p_value`,
    `Visit 3 r` = `Visit 3_r`,
    `Visit 3 p-value` = `Visit 3_p_value`
  )

# -----------------------------
# GT table
# -----------------------------
corr_gt <- corr_table %>%
  gt() %>%
  tab_header(
    title = md("**Correlation Between Adherence and Domain Scores**"),
    subtitle = md("Spearman correlation by visit (final population)")
  ) %>%
  cols_align(align = "center", -Domain) %>%
  opt_row_striping()

corr_gt
Correlation Between Adherence and Domain Scores
Spearman correlation by visit (final population)
Domain Visit 1 r Visit 2 r Visit 3 r Visit 1 p-value Visit 2 p-value Visit 3 p-value
GI 0.13 -0.12 0.20 0.412 0.454 0.216
Dermatologic -0.18 0.25 0.02 0.265 0.115 0.903
Pain 0.14 0.14 0.18 0.380 0.389 0.248
Neurologic 0.00 0.07 0.10 0.987 0.658 0.512
Constitutional 0.14 -0.12 -0.03 0.382 0.453 0.856

Export composite scores

# ============================================================
# EXPORT ANALYSIS DATASET WITH COMPOSITE SCORES
# ============================================================

library(dplyr)
library(writexl)

analysis_dataset <- values_clean %>%
  dplyr::select(
    # Identifiers
    patient_id,
    instance,
    relative_date,

    # Adherence
    wilson_adherence,

    # Composite scores
    AE_score,
    GI_score,
    Dermatologic_score,
    Pain_score,
    Neurologic_score,
    Constitutional_score,

    # Demographics / covariates
    age = Q6,
    sex = Q1,
    edu_recoded,
    employment_recoded,
    income_recoded,
    marital_recoded,
    insurance_recoded,
    line_therapy_recoded,
    daily_freq_recoded,
    months_therapy_recoded
  ) %>%
  arrange(patient_id, instance)

# ---- Export to CSV ----
write.csv(
  analysis_dataset,
  file = "analysis_dataset_with_composite_scores.csv",
  row.names = FALSE
)

# ---- Optional Excel version (often preferred by advisors) ----
write_xlsx(
  analysis_dataset,
  path = "analysis_dataset_with_composite_scores.xlsx"
)

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.