Based on the manuscript “A comparative study of two-sample hypothesis tests in the presence of long-term survivors” (full_3__3__3.tex), the following methods were identified and implemented:
| Group | n | Events | Censored | Median Time | Max Time |
|---|---|---|---|---|---|
| Neoadjuvant-adjuvant | 155 | 42 | 113 | 10.90 | 36.00 |
| Adjuvant-only | 159 | 75 | 84 | 8.39 | 36.00 |
| Total | 314 | 117 | 197 | - | - |
| Method | Test Statistic | DF | P-Value | Significant |
|---|---|---|---|---|
| Log-rank test (standard) | 9.6455 | 1 | 0.001898 | ✓ |
| Early weighted log-rank (ρ=1, γ=0) | 7.1775 | 1 | 0.007382 | ✓ |
| Late weighted log-rank (ρ=0, γ=1) | 15.5901 | 1 | 0.000079 | ✓ |
| Optimal weighted log-rank (ρ=-1, γ=0) | 12.0471 | 1 | 0.000519 | ✓ |
| Yang-Prentice (YP) test | 15.7487 | 2 | 0.000380 | ✓ |
| Two-Stage (TS) test | 9.6455 | 1 | 0.001898 | ✓ |
| Mixture Cure Model LRT (Weibull) | 18.5452 | 2 | 0.000094 | ✓ |
Note: The original S1801 trial reported p = 0.004 from the planned log-rank test. Our reconstructed data yields p = 0.0019, which is consistent with the published result.
| Parameter | Value |
|---|---|
| Cure fraction (Adjuvant-only) | 42.0% |
| Cure fraction (Neoadjuvant-adjuvant) | 68.2% |
| Weibull shape parameter | 1.4445 |
| Weibull scale (Adjuvant) | 9.14 |
| Weibull scale (Neoadjuvant) | 6.06 |
Interpretation: The neoadjuvant-adjuvant group shows a substantially higher cure fraction (68.2%) compared to the adjuvant-only group (42.0%), suggesting that pre-surgical treatment may improve long-term outcomes.
| Model | Log-Likelihood | Parameters | AIC | Cure (Adj) | Cure (Neo) |
|---|---|---|---|---|---|
| Cure Model (Log-normal) | -481.77 | 5 | 973.54 | 42.1% | 67.8% |
| Cure Model (Weibull) | -494.81 | 5 | 999.61 | 42.0% | 68.2% |
| Cure Model (Exponential) | -504.48 | 4 | 1016.96 | 37.0% | 66.9% |
| Model | Log-Likelihood | Parameters | AIC |
|---|---|---|---|
| Log-normal | -509.20 | 3 | 1024.41 |
| Log-logistic | -515.47 | 3 | 1036.94 |
| Weibull AFT | -521.77 | 3 | 1049.54 |
| Exponential | -524.05 | 2 | 1052.11 |
| Cox PH | -617.20 | 1 | 1236.40 |
| Rank | Model | Type | AIC | ΔAIC |
|---|---|---|---|---|
| 1 | Cure Model (Log-normal) | Cure Model | 973.54 | 0.00 |
| 2 | Cure Model (Weibull) | Cure Model | 999.61 | 26.07 |
| 3 | Cure Model (Exponential) | Cure Model | 1016.96 | 43.42 |
| 4 | Log-normal | Standard Model | 1024.41 | 50.86 |
| 5 | Log-logistic | Standard Model | 1036.94 | 63.39 |
| 6 | Weibull AFT | Standard Model | 1049.54 | 76.00 |
| 7 | Exponential | Standard Model | 1052.11 | 78.56 |
| 8 | Cox PH | Standard Model | 1236.40 | 262.86 |
All hypothesis tests are statistically significant (p < 0.01), providing strong evidence for a treatment effect favoring the neoadjuvant-adjuvant arm.
The late weighted log-rank test shows the strongest evidence (p = 0.000079), suggesting the treatment effect is most pronounced at later follow-up times, consistent with a cure model framework.
Cure models substantially outperform standard survival models based on AIC, with the log-normal cure model providing the best fit (AIC = 973.54 vs. 1024.41 for the best standard model).
The cure fraction is approximately 26 percentage points higher in the neoadjuvant-adjuvant arm (68.2% vs. 42.0%), indicating a meaningful increase in long-term survivors.
The best fitting model is the Log-normal Cure Model, which accounts for both the presence of long-term survivors and the treatment effect on both the cure fraction and the uncured survival distribution.
hypothesis_test_results.csv - P-values and test
statistics for all methodsmodel_aic_comparison.csv - AIC comparison for cure and
standard models