Simulation prompt — HIMALAYA (NCT03298451)

Simulate the operating characteristics of the HIMALAYA Phase 3 trial design. Source: AstraZeneca SAP D419CC00002, Edition 4.0, 30-JUL-2021 (kept locally as source_sap.pdf).

Trial

1L unresectable advanced hepatocellular carcinoma, open-label, three-arm RCT (after Amendment 4 closed the original Arm B):

Arm A — durvalumab monotherapy
Arm C — STRIDE: single priming dose of tremelimumab + durvalumab
Arm D — sorafenib (active control)

Allocation: 1:1:1 (A:C:D). Total N = 1,324 (~441 per arm). [Curator-supplied; redacted in SAP, taken from public results.]

Primary endpoint: overall survival (OS), randomization to death from any cause.

Hypothesis structure and α budget

Familywise α = 5% two-sided, strongly controlled.

Tag	Comparison	Test	α budget	Gating
ORR-IA1	A & C	ORR / DoR at IA1	0.001	— (out of scope)
H1	C vs D	OS superiority	0.049	Primary
H2	A vs D	OS non-inferiority (margin 1.08)	recycled from H1	After H1 success
H3	A vs D	OS superiority	recycled from H2	After H2 NI achieved
OS36	C vs D	3-yr OS rate	recycled from H3	After H1+H2+H3 all positive

Spending function: Lan-DeMets approximation of O’Brien-Fleming across IA2 + FA. If H1 fails at IA2 but succeeds at FA, H2 is tested only at FA.

Survival assumptions

Arm D (sorafenib): exponential, median OS = 11.5 months.
Arm C (STRIDE) vs D: average HR = 0.70 with a 2-month delay in separation. Translate this into a hazard specification consistent with the SAP wording.
Arm A (durvalumab mono) vs D: HR = 0.84, proportional hazards (no delay specified in SAP).

Trial conduct

Accrual: non-uniform over 22 months, total enrollment 1,324.
Follow-up after accrual ends: 15.5 months. Total study horizon: 37.5 months from FSR.
Dropout: none modeled (matches SAP). Censoring only at administrative cutoffs.
Stratification: SAP stratifies by etiology (HBV / HCV / other), ECOG (0 / 1), macrovascular invasion (Y / N). For this pilot, unstratified analysis.

Information time and look schedule

Look	Trigger	C+D events	A+D events	Approx calendar	SAP-stated 2-sided α
IA2	~404 events in C+D	404	~453	~30 mo	H1: 0.0222 · H2: 0.0248
FA	~515 events in C+D	515	~560	~37.5 mo	H1: 0.0425 · H2: 0.0418

Boundaries should be derived from observed event counts at each look using Lan-DeMets / OF; the SAP α values are cross-checks.

Analysis methods

OS: log-rank test (unstratified for this pilot); HR via Cox PH.
H2 NI: reject H0 if upper limit of the 2-sided α-adjusted CI for HR(A/D) is below 1.08.

Operating characteristics to compute

Use 1,000 replications per scenario.

Planning alternative (HR(C/D) = 0.70 average with 2-month delay; HR(A/D) = 0.84):
- Empirical power, H1 at IA2 and at FA, plus cumulative.
- Empirical power, H2 NI at FA.
- Empirical power, H3 superiority at FA.
- Empirical probability of early stop at IA2 for H1.
Global null (HR = 1 for both C vs D and A vs D): empirical FWER under the alpha-recycling rule.
Boundary verification: OF-derived boundaries vs SAP’s 0.0222 / 0.0425 (H1) and 0.0248 / 0.0418 (H2/H3).
Calendar timing: distribution of months from FSR to IA2 and to FA. Compare to sponsor’s 30 / 37.5.
NPH-translation sensitivity for H1: rerun with average HR fixed at 0.70 but the post-delay slope varied to bracket the translation choice.
(Optional) MaxCombo sensitivity for H1: max{logrank, FH(0,1), FH(1,1)} per Karrison 2016 / He-Koch-Kurland 2021.

Quantity	Sponsor value
Power H1 at IA2	≥ 85%
Power H1 at FA	≥ 97%
Power H2 NI at FA	~ 84%
Time to IA2	~ 30 months
Time to FA	~ 37.5 months
Smallest detectable average HR (H1, FA)	0.84