Simulation prompt — HIMALAYA (NCT03298451)

Simulate the operating characteristics of the HIMALAYA Phase 3 trial design. Source: AstraZeneca SAP D419CC00002, Edition 4.0, 30-JUL-2021 (kept locally as source_sap.pdf).

Trial

1L unresectable advanced hepatocellular carcinoma, open-label, three-arm RCT (after Amendment 4 closed the original Arm B):

Allocation: 1:1:1 (A:C:D). Total N = 1,324 (~441 per arm). [Curator-supplied; redacted in SAP, taken from public results.]

Primary endpoint: overall survival (OS), randomization to death from any cause.

Hypothesis structure and α budget

Familywise α = 5% two-sided, strongly controlled.

Tag Comparison Test α budget Gating
ORR-IA1 A & C ORR / DoR at IA1 0.001 — (out of scope)
H1 C vs D OS superiority 0.049 Primary
H2 A vs D OS non-inferiority (margin 1.08) recycled from H1 After H1 success
H3 A vs D OS superiority recycled from H2 After H2 NI achieved
OS36 C vs D 3-yr OS rate recycled from H3 After H1+H2+H3 all positive

Spending function: Lan-DeMets approximation of O’Brien-Fleming across IA2 + FA. If H1 fails at IA2 but succeeds at FA, H2 is tested only at FA.

Survival assumptions

Trial conduct

Information time and look schedule

Look Trigger C+D events A+D events Approx calendar SAP-stated 2-sided α
IA2 ~404 events in C+D 404 ~453 ~30 mo H1: 0.0222 · H2: 0.0248
FA ~515 events in C+D 515 ~560 ~37.5 mo H1: 0.0425 · H2: 0.0418

Boundaries should be derived from observed event counts at each look using Lan-DeMets / OF; the SAP α values are cross-checks.

Analysis methods

Operating characteristics to compute

Use 1,000 replications per scenario.

  1. Planning alternative (HR(C/D) = 0.70 average with 2-month delay; HR(A/D) = 0.84):
    • Empirical power, H1 at IA2 and at FA, plus cumulative.
    • Empirical power, H2 NI at FA.
    • Empirical power, H3 superiority at FA.
    • Empirical probability of early stop at IA2 for H1.
  2. Global null (HR = 1 for both C vs D and A vs D): empirical FWER under the alpha-recycling rule.
  3. Boundary verification: OF-derived boundaries vs SAP’s 0.0222 / 0.0425 (H1) and 0.0248 / 0.0418 (H2/H3).
  4. Calendar timing: distribution of months from FSR to IA2 and to FA. Compare to sponsor’s 30 / 37.5.
  5. NPH-translation sensitivity for H1: rerun with average HR fixed at 0.70 but the post-delay slope varied to bracket the translation choice.
  6. (Optional) MaxCombo sensitivity for H1: max{logrank, FH(0,1), FH(1,1)} per Karrison 2016 / He-Koch-Kurland 2021.

Cross-check targets (sponsor-stated, for context — not pass/fail)

Quantity Sponsor value
Power H1 at IA2 ≥ 85%
Power H1 at FA ≥ 97%
Power H2 NI at FA ~ 84%
Time to IA2 ~ 30 months
Time to FA ~ 37.5 months
Smallest detectable average HR (H1, FA) 0.84