Code
pacman::p_load(tidyverse, dplyr, ggplot2, rmarkdown, purrr)Budget Impact Analyses (BIAs) estimate how a healthcare budget holder’s spending might change if they decide to cover a new medical intervention or modify an existing policy. These analyses assess financial effects at a population level, helping decision-makers understand the overall cost implications (Von Hein, 2025). In the pharmaceutical industry, a Budget Impact Analysis (BIA) helps evaluate the financial consequences of introducing a new drug. For example, if a manufacturer (e.g., Johnson and Johnson or Merck ) launches a higher-cost cancer therapy, the BIA may seek to answer the following questions.
Cost of the New Treatment: What’s the price of the new drug or technology?
How Many People Will Use It?: How many patients are expected to need it?
What Does It Replace?: Will it substitute an existing treatment (saving money) or be an add-on (increasing costs)?
Total Budget Impact: Will the overall spending go up, down, or stay the same?
With these guiding questions in mind, we will implement a BIA in R following the parameters and key assumptions in Mirko Von Hein’s Video. Link to the youtube
Let’s begin by loading the required packages. I like to use pacman because it does not require me to call the libraries all over again after installing the packages.
pacman::p_load(tidyverse, dplyr, ggplot2, rmarkdown, purrr)In this walk through tutorial, our aim it to assess the budget impact of introducing Drug A into the market against an existing Standard of Care (SoC). To model this scenario, one can assume a situation where Drug A is available against a situation without Drug A. The goal here is to estimate the budget impact at the national level over a period of 10 years.
Lets begin by closely following the cost inputs from Mirko’s model. We know there are two main costs: 1. Treatment Related Cost 2 Conditions Related Costs.
We assume the drug acquisition cost for A and SoC is 500 and 300 respectively. We also assume that treatment(SoC) is oral so there is not admin cost but for drug A there is an admin cost of 200 for administering injection. We also assume there is a monitoring cost for 100 and 50 for both drug A and SoC. The share of patient with adverse event due to drug A is 5% while the share of patients with rare side effects for SoC is 2%
cost_components <- data.frame(
Component = c("Drug", "Admin", "Monitor", "AE", "Hospitalization", "Outpatient"),
Cost_A = c(500, 200, 100, 5, 0, 175),
Cost_SoC = c(300, 0, 50, 60, 75, 175)
)
total_cost <- data.frame(
Treatment = c("Drug A", "SoC"),
Total_Cost = c(colSums(cost_components[, c("Cost_A", "Cost_SoC")]))
)
print(total_cost) Treatment Total_Cost
Cost_A Drug A 980
Cost_SoC SoC 660
From this, we can compute the total cost per year per patient for Drug A and SoC respectively as $960 and $660.
We also define the epidemiological landscape (Incidence and Prevalence): The U.S. population of 335 million serves as our baseline. Current cancer prevalence is approximately 6 million patients (1.8% of the population), with about 1.96 million new cases diagnosed annually. Focusing on treatment-eligible populations, we estimate 1.51 million patients are in active treatment, 904,500 receive systemic therapy, and 361,800 require second- or third-line treatments. These figures, drawn from SEER and American Cancer Society data
(Based on SEER, CDC, and ACS data)
| Cascade Step | Assumption | Population Size | Notes |
|---|---|---|---|
| Total US Population | - | 335,000,000 | (2025 estimate) |
| Cancer Prevalence | 1.8% | 6,030,000 | ~18 million survivors (5.4% lifetime) |
| Active Treatment Cases | 25% | 1,507,500 | Patients currently in therapy |
| On Systemic Therapy | 60% | 904,500 | Chemo/targeted/immunotherapy |
| On 2nd/3rd Line Therapy | 40% | 361,800 | Progressive/refractory disease |
# Cancer Cascade
cancer_cascade <- data.frame(
Step = c("Total US", "Prevalence", "Active Treatment", "Systemic Therapy", "2nd/3rd Line"),
Assumption = c(NA, "1.8%", "25%", "60%", "40%"),
Population = c(335000000, 6030000, 1507500, 904500, 361800)
)
# Eligible Patients Over 5 Years
annual_incident <- 150000
years <- 0:5
eligible_patients <- 361800 + (annual_incident * years)
# Print tables
print(cancer_cascade) Step Assumption Population
1 Total US <NA> 335000000
2 Prevalence 1.8% 6030000
3 Active Treatment 25% 1507500
4 Systemic Therapy 60% 904500
5 2nd/3rd Line 40% 361800
print(data.frame(Year = years, Eligible_Patients = eligible_patients)) Year Eligible_Patients
1 0 361800
2 1 511800
3 2 661800
4 3 811800
5 4 961800
6 5 1111800
Now that we have determined the eligible patient population, we can now build the market share for a situation with Drug A as against a situation without Drug A
No_New_Drug <- data.frame(
Year = c("Year 0", "Year 1", "Year 2", "Year 3", "Year 4", "Year 5"),
Drug_A = c(0, 0, 0, 0, 0, 0),
SoC = c(1.0, 1.0, 1.0, 1.0, 1.0, 1.0) # 100% as 1.0
)
With_New_Drug <- data.frame(
Year = c("Year 0", "Year 1", "Year 2", "Year 3", "Year 4", "Year 5"),
Drug_A = c(0, 10, 25, 40, 60, 80), # Market penetration %
SoC = c(1.0, 0.9, 0.8, 0.75, 0.6, 0.2) # 90% as 0.9, etc.
)
print(No_New_Drug) Year Drug_A SoC
1 Year 0 0 1
2 Year 1 0 1
3 Year 2 0 1
4 Year 3 0 1
5 Year 4 0 1
6 Year 5 0 1
print(With_New_Drug) Year Drug_A SoC
1 Year 0 0 1.00
2 Year 1 10 0.90
3 Year 2 25 0.80
4 Year 3 40 0.75
5 Year 4 60 0.60
6 Year 5 80 0.20
The assumption is with the introduction of the new treatment into the market, drug A will begin to take some of the market from SoC over time.
unit_cost_A <- sum(cost_components$Cost_A)
unit_cost_SoC <- sum(cost_components$Cost_SoC)
# Step 4: Loop over years to calculate total cost
total_costs <- data.frame(
Year = paste0("Year ", 0:5),
Eligible = eligible_patients,
DrugA_Share = With_New_Drug$Drug_A,
SoC_Share = With_New_Drug$SoC
)
total_costs$Cost_DrugA <- total_costs$Eligible * total_costs$DrugA_Share * unit_cost_A
total_costs$Cost_SoC <- total_costs$Eligible * total_costs$SoC_Share * unit_cost_SoC
total_costs$Total_Cost <- total_costs$Cost_DrugA + total_costs$Cost_SoC
# View result
print(total_costs[, c("Year", "Eligible", "Cost_DrugA", "Cost_SoC", "Total_Cost")]) Year Eligible Cost_DrugA Cost_SoC Total_Cost
1 Year 0 361800 0 238788000 238788000
2 Year 1 511800 5015640000 304009200 5319649200
3 Year 2 661800 16214100000 349430400 16563530400
4 Year 3 811800 31822560000 401841000 32224401000
5 Year 4 961800 56553840000 380872800 56934712800
6 Year 5 1111800 87165120000 146757600 87311877600
Similarly, we can create one for a world with the new drug
# Step 0: Define components and cost structure
cost_components <- data.frame(
Component = c("Drug", "Admin", "Monitor", "AE", "Hospitalization", "Outpatient"),
Cost_A = c(500, 200, 100, 5, 0, 175),
Cost_SoC = c(300, 0, 50, 60, 75, 175)
)
unit_cost_A <- sum(cost_components$Cost_A)
unit_cost_SoC <- sum(cost_components$Cost_SoC)
# Step 1: Define population and market shares
years <- 0:5
eligible_patients <- 361800 + 150000 * years
With_New_Drug <- data.frame(
Year = paste0("Year ", years),
Drug_A = c(0, 0.10, 0.25, 0.40, 0.60, 0.80),
SoC = c(1.0, 0.90, 0.75, 0.60, 0.40, 0.20)
)
# Step 2: Calculate patient splits
drugA_patients <- eligible_patients * With_New_Drug$Drug_A
soc_patients <- eligible_patients * With_New_Drug$SoC
# Step 3: Compute cost per component per year
calculate_component_costs <- function(patients, cost_vector) {
sapply(1:nrow(cost_components), function(i) {
patients * cost_vector[i]
})
}
component_names <- cost_components$Component
cost_A_matrix <- t(mapply(calculate_component_costs, patients = drugA_patients,
MoreArgs = list(cost_vector = cost_components$Cost_A)))
cost_SoC_matrix <- t(mapply(calculate_component_costs, patients = soc_patients,
MoreArgs = list(cost_vector = cost_components$Cost_SoC)))
# Step 4: Sum across components
total_cost_A <- rowSums(cost_A_matrix)
total_cost_SoC <- rowSums(cost_SoC_matrix)
total_cost_combined <- total_cost_A + total_cost_SoC
# Step 5: Assemble final table
final_costs <- data.frame(
Year = paste0("Year ", years),
Eligible_Patients = eligible_patients,
DrugA_Patients = drugA_patients,
SoC_Patients = soc_patients,
Cost_DrugA = round(total_cost_A),
Cost_SoC = round(total_cost_SoC),
Total_Cost = round(total_cost_combined)
)
print(final_costs) Year Eligible_Patients DrugA_Patients SoC_Patients Cost_DrugA Cost_SoC
1 Year 0 361800 0 361800 0 238788000
2 Year 1 511800 51180 460620 50156400 304009200
3 Year 2 661800 165450 496350 162141000 327591000
4 Year 3 811800 324720 487080 318225600 321472800
5 Year 4 961800 577080 384720 565538400 253915200
6 Year 5 1111800 889440 222360 871651200 146757600
Total_Cost
1 238788000
2 354165600
3 489732000
4 639698400
5 819453600
6 1018408800
# Reuse earlier values
total_eligible <- eligible_patients # 6 values: Year 0 to Year 5
total_cost_WITH <- final_costs$Total_Cost # From With_New_Drug block
# World WITHOUT: all patients on SoC
unit_cost_SoC <- sum(cost_components$Cost_SoC)
total_cost_WITHOUT <- eligible_patients * unit_cost_SoC
# Budget impact
budget_impact_total <- total_cost_WITH - total_cost_WITHOUT
# Per-patient costs
per_patient_WITH <- total_cost_WITH / eligible_patients
per_patient_WITHOUT <- total_cost_WITHOUT / eligible_patients
budget_impact_per_patient <- per_patient_WITH - per_patient_WITHOUT
# Assemble results
budget_summary <- data.frame(
Year = paste0("Year ", years),
Eligible_Patients = eligible_patients,
Cost_WITHOUT = round(total_cost_WITHOUT),
Cost_WITH = round(total_cost_WITH),
Budget_Impact = round(budget_impact_total),
CostPerPatient_WITHOUT = round(per_patient_WITHOUT, 2),
CostPerPatient_WITH = round(per_patient_WITH, 2),
BudgetImpact_PerPatient = round(budget_impact_per_patient, 2)
)
print(budget_summary) Year Eligible_Patients Cost_WITHOUT Cost_WITH Budget_Impact
1 Year 0 361800 238788000 238788000 0
2 Year 1 511800 337788000 354165600 16377600
3 Year 2 661800 436788000 489732000 52944000
4 Year 3 811800 535788000 639698400 103910400
5 Year 4 961800 634788000 819453600 184665600
6 Year 5 1111800 733788000 1018408800 284620800
CostPerPatient_WITHOUT CostPerPatient_WITH BudgetImpact_PerPatient
1 660 660 0
2 660 692 32
3 660 740 80
4 660 788 128
5 660 852 192
6 660 916 256
# Total cost by component
total_by_component <- colSums(cost_A_matrix + cost_SoC_matrix)
# Cost per patient (total over 6 years / total patients)
total_patients <- sum(eligible_patients)
per_patient_by_component <- total_by_component / total_patients
# Display
detailed_summary <- data.frame(
Component = cost_components$Component,
Total_Cost = round(total_by_component),
Per_Patient_Cost = round(per_patient_by_component, 2)
)
print(detailed_summary) Component Total_Cost Per_Patient_Cost
1 Drug 1727814000 390.84
2 Admin 401574000 90.84
3 Monitor 321433500 72.71
4 AE 154815150 35.02
5 Hospitalization 180969750 40.94
6 Outpatient 773640000 175.00
# Assume budget_summary exists from earlier step
bim_plot_data <- budget_summary %>%
select(Year, Cost_WITH, Cost_WITHOUT, Budget_Impact) %>%
pivot_longer(cols = -Year, names_to = "Scenario", values_to = "Cost")
ggplot(bim_plot_data, aes(x = Year, y = Cost / 1e6, fill = Scenario)) +
geom_col(position = "dodge") +
labs(title = "Budget Impact Results Per Year",
y = "Cost (£ millions)", x = "Year") +
scale_fill_manual(values = c("Cost_WITH" = "steelblue",
"Cost_WITHOUT" = "grey60",
"Budget_Impact" = "firebrick")) +
theme_minimal()# Assume detailed_summary from earlier exists
ggplot(detailed_summary, aes(x = Component, y = Total_Cost / 1e6, fill = Component)) +
geom_col(position = "dodge") +
labs(title = "Detailed Cost by Cost Type",
y = "Total Cost (£ millions)", x = "Cost Component") +
theme_minimal() +
theme(legend.position = "none")patient_data <- data.frame(
Year = paste0("Year ", 0:5),
Eligible_Patients = eligible_patients
)
ggplot(patient_data, aes(x = Year, y = Eligible_Patients)) +
geom_line(group = 1, color = "darkgreen", size = 1.2) +
geom_point(color = "darkgreen") +
labs(title = "Predicted Eligible Patient Population Over Time",
y = "Patients", x = "Year") +
theme_minimal()Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
market_uptake_long <- With_New_Drug %>%
pivot_longer(cols = c(Drug_A, SoC), names_to = "Treatment", values_to = "Share")
ggplot(market_uptake_long, aes(x = Year, y = Share, color = Treatment)) +
geom_line(size = 1.2) +
geom_point() +
labs(title = "Market Uptake Over Time",
y = "Market Share (%)", x = "Year") +
theme_minimal()`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
ggplot(patient_data,
aes(x = Year, y = Eligible_Patients)) +
geom_col(fill = "skyblue") +
labs(title = "Eligible Patients Over Time", y = "Eligible Patients", x = "Year") +
theme_minimal()Validate the inputs you are considering
Check standardized templates
Follow proper guidelines as issued by ISPOR
Make the model as simple and interactive as possible.