Data source: Diabetes Prevention Impact Toolkit (CDC; US-based)
Reference: https://nccd.cdc.gov/Toolkit/DiabetesImpact/
This report evaluates late-stage prediabetes prevention savings based on simulated projections from the National Diabetes Prevention Program (NDP) (https://www.cdc.gov/diabetes-prevention/)
Summary of our main assumptions:
Additional details behind default assumptions: https://nccd.cdc.gov/Toolkit/DiabetesImpact/docs/input-checklist_state.pdf
# Load required package
library(knitr)
library(tidyverse)
library(patchwork)
# Create data frame
inputs <- data.frame(
Toolkit_Inputs = c(
"<span style='color:red'><b>Population</b></span>",
"Total Adult Population",
"Eligible Persons",
"Completed Screening",
"Participate in Intervention",
"<span style='color:red'><b>Age Breakdown</b></span>",
" % Age 45–64",
" % Age 65–74",
" % Age 75+",
"<span style='color:red'><b>Sex Breakdown</b></span>",
" % Male",
" % Female",
"<span style='color:red'><b>Race Breakdown</b></span>",
" % White (Non-Hispanic)",
" % Black (Non-Hispanic)",
" % Hispanic",
" % Asian",
" % Other Race",
"<span style='color:red'><b>Body Weight Breakdown</b></span>",
" % Obese (BMI ≥ 30.0)",
" % Overweight (BMI ≥ 24.0 and < 30.0)",
"<span style='color:red'><b>Prediabetes Screening Participation</b></span>",
"Persons with Prediabetes",
"No new screenings for prediabetes",
"% Eligible previously screened",
"% Screened who participate",
"<span style='color:red'><b>Prevention Cost Assumptions </b></span>",
"Prevention Program Cost",
"Costs in Year of Diagnosis",
"Costs per Year After Diagnosis",
"Discount Rate",
"<span style='color:red'><b>Productivity Loss Assumptions </b></span>",
"Days Missed per Year",
"Daily Earnings (with diabetes)"
),
Values_Used_in_Analysis = c(
"",
"100,000",
"16,386",
"7,538",
"2,638",
"",
"100.00%",
"0.00%",
"0.00%",
"",
"53.82%",
"46.18%",
"",
"66.75%",
"10.36%",
"14.99%",
"5.10%",
"2.80%",
"",
"40.00%",
"60.00%",
"",
"6.20%",
"Checked",
"46.00%",
"35.00%",
"",
"$417.00",
"$6,425.00",
"$3,900.00",
"3.00%",
"",
"3.3",
"$276.00"
)
)
# Print nicely formatted markdown table
kable(inputs, format = "markdown", col.names = c("**Toolkit Inputs**", "**Values Used in Analysis**"))| Toolkit Inputs | Values Used in Analysis |
|---|---|
| Population | |
| Total Adult Population | 100,000 |
| Eligible Persons | 16,386 |
| Completed Screening | 7,538 |
| Participate in Intervention | 2,638 |
| Age Breakdown | |
| % Age 45–64 | 100.00% |
| % Age 65–74 | 0.00% |
| % Age 75+ | 0.00% |
| Sex Breakdown | |
| % Male | 53.82% |
| % Female | 46.18% |
| Race Breakdown | |
| % White (Non-Hispanic) | 66.75% |
| % Black (Non-Hispanic) | 10.36% |
| % Hispanic | 14.99% |
| % Asian | 5.10% |
| % Other Race | 2.80% |
| Body Weight Breakdown | |
| % Obese (BMI ≥ 30.0) | 40.00% |
| % Overweight (BMI ≥ 24.0 and < 30.0) | 60.00% |
| Prediabetes Screening Participation | |
| Persons with Prediabetes | 6.20% |
| No new screenings for prediabetes | Checked |
| % Eligible previously screened | 46.00% |
| % Screened who participate | 35.00% |
| Prevention Cost Assumptions | |
| Prevention Program Cost | $417.00 |
| Costs in Year of Diagnosis | $6,425.00 |
| Costs per Year After Diagnosis | $3,900.00 |
| Discount Rate | 3.00% |
| Productivity Loss Assumptions | |
| Days Missed per Year | 3.3 |
| Daily Earnings (with diabetes) | $276.00 |
This section analyzes projected medical costs for 2,638 individuals with late stage prediabetes under two scenarios:
See ‘📝 Notes’ section for diabetes-specific medical cost estimate calculation and reference
df_med <- data.frame(
Year = 1:10,
No_Intervention = c(6001, 12024, 18047, 24055, 30128, 36168, 42181, 48163, 54104, 60007),
With_Intervention = c(5864, 11770, 17670, 23588, 29571, 35519, 41438, 47325, 53171, 58977)
)
df_med <- df_med %>%
mutate(Difference = No_Intervention - With_Intervention)
# 2. Reshape for Plot 1
df_med_long <- df_med %>%
pivot_longer(cols = c(No_Intervention, With_Intervention),
names_to = "Type", values_to = "Cost") %>%
mutate(Type = recode(Type,
"No_Intervention" = "No Prevention",
"With_Intervention" = "With Prevention"))
df_labels <- df_med %>%
mutate(
Label = paste0("+$", format(Difference, big.mark = ",")),
Y_Position = pmax(No_Intervention, With_Intervention) + 1000,
X_Position = as.numeric(factor(Year)) + 0.2
)
# 3. Plot 1: Medical Costs per Year
plot1 <- ggplot(df_med_long, aes(x = factor(Year), y = Cost, fill = Type)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9), width = 0.8) +
geom_label(data = df_labels,
aes(x = X_Position, y = Y_Position, label = Label),
fill = "#FDEDEC",
color = "black",
show.legend = FALSE,
size = 3.5,
label.size = 0.2) +
scale_fill_manual(
name = "",
values = c("No Prevention" = "#D55E00", "With Prevention" = "#0072B2")
) +
labs(
title = "",
subtitle = "T2D-specific medical costs over 10y (no prevention vs. prevention)",
x = "Year",
y = "Annual Medical Cost ($ USD)"
) +
theme_classic(base_size = 14) +
theme(plot.margin = margin(10, 20, 10, 20),
legend.position = "bottom")
plot1Vlisualizes annual net medical savings from prevention (derived from plot1) over 10-year projected horizon, accounting for the $ cost of prevention (NDP).
✅ Green bar: shows prevention savings* discounted back to Year 0 using a 3% annual discount rate to reflect time value of money.
⚠️ Yellow bar shows cost of prevention assumed $417/year(with 3% annual discount rate[to discuss if discounting here makes sense]
Labels above bars show difference in discounted prevention savings and prevention costs, with green (prevention savings > cost) and red (prevention cost > savings).
Diffference in savings (prevention savings vs. prevention costs) is calculated per participant.
library(tidyverse)
library(patchwork)
library(tidyverse)
library(ggplot2)
library(knitr)
library(kableExtra)
# 4. Add discounting for Plot 2 and Table
discount_rate <- 0.03
df_med_discounted <- df_med %>%
mutate(
Intervention_Cost = 417,
Discount_Factor = 1 / (1 + discount_rate) ^ Year,
PV_Savings = Difference * Discount_Factor,
PV_Intervention_Cost = Intervention_Cost * Discount_Factor,
Net_Benefit = PV_Savings - PV_Intervention_Cost
)
df_med_long_discounted <- df_med_discounted %>%
select(Year, PV_Savings, PV_Intervention_Cost) %>%
pivot_longer(cols = c(PV_Savings, PV_Intervention_Cost),
names_to = "Type", values_to = "Amount") %>%
mutate(Type = recode(Type,
"PV_Savings" = "Prevention Savings",
"PV_Intervention_Cost" = "Prevention Cost"))
# Labels: include **negative net benefit** explicitly
# Update label data with custom colors
df_labels_discounted <- df_med_discounted %>%
mutate(
Label = ifelse(Net_Benefit >= 0,
paste0("+$", round(Net_Benefit, 0)),
paste0("-$", abs(round(Net_Benefit, 0)))),
Y_Position = pmax(PV_Savings, PV_Intervention_Cost) + 100,
X_Position = as.numeric(factor(Year)) + 0.2,
Fill_Color = ifelse(Net_Benefit < 0, "#FDEDEC", "#D4EFDF"), # pink for negative
Label_Color = ifelse(Net_Benefit < 0, "red", "black")
)
# Plot 2: Avoid double fill scale
plot2 <- ggplot(df_med_long_discounted, aes(x = factor(Year), y = Amount, fill = Type)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9), width = 0.8) +
# Labels with custom background fill and text color
geom_label(data = df_labels_discounted,
aes(x = X_Position, y = Y_Position, label = Label),
fill = df_labels_discounted$Fill_Color,
color = df_labels_discounted$Label_Color,
inherit.aes = FALSE,
show.legend = FALSE,
size = 3.5,
label.size = 0.2) +
scale_fill_manual(
name = "",
values = c("Prevention Savings" = "#009E73", "Prevention Cost" = "#E69F00")
) +
labs(
title = "",
subtitle = "",
x = "Year",
y = "Discounted $ USD"
) + theme_classic(base_size = 14) +
theme(plot.margin = margin(10, 20, 10, 20),
legend.position = "bottom")
# 6. Combine Plots Side by Side
plot2df_table <- df_med_discounted %>%
mutate(
`Year` = Year,
`PV Savings` = paste0("$", round(PV_Savings, 1)),
`PV Cost` = paste0("$", round(PV_Intervention_Cost, 1)),
`Net Benefit` = ifelse(Net_Benefit > 0,
paste0("+$", round(Net_Benefit, 1), " (savings)"),
paste0("-", abs(round(Net_Benefit, 1)), " (loss)"))
) %>%
select(Year, `PV Savings`, `PV Cost`, `Net Benefit`)
kable(df_table, align = "c",
caption = "Table 1: Medical Cost Savings & Net Benefit by Year (per individual)") %>%
kable_styling(full_width = FALSE, position = "left", font_size = 14)| Year | PV Savings | PV Cost | Net Benefit |
|---|---|---|---|
| 1 | $133 | $404.9 | -271.8 (loss) |
| 2 | $239.4 | $393.1 | -153.6 (loss) |
| 3 | $345 | $381.6 | -36.6 (loss) |
| 4 | $414.9 | $370.5 | +$44.4 (savings) |
| 5 | $480.5 | $359.7 | +$120.8 (savings) |
| 6 | $543.5 | $349.2 | +$194.3 (savings) |
| 7 | $604.1 | $339.1 | +$265.1 (savings) |
| 8 | $661.5 | $329.2 | +$332.3 (savings) |
| 9 | $715.1 | $319.6 | +$395.5 (savings) |
| 10 | $766.4 | $310.3 | +$456.1 (savings) |
# Calculate summary statistics
# If df_table exists and Net Benefit column is in string format:
library(dplyr)
library(stringr)
# Extract numeric values from Net Benefit column
net_benefits <- df_table %>%
mutate(Net_Benefit_Num = as.numeric(str_extract(`Net Benefit`, "-?\\d+\\.?\\d*"))) %>%
pull(Net_Benefit_Num)
# Compute summary statistics
summary_stats <- tibble(
Mean_Net_Benefit = mean(net_benefits),
Median_Net_Benefit = median(net_benefits),
SD_Net_Benefit = sd(net_benefits)
)
# Format with dollar signs and 2 decimal places
summary_stats_fmt <- summary_stats %>%
mutate(across(everything(), ~ sprintf("$%.2f", .)))
# Display the table
kable(summary_stats_fmt,
col.names = c("Mean Net Benefit", "Median Net Benefit", "Standard Deviation"),
align = c("r", "r", "r"),
caption = "Table 2: Aggregated Net Benefit Over 10 Years (per individual)") %>%
kable_styling(full_width = FALSE, position = "left", font_size = 14)| Mean Net Benefit | Median Net Benefit | Standard Deviation |
|---|---|---|
| $134.65 | $157.55 | $239.67 |
How is medical cost calculated?
To discuss next if it makes sense to discount prevention program costs.
Compares annual productivity loss in $ per participant across two scenarios (no T2D prevention vs. prevention) over a 10-year projected horizon.
Bar chart visualizes raw undiscounted costs of productivity loss as the cumulative difference between both scenarios (per participant).
Plot 1 does not account for cost of prevention.
CDC assumes:
# 1. Base Productivity Data
df_prod <- data.frame(
Year = 1:10,
No_Intervention = c(55, 157, 301, 480, 689, 924, 1181, 1455, 1744, 2045),
With_Intervention = c(35, 110, 222, 373, 556, 768, 1003, 1259, 1531, 1816)
)
df_prod <- df_prod %>%
mutate(Difference = No_Intervention - With_Intervention)
# 2. Reshape for Plot 1
df_prod_long <- df_prod %>%
pivot_longer(cols = c(No_Intervention, With_Intervention),
names_to = "Type", values_to = "Cost") %>%
mutate(Type = recode(Type,
"No_Intervention" = "No Prevention",
"With_Intervention" = "With Prevention"))
df_labels_prod <- df_prod %>%
mutate(
Label = paste0("+$", Difference),
Y_Position = pmax(No_Intervention, With_Intervention) + 100,
X_Position = as.numeric(factor(Year)) + 0.2
)
# === Plot 1: Raw Productivity Costs ===
plot1_prod <- ggplot(df_prod_long, aes(x = factor(Year), y = Cost, fill = Type)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9), width = 0.8) +
geom_label(data = df_labels_prod,
aes(x = X_Position, y = Y_Position, label = Label),
fill = "#FDEDEC",
color = "black",
show.legend = FALSE,
size = 3.5,
label.size = 0.2) +
scale_fill_manual(
name = "Scenario",
values = c("No Prevention" = "#D55E00", "With Prevention" = "#0072B2")
) +
labs(
title = "",
subtitle = "",
x = "Year",
y = "Annual Productivity Cost ($ USD)"
) +
theme_classic(base_size = 14) +
theme(plot.margin = margin(10, 20, 10, 20),
legend.position = "bottom")
plot1_prodVlisualizes annual net productivity savings from prevention over 10-year projected horizon, accounting for the cost of prevention (NDP).
⚠️ Yellow bar shows cost of prevention assumed $417/year(with 3% annual discount rate)
✅ Green bar: shows productivity-specific savings* *discounted back to Year 0 using a 3% annual discount rate** to reflect time value of money.
Labels above the bars show the difference in discounted prevention savings on productivity versus prevention costs, with red (prevention cost > productivity savings).
# === Plot 2: Discounted Productivity Costs ===
discount_rate <- 0.03
df_prod_discounted <- df_prod %>%
mutate(
Intervention_Cost = 417,
Discount_Factor = 1 / (1 + discount_rate) ^ Year,
PV_Savings = Difference * Discount_Factor,
PV_Intervention_Cost = Intervention_Cost * Discount_Factor,
Net_Benefit = PV_Savings - PV_Intervention_Cost
)
df_prod_long_discounted <- df_prod_discounted %>%
select(Year, PV_Savings, PV_Intervention_Cost) %>%
pivot_longer(cols = c(PV_Savings, PV_Intervention_Cost),
names_to = "Type", values_to = "Amount") %>%
mutate(Type = recode(Type,
"PV_Savings" = "Prevention Savings",
"PV_Intervention_Cost" = "Prevention Cost"))
df_labels_discounted_prod <- df_prod_discounted %>%
mutate(
Label = ifelse(Net_Benefit >= 0,
paste0("+$", round(Net_Benefit, 0)),
paste0("-$", abs(round(Net_Benefit, 0)))),
Y_Position = pmax(PV_Savings, PV_Intervention_Cost) + 20,
X_Position = as.numeric(factor(Year)) + 0.2,
Fill_Color = ifelse(Net_Benefit < 0, "#FDEDEC", "#D4EFDF"),
Label_Color = ifelse(Net_Benefit < 0, "red", "black")
)
plot2_prod <- ggplot(df_prod_long_discounted, aes(x = factor(Year), y = Amount, fill = Type)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9), width = 0.8) +
geom_label(data = df_labels_discounted_prod,
aes(x = X_Position, y = Y_Position, label = Label),
fill = df_labels_discounted_prod$Fill_Color,
color = df_labels_discounted_prod$Label_Color,
inherit.aes = FALSE,
show.legend = FALSE,
size = 3.5,
label.size = 0.2) +
scale_fill_manual(
name = "Discounted Components",
values = c("Prevention Savings" = "#009E73", "Prevention Cost" = "#E69F00")
) +
labs(
title = "",
subtitle = "",
x = "Year",
y = "Discounted $ USD"
) +
theme_classic(base_size = 14) +
theme(plot.margin = margin(10, 20, 10, 20),
legend.position = "bottom")
plot2_prod# Prepare productivity cost summary table
df_summary_prod <- df_prod_discounted %>%
mutate(
PV_Savings = round(PV_Savings),
PV_Intervention_Cost = round(PV_Intervention_Cost),
PV_Net_Savings = round(Net_Benefit)
) %>%
select(
Year,
`Discounted Savings` = PV_Savings,
`Discounted Cost` = PV_Intervention_Cost,
`Net Present Value` = PV_Net_Savings
) %>%
mutate(
`Discounted Savings` = paste0("$", format(`Discounted Savings`, big.mark = ",")),
`Discounted Cost` = paste0("$", format(`Discounted Cost`, big.mark = ",")),
`Net Present Value` = paste0("$", format(`Net Present Value`, big.mark = ","))
)
# Output Markdown table
knitr::kable(df_summary_prod, align = "c",
caption = "Discounted Productivity Savings vs. Intervention Cost (per participant)") %>%
kableExtra::kable_styling(full_width = FALSE, position = "left", font_size = 14)| Year | Discounted Savings | Discounted Cost | Net Present Value |
|---|---|---|---|
| 1 | $ 19 | $405 | $-385 |
| 2 | $ 44 | $393 | $-349 |
| 3 | $ 72 | $382 | $-309 |
| 4 | $ 95 | $370 | $-275 |
| 5 | $115 | $360 | $-245 |
| 6 | $131 | $349 | $-219 |
| 7 | $145 | $339 | $-194 |
| 8 | $155 | $329 | $-174 |
| 9 | $163 | $320 | $-156 |
| 10 | $170 | $310 | $-140 |
npv_values <- df_summary_prod %>%
mutate(NPV_Num = as.numeric(str_replace_all(`Net Present Value`, "[ $,]", ""))) %>%
pull(NPV_Num)
# Compute summary statistics
npv_summary <- tibble(
Mean_NPV = mean(npv_values),
Median_NPV = median(npv_values),
SD_NPV = sd(npv_values)
)
# Format with dollar signs
npv_summary_fmt <- npv_summary %>%
mutate(across(everything(), ~ sprintf("$%.2f", .)))
# Render summary table
kable(npv_summary_fmt,
col.names = c("Mean Net Present Value", "Median", "SD"),
align = c("r", "r", "r"),
caption = "Summary of Net Benefit (Productivity) Over 10 Years (per participant)") %>%
kable_styling(full_width = FALSE, position = "left", font_size = 14)| Mean Net Present Value | Median | SD |
|---|---|---|
| $-244.60 | $-232.00 | $83.45 |
Days of work missed due to T2 diabetes are calculated the excess days of work missed by someone with diabetes compared with a similar person without diabetes (e.g., similar age, sex, comorbidities).
Estimate of $276 in daily earnings is a weighted average of the estimated daily earnings reported in the current Population Survey for 45- to 64-year-old males and females. 45 to 64 age group is selected because the mean age of persons with prediabetes is about 52 years old
library(tidyverse)
# Data
df_summary <- data.frame(
Year = 1:10,
Sum_Cost_Saving = c(157, 301, 456, 575, 690, 805, 920, 1034, 1147, 1259),
Program_Cost = rep(417, 10),
Net_Cost = c(260, 116, -39, -158, -273, -388, -503, -617, -730, -842)
)
# Long format for bars
df_long_summary <- df_summary %>%
pivot_longer(cols = c(Sum_Cost_Saving, Program_Cost),
names_to = "Type", values_to = "Amount") %>%
mutate(Type = recode(Type,
"Sum_Cost_Saving" = "Cumulative Savings",
"Program_Cost" = "Program Cost")) # FIXED this line
# Label dataset with custom fill colors
df_labels <- df_summary %>%
mutate(
Label = paste0("$", Net_Cost),
Fill_Color = ifelse(Net_Cost > 0, "#FDEDEC", "#D4EFDF"), # Light red or light blue
Y_Position = pmax(Sum_Cost_Saving, Program_Cost) + 40
)
# Plot
ggplot(df_long_summary, aes(x = factor(Year), y = Amount, fill = Type)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.8), width = 0.7) +
# Label layer with background fill from Fill_Color column
geom_label(data = df_labels,
aes(x = factor(Year),
y = Y_Position,
label = Label),
fill = df_labels$Fill_Color,
color = "black",
show.legend = FALSE,
size = 3.5,
label.size = 0.2) +
# Correct bar fill colors
scale_fill_manual(
name = "Component",
values = c("Cumulative Savings" = "#009E73", # green
"Program Cost" = "#E69F00") # orange
) +
labs(
title = "",
subtitle = "",
x = "Year",
y = "Cumulative Value (USD)"
) +
theme_classic(base_size = 14)df_table <- df_summary %>%
mutate(
`Cumulative Savings` = paste0("$", format(Sum_Cost_Saving, big.mark = ",")),
`Program Cost` = paste0("$", format(Program_Cost, big.mark = ",")),
`Net Cost` = ifelse(
Net_Cost < 0,
paste0("$-", format(abs(Net_Cost), big.mark = ","), " (savings)"),
paste0("$", format(Net_Cost, big.mark = ","))
)
) %>%
select(
Year,
`Cumulative Savings`,
`Program Cost`,
`Net Cost`
)
kable(df_table, align = "c",
caption = "Cumulative Cost Savings vs. Program Cost Over 10 Years") %>%
kable_styling(full_width = FALSE, position = "left", font_size = 14)| Year | Cumulative Savings | Program Cost | Net Cost |
|---|---|---|---|
| 1 | $ 157 | $417 | $ 260 |
| 2 | $ 301 | $417 | $ 116 |
| 3 | $ 456 | $417 | $- 39 (savings) |
| 4 | $ 575 | $417 | $-158 (savings) |
| 5 | $ 690 | $417 | $-273 (savings) |
| 6 | $ 805 | $417 | $-388 (savings) |
| 7 | $ 920 | $417 | $-503 (savings) |
| 8 | $1,034 | $417 | $-617 (savings) |
| 9 | $1,147 | $417 | $-730 (savings) |
| 10 | $1,259 | $417 | $-842 (savings) |
######
# Extract numeric values
cost_summary_values <- df_summary %>%
summarise(
Cumulative_Savings = Sum_Cost_Saving,
Program_Cost = Program_Cost,
Net_Cost = Net_Cost
)
# Compute summary statistics
cost_summary_stats <- cost_summary_values %>%
summarise(
Mean_Savings = mean(Cumulative_Savings),
Median_Savings = median(Cumulative_Savings),
SD_Savings = sd(Cumulative_Savings),
Mean_Program = mean(Program_Cost),
Median_Program = median(Program_Cost),
SD_Program = sd(Program_Cost),
Mean_Net = mean(Net_Cost),
Median_Net = median(Net_Cost),
SD_Net = sd(Net_Cost)
)
# Reshape into long format for table display
cost_summary_long <- tibble(
Metric = c("Cumulative Savings"),
Mean = c(cost_summary_stats$Mean_Savings),
Median = c(cost_summary_stats$Median_Savings),
SD = c(cost_summary_stats$SD_Savings)
)
# Format values with dollar signs and commas
cost_summary_fmt <- cost_summary_long %>%
mutate(across(c(Mean, Median, SD), ~ sprintf("$%s", format(round(.x, 2), big.mark = ","))))
# Render summary table
kable(cost_summary_fmt,
col.names = c("Metric", "Mean", "Median", "SD"),
align = c("l", "r", "r", "r"),
caption = "Summary Statistics for Net Savings (medical and productivity combined) Over 10 Years") %>%
kable_styling(full_width = FALSE, position = "left", font_size = 14)| Metric | Mean | Median | SD |
|---|---|---|---|
| Cumulative Savings | $734.4 | $747.5 | $365.26 |
library(tidyverse)
library(tidyverse)
# Data
df_qaly <- data.frame(
Year = 1:10,
No_Intervention = c(2145.0, 4211.9, 6203.3, 8121.9, 9970.1, 11750.4, 13465.3, 15116.9, 16707.5, 18239.3),
With_Intervention = c(2147.3, 4217.5, 6213.1, 8135.8, 9988.1, 11772.5, 13491.3, 15146.8, 16741.2, 18276.8),
QALYs_Gained = c(2.3, 5.6, 9.8, 13.9, 18.0, 22.1, 26.0, 29.9, 33.7, 37.5)
)
# Bar format
df_long_qaly <- df_qaly %>%
pivot_longer(cols = c(No_Intervention, With_Intervention),
names_to = "Type", values_to = "QALYs") %>%
mutate(Type = recode(Type,
"No_Intervention" = "No Prevention",
"With_Intervention" = "With Prevention"))
# Labels with adjusted X position
df_qaly_labels <- df_qaly %>%
mutate(
Label = paste0("+", QALYs_Gained, " QALYs"),
Y_Position = pmax(No_Intervention, With_Intervention) + 100,
X_Position = as.numeric(factor(Year)) + 0.2 # slight right shift
)
# Plot
ggplot(df_long_qaly, aes(x = factor(Year), y = QALYs, fill = Type)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9), width = 0.8) +
# Labels with custom x and y
geom_label(data = df_qaly_labels,
aes(x = X_Position,
y = Y_Position,
label = Label),
fill = "#D1ECF1",
color = "black",
show.legend = FALSE,
size = 3.5,
label.size = 0.2) +
scale_fill_manual(
name = "Scenario",
values = c("No Prevention" = "#999999", # grey
"With Prevention" = "#56B4E9") # blue
) +
labs(
title = "",
subtitle = "",
x = "Year",
y = "Cumulative QALYs"
) +
theme_classic(base_size = 14) +
theme(plot.margin = margin(10, 20, 10, 20)) # more spacing left/right# Data
df_qaly <- data.frame(
Year = 1:10,
No_Intervention = c(2145.0, 4211.9, 6203.3, 8121.9, 9970.1, 11750.4, 13465.3, 15116.9, 16707.5, 18239.3),
With_Intervention = c(2147.3, 4217.5, 6213.1, 8135.8, 9988.1, 11772.5, 13491.3, 15146.8, 16741.2, 18276.8),
QALYs_Gained = c(2.3, 5.6, 9.8, 13.9, 18.0, 22.1, 26.0, 29.9, 33.7, 37.5)
)
# Table formatting
df_qaly_table <- df_qaly %>%
mutate(
`No Intervention` = format(round(No_Intervention, 1), big.mark = ","),
`With Intervention` = format(round(With_Intervention, 1), big.mark = ","),
`QALYs Gained` = paste0("+", QALYs_Gained, " (gained)")
) %>%
select(
Year,
`No Intervention`,
`With Intervention`,
`QALYs Gained`
)
# Create table
kable(df_qaly_table, align = "c",
caption = "Cumulative QALYs Over 10 Years: Intervention vs. No Intervention") %>%
kable_styling(full_width = FALSE, position = "left", font_size = 14)| Year | No Intervention | With Intervention | QALYs Gained |
|---|---|---|---|
| 1 | 2,145.0 | 2,147.3 | +2.3 (gained) |
| 2 | 4,211.9 | 4,217.5 | +5.6 (gained) |
| 3 | 6,203.3 | 6,213.1 | +9.8 (gained) |
| 4 | 8,121.9 | 8,135.8 | +13.9 (gained) |
| 5 | 9,970.1 | 9,988.1 | +18 (gained) |
| 6 | 11,750.4 | 11,772.5 | +22.1 (gained) |
| 7 | 13,465.3 | 13,491.3 | +26 (gained) |
| 8 | 15,116.9 | 15,146.8 | +29.9 (gained) |
| 9 | 16,707.5 | 16,741.2 | +33.7 (gained) |
| 10 | 18,239.3 | 18,276.8 | +37.5 (gained) |
library(dplyr)
library(stringr)
library(knitr)
library(kableExtra)
# Extract numeric QALYs gained values from string (e.g., "+2.3 (gained)")
qaly_values <- df_qaly_table %>%
mutate(QALYs_Num = as.numeric(str_replace_all(`QALYs Gained`, "[^0-9.-]", ""))) %>%
pull(QALYs_Num)
# Compute summary statistics
qaly_summary <- tibble(
Mean_QALYs = mean(qaly_values),
Median_QALYs = median(qaly_values),
SD_QALYs = sd(qaly_values)
)
# Format to two decimals, with "+" prefix
qaly_summary_fmt <- qaly_summary %>%
mutate(across(everything(), ~ sprintf("+%.2f", .)))
# Render summary table
kable(qaly_summary_fmt,
col.names = c("Mean QALYs Gained", "Median", "Standard Deviation"),
align = c("r", "r", "r"),
caption = "Summary of QALYs Gained Over 10 Years") %>%
kable_styling(full_width = FALSE, position = "left", font_size = 14)| Mean QALYs Gained | Median | Standard Deviation |
|---|---|---|
| +19.88 | +20.05 | +12.01 |
Years 1–3: Not cost-effective → discounted savings < cost; Years 4–10: Cost-effective → discounted savings > cost; Breakeven occurs around Year 4
However, important to note over 10 years SD of the net benefit is 2X larger than both the mean, suggesting very high that there is high variability in the net benefit outcomes across time (standard deviation of $239.67, compared to a mean of $134.65)