MSc-Forschungsmethoden 4 - Gesundheitsökonomie Übung

Gesundheitsökonomische Evaluationen in RStudio

Author
Affiliation

Alexander Schurz

Bern University of Applied Sciences

Published

October 20, 2025

1 Gesundheitsökonomische Evaluationen in R

Ziele der Lerneinheiten:

  • Erlernen der Grundkenntnisse für gesundheitsökonomische Evaluationen in RStudio

  • Durchführung der Kostenumrechnungen

  • Berechnung von QALYs

  • Durchführung von probabilistischer und deterministischer Sensitivitätsanalysen

  • Erstellung von Cost-effectiveness/Cost-utility-planes

  • Erstellung von Cost-effectiveness-acceptability planes

  • Einführung in das Bootstrapping von Kostendaten zur Berechnung von 95% CI Values

Für diese Übung benötigen wir die folgenden Datensätze:

  • df.test.csv

  • df.coding.csv

  • IMF_cost_conversion_2025.csv

  • df.acl.csv

  • df.acl.dsa.long

Code
library(pacman)
pacman::p_load(
  tidyverse, # data manipulation
  dplyr, # data manipulation
  flextable, #to develop tables for data export
  gtsummary, # summary table
  eq5d, # eq5d5l utility calculation
  ggplot2,
  dampack)

2 Datenimport

Lese die Daten vom Datensatz df.test.csv in R ein.

Code
# rio::import("file.csv")
Code
df.raw <- rio::import("../data/df.test.csv")
df.coding <- rio::import("../data/df.coding.csv", header = TRUE)
Code
df.coding %>%
  select(-V1) %>% 
  flextable() %>%                        # convert to flextable
  align(j = 1, align = "left", part = "all") %>%  # left-align first column
  set_table_properties(width = 1, layout = "autofit") %>%  # fit table
  fontsize(size = 12, part = "all") %>%          # font size
  font(fontname = "Times New Roman", part = "all")           # font family

Table: Coding conventions

Variables

Variable.name

Beschreibung

ID

id

Study_id

Currency

cur

e.g. CHF, €

Currency year

cur_year

e.g. 2005

Direct medical costs

dir_med

e.g. costs for GP consultations

Direct non-medical costs

dir_nonmed

e.g. transportation costs

Absenteeism

abs

0 = No
1 = Yes

Absenteeism days

abs_d

Hours of absenteeism

Presenteeism

pres

0 = No
1 = Yes

Presenteeism days

pres_days

Hours of presenteeism

Presenteeism - Workload fulfillment

pres_workload

Fulfillment of workload during presenteeism days
Likert scale: 10 = normal workload, 5 = 50% workload, 0 = no work possible

EQ-5D-5L - Question 1

eq5d5l_q1

Mobility

EQ-5D-5L - Question 2

eq5d5l_q2

Self-care

EQ-5D-5L - Question 3

eq5d5l_q3

Usual activities

EQ-5D-5L - Question 4

eq5d5l_q4

Pain/Discomfort

EQ-5D-5L - Question 5

eq5d5l_q5

Anxiety/Depression

EQ-5D-5L - Question 6

eq5d5l_q6

Subjective health status, Likert scale: 0-100 (optimal health)

3 Vorbereitung der Daten

  • Erstelle die Variable bmi.
  • Berechne die Grösse in metern und speichere dies unter der Variable height_m.
  • Konvertiere alle nummerischen Variablen in eine Zahlenvariable.
Code
# library(tidyverse)
# df.x <- df.y %>%
#   mutate(
#     xy = round(xy,2)
#     xy = round(xy/z,2),
#     across(c(var1, var.2, var.3), as.numeric))
Code
df.adapted <- df.raw %>% 
  select(-V1) %>% 
  mutate(
    bmi = round(weight / ((height / 100) ^ 2), 2),
    height_m = round(height / 100,2)
  ) %>%
  mutate(
    across(
      c(age, height, weight, dir_med, dir_nonmed, abs_d, pres_days, pres_workload), as.numeric))

4 Descriptive statistics

  • Erstelle eine Table 1 mit den Variablen sex, age, height_m, weight, bmi und unterteile dies basierend auf dem Geschlecht
Code
# library(gtsummary)
# library(flextable)
# tbl <- df %>% 
#   select(var1, var2) %>%
#   gtsummary::tbl_summary(
#     by = var1,
#     statistic = list(
#       all_continuous() ~ "{mean}",
#       all_categorical() ~ "{p}%"),
#     label = list(
#       var1 = "Var",
#       var2 = "Age"),
#     missing_text = "(Missing)",
#     digits = all_continuous() ~ 2) %>% 
#   gtsummary::add_overall(last = TRUE) %>%
#   bold_labels()))
# 
# # Summary statistics table using Rpackage: Flextable
# tbl %>%
#   as_flex_table() %>%
#   align(j = 1, align = "left", part = "all") %>% 
#   flextable::set_table_properties(width = 1, layout = "autofit") %>% 
#   flextable::fontsize(size = 12, part = "all") %>% 
#   flextable::font(fontname = "Times New Roman", part = "all")
Code
tbl.demo <- df.adapted %>%
  select(
    sex, age, height_m, weight, bmi
  ) %>%
  gtsummary::tbl_summary(
    by = sex,
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{p}%"),
    label = list(
      sex = "Sex",
      age = "Age",
      height_m = "Height (m)",
      weight = "Weight (kg)",
      bmi = "Body mass index"),
    missing_text = "(Missing)",
    digits = all_continuous() ~ 2
  ) %>% 
  gtsummary::add_overall(last = TRUE) %>%
  bold_labels()


tbl.demo %>%
  as_flex_table() %>%
  align(j = 1, align = "left", part = "all") %>% 
  flextable::set_table_properties(width = 1, layout = "autofit") %>% 
  flextable::fontsize(size = 12, part = "all") %>% 
  flextable::font(fontname = "Times New Roman", part = "all")

Table 1: Demographics

Characteristic

Female
N = 281

Male
N = 201

Overall
N = 481

Age

43.11 (13.42)

47.10 (12.20)

44.77 (12.95)

Height (m)

1.76 (0.13)

1.76 (0.12)

1.76 (0.12)

Weight (kg)

87.21 (10.35)

87.25 (10.33)

87.22 (10.23)

Body mass index

28.23 (3.11)

28.14 (2.52)

28.19 (2.85)

1Mean (SD)

5 Berechnng von Absenteeismus- und Präsenteeismuskosten

5.1 Absenteeismus

Societal perspective:

Costs of absenteeism were valued using the human capital approach (van den Hout, 2010).
It “considers the patient´s hours of productivity that are lost and calculates productivity costs as the product of those total lost hours with the hourly wage. Every hour not worked is an hour lost, possibly until the patient’s retirement age(van den Hout, 2010).

The hourly wage is calculated using the following formula:

\[ \text{Hourly wage} = \frac{\text{Annual income before tax}}{\big[(52 \times 5) - (\text{paid vacation days} + \text{public holidays})\big] \times \text{hours of standard workday}} \]

The following data are used:

  • average annual income before tax in Switzerland (latest available data from 2023):

    • 2023 CHF 94’507
  • hours of standard workday: 8.4 hours

  • paid vacation days: 25 days

  • paid public holidays: 8 days (for Canton of Bern)

Bibliography:

Bibliography used to recalculate costs in other currency and price year:

  • Berechne die Absenteeismus Kosten. Falls eine Person kein Absenteeismus angibt, trage 0 ein.
Code
# average_annual_income_CHF <- xy
# workday_hours <- xy
# paid_vacation <- xy
# paid_public_holidays <- xy
# 
# # Calculation of hourly wage based on formula used in Lutz et al. 2022 + making the relevant variables numeric values
# hourly_wage_ch <- as.numeric(round((((average_annual_income_CHF)/((52*5)-(paid_vacation+paid_public_holidays)))/workday_hours),2))
# daily_wage_ch <- as.numeric(round(hourly_wage_ch*workday_hours, 2))
# 
# # Valuation of absenteeism days (ipcq_ch_q4_2) in a new column called absenteeism_costs_CHF
# df.2 <- df %>%
#   mutate(
#     Costs_abs = abs_d * daily_wage_ch,
#     Costs_abs = ifelse(abs == 0, 0, Costs_abs))
Code
# Absenteeism
# Average annual income in Switzerland in 2023 CHF (2023 US$ 83'332)
average_annual_income_CHF <- 94507
workday_hours <- 8.4
paid_vacation <- 25
paid_public_holidays <- 8

# Calculation of hourly wage based on formula used in Lutz et al. 2022 + making the relevant variables numeric values
hourly_wage_ch <- as.numeric(round((((average_annual_income_CHF)/((52*5)-(paid_vacation+paid_public_holidays)))/workday_hours),2))
daily_wage_ch <- as.numeric(round(hourly_wage_ch*workday_hours, 2))

# Valuation of absenteeism days (ipcq_ch_q4_2) in a new column called absenteeism_costs_CHF
df.valued <- df.adapted %>%
  mutate(
    Costs_abs = abs_d * daily_wage_ch,
    Costs_abs = ifelse(abs == 0, 0, Costs_abs)
  )

5.2 Presenteeismus

Presenteeism is valued using the Osterhaus method with the following formula: \[ \text{Reduced monthly presenteeism days} = (\text{DWsx}) \times (\text{PRODsx}) \times (\text{ED}) \]

  • DWsx: days worked with symptoms (ipcq_ch_q8)

  • PRODsx: productivity when working with symptoms (ipcq_ch_q9)/10

  • ED: daily earnings (calculated in valuation of absenteeism section) –> variable: daily_wage_ch

  • Berechne die Presenteeismuskosten mit Hilfe der Formel
Code
# adjusted variable: 0: no presenteeism and 10 maximal presenteeism instead of the other way around
# df$`pres_workload_adj` <- 1 - (df$`pres_workload` / 10)
# 
# # Presenteism
# df$Costs_pres <- ifelse( is.na(df$pres_days) | is.na(df$pres_workload), 0, (df$pres_days * df$pres_workload_adj) * daily_wage_ch)
Code
# adjusted variable: 0: no presenteeism and 10 maximal presenteeism instead of the other way around
df.valued$`pres_workload_adj` <- 1 - (df.valued$`pres_workload` / 10)

# Presenteism
df.valued$Costs_pres <- ifelse( is.na(df.valued$pres_days) | is.na(df.valued$pres_workload), 0, (df.valued$pres_days * df.valued$pres_workload_adj) * daily_wage_ch)

5.3 Indirekte Kosten

  • Berechne die indirekten Kosten aus der Summe von Absenteeismus- und Präsenteeismuskosten.
Code
# df <- df %>% 
#   mutate(
#     Var.neu = Var.x + Var.y)

\[ \text{Indirect costs} = \text{Absenteeism costs} + \text{Presenteeism costs} \]

Code
df.valued <- df.valued %>% 
  mutate(
    `Costs_indir` = `Costs_abs` + `Costs_pres`)

6 Umrechnung von Kosten

We use the IMF dataset with the GDP deflator index and the PPP value for the different countries for the cost recalculation (Shemilt et al. 2010).

\[ Cost2025 Euro= \frac{{\rm GDP}_2\times{\rm PPP}_2}{{\rm GDP}_1\times{\rm PPP}_1}\times\ Costoriginal \]

  • PPP2: Purchasing power parities for target year (2025)

  • GDP2: Gross domestic product values for target price year (2025) and currency (CHF or Euro)

  • PPP1: PPP for original price year

  • GDP1: GDP for original currency (CHF)

See also: CCEMG cost converter

All costs are recalculated to report them in 2025 Swiss Francs or e.g. 2025 Belgium Euros.

Code
df.IMF.conversion.raw <- rio::import("../data/IMF_cost_conversion_2025.csv", header = TRUE)

df.IMF.conversion.raw <- df.IMF.conversion.raw %>% 
  mutate(across(c("2023", "2024", "2025"), as.numeric))

# Subset for Belgium and Switzerland for years 2023, 2024, 2025
df.IMF.conversion.adapted <- df.IMF.conversion.raw %>%
  filter(
    ISO %in% c("BEL", "CHE") & 
    `WEO Subject Code` %in% c("NGDP_D", "PPPEX") & 
    (!is.na(`2023`) | !is.na(`2024`) | !is.na(`2025`))  # Keep rows where at least one year has a value
  ) %>%
  select(Country, ISO, `WEO Subject Code`, `Subject Descriptor`, `2023`, `2024`, `2025`)
Code
# Subset data for Switzerland
df.IMF.con.CH <- df.IMF.conversion.adapted %>% 
  filter(Country == "Switzerland", `WEO Subject Code`%in% c("NGDP_D", "PPPEX")) %>% 
  select(`WEO Subject Code`, `2023`, `2024`, `2025`) %>% 
  pivot_longer(cols = `2023`:`2025`, names_to = "Year", values_to = "Value") %>% 
  pivot_wider(names_from = `WEO Subject Code`, values_from = "Value") %>% 
  rename(GDP = NGDP_D, PPP = PPPEX) %>% 
  mutate(Year = as.numeric(Year))
  
# Subset data for Belgium
df.IMF.con.BE <- df.IMF.conversion.adapted %>% 
  filter(Country == "Belgium", `WEO Subject Code`%in% c("NGDP_D", "PPPEX")) %>% 
  select(`WEO Subject Code`, `2023`, `2024`, `2025`) %>% 
  pivot_longer(cols = `2023`:`2025`, names_to = "Year", values_to = "Value") %>% 
  pivot_wider(names_from = `WEO Subject Code`, values_from = "Value") %>% 
  rename(GDP = NGDP_D, PPP = PPPEX) %>% 
  mutate(Year = as.numeric(Year))

# Function for inflation (Important variables: "Year T0" und "Price (CHF)")
# Funktion zur Preisberechnung
convert_price_to_2025 <- function(price_chf, year_t0) {
  # Ensure year_t0 is numeric
  year_t0 <- as.numeric(year_t0)

  # GDP-Values for relevant years
  gdp_values <- df.IMF.con.CH %>%
    filter(Year %in% c(2025, unique(year_t0))) %>%
    select(Year, GDP) %>%
    pivot_wider(names_from = Year, values_from = GDP)

  # GDP values for 2025
  gdp_2025 <- gdp_values[[as.character(2025)]]

  # GDP values for year_t0
  gdp_year_t0 <- gdp_values[match(year_t0, colnames(gdp_values))]

  # If there are no GDP values, add NA
  gdp_ratio <- ifelse(!is.na(gdp_year_t0), gdp_2025 / gdp_year_t0, NA)

  # Adapt price
  price_2025 <- as.numeric(price_chf) * as.numeric(gdp_ratio)

  return(price_2025)
}
  • Importiere den Datensatz IMF_cost_conversion_2025.csv.
  • Lasse den Code oben laufen.
  • Berechne nun die disaggregierten Kostenwerte in 2025 CHF mit Hilfe der Formel oben und runde auf 2 Nachkommastellen
Code
# df.y <- df.x %>% 
#   mutate(
#     xy = round(formel(as.numeric(var.x), currencyyear),2)
#   )
Code
# Convert CHF from reference year to 2025 CHF - variable "Price (2025 CHF)"
df.valued <- df.valued %>%
  mutate(
    `Direct medical (2025 CHF)`      = round(convert_price_to_2025(as.numeric(dir_med), cur_year), 2),
    `Direct non-medical (2025 CHF)`  = round(convert_price_to_2025(as.numeric(dir_nonmed), cur_year), 2),
    `Absenteeism (2025 CHF)`         = round(convert_price_to_2025(as.numeric(Costs_abs), cur_year), 2),
    `Presenteeism (2025 CHF)`        = round(convert_price_to_2025(as.numeric(Costs_pres), cur_year), 2),
    `Indirect costs (2025 CHF)`      = round(convert_price_to_2025(as.numeric(Costs_indir), cur_year), 2),
    `Total costs (2025 CHF)` = 
      `Direct medical (2025 CHF)` + 
      `Direct non-medical (2025 CHF)` + 
      `Absenteeism (2025 CHF)` + 
      `Presenteeism (2025 CHF)`
  )

Extracting the PPP values for Belgium and Switzerland.

Code
# Dataset for conversion from 2025 CHF to 2025 Belgium Euro
# Extract PPP Value for 2025 for Belgium 
PPP_BE_2025 <- df.IMF.con.BE %>%
  filter(Year == 2025) %>%
  pull(PPP)

# Extract PPP Value for 2025 for Switzerland
PPP_CH_2025 <- df.IMF.con.CH %>%
  filter(Year == 2025) %>%
  pull(PPP)

# Variable to convert 2025 CHF into 2025 Euro
var.conv.CHF.to.EUR <- PPP_BE_2025/PPP_CH_2025
Code
# Convert CHF from reference year to 2025 CHF - variable "Price (2025 CHF)"
df.valued <- df.valued %>%
  mutate(
    `Direct medical (2025 EUR)`      = var.conv.CHF.to.EUR*`Direct medical (2025 CHF)`,
    `Direct non-medical (2025 EUR)`  = var.conv.CHF.to.EUR*`Direct non-medical (2025 CHF)`,
    `Absenteeism (2025 EUR)`         = var.conv.CHF.to.EUR*`Absenteeism (2025 CHF)`,
    `Presenteeism (2025 EUR)`        = var.conv.CHF.to.EUR*`Presenteeism (2025 CHF)`,
    `Indirect costs (2025 EUR)`      = var.conv.CHF.to.EUR*`Indirect costs (2025 CHF)`,
    `Total costs (2025 EUR)`         = var.conv.CHF.to.EUR*`Total costs (2025 CHF)`)

7 QALY Berechnung

  • Berechne nun die QALYs basierend auf den Daten für die 5 EQ-5D-5L Fragen und nutze dafür die health_state Funktion.
  • Berechne die QALYs mit dem deutschen Value set mit Hilfe des Packages eq5d.
Code
# Define the function to combine answers and create a 5-digit number
# health_state <- function(q1, q2, q3, q4, q5) {
#   answers <- c(q1, q2, q3, q4, q5)
#   combined_number <- as.numeric(paste(answers, collapse = ""))
#   return(combined_number)}

# df.x <- df.y %>% 
#   rowwise() %>% 
#   mutate(
#     `Health state` = health_state(q1,q2,q4,q5),
#     `Subjective health` = as.numeric(q6),
#     `utility` = eq5d(scores = `Health state`, , type = "VT", country = "Germany", version = "5L", ignore.invalid = TRUE)) %>% 
#   ungroup()
Code
# Define the function to combine answers and create a 5-digit number
health_state <- function(q1, q2, q3, q4, q5) {
  answers <- c(q1, q2, q3, q4, q5)
  combined_number <- as.numeric(paste(answers, collapse = ""))
  return(combined_number)}

# Develop health states and utility scores
df.valued <- df.valued %>%
  rowwise() %>%
  mutate(
    # Calculation of health states
    `Health state` = health_state(eq5d5l_q1, eq5d5l_q2, eq5d5l_q3, eq5d5l_q4, eq5d5l_q5),
    # Subjective health (NAS 0-100)
    `Subjective health` = as.numeric(eq5d5l_q6),
    # Utilities for the different value sets
    `Utility (German value set)` = eq5d(scores = `Health state`, type = "VT", country = "Germany", version = "5L", ignore.invalid = TRUE),
    `Utility (Belgium value set)` = eq5d(scores = `Health state`, type = "VT", country = "Belgium", version = "5L", ignore.invalid = TRUE),
    `Utility (GB value set)` = eq5d(scores = `Health state`, type = "VT", country = "England", version = "5L", ignore.invalid = TRUE)) %>% 
  ungroup()
Code
tbl.costs.qaly <- df.valued %>%
  select(
    sex, bmi,
    `Direct medical (2025 CHF)`, `Direct non-medical (2025 CHF)`,
    `Absenteeism (2025 CHF)`, `Presenteeism (2025 CHF)`,
    `Indirect costs (2025 CHF)`, `Total costs (2025 CHF)`,
    `Subjective health`,
    `Utility (German value set)`,
    `Utility (Belgium value set)`,
    `Utility (GB value set)`
  ) %>%
  gtsummary::tbl_summary(
    by = sex,
    type = list(`Absenteeism (2025 CHF)` ~ "continuous",
                `Indirect costs (2025 CHF)`~ "continuous",
                `Total costs (2025 CHF)`~ "continuous"),
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    ),
    label = list(
      sex = "Sex",
      age = "Age",
      height_m = "Height (m)",
      weight = "Weight (kg)",
      bmi = "Body mass index"
    ),
    missing_text = "(Missing)",
    digits = list(all_continuous() ~ 2)
  ) %>%
  gtsummary::add_overall(last = TRUE) %>%
  bold_labels()


tbl.costs.qaly %>%
  as_flex_table() %>%
  align(j = 1, align = "left", part = "all") %>% 
  flextable::set_table_properties(width = 1, layout = "autofit") %>% 
  flextable::fontsize(size = 12, part = "all") %>% 
  flextable::font(fontname = "Times New Roman", part = "all")

Summary statistics for costs and utility

Characteristic

Female
N = 281

Male
N = 201

Overall
N = 481

Body mass index

28.23 (3.11)

28.14 (2.52)

28.19 (2.85)

Direct medical (2025 CHF)

117.85 (160.22)

137.03 (171.41)

125.84 (163.45)

Direct non-medical (2025 CHF)

22.35 (28.21)

28.63 (35.13)

24.96 (31.08)

Absenteeism (2025 CHF)

1,598.72 (4,315.08)

980.84 (3,391.35)

1,341.27 (3,929.47)

Presenteeism (2025 CHF)

2,055.45 (3,188.79)

1,441.39 (2,861.23)

1,799.59 (3,040.48)

Indirect costs (2025 CHF)

3,654.16 (5,104.41)

2,422.24 (4,188.27)

3,140.86 (4,736.64)

Total costs (2025 CHF)

3,794.35 (5,077.43)

2,587.90 (4,196.93)

3,291.66 (4,721.43)

Subjective health

73.57 (14.54)

72.85 (11.84)

73.27 (13.35)

Utility (German value set)

0.84 (0.11)

0.85 (0.15)

0.84 (0.13)

Utility (Belgium value set)

0.78 (0.11)

0.79 (0.17)

0.78 (0.13)

Utility (GB value set)

0.80 (0.10)

0.82 (0.14)

0.81 (0.12)

1Mean (SD)

8 Cost-effectiveness plane/ Cost-utility plane

Wir wechseln nun zur Analyse von longitudinalen gesundheitsökonomischen Evaluationen. Hierfür verwenden wir den Datensatz df.acl.csv. Dieser Datensatz enthält die Ergebnisse für eine model-basierte gesundheitsökonomische Evaluation zur Evaluation eines Verletzungspräventionsprogramms auf ACL-Verletzungen im Frauenhandball.

Einschluss Ausschluss
Population
  • Wettkampfsport im Frauenhandball (Premium 1, Premium 2, 1. Liga)

  • ≥3 Trainings pro Woche

  • ≥ 10 Jahre alte Spielerinnen

  • ACL-Verletzung in der Vergangenheit

  • Andere grosse Verletzungen, welche die regelmässige Teilnahme am Training verhindern

Intervention Präventionsprogram während Warm-up
Control Usual care bereits vorhandenes Präventionsprogram
Outcome QALY
Zeit 52 Wochen

Figure: Decision tree ACL study

9 Sensitivitätsanalysen

Es gibt zwei primäre Arten von Sensitivitätsanalysen bei gesundheitsökonomischen Evaluationen:

  • probabilistische SA: Ergebnisse können im Cost-effectiveness/cost-utility plane dargestellt werden (Fokus heute).

  • deterministische SA: Ergebnisse können in einem Tornado diagramm dargestellt werden.

9.1 Probabilistic senstivity analysis (PSA)

  • Importiere den Datensatz df.acl.csv.
  • Extrahiere die inkrementellen Kostenwerte und Utility-Werte und Erstelle hierfür den neuen Datensatz df.acl.base.
  • Lasse eine PSA laufen (R=5000) für eine bestehende Standardabweichung SD von 150 (costs), 0.0005 (utility). Die simulierten Daten auch in einem neuen Datensatz df.acl.sim speichern.
  • Erstelle einen Cost-utility plane mit Hilfe des R-package ggplot2.
Code
# Daten aus Datensatz extrahieren
# var <- df %>%
#   filter(Category == "Row") %>%
#   pull(Coloumn)

# Neuen Datensatz erstellen
# df.new <- data.frame(
#   Scenario = "Base Case",
#   Incr.cost = var1,
#   Incr.utility = var2)

# PSA mit R = 5000
# n_sim <- 5000
# set.seed(123)  # Für Reproduzierbarkeit
# var <- rnorm(`Anzahl von R`, mean = var1, sd = sd_var1)

# Cost-utility plane
# library(ggplot2)
# ggplot() +
#   # Simulierte Werte
#   geom_point(data = df1, aes(x = value1, y = value2), 
#              alpha = 0.2, color = "blue", size = 2) +
#   # Base Case Punkt
#   geom_point(data = df2, aes(x = value1, y = value2), 
#              color = "red", size = 4)
Code
df.acl <- rio::import("../data/df.acl.csv")

# Extrahiere incremental cost values
var.incr.cost <- df.acl %>%
  filter(Category == "Cost") %>%
  pull(Incremental)

var.incr.utility <- df.acl %>%
  filter(Category == "Utility") %>%
  pull(Incremental)

# Incrementellen Werte in Datensatz einfügen
df.acl.base <- data.frame(
  Scenario = "Base Case",
  Incr.cost = var.incr.cost,
  Incr.utility = var.incr.utility)

# Anzahl der Simulationen
n_sim <- 5000

# Unsicherheiten (z.B. Standardabweichungen) für die Monte-Carlo-Simulation
sd.cost <- 150
sd.utility <- 0.0005

# Monte-Carlo-Simulation durchführen
# Beachte: Wir gehen hier von einer Normalverteilung der Kostendaten aus.
# Im Normalfall sind diese eher Gamma verteilt.
set.seed(123)
sim.cost <- rnorm(n_sim, mean = var.incr.cost, sd = sd.cost)
sim.utility <- rnorm(n_sim, mean = var.incr.utility, sd = sd.utility)
# für Gamma Verteilung:
# var.cost <- sd.cost^2
# shape <- var.incr.cost^2 / var.cost
# scale <- var.cost / abs(var.incr.cost) # Gamma Verteilung nur möglich mit positiven Zahlen, zur Darstellung hier nun die Absoluten Werte verendet
# shape <- var.incr.cost^2/var.cost
# sim.cost <- rgamma(n_sim, shape = shape, scale = scale)

# Simulierte Daten
df.acl.sim <- data.frame(
  Incr.cost = sim.cost,
  Incr.utility = sim.utility)

# CUP
ggplot.acl.cup <- ggplot() +
  # Simulierte Werte
  geom_point(data = df.acl.sim, aes(x = Incr.utility, y = Incr.cost), 
             alpha = 0.2, color = "blue", size = 2) +
  # Base Case Punkt
  geom_point(data = df.acl.base, aes(x = Incr.utility, y = Incr.cost), 
             color = "red", size = 4) +
  # Achsenbeschriftungen
  labs(
    title = "Cost-Utility Plane: Base case with PSA",
    x = "Incremental Utility",
    y = "Incremental Costs (2025 Swiss Francs)"
  ) +
  xlim(-0.005, 0.005) +
  ylim(-800, 800) +
  geom_vline(xintercept = 0, linetype = "solid", color = "black", size = 0.3) +
  geom_hline(yintercept = 0, linetype = "solid", color = "black", size = 0.3) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "none")

ggplot.acl.cup

9.2 Introduction to dampack

Code
library(ggplot2)
library(dampack)

# Build absolute cost and utility matrices
# Comparator = baseline (0, 0)
cost_mat <- cbind(Comparator = rep(0, n_sim),
                  Intervention = sim.cost)

effect_mat <- cbind(Comparator = rep(0, n_sim),
                    Intervention = sim.utility)

psa.acl <- make_psa_obj(
  cost = cost_mat,
  effectiveness = effect_mat,
  strategies = c("Comparator", "Intervention"))

plot(
  psa.acl,
  center = TRUE,       # show mean point
  ellipse = TRUE,      # show confidence ellipse
  alpha = 0.2,         # transparency
  col = "full",        # color mode
  xlim = c(-0.005, 0.005),
  ylim = c(-800, 800))

Code
library(ggplot2)
library(dampack)

# WTP thresholds in CHF per QALY
wtp <- seq(0, 200000, by = 5000)

# Create CEAC object
ceac.acl <- ceac(psa = psa.acl, wtp = wtp)
plot(ceac.acl, frontier = TRUE, points = TRUE)

Code
ceac(wtp, psa.acl)
      WTP     Strategy Proportion On_Frontier
1       0   Comparator     0.0568       FALSE
2       0 Intervention     0.9432        TRUE
3    5000   Comparator     0.0444       FALSE
4    5000 Intervention     0.9556        TRUE
5   10000   Comparator     0.0364       FALSE
6   10000 Intervention     0.9636        TRUE
7   15000   Comparator     0.0298       FALSE
8   15000 Intervention     0.9702        TRUE
9   20000   Comparator     0.0234       FALSE
10  20000 Intervention     0.9766        TRUE
11  25000   Comparator     0.0182       FALSE
12  25000 Intervention     0.9818        TRUE
13  30000   Comparator     0.0142       FALSE
14  30000 Intervention     0.9858        TRUE
15  35000   Comparator     0.0114       FALSE
16  35000 Intervention     0.9886        TRUE
17  40000   Comparator     0.0094       FALSE
18  40000 Intervention     0.9906        TRUE
19  45000   Comparator     0.0074       FALSE
20  45000 Intervention     0.9926        TRUE
21  50000   Comparator     0.0060       FALSE
22  50000 Intervention     0.9940        TRUE
23  55000   Comparator     0.0046       FALSE
24  55000 Intervention     0.9954        TRUE
25  60000   Comparator     0.0034       FALSE
26  60000 Intervention     0.9966        TRUE
27  65000   Comparator     0.0028       FALSE
28  65000 Intervention     0.9972        TRUE
29  70000   Comparator     0.0020       FALSE
30  70000 Intervention     0.9980        TRUE
31  75000   Comparator     0.0014       FALSE
32  75000 Intervention     0.9986        TRUE
33  80000   Comparator     0.0010       FALSE
34  80000 Intervention     0.9990        TRUE
35  85000   Comparator     0.0004       FALSE
36  85000 Intervention     0.9996        TRUE
37  90000   Comparator     0.0002       FALSE
38  90000 Intervention     0.9998        TRUE
39  95000   Comparator     0.0002       FALSE
40  95000 Intervention     0.9998        TRUE
41 100000   Comparator     0.0002       FALSE
42 100000 Intervention     0.9998        TRUE
43 105000   Comparator     0.0000       FALSE
44 105000 Intervention     1.0000        TRUE
45 110000   Comparator     0.0000       FALSE
46 110000 Intervention     1.0000        TRUE
47 115000   Comparator     0.0000       FALSE
48 115000 Intervention     1.0000        TRUE
49 120000   Comparator     0.0000       FALSE
50 120000 Intervention     1.0000        TRUE
51 125000   Comparator     0.0000       FALSE
52 125000 Intervention     1.0000        TRUE
53 130000   Comparator     0.0000       FALSE
54 130000 Intervention     1.0000        TRUE
55 135000   Comparator     0.0000       FALSE
56 135000 Intervention     1.0000        TRUE
57 140000   Comparator     0.0000       FALSE
58 140000 Intervention     1.0000        TRUE
59 145000   Comparator     0.0000       FALSE
60 145000 Intervention     1.0000        TRUE
61 150000   Comparator     0.0000       FALSE
62 150000 Intervention     1.0000        TRUE
63 155000   Comparator     0.0000       FALSE
64 155000 Intervention     1.0000        TRUE
65 160000   Comparator     0.0000       FALSE
66 160000 Intervention     1.0000        TRUE
67 165000   Comparator     0.0000       FALSE
68 165000 Intervention     1.0000        TRUE
69 170000   Comparator     0.0000       FALSE
70 170000 Intervention     1.0000        TRUE
71 175000   Comparator     0.0000       FALSE
72 175000 Intervention     1.0000        TRUE
73 180000   Comparator     0.0000       FALSE
74 180000 Intervention     1.0000        TRUE
75 185000   Comparator     0.0000       FALSE
76 185000 Intervention     1.0000        TRUE
77 190000   Comparator     0.0000       FALSE
78 190000 Intervention     1.0000        TRUE
79 195000   Comparator     0.0000       FALSE
80 195000 Intervention     1.0000        TRUE
81 200000   Comparator     0.0000       FALSE
82 200000 Intervention     1.0000        TRUE
Code
calculate_icers_psa(psa.acl, uncertainty = TRUE) # be aware of negative ICERS
      Strategy      Cost Lower_95_Cost Upper_95_Cost      Effect
1 Intervention -235.0354     -532.4012      58.35823 0.003097913
2   Comparator    0.0000        0.0000       0.00000 0.000000000
  Lower_95_Effect Upper_95_Effect Inc_Cost Inc_Effect ICER Status
1     0.002112705     0.004073618       NA         NA   NA     ND
2     0.000000000     0.000000000       NA         NA   NA      D

9.3 Szenarioanalysen

Code
# Using reported 95% CI values for incr.costs and incr.utility
df.acl.scenario <- data.frame(
  Scenario = c("Base Case", "Best Case", "Worst Case"),
  Incr.cost = c(-234.95, -307.59, 43.2069),
  Incr.utility = c(0.0031, 0.0037, 0.0011))

df.acl.scenario$Scenario <- factor(df.acl.scenario$Scenario,
                                   levels = c("Base Case", "Best Case", "Worst Case"))

# Define colours
colors <- c("Base Case" = "black", 
            "Best Case" = "green", 
            "Worst Case" = "red")

# Plot entwickeln mit definierten Achsenbereichen und dickeren Achsenlinien bei x=0 und y=0
ggplot.acl.scenario <- ggplot(df.acl.scenario, aes(x = Incr.utility, y = Incr.cost, color = Scenario)) +
  geom_point(size = 3) +
  scale_color_manual(values = colors) +
  labs(
    title = "Cost-Utility Plane: Base case, best and worst case scenario",
    x = "Incremental Utility",
    y = "Incremental Costs (2025 Swiss Francs)",
    color = "Szenario"
  ) +
  xlim(-0.001, 0.006) +
  ylim(-400, 200) +
  geom_vline(xintercept = 0, linetype = "solid", color = "black", size = 0.3) +
  geom_hline(yintercept = 0, linetype = "solid", color = "black", size = 0.3) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "right",
    legend.title = element_text(size = 10),
    legend.text = element_text(size = 10),
    legend.key = element_blank())

print(ggplot.acl.scenario)

10 Bootstrapping Cost of illness data

  • Berechne nun für die direkten medizinischen Kosten für 2025 in CHF die 95% CI Werte via non-parametric bootstrapping (R= 5000) mit Hilfe des boot Packages in R (Beachte: Definiere davor den Seed e.g. set.seed(123))
Code
# library(boot)
# set.seed(123)

# Bootstrap function for mean direct medical costs
# fct.boot.dir.med <- function(data, i) {
#   df <- data[i, ]  # bootstrap sample
#   mean_cost <- mean(df$`Direct medical (2025 CHF)`, na.rm = TRUE)
#   return(mean_cost)
# }

# Bootstrapping with 5000 iterations
# boot.xy <- boot(df, fct.boot.dir.med, R = 5000)

# Mean estimate (bootstrap distribution mean)
# t_d <- mean(boot.xy$t)

# Confidence interval (basic bootstrap CI)
# ci_formula <- function(orig_mean, boot_samples) {
#   lower <- 2 * orig_mean - quantile(boot_samples, probs = 0.975, na.rm = TRUE)
#   upper <- 2 * orig_mean - quantile(boot_samples, probs = 0.025, na.rm = TRUE)
#   return(c(lower, upper))
# }
# ci_xy <- ci_formula(t_d, boot.xy$t)
# 
# # Results dataframe
# df.new <- data.frame(
#   Statistic = "Mean Direct medical Cost (Overall)",
#   Value     = round(t_d, 2),
#   Lower     = round(ci_xy[1], 2),
#   Upper     = round(ci_xy[2], 2)
# )
# 
# print(df.new)
Code
library(boot)
set.seed(123)

# Bootstrap function for mean direct medical costs
fct.boot.dir.med <- function(data, i) {
  df <- data[i, ]  # bootstrap sample
  mean_cost <- mean(df$`Direct medical (2025 CHF)`, na.rm = TRUE)
  return(mean_cost)
}

# Bootstrapping with 5000 iterations
boot.dir.med <- boot(df.valued, fct.boot.dir.med, R = 5000)

# Mean estimate (bootstrap distribution mean)
t_dir_med_mean <- mean(boot.dir.med$t)

# Confidence interval (basic bootstrap CI)
ci_formula <- function(orig_mean, boot_samples) {
  lower <- 2 * orig_mean - quantile(boot_samples, probs = 0.975, na.rm = TRUE)
  upper <- 2 * orig_mean - quantile(boot_samples, probs = 0.025, na.rm = TRUE)
  return(c(lower, upper))
}
ci_dir_med_mean <- ci_formula(t_dir_med_mean, boot.dir.med$t)

# Results dataframe
df.boot.direct.medical.costs <- data.frame(
  Statistic = "Mean Direct medical Cost (Overall)",
  Value     = round(t_dir_med_mean, 2),
  Lower     = round(ci_dir_med_mean[1], 2),
  Upper     = round(ci_dir_med_mean[2], 2)
)

print(df.boot.direct.medical.costs)
                               Statistic  Value Lower Upper
97.5% Mean Direct medical Cost (Overall) 126.17 78.21 170.5
Code
set.seed(123)

# Number of bootstrap replications
n.boot <- 5000

# Create vector for bootstrapped overall costs
total.costs.boot.overall <- vector("numeric", n.boot)

# Bootstrapping loop (overall, no BMI groups)
for(i in 1:n.boot){
  
  dir.med <- sample(df.valued$`Direct medical (2025 CHF)`, 
                    nrow(df.valued), replace = TRUE)
  dir.nonmed <- sample(df.valued$`Direct non-medical (2025 CHF)`, 
                       nrow(df.valued), replace = TRUE)
  abs.costs <- sample(df.valued$`Absenteeism (2025 CHF)`, 
                      nrow(df.valued), replace = TRUE)
  pres.costs <- sample(df.valued$`Presenteeism (2025 CHF)`, 
                       nrow(df.valued), replace = TRUE)
  indir.costs <- sample(df.valued$`Indirect costs (2025 CHF)`, 
                        nrow(df.valued), replace = TRUE)
  
  total.costs.boot.overall[i] <- mean(dir.med + dir.nonmed + abs.costs + pres.costs)
}

# Bootstrap CI function
ci_formula_total <- function(orig_mean, boot_samples) {
  lower <- 2 * orig_mean - quantile(boot_samples, probs = 0.975, na.rm = TRUE)
  upper <- 2 * orig_mean - quantile(boot_samples, probs = 0.025, na.rm = TRUE)
  return(c(lower = lower, upper = upper))
}

# Compute mean + CI
mean.overall <- mean(total.costs.boot.overall)
ci.total.costs.overall <- ci_formula_total(mean.overall, total.costs.boot.overall)

# Create output dataframe
df.boot.total.mean <- data.frame(
  Statistic = "Total Mean Societal Costs (Overall)",
  Value = round(mean.overall, 2),
  Lower = round(ci.total.costs.overall[1], 2),
  Upper = round(ci.total.costs.overall[2], 2)
)

print(df.boot.total.mean)
                                      Statistic   Value   Lower   Upper
lower.97.5% Total Mean Societal Costs (Overall) 3305.75 1809.94 4576.92