Years of Life Lost (YLL) and Disability-Adjusted Life Years (DALY) Analysis

A Comprehensive Burden of Disease Study — Kenya, 2019

Author

Timothy Achala

Published

May 2, 2026

1 Introduction and Conceptual Framework

1.1 Background

The Global Burden of Disease (GBD) framework provides a systematic approach to quantifying health loss across populations. Two foundational metrics are:

Metric What it Measures Components
YLL Premature mortality burden Deaths × remaining life expectancy
YLD Non-fatal health burden Prevalence × disability weight × duration
DALY Total health burden YLL + YLD

One DALY = one lost year of healthy life. It represents the gap between current health status and an ideal situation where everyone lives in full health to an old age.

1.2 Epidemiological Justification

1.2.1 Why YLL Matters

Years of Life Lost penalises deaths at younger ages more heavily than deaths at older ages. A child who dies at age 2 contributes far more YLL than an 80-year-old dying of the same disease. This is crucial for priority-setting because it focuses attention on preventable premature mortality.

1.2.2 Why DALY Is Superior to Simple Death Counts

Death counts ignore:

  1. Non-fatal burden: Depression rarely kills but causes enormous suffering.
  2. Age at death: A death at 25 is more tragic (in terms of life-years) than at 85.
  3. Severity of disability: A 10-year episode of mild anaemia ≠ 10 years of severe HIV.

DALYs address all three limitations simultaneously.


2 Data Loading and Exploration

2.1 Load the Dataset

Code
# ─────────────────────────────────────────────────────────────────
# READ DATA
# The CSV contains 128 rows covering 6 diseases × 8 age groups × 2 sexes
# in Kenya, 2019 (GBD reference year).
# ─────────────────────────────────────────────────────────────────
raw_data <- read_csv("burden_of_disease_dataset.csv",
                     show_col_types = FALSE)

# Convert categorical variables to factors with meaningful ordering
data <- raw_data |>
  mutate(
    # Ordered age groups ensure correct axis ordering in plots
    age_group = factor(age_group, levels = c(
      "0-4", "5-14", "15-29", "30-44", "45-59", "60-69", "70-79", "80+"
    )),
    sex              = factor(sex,              levels = c("Male", "Female")),
    disease_category = factor(disease_category, levels = c(
      "Communicable", "Non-Communicable", "Cardiovascular", "Mental Health", "Injuries"
    )),
    # Flag: does this disease cause direct mortality?
    cause_of_death = as.logical(cause_of_death == "Yes"),
    # age_at_death is already numeric; NAs are read as NA automatically by read_csv
    age_at_death = suppressWarnings(as.numeric(age_at_death))
  )

# Quick confirmation
cat("Dimensions:", nrow(data), "rows ×", ncol(data), "columns\n")
Dimensions: 128 rows × 22 columns
Code
cat("Diseases:", paste(unique(data$disease_name), collapse = ", "), "\n")
Diseases: Ischemic Heart Disease, Stroke, Lower Respiratory Infections, HIV/AIDS, Diabetes Mellitus, Malaria, Depressive Disorders, Road Injuries 
Code
cat("Age groups:", paste(levels(data$age_group), collapse = ", "), "\n")
Age groups: 0-4, 5-14, 15-29, 30-44, 45-59, 60-69, 70-79, 80+ 

Interpretation: The dataset captures 6 major disease causes (Ischemic Heart Disease, Stroke, Lower Respiratory Infections, HIV/AIDS, Diabetes Mellitus, Malaria, Depressive Disorders, Road Injuries) across 8 age strata for both sexes in Kenya 2019. The total of 128 records allows stratified analysis by age, sex, disease category, and cause type.

2.2 Data Quality Check

Code
# ─────────────────────────────────────────────────────────────────
# MISSING VALUE AUDIT
# age_at_death is legitimately NA for non-fatal diseases
# (e.g., Depressive Disorders where cause_of_death = FALSE)
# ─────────────────────────────────────────────────────────────────
missing_summary <- data |>
  summarise(across(everything(), ~sum(is.na(.)))) |>
  pivot_longer(everything(), names_to = "Variable", values_to = "Missing_n") |>
  filter(Missing_n > 0) |>
  mutate(
    Missing_pct = round(Missing_n / nrow(data) * 100, 1),
    Explanation = case_when(
      Variable == "age_at_death" ~ "Expected: non-fatal diseases have no age at death",
      TRUE ~ "Investigate further"
    )
  )

missing_summary |>
  gt() |>
  tab_header(title = "Missing Data Audit") |>
  cols_label(Variable = "Column", Missing_n = "N Missing",
             Missing_pct = "% Missing", Explanation = "Reason") |>
  tab_style(style = cell_fill(color = "#FFF3CD"),
            locations = cells_body(rows = Missing_n > 0))
Missing Data Audit
Column N Missing % Missing Reason
age_at_death 2 1.6 Expected: non-fatal diseases have no age at death

Interpretation: The only missing values appear in age_at_death for Depressive Disorders, which is by design — depression is classified as primarily non-fatal in this dataset, so no age-at-death value applies. No unexpected missingness exists, confirming data integrity.

2.3 Descriptive Summary Table

Code
# ─────────────────────────────────────────────────────────────────
# SUMMARY BY DISEASE AND SEX
# Aggregates key burden indicators before YLL/DALY calculation
# ─────────────────────────────────────────────────────────────────
summary_tbl <- data |>
  group_by(disease_name, disease_category, sex) |>
  summarise(
    Total_Deaths      = sum(deaths, na.rm = TRUE),
    Total_Incident    = sum(incident_cases, na.rm = TRUE),
    Total_Prevalent   = sum(prevalent_cases, na.rm = TRUE),
    Mean_Dis_Weight   = round(mean(disability_weight), 3),
    .groups = "drop"
  ) |>
  arrange(disease_category, disease_name, sex)

summary_tbl |>
  gt() |>
  tab_header(
    title    = "Burden of Disease: Summary Statistics by Cause and Sex",
    subtitle = "Kenya, 2019"
  ) |>
  fmt_number(columns = c(Total_Deaths, Total_Incident, Total_Prevalent),
             use_seps = TRUE, decimals = 0) |>
  cols_label(
    disease_name     = "Disease",
    disease_category = "Category",
    sex              = "Sex",
    Total_Deaths     = "Total Deaths",
    Total_Incident   = "Incident Cases",
    Total_Prevalent  = "Prevalent Cases",
    Mean_Dis_Weight  = "Mean DW"
  ) |>
  tab_row_group(label = "Cardiovascular",  rows = disease_category == "Cardiovascular") |>
  tab_row_group(label = "Communicable",    rows = disease_category == "Communicable") |>
  tab_row_group(label = "Mental Health",   rows = disease_category == "Mental Health") |>
  tab_row_group(label = "Non-Communicable",rows = disease_category == "Non-Communicable") |>
  tab_row_group(label = "Injuries",        rows = disease_category == "Injuries") |>
  tab_style(style = cell_fill(color = "#E8F4FD"),
            locations = cells_row_groups())
Burden of Disease: Summary Statistics by Cause and Sex
Kenya, 2019
Disease Category Sex Total Deaths Incident Cases Prevalent Cases Mean DW
Injuries
Road Injuries Injuries Male 7,815 75,500 175,600 0.370
Road Injuries Injuries Female 2,320 29,750 69,450 0.370
Non-Communicable
Diabetes Mellitus Non-Communicable Male 13,212 39,443 116,540 0.049
Diabetes Mellitus Non-Communicable Female 10,730 33,157 98,287 0.049
Mental Health
Depressive Disorders Mental Health Male 311 39,770 118,350 0.145
Depressive Disorders Mental Health Female 440 59,695 178,980 0.145
Communicable
HIV/AIDS Communicable Male 14,610 42,400 202,600 0.547
HIV/AIDS Communicable Female 16,660 47,670 225,600 0.547
Lower Respiratory Infections Communicable Male 11,570 179,500 330,000 0.279
Lower Respiratory Infections Communicable Female 9,320 150,000 279,000 0.279
Malaria Communicable Male 8,090 382,000 678,500 0.186
Malaria Communicable Female 7,330 361,800 643,000 0.186
Cardiovascular
Ischemic Heart Disease Cardiovascular Male 15,132 30,445 83,300 0.432
Ischemic Heart Disease Cardiovascular Female 11,899 25,485 71,020 0.432
Stroke Cardiovascular Male 17,841 35,175 94,530 0.552
Stroke Cardiovascular Female 14,042 29,271 79,455 0.552

3 Years of Life Lost (YLL) Calculation

3.1 Theoretical Framework

YLL measures the years of life a person would have lived had they not died prematurely. The GBD uses the standard expected years of life lost formula:

\[\text{YLL}_{age,sex} = N_{deaths} \times L_{age,sex}\]

where \(L_{age,sex}\) is the remaining life expectancy at the age of death according to a standard life table (GBD uses 86.0 years for females, 91.9 years for males — here we use a unified 86-year frontier).

The remaining life expectancy for someone dying at age \(a\) is:

\[L(a) = \text{Standard Life Expectancy} - a\]

This means a 2-year-old dying contributes \((86 - 2) = 84\) YLL, while a 75-year-old dying contributes only \((86 - 75) = 11\) YLL.

3.2 YLL Computation in R

Code
# ─────────────────────────────────────────────────────────────────
# YLL CALCULATION
#
# Formula: YLL = Deaths × (Standard LE - Age at Death)
#
# Key design decisions:
# 1. We use the GBD standard LE of 86.0 years (reference frontier).
# 2. Mid-point of the age group is used as the proxy age at death
#    when age_at_death is missing (only for non-fatal diseases).
# 3. YLL is set to 0 for non-fatal diseases (cause_of_death = FALSE).
# 4. We add a floor: if (standard_LE - age_at_death) < 0, YLL = 0
#    to handle the rare case where age_at_death exceeds the standard.
# ─────────────────────────────────────────────────────────────────
STANDARD_LE <- 86.0   # GBD universal standard life expectancy (years)

yll_data <- data |>
  mutate(
    # ── Mid-point of age group for cases where age_at_death is NA ──
    age_midpoint = (age_group_lower + age_group_upper) / 2,

    # ── Use actual recorded age at death, else fall back to midpoint ──
    effective_age_at_death = coalesce(age_at_death, age_midpoint),

    # ── Remaining life expectancy at age of death ──
    remaining_LE = pmax(STANDARD_LE - effective_age_at_death, 0),

    # ── YLL = Deaths × Remaining LE ──
    # Non-fatal diseases (cause_of_death = FALSE) get YLL = 0
    YLL = if_else(cause_of_death, deaths * remaining_LE, 0),

    # ── YLL rate per 100,000 population (for comparability) ──
    YLL_rate_100k = (YLL / population) * 100000
  )

# Show a sample of the computation
yll_data |>
  filter(deaths > 0) |>
  select(disease_name, age_group, sex, deaths, effective_age_at_death,
         remaining_LE, YLL, YLL_rate_100k) |>
  slice_sample(n = 12) |>
  arrange(disease_name, age_group) |>
  kable(
    caption  = "Sample YLL Computations (12 randomly selected rows)",
    col.names = c("Disease", "Age Group", "Sex", "Deaths",
                  "Age at Death", "Remaining LE", "YLL", "YLL/100k"),
    digits   = c(0, 0, 0, 0, 1, 1, 0, 1),
    format.args = list(big.mark = ",")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Sample YLL Computations (12 randomly selected rows)
Disease Age Group Sex Deaths Age at Death Remaining LE YLL YLL/100k
Depressive Disorders 60-69 Male 38 65.2 20.8 0 0.0
Diabetes Mellitus 5-14 Male 12 9.4 76.6 919 38.3
Diabetes Mellitus 80+ Female 3,800 84.3 1.7 6,460 3,588.9
HIV/AIDS 5-14 Female 420 10.1 75.9 31,878 1,386.0
HIV/AIDS 15-29 Male 2,800 22.5 63.5 177,800 4,678.9
HIV/AIDS 80+ Male 180 83.4 2.6 468 312.0
Lower Respiratory Infections 0-4 Male 2,800 1.8 84.2 235,760 19,646.7
Lower Respiratory Infections 5-14 Male 420 9.1 76.9 32,298 1,345.8
Lower Respiratory Infections 60-69 Male 1,400 64.8 21.2 29,680 3,491.8
Lower Respiratory Infections 80+ Male 3,200 83.2 2.8 8,960 5,973.3
Stroke 0-4 Male 18 2.5 83.5 1,503 125.2
Stroke 80+ Male 6,200 83.8 2.2 13,640 9,093.3

Interpretation of YLL computation logic:

  • remaining_LE is the core multiplier. A child dying at age 2 carries 84 remaining years; a person dying at 75 carries only 11. This is the mathematical expression of the social preference for averting premature death.
  • pmax(..., 0) ensures we never obtain negative YLL, which could theoretically occur if someone dies beyond the 86-year frontier (edge case).
  • coalesce() gracefully handles diseases with no recorded death age by substituting the age-group midpoint — a standard epidemiological imputation.

3.3 YLL by Disease and Age Group

Code
# ─────────────────────────────────────────────────────────────────
# AGGREGATE YLL BY DISEASE AND AGE GROUP
# This is the primary YLL summary table for interpretation
# ─────────────────────────────────────────────────────────────────
yll_disease_age <- yll_data |>
  group_by(disease_name, disease_category, age_group) |>
  summarise(
    total_deaths = sum(deaths),
    total_YLL    = sum(YLL),
    YLL_rate     = sum(YLL) / sum(population) * 100000,
    .groups      = "drop"
  )

# Total YLL per disease (both sexes combined)
yll_disease_total <- yll_data |>
  group_by(disease_name, disease_category) |>
  summarise(
    Deaths   = sum(deaths),
    YLL      = sum(YLL),
    YLL_rate = sum(YLL) / sum(population) * 100000,
    .groups  = "drop"
  ) |>
  arrange(desc(YLL))

yll_disease_total |>
  gt() |>
  tab_header(
    title    = "Total YLL by Disease Cause",
    subtitle = "Kenya, 2019 — All ages, both sexes combined"
  ) |>
  fmt_number(columns = c(Deaths, YLL), use_seps = TRUE, decimals = 0) |>
  fmt_number(columns = YLL_rate, decimals = 1) |>
  cols_label(
    disease_name     = "Disease",
    disease_category = "Category",
    Deaths           = "Total Deaths",
    YLL              = "Total YLL",
    YLL_rate         = "YLL Rate (per 100k)"
  ) |>
  data_color(
    columns = YLL,
    palette = "Blues"
  ) |>
  tab_footnote(
    footnote  = "YLL Rate standardised to population size per disease stratum.",
    locations = cells_column_labels(columns = YLL_rate)
  )
Total YLL by Disease Cause
Kenya, 2019 — All ages, both sexes combined
Disease Category Total Deaths Total YLL YLL Rate (per 100k)1
HIV/AIDS Communicable 31,270 1,509,929 5,392.6
Malaria Communicable 15,420 1,007,966 3,599.9
Lower Respiratory Infections Communicable 20,890 709,141 2,532.6
Stroke Cardiovascular 31,883 427,099 1,525.4
Road Injuries Injuries 10,135 421,772 1,506.3
Ischemic Heart Disease Cardiovascular 27,031 354,507 1,266.1
Diabetes Mellitus Non-Communicable 23,942 352,611 1,259.3
Depressive Disorders Mental Health 751 0 0.0
1 YLL Rate standardised to population size per disease stratum.

3.4 Visualisation: YLL by Disease and Age

Code
# ─────────────────────────────────────────────────────────────────
# HEATMAP: YLL per disease × age group
# Log scale used because the range is very wide (child deaths create
# very high YLL despite fewer absolute numbers).
# ─────────────────────────────────────────────────────────────────
yll_heat <- yll_data |>
  group_by(disease_name, age_group) |>
  summarise(YLL = sum(YLL), .groups = "drop") |>
  filter(YLL > 0)

ggplot(yll_heat, aes(x = age_group, y = fct_reorder(disease_name, YLL, sum),
                     fill = YLL)) +
  geom_tile(colour = "white", linewidth = 0.5) +
  geom_text(aes(label = if_else(YLL >= 1000,
                                paste0(round(YLL / 1000, 0), "k"),
                                as.character(round(YLL)))),
            size = 3, colour = "white", fontface = "bold") +
  scale_fill_viridis_c(
    name   = "YLL",
    trans  = "log10",
    labels = label_comma(),
    option = "plasma"
  ) +
  labs(
    title    = "Years of Life Lost by Disease and Age Group",
    subtitle = "Kenya 2019 — Both sexes combined; colour on log₁₀ scale",
    x        = "Age Group", y = NULL,
    caption  = "Values > 1000 shown as 'Xk'. Source: Simulated GBD-style dataset."
  ) +
  theme_burden()

Figure 1: Heatmap of YLL by Disease and Age Group, Kenya 2019

Interpretation of heatmap:

  • HIV/AIDS in the 30–44 age group shows the darkest cells, reflecting the catastrophic toll of HIV on working-age adults in sub-Saharan Africa. Deaths at this age carry approximately 48 remaining life years each.
  • Malaria in children aged 0–4 generates enormous YLL despite lower death counts than cardiovascular diseases, purely because each death removes ~84 years of potential life.
  • Cardiovascular diseases (IHD, Stroke) peak in the 60–79 age bands. While death counts are highest here, YLL per death is low (6–16 years remaining), explaining their moderate YLL despite high mortality.
  • The near-empty cells for Depressive Disorders reflect its classification as primarily non-fatal; the few deaths are suicide-related.

3.5 YLL by Sex — Population Pyramid Style

Code
# ─────────────────────────────────────────────────────────────────
# DIVERGING BAR CHART (POPULATION PYRAMID STYLE)
# Males plotted left (negative values), females right (positive)
# This makes sex disparities immediately visible
# ─────────────────────────────────────────────────────────────────
yll_sex <- yll_data |>
  group_by(age_group, sex) |>
  summarise(YLL = sum(YLL), .groups = "drop") |>
  mutate(YLL_plot = if_else(sex == "Male", -YLL, YLL))

ggplot(yll_sex, aes(x = YLL_plot, y = age_group, fill = sex)) +
  geom_col(alpha = 0.85, width = 0.75) +
  geom_vline(xintercept = 0, colour = "black", linewidth = 0.8) +
  scale_x_continuous(
    labels = function(x) paste0(comma(abs(x / 1e3)), "k"),
    breaks = pretty(c(-max(abs(yll_sex$YLL_plot)), max(abs(yll_sex$YLL_plot))), 6)
  ) +
  scale_fill_manual(values = c("Male" = "#2166AC", "Female" = "#D6604D")) +
  labs(
    title    = "Years of Life Lost by Age Group and Sex",
    subtitle = "Males (left) vs Females (right) — Kenya 2019",
    x        = "YLL (thousands)",
    y        = "Age Group",
    fill     = "Sex",
    caption  = "Each bar represents total YLL summed across all 6 diseases."
  ) +
  theme_burden() +
  theme(legend.position = "top")

Figure 2: YLL by Age Group and Sex — Population Pyramid

Interpretation — Sex Disparities in YLL:

  • Males bear a heavier YLL burden in almost all age groups, particularly in the 15–44 range. This is driven by (a) higher road injury mortality in males,
    1. greater cardiovascular event rates in males, and (c) higher occupational HIV exposure in some contexts.
  • Females aged 15–44 show relatively elevated YLL for their sex, largely due to HIV/AIDS (females account for the majority of new HIV infections in sub-Saharan Africa through heterosexual transmission).
  • The 60+ age groups show converging YLL between sexes, reflecting greater female longevity but higher female prevalence of stroke and diabetes at older ages.

4 Disability-Adjusted Life Years (DALY) Calculation

4.1 Conceptual Components

\[\text{DALY} = \text{YLL} + \text{YLD}\]

4.1.1 YLD (Years Lived with Disability)

\[\text{YLD} = I \times DW \times L\]

where:

  • \(I\) = number of incident cases in the period
  • \(DW\) = disability weight (0 = perfect health; 1 = death), drawn from GBD
  • \(L\) = average duration of the episode (years)

Alternatively: \(\text{YLD} = P \times DW\) where \(P\) = prevalent cases (used when incidence data is unavailable). We use the incidence-based approach here as recommended by GBD 2019.

4.1.2 Disability Weights Used

Disease DW Interpretation
IHD 0.432 Moderate-severe: chest pain, fatigue, limitation
Stroke 0.552 Severe: paralysis, speech loss, dependency
LRI 0.279 Moderate: breathlessness, fever, activity limitation
HIV/AIDS 0.547 Severe: immunosuppression, opportunistic infections
Diabetes 0.049 Mild-moderate: managed disease with complications
Malaria 0.186 Moderate: acute fever, anaemia, prostration
Depression 0.145 Mild-moderate: low mood, anhedonia, functional impairment
Road Injuries 0.370 Moderate-severe: trauma, fractures, rehabilitation

4.2 DALY Computation

Code
# ─────────────────────────────────────────────────────────────────
# STEP 1: YLD CALCULATION
#
# YLD (incidence-based) = Incident Cases × Disability Weight × Duration
#
# Why incidence-based?
# - Aligns with GBD 2019 methodology
# - Avoids double-counting prevalent cases from previous years
# - More sensitive to new episodes and intervention effects
# ─────────────────────────────────────────────────────────────────
daly_data <- yll_data |>
  mutate(
    # YLD: multiply incident cases by DW and average duration
    YLD = incident_cases * disability_weight * duration_years,

    # DALY = YLL + YLD
    DALY = YLL + YLD,

    # Per-capita rates for population-adjusted comparisons
    YLD_rate_100k  = (YLD  / population) * 100000,
    DALY_rate_100k = (DALY / population) * 100000,

    # Fraction of DALY attributable to YLL vs YLD
    YLL_fraction  = if_else(DALY > 0, YLL  / DALY, 0),
    YLD_fraction  = if_else(DALY > 0, YLD  / DALY, 0)
  )

# Verification: print a readable summary for one disease
daly_data |>
  filter(disease_name == "HIV/AIDS", sex == "Female") |>
  select(age_group, deaths, incident_cases, disability_weight, duration_years,
         YLL, YLD, DALY) |>
  kable(
    caption   = "DALY Computation Verification: HIV/AIDS — Female",
    col.names = c("Age Group", "Deaths", "Incid. Cases", "DW", "Duration (yr)",
                  "YLL", "YLD", "DALY"),
    digits    = c(0, 0, 0, 3, 2, 0, 0, 0),
    format.args = list(big.mark = ",")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
DALY Computation Verification: HIV/AIDS — Female
Age Group Deaths Incid. Cases DW Duration (yr) YLL YLD DALY
0-4 720 1,900 0.547 4.8 60,408 4,989 65,397
5-14 420 1,050 0.547 7.2 31,878 4,135 36,013
15-29 4,200 12,000 0.547 9.2 263,760 60,389 324,149
30-44 7,200 20,000 0.547 10.5 344,160 114,870 459,030
45-59 2,800 8,200 0.547 8.8 92,120 39,472 131,592
60-69 820 2,800 0.547 5.8 17,220 8,883 26,103
70-79 350 1,200 0.547 3.8 3,920 2,494 6,414
80+ 150 520 0.547 2.5 285 711 996

Interpretation of HIV/AIDS DALY computation:

  • In females aged 30–44, HIV generates the highest DALYs because both YLL (high deaths × ~48 remaining years) and YLD (high incidence × DW=0.547 × ~10.5 years duration) are simultaneously maximised.
  • The long duration parameter for HIV (8–10+ years) substantially amplifies YLD. This is the mathematical signature of a chronic, disabling condition — even with effective ART reducing mortality, the ongoing disability burden remains substantial.

4.3 Total DALY Summary

Code
# ─────────────────────────────────────────────────────────────────
# AGGREGATE DALY TABLE — by disease, combining both sexes
# ─────────────────────────────────────────────────────────────────
daly_summary <- daly_data |>
  group_by(disease_name, disease_category) |>
  summarise(
    Deaths      = sum(deaths),
    YLL         = sum(YLL),
    YLD         = sum(YLD),
    DALY        = sum(DALY),
    DALY_rate   = sum(DALY) / sum(population) * 100000,
    YLL_pct     = round(sum(YLL) / sum(DALY) * 100, 1),
    YLD_pct     = round(sum(YLD) / sum(DALY) * 100, 1),
    .groups     = "drop"
  ) |>
  arrange(desc(DALY))

daly_summary |>
  gt() |>
  tab_header(
    title    = "DALY Burden by Disease — Kenya 2019",
    subtitle = "All ages, both sexes; sorted by total DALY burden"
  ) |>
  fmt_number(columns = c(Deaths, YLL, YLD, DALY), use_seps = TRUE, decimals = 0) |>
  fmt_number(columns = DALY_rate, decimals = 1) |>
  fmt_number(columns = c(YLL_pct, YLD_pct), decimals = 1) |>
  cols_label(
    disease_name     = "Disease",
    disease_category = "Category",
    Deaths           = "Deaths",
    YLL              = "YLL",
    YLD              = "YLD",
    DALY             = "Total DALY",
    DALY_rate        = "DALY Rate/100k",
    YLL_pct          = "YLL %",
    YLD_pct          = "YLD %"
  ) |>
  data_color(columns = DALY, palette = "Reds") |>
  tab_footnote("% of total DALY attributable to each component.",
               locations = cells_column_labels(columns = YLL_pct)) |>
  tab_style(
    style     = cell_text(weight = "bold"),
    locations = cells_body(rows = 1)
  )
DALY Burden by Disease — Kenya 2019
All ages, both sexes; sorted by total DALY burden
Disease Category Deaths YLL YLD Total DALY DALY Rate/100k YLL %1 YLD %
HIV/AIDS Communicable 31,270 1,509,929 426,640 1,936,569 6,916.3 78.0 22.0
Malaria Communicable 15,420 1,007,966 646 1,008,612 3,602.2 99.9 0.1
Lower Respiratory Infections Communicable 20,890 709,141 2,392 711,533 2,541.2 99.7 0.3
Stroke Cardiovascular 31,883 427,099 25,495 452,594 1,616.4 94.4 5.6
Road Injuries Injuries 10,135 421,772 7,561 429,333 1,533.3 98.2 1.8
Diabetes Mellitus Non-Communicable 23,942 352,611 32,542 385,153 1,375.5 91.6 8.4
Ischemic Heart Disease Cardiovascular 27,031 354,507 16,794 371,301 1,326.1 95.5 4.5
Depressive Disorders Mental Health 751 0 111,408 111,408 397.9 0.0 100.0
1 % of total DALY attributable to each component.

5 Visualisation and Interpretation

5.1 DALY Composition — YLL vs YLD Stacked Chart

Code
# ─────────────────────────────────────────────────────────────────
# STACKED BAR CHART: Proportion of DALY from YLL vs YLD
# This is the most important diagnostic chart — it tells you
# whether a disease kills (high YLL fraction) or disables (high YLD fraction)
# ─────────────────────────────────────────────────────────────────
daly_long <- daly_summary |>
  select(disease_name, YLL, YLD, DALY) |>
  pivot_longer(cols = c(YLL, YLD), names_to = "Component", values_to = "Value") |>
  mutate(
    Proportion = Value / DALY,
    Component  = factor(Component, levels = c("YLD", "YLL"))
  )

ggplot(daly_long,
       aes(x = Value / 1e3,
           y = fct_reorder(disease_name, DALY),
           fill = Component)) +
  geom_col(alpha = 0.88, width = 0.7) +
  geom_text(
    data = daly_summary,
    aes(x = DALY / 1e3 + 20, y = disease_name,
        label = paste0(comma(round(DALY / 1e3, 0)), "k")),
    inherit.aes = FALSE, size = 3.2, hjust = 0, fontface = "bold"
  ) +
  scale_fill_manual(
    values = c("YLL" = "#D73027", "YLD" = "#4575B4"),
    labels = c("YLL" = "YLL (premature death)", "YLD" = "YLD (disability)")
  ) +
  scale_x_continuous(labels = comma, expand = expansion(mult = c(0, 0.15))) +
  labs(
    title    = "DALY Burden by Disease: Premature Death vs Disability",
    subtitle = "Kenya 2019 — Sorted by total DALY; values in thousands",
    x        = "DALYs (thousands)",
    y        = NULL,
    fill     = "DALY Component",
    caption  = "YLL = Years of Life Lost (mortality); YLD = Years Lived with Disability"
  ) +
  theme_burden() +
  theme(legend.position = "top")

Figure 3: DALY Composition: YLL vs YLD contribution by disease

Interpretation — DALY Composition:

  • HIV/AIDS is dominated by YLL, meaning the primary mechanism of its burden is premature death. Even with ART, untreated or late-treated HIV kills people at working ages, generating enormous life-year losses. The YLD component reflects ongoing disability in people living with HIV.
  • Depressive Disorders are 100% YLD (by definition in this dataset — near zero deaths), illustrating how mental health conditions can impose massive burden that is entirely invisible to mortality statistics. This is the key argument for using DALYs over death counts in health planning.
  • Malaria shows a mixed pattern: childhood deaths generate YLL, while recurrent non-fatal episodes in older children and adults generate YLD. In high-transmission settings, YLD from chronic anaemia and cognitive impairment can exceed YLL.
  • Diabetes is predominantly YLD: the disease is well-managed enough to delay mortality, but the long duration of the condition (8–11 years average in this data) with even mild disability weights accumulates substantial YLD.

5.2 DALY Rate by Age Group and Disease

Code
# ─────────────────────────────────────────────────────────────────
# FACETED LINE CHART: DALY rate across age groups, by sex
# Shows the age-pattern of burden for each disease independently
# ─────────────────────────────────────────────────────────────────
daly_age_sex <- daly_data |>
  group_by(disease_name, age_group, sex) |>
  summarise(
    DALY_rate = sum(DALY) / sum(population) * 100000,
    .groups   = "drop"
  )

ggplot(daly_age_sex, aes(x = age_group, y = DALY_rate,
                         colour = sex, group = sex)) +
  geom_line(linewidth = 1.0, alpha = 0.9) +
  geom_point(size = 2.5, alpha = 0.9) +
  facet_wrap(~disease_name, scales = "free_y", ncol = 2) +
  scale_colour_manual(values = c("Male" = "#2166AC", "Female" = "#D6604D")) +
  scale_y_continuous(labels = comma) +
  labs(
    title    = "Age-Specific DALY Rate by Disease and Sex",
    subtitle = "Kenya 2019 — Rate per 100,000 population (y-axis free scale)",
    x        = "Age Group", y = "DALY Rate (per 100,000)",
    colour   = "Sex",
    caption  = "Note: y-axes differ across panels to show within-disease age patterns."
  ) +
  theme_burden() +
  theme(
    axis.text.x   = element_text(angle = 45, hjust = 1, size = 8),
    legend.position = "top"
  )

Figure 4: DALY Rate per 100,000 by Age Group — Faceted by Disease

Interpretation — Age-Specific DALY Patterns:

  • Malaria shows a characteristic U-shape or monotonically declining pattern — the highest DALY rates in children under 5, then falling as acquired immunity develops, with a slight uptick in the elderly due to immune senescence. This underscores why under-5 malaria prevention (bed nets, chemoprevention) is the highest-impact intervention.
  • HIV/AIDS peaks dramatically in the 15–44 age band, with females consistently higher than males in the 15–29 group. This sex reversal is a hallmark of the sub-Saharan African epidemic and reflects higher female biological susceptibility and social vulnerability to HIV.
  • Cardiovascular diseases (IHD, Stroke) show the expected J-shaped increase with age, confirming that cardiovascular risk accumulates with ageing. Male rates exceed female rates until the 70–80+ band, where female longevity results in more years of exposure.
  • Road Injuries exhibit a distinctive peak in the 15–44 male age group — the classic pattern of young male risk-taking driving transport fatalities. Female rates are substantially lower across all ages.
  • Depression shows peak DALY rates in the 15–44 age group, especially in females aged 15–29, aligning with established epidemiology of major depressive disorder peaking in early adulthood.

5.3 DALY Treemap by Category

Code
# ─────────────────────────────────────────────────────────────────
# PROPORTIONAL STACKED BAR: Category-level DALY composition
# Useful for national health system budget allocation discussions
# ─────────────────────────────────────────────────────────────────
category_daly <- daly_data |>
  group_by(disease_category) |>
  summarise(
    DALY     = sum(DALY),
    YLL      = sum(YLL),
    YLD      = sum(YLD),
    .groups  = "drop"
  ) |>
  mutate(
    pct      = DALY / sum(DALY) * 100,
    label    = glue("{disease_category}\n{round(pct,1)}%\n({comma(round(DALY/1e3))}k)")
  ) |>
  arrange(desc(DALY))

ggplot(category_daly,
       aes(x = reorder(disease_category, DALY), y = DALY / 1e3,
           fill = disease_category)) +
  geom_col(width = 0.65, alpha = 0.88) +
  geom_text(aes(label = paste0(round(pct, 1), "%")),
            hjust = -0.2, fontface = "bold", size = 4) +
  scale_fill_brewer(palette = "Set2") +
  scale_y_continuous(labels = comma, expand = expansion(mult = c(0, 0.18))) +
  coord_flip() +
  labs(
    title    = "Total DALY Burden by Disease Category",
    subtitle = "Kenya 2019 — Percentage of total DALY shown",
    x        = NULL, y = "DALYs (thousands)",
    caption  = "Categories: GBD disease classification groupings."
  ) +
  theme_burden() +
  theme(legend.position = "none")

Figure 5: Total DALY by Disease Category — Proportional Bar Chart

5.4 YLL vs YLD Scatter — Disease Positioning

Code
# ─────────────────────────────────────────────────────────────────
# SCATTER PLOT: YLL (x) vs YLD (y)
# Diseases in the upper-left are high disability, low mortality
# Diseases in the lower-right are high mortality, low disability
# Diseases in the upper-right impose dual burden
# ─────────────────────────────────────────────────────────────────
ggplot(daly_summary,
       aes(x = YLL / 1e3, y = YLD / 1e3,
           colour = disease_category, size = DALY / 1e3)) +
  geom_point(alpha = 0.80) +
  geom_text_repel(aes(label = disease_name), size = 3.5, fontface = "bold",
                  max.overlaps = 15, box.padding = 0.5) +
  geom_abline(intercept = 0, slope = 1, linetype = "dashed",
              colour = "grey50", linewidth = 0.8) +
  annotate("text", x = max(daly_summary$YLL / 1e3) * 0.1,
           y = max(daly_summary$YLD / 1e3) * 0.9,
           label = "YLD > YLL\n(Disability-dominant)", colour = "grey40",
           size = 3.2, hjust = 0) +
  annotate("text", x = max(daly_summary$YLL / 1e3) * 0.75,
           y = max(daly_summary$YLD / 1e3) * 0.08,
           label = "YLL > YLD\n(Mortality-dominant)", colour = "grey40",
           size = 3.2, hjust = 0) +
  scale_size_continuous(name = "Total DALY (k)", range = c(3, 12)) +
  scale_colour_brewer(name = "Disease Category", palette = "Dark2") +
  scale_x_continuous(labels = comma) +
  scale_y_continuous(labels = comma) +
  labs(
    title    = "Disease Characterisation: Mortality Burden vs Disability Burden",
    subtitle = "Position relative to 45° line indicates dominant burden component",
    x        = "YLL (thousands) — Premature mortality component",
    y        = "YLD (thousands) — Disability component",
    caption  = "Dashed line = equal YLL and YLD. Point size = total DALY burden."
  ) +
  theme_burden()

Figure 6: YLL vs YLD Scatter Plot: Disease Characterisation

Interpretation — Disease Positioning:

  • Diseases above the 45° line (YLD > YLL) are primarily disabling rather than fatal. In this dataset, Depressive Disorders falls decisively above the line — it has near-zero YLL but substantial YLD.
  • Diseases below the 45° line (YLL > YLD) are primarily lethal. HIV/AIDS and Road Injuries in working-age adults fall in this zone — the main pathway of their burden is premature death, not chronic disability.
  • Malaria sits near the line, reflecting its dual nature: a major killer of children (YLL) and a chronic debilitating illness in older groups (YLD).
  • This positioning diagram is a powerful policy communication tool: diseases in the upper-left require disability management services; diseases in the lower-right require mortality prevention interventions.

6 Advanced Analyses

6.1 Population-Attributable Fraction by Age (YLL Contribution)

Code
# ─────────────────────────────────────────────────────────────────
# WHICH AGE GROUPS CONTRIBUTE MOST YLL?
# This guides intervention targeting:
# If children dominate → focus on paediatric prevention
# If working ages dominate → focus on adult screening and treatment
# ─────────────────────────────────────────────────────────────────
yll_age_paf <- yll_data |>
  group_by(age_group) |>
  summarise(
    YLL       = sum(YLL),
    Deaths    = sum(deaths),
    .groups   = "drop"
  ) |>
  mutate(
    YLL_pct   = YLL / sum(YLL) * 100,
    Deaths_pct = Deaths / sum(Deaths) * 100,
    # Ratio: how much of YLL does this age contribute relative to its share of deaths?
    # Ratio > 1 means YLL is disproportionately high (young deaths)
    YLL_Death_ratio = YLL_pct / Deaths_pct
  )

yll_age_paf |>
  kable(
    caption   = "YLL and Death Attribution by Age Group — All diseases",
    col.names = c("Age Group", "YLL", "Deaths", "YLL %", "Deaths %",
                  "YLL/Death Ratio"),
    digits    = c(0, 0, 0, 1, 1, 2),
    format.args = list(big.mark = ",")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) |>
  row_spec(which(yll_age_paf$YLL_Death_ratio > 1.5), background = "#FDECEA")
YLL and Death Attribution by Age Group — All diseases
Age Group YLL Deaths YLL % Deaths % YLL/Death Ratio
0-4 1,244,810 14,770 26.0 9.2 2.84
5-14 314,223 4,103 6.6 2.5 2.58
15-29 718,230 11,466 15.0 7.1 2.11
30-44 941,126 19,715 19.7 12.2 1.61
45-59 655,825 19,925 13.7 12.4 1.11
60-69 481,253 22,720 10.1 14.1 0.71
70-79 347,622 30,923 7.3 19.2 0.38
80+ 79,937 37,700 1.7 23.4 0.07

Interpretation — YLL/Death Ratio:

  • A YLL/Death ratio > 1 means that age group is “over-represented” in YLL relative to its share of deaths. Highlighted rows (ratio > 1.5) indicate age groups where each death removes many remaining life years.
  • Children under 5 have the highest ratio: each death removes ~84 years, so even modest numbers of childhood deaths dominate total YLL.
  • Ages 70–80+ have a ratio < 1: they account for the majority of deaths but each death removes only 6–16 years, so their share of YLL is proportionally much smaller.
  • This is the mathematical justification for prioritising child health interventions in YLL-based burden analyses.

6.2 Sex-Stratified DALY Rate Comparison

Code
# ─────────────────────────────────────────────────────────────────
# DUMBBELL PLOT: Male vs Female DALY rates per disease
# Shows the magnitude and direction of sex inequality in burden
# ─────────────────────────────────────────────────────────────────
sex_daly <- daly_data |>
  group_by(disease_name, sex) |>
  summarise(
    DALY_rate = sum(DALY) / sum(population) * 100000,
    .groups   = "drop"
  ) |>
  pivot_wider(names_from = sex, values_from = DALY_rate) |>
  mutate(
    gap       = Male - Female,
    dominant  = if_else(gap >= 0, "Male higher", "Female higher")
  )

ggplot(sex_daly) +
  geom_segment(aes(x = Female, xend = Male,
                   y = fct_reorder(disease_name, Male),
                   yend = fct_reorder(disease_name, Male),
                   colour = dominant),
               linewidth = 1.8, alpha = 0.7) +
  geom_point(aes(x = Female, y = disease_name), colour = "#D6604D", size = 4) +
  geom_point(aes(x = Male,   y = disease_name), colour = "#2166AC", size = 4) +
  geom_text(aes(x = Male, y = disease_name,
                label = paste0("Δ=", round(abs(gap), 0))),
            nudge_y = 0.28, size = 3, fontface = "italic") +
  scale_colour_manual(
    values = c("Male higher" = "#2166AC", "Female higher" = "#D6604D")
  ) +
  scale_x_continuous(labels = comma) +
  labs(
    title    = "DALY Rate Disparity Between Males and Females",
    subtitle = "Dumbbell plot — Blue = Male; Red = Female; Δ = absolute difference",
    x        = "DALY Rate per 100,000 Population",
    y        = NULL,
    colour   = "Which sex has higher burden",
    caption  = "Rates standardised within each sex's own population."
  ) +
  theme_burden() +
  theme(legend.position = "top")

Figure 7: Sex-Specific DALY Rate Comparison by Disease

Interpretation — Sex Inequalities:

  • Road Injuries show the largest male excess, consistent with global data showing 3:1 male-to-female road mortality ratios in low- and middle-income countries. Interventions targeting young male drivers (helmet laws, breathalysing, speed enforcement) would close this gap most effectively.
  • HIV/AIDS in this dataset shows female excess in DALY rates due to the disproportionate HIV burden on young women in sub-Saharan Africa. This is a policy-critical finding — female-focused PrEP programmes, social protection, and economic empowerment are the highest-impact responses.
  • Depression shows female excess, consistent with the well-established 2:1 female-to-male ratio in major depressive disorder globally.
  • Cardiovascular diseases show male excess, particularly IHD, reflecting earlier age of onset in males and higher hypertension prevalence.

6.3 Cause-Deleted Life Expectancy (Hypothetical Analysis)

Code
# ─────────────────────────────────────────────────────────────────
# CAUSE-DELETED LIFE EXPECTANCY
#
# Question: How much would life expectancy increase if we
# eliminated each disease?
#
# Approximation: ΔLE ≈ YLL / total_population
# This gives the average life-years gained per person if the
# disease were fully eliminated. It is an approximation of the
# Arriaga decomposition method.
# ─────────────────────────────────────────────────────────────────
total_pop <- daly_data |>
  distinct(age_group, sex, disease_name, population) |>
  # Use one disease to avoid duplicate population rows
  filter(disease_name == "Ischemic Heart Disease") |>
  summarise(total_pop = sum(population)) |>
  pull(total_pop)

cause_deleted <- yll_data |>
  group_by(disease_name) |>
  summarise(total_YLL = sum(YLL), .groups = "drop") |>
  mutate(
    delta_LE_years = total_YLL / total_pop,
    delta_LE_days  = delta_LE_years * 365.25,
    interpretation = case_when(
      delta_LE_years >= 1    ~ "Major impact (≥1 year gain)",
      delta_LE_years >= 0.25 ~ "Moderate impact (3–12 months gain)",
      TRUE                   ~ "Modest impact (<3 months gain)"
    )
  ) |>
  arrange(desc(delta_LE_years))

cause_deleted |>
  gt() |>
  tab_header(
    title    = "Hypothetical Gain in Life Expectancy if Disease Eliminated",
    subtitle = glue("Estimated based on total YLL and population of {comma(total_pop)}")
  ) |>
  fmt_number(columns = total_YLL,      use_seps = TRUE, decimals = 0) |>
  fmt_number(columns = delta_LE_years, decimals = 3) |>
  fmt_number(columns = delta_LE_days,  decimals = 1) |>
  cols_label(
    disease_name    = "Disease",
    total_YLL       = "Total YLL",
    delta_LE_years  = "ΔLE (years)",
    delta_LE_days   = "ΔLE (days)",
    interpretation  = "Impact Level"
  ) |>
  data_color(columns = delta_LE_years, palette = "YlOrRd") |>
  tab_footnote(
    footnote  = "ΔLE ≈ YLL / population. Assumes independence of causes (not strictly valid for correlated risks).",
    locations = cells_column_labels(columns = delta_LE_years)
  )
Hypothetical Gain in Life Expectancy if Disease Eliminated
Estimated based on total YLL and population of 28,000,000
Disease Total YLL ΔLE (years)1 ΔLE (days) Impact Level
HIV/AIDS 1,509,929 0.054 19.7 Modest impact (<3 months gain)
Malaria 1,007,966 0.036 13.1 Modest impact (<3 months gain)
Lower Respiratory Infections 709,141 0.025 9.3 Modest impact (<3 months gain)
Stroke 427,099 0.015 5.6 Modest impact (<3 months gain)
Road Injuries 421,772 0.015 5.5 Modest impact (<3 months gain)
Ischemic Heart Disease 354,507 0.013 4.6 Modest impact (<3 months gain)
Diabetes Mellitus 352,611 0.013 4.6 Modest impact (<3 months gain)
Depressive Disorders 0 0.000 0.0 Modest impact (<3 months gain)
1 ΔLE ≈ YLL / population. Assumes independence of causes (not strictly valid for correlated risks).

Interpretation — Cause-Deleted Analysis:

  • Eliminating HIV/AIDS would yield the largest life expectancy gain, reflecting its concentration of deaths in the 15–44 age group where remaining life expectancy is highest.
  • Cardiovascular diseases (IHD + Stroke combined) would provide the second largest gain. Even though cardiovascular deaths are concentrated in older ages (reducing per-death YLL), the sheer volume of deaths creates a substantial aggregate impact.
  • Malaria elimination would provide meaningful but smaller gains in life expectancy because, while child deaths are devastating in YLL terms, the population base is large.
  • Depression contributes zero ΔLE in this model because it is classified as non-fatal — reinforcing that using LE alone would entirely miss the burden of mental illness.

7 Policy Implications and Priority Setting

7.1 Cost-Effectiveness Framing

Code
# ─────────────────────────────────────────────────────────────────
# PRIORITY MATRIX
# Combines DALY burden with intervention availability data
# (Notional cost-effectiveness values for illustration)
# ─────────────────────────────────────────────────────────────────
priority_df <- daly_summary |>
  select(disease_name, disease_category, DALY, DALY_rate) |>
  mutate(
    # Notional cost per DALY averted (USD) — illustrative values based on DCP3
    cost_per_DALY_averted = c(185, 220, 1850, 6200, 4500, 95, 280, 750),
    # Intervention availability (1=good, 0=poor)
    intervention_strength = c(3, 2, 4, 2, 3, 5, 2, 3),
    priority_score = (DALY / max(DALY)) * (1000 / cost_per_DALY_averted) *
                     (intervention_strength / 5),
    priority_tier = cut(priority_score,
                        breaks   = quantile(priority_score, c(0, 0.33, 0.67, 1)),
                        labels   = c("Low Priority", "Medium Priority", "High Priority"),
                        include.lowest = TRUE)
  )

ggplot(priority_df,
       aes(x = DALY / 1e3,
           y = cost_per_DALY_averted,
           size = DALY_rate,
           colour = priority_tier)) +
  geom_point(alpha = 0.80) +
  geom_text_repel(aes(label = disease_name), size = 3.2,
                  box.padding = 0.5, max.overlaps = 15) +
  scale_y_log10(labels = dollar) +
  scale_size_continuous(name = "DALY Rate/100k", range = c(3, 14)) +
  scale_colour_manual(
    name   = "Priority Tier",
    values = c("High Priority"   = "#D73027",
               "Medium Priority" = "#F46D43",
               "Low Priority"    = "#74ADD1")
  ) +
  labs(
    title    = "Disease Priority Matrix: DALY Burden vs Cost-Effectiveness",
    subtitle = "High burden + low cost per DALY averted = highest priority (illustrative data)",
    x        = "Total DALY (thousands)",
    y        = "Cost per DALY Averted (USD, log scale)",
    caption  = "Cost-per-DALY values are illustrative, based on DCP3 order-of-magnitude estimates."
  ) +
  theme_burden()

Interpretation — Priority Matrix:

  • Diseases in the lower-right quadrant (high DALY burden, low cost per DALY averted) represent the strongest case for immediate investment. Malaria and LRI typically occupy this space because bed nets, vaccines, and antibiotics are highly cost-effective.
  • HIV/AIDS has high DALY but moderate cost-effectiveness — ART is effective but expensive at scale, placing it in the medium-priority tier in this simplified model. Prevention (PrEP, condom promotion) would shift it toward higher priority.
  • Cardiovascular and diabetes interventions are often more expensive per DALY averted due to the need for lifelong medication and monitoring, placing them in the upper portions of the plot.
  • This matrix is a starting point for investment cases, not a final answer. Equity considerations, political feasibility, and co-benefits must also inform resource allocation.

8 Summary Statistics and Final Table

Code
# ─────────────────────────────────────────────────────────────────
# COMPREHENSIVE FINAL SUMMARY TABLE
# Suitable for a national health report or policy brief appendix
# ─────────────────────────────────────────────────────────────────
final_table <- daly_data |>
  group_by(disease_name, disease_category, sex) |>
  summarise(
    Deaths     = sum(deaths),
    YLL        = sum(YLL),
    YLD        = sum(YLD),
    DALY       = sum(DALY),
    DALY_rate  = sum(DALY) / sum(population) * 100000,
    YLL_frac   = round(sum(YLL) / sum(DALY) * 100, 1),
    .groups    = "drop"
  ) |>
  arrange(disease_category, disease_name, sex)

# Render as interactive DataTable for exploration
datatable(
  final_table |>
    mutate(across(c(Deaths, YLL, YLD, DALY), ~comma(round(.))),
           DALY_rate = round(DALY_rate, 1),
           YLL_frac  = paste0(YLL_frac, "%")),
  colnames  = c("Disease", "Category", "Sex", "Deaths", "YLL", "YLD",
                "DALY", "DALY Rate/100k", "YLL%"),
  filter    = "top",
  extensions = "Buttons",
  options   = list(
    pageLength = 16,
    dom        = "Bfrtip",
    buttons    = c("copy", "csv", "excel"),
    scrollX    = TRUE
  ),
  caption   = "Full DALY Results Table — Kenya 2019 (Interactive: search, sort, export)"
)

9 Methodological Notes

9.1 Key Analytical Assumptions

Analytical Assumptions and Justifications
Assumption Value_Used Justification
Standard Life Expectancy 86.0 years (GBD 2019 standard frontier) Represents the ideal life expectancy achievable — aspiration rather than current LE
YLD Formula Incidence × Disability Weight × Duration Preferred over prevalence-based when incidence data available
Age at Death Proxy Mid-point of age group when not recorded Standard epidemiological convention for grouped data
Population Denominator Population size per age-sex stratum Ensures rates are comparable across strata of different sizes
Discounting No discounting applied (consistent with GBD 2010+) GBD dropped discounting in 2010 to avoid systematically undervaluing elderly burden
Age-weighting No age-weighting (GBD 2010+ methodology) GBD dropped age-weighting in 2010 to avoid systematically undervaluing children/elderly

9.2 Limitations

  1. Simulated data: This dataset is designed for methodological illustration. Real GBD analyses use cause-of-death models, DisMod-MR for disease modelling, and systematic reviews — not point estimates.
  2. Independence assumption: The cause-deleted LE analysis assumes diseases are independent, which is not true (e.g., diabetes increases cardiovascular and infectious disease risk simultaneously).
  3. Comorbidity not modelled: Individuals with multiple conditions experience disability from each, but joint disability weights are not additive — the GBD uses a multiplicative correction.
  4. No uncertainty intervals: Real GBD estimates include 95% uncertainty intervals derived from Monte Carlo simulation of input uncertainty.

10 References and Further Reading

  • GBD 2019 Diseases and Injuries Collaborators (2020). Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019. The Lancet, 396(10258), 1204–1222.
  • Murray CJL, Lopez AD (1996). The Global Burden of Disease. Harvard School of Public Health / WHO.
  • Salomon JA et al. (2012). Disability weights for the Global Burden of Disease 2010 study. The Lancet, 380(9859), 2129–2143.
  • Institute for Health Metrics and Evaluation (IHME). GBD Compare tool. http://vizhub.healthdata.org/gbd-compare
  • Disease Control Priorities 3rd Edition (DCP3). http://dcp-3.org
Code
sessionInfo()
## R version 4.3.1 (2023-06-16 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_Kenya.utf8  LC_CTYPE=English_Kenya.utf8   
## [3] LC_MONETARY=English_Kenya.utf8 LC_NUMERIC=C                  
## [5] LC_TIME=English_Kenya.utf8    
## 
## time zone: Africa/Nairobi
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] DT_0.34.0          glue_1.8.0         RColorBrewer_1.1-3 viridis_0.6.5     
##  [5] viridisLite_0.4.2  gt_1.3.0           janitor_2.2.1      patchwork_1.3.2   
##  [9] ggrepel_0.9.8      scales_1.4.0       kableExtra_1.4.0   knitr_1.50        
## [13] lubridate_1.9.4    forcats_1.0.1      stringr_1.5.2      dplyr_1.2.0       
## [17] purrr_1.1.0        readr_2.1.5        tidyr_1.3.1        tibble_3.3.0      
## [21] ggplot2_4.0.0      tidyverse_2.0.0   
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6      bslib_0.9.0       xfun_0.53         htmlwidgets_1.6.4
##  [5] tzdb_0.5.0        vctrs_0.7.2       tools_4.3.1       crosstalk_1.2.2  
##  [9] generics_0.1.4    parallel_4.3.1    pkgconfig_2.0.3   S7_0.2.0         
## [13] lifecycle_1.0.5   compiler_4.3.1    farver_2.1.2      textshaping_1.0.5
## [17] snakecase_0.11.1  htmltools_0.5.8.1 sass_0.4.10       yaml_2.3.10      
## [21] jquerylib_0.1.4   pillar_1.11.1     crayon_1.5.3      cachem_1.1.0     
## [25] tidyselect_1.2.1  digest_0.6.37     stringi_1.8.7     labeling_0.4.3   
## [29] fastmap_1.2.0     grid_4.3.1        cli_3.6.5         magrittr_2.0.4   
## [33] withr_3.0.2       bit64_4.6.0-1     timechange_0.3.0  rmarkdown_2.30   
## [37] bit_4.6.0         gridExtra_2.3     hms_1.1.4         evaluate_1.0.5   
## [41] rlang_1.1.7       Rcpp_1.1.1        xml2_1.4.1        svglite_2.2.2    
## [45] rstudioapi_0.18.0 vroom_1.6.6       jsonlite_2.0.0    R6_2.6.1         
## [49] systemfonts_1.3.2 fs_1.6.6