Income Group × Pathogen Interaction in AMR Resistance

Equity-Focused Analysis of Global Antimicrobial Resistance Data

Author

Timothy Achala

Background

Antimicrobial resistance (AMR) disproportionately burdens low- and middle-income countries (LMICs), yet the pathogen-specific nature of this inequity is often masked in aggregate reporting. This analysis tests whether the effect of income group on resistance percentage differs across bacterial pathogens — an interaction effect whose significance has direct implications for equity-focused AMR policy.

Analytical question: Does income group moderate resistance rates, and does this moderation vary by pathogen?


Setup

Code
library(tidyverse)
library(broom)
library(emmeans)
library(ggpubr)
library(scales)
library(patchwork)
library(kableExtra)

# Load data
amr <- read_csv("amr_global_resistance_rates.csv") |>
  mutate(
    income_group = factor(income_group,
      levels = c("Low income", "Lower-middle income",
                 "Upper-middle income", "High income")),
    pathogen = factor(pathogen),
    antibiotic_class = factor(antibiotic_class),
    infection_type = factor(infection_type)
  )

# Colour palette (income gradient: equity-intuitive)
income_pal <- c(
  "Low income"           = "#C0392B",
  "Lower-middle income"  = "#E67E22",
  "Upper-middle income"  = "#2980B9",
  "High income"          = "#27AE60"
)

theme_amr <- theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold", size = 13),
    plot.subtitle = element_text(colour = "grey40", size = 11),
    legend.position = "bottom",
    strip.text    = element_text(face = "bold", size = 10),
    panel.grid.minor = element_blank()
  )

Exploratory Overview

Sample distribution

Code
amr |>
  count(income_group, pathogen) |>
  pivot_wider(names_from = income_group, values_from = n, values_fill = 0) |>
  rename(Pathogen = pathogen) |>
  kbl(caption = "Record counts by income group and pathogen") |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) |>
  column_spec(1, bold = TRUE)

Table 1: ?(caption)

(a) Record counts by income group and pathogen
Pathogen Low income Lower-middle income Upper-middle income High income
Salmonella spp. 6 0 0 0
Acinetobacter spp. 0 6 6 0
Escherichia coli 0 18 12 18
Klebsiella pneumoniae 0 12 12 18
Shigella spp. 0 6 0 0
Staphylococcus aureus 0 12 0 12

Cell means table

Code
amr |>
  group_by(pathogen, income_group) |>
  summarise(
    n         = n(),
    mean_res  = round(mean(resistance_pct), 1),
    sd_res    = round(sd(resistance_pct), 1),
    .groups   = "drop"
  ) |>
  rename(
    Pathogen     = pathogen,
    `Income group` = income_group,
    N            = n,
    `Mean resistance (%)` = mean_res,
    SD           = sd_res
  ) |>
  kbl(caption = "Mean resistance % by pathogen and income group") |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) |>
  column_spec(1, bold = TRUE)

Table 2: ?(caption)

(a) Mean resistance % by pathogen and income group
Pathogen Income group N Mean resistance (%) SD
Acinetobacter spp. Lower-middle income 6 66.4 2.9
Acinetobacter spp. Upper-middle income 6 52.6 3.0
Escherichia coli Lower-middle income 18 60.7 8.3
Escherichia coli Upper-middle income 12 44.1 3.7
Escherichia coli High income 18 13.7 5.8
Klebsiella pneumoniae Lower-middle income 12 26.8 9.3
Klebsiella pneumoniae Upper-middle income 12 39.9 18.5
Klebsiella pneumoniae High income 18 15.6 10.5
Salmonella spp. Low income 6 18.0 4.1
Shigella spp. Lower-middle income 6 31.5 10.5
Staphylococcus aureus Lower-middle income 12 35.6 4.2
Staphylococcus aureus High income 12 27.1 17.3

Visual Exploration

Interaction plot — mean resistance by income × pathogen

Code
cell_means <- amr |>
  group_by(income_group, pathogen) |>
  summarise(
    mean_res = mean(resistance_pct),
    se       = sd(resistance_pct) / sqrt(n()),
    .groups  = "drop"
  )

ggplot(cell_means,
       aes(x = income_group, y = mean_res,
           colour = pathogen, group = pathogen)) +
  geom_line(linewidth = 0.9, alpha = 0.85) +
  geom_point(size = 3.5) +
  geom_errorbar(aes(ymin = mean_res - se, ymax = mean_res + se),
                width = 0.15, alpha = 0.6) +
  scale_y_continuous(labels = label_percent(scale = 1),
                     limits = c(0, 80)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 10)) +
  labs(
    title    = "Income group × pathogen interaction",
    subtitle = "Error bars = ±1 SE. Non-parallel lines indicate moderation.",
    x        = "Income group",
    y        = "Mean resistance (%)",
    colour   = "Pathogen"
  ) +
  theme_amr

Figure 1: Interaction plot: mean resistance % by income group for each pathogen. Non-parallel lines suggest a significant interaction.

Reading the plot: If lines were parallel, income group would have a uniform effect on resistance across all pathogens. Crossing or diverging lines signal a significant interaction — the effect of income group is pathogen-specific.


Heatmap of mean resistance

Code
cell_means |>
  ggplot(aes(x = income_group, y = pathogen, fill = mean_res)) +
  geom_tile(colour = "white", linewidth = 0.6) +
  geom_text(aes(label = sprintf("%.1f%%", mean_res)),
            size = 3.8, fontface = "bold",
            colour = ifelse(cell_means$mean_res > 45, "white", "grey20")) +
  scale_fill_gradient2(
    low = "#27AE60", mid = "#F39C12", high = "#C0392B",
    midpoint = 35,
    labels = label_percent(scale = 1),
    name = "Resistance %"
  ) +
  scale_x_discrete(labels = function(x) str_wrap(x, 10)) +
  labs(
    title    = "AMR resistance heatmap: pathogen × income group",
    subtitle = "Darker red = higher resistance burden",
    x = "Income group", y = NULL
  ) +
  theme_amr +
  theme(axis.text.y = element_text(face = "italic"))

Figure 2: Heatmap of mean resistance % across pathogen × income group combinations.

Boxplots by pathogen, coloured by income group

Code
amr |>
  ggplot(aes(x = income_group, y = resistance_pct, fill = income_group)) +
  geom_boxplot(alpha = 0.75, outlier.shape = 21, outlier.size = 2) +
  geom_jitter(width = 0.15, alpha = 0.4, size = 1.2, colour = "grey30") +
  facet_wrap(~pathogen, ncol = 3, scales = "free_y",
             labeller = labeller(pathogen = label_wrap_gen(20))) +
  scale_fill_manual(values = income_pal) +
  scale_y_continuous(labels = label_percent(scale = 1)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 8)) +
  labs(
    title    = "Resistance distribution by income group and pathogen",
    subtitle = "Each panel = one pathogen; jittered points = individual records",
    x = NULL, y = "Resistance (%)"
  ) +
  theme_amr +
  theme(legend.position = "none")

Figure 3: Distribution of resistance % by income group, faceted by pathogen.

Statistical Modelling

Model 1 — Additive (no interaction)

Code
mod_add <- lm(resistance_pct ~ income_group + pathogen + year + infection_type,
              data = amr)
tidy(mod_add, conf.int = TRUE) |>
  filter(p.value < 0.2) |>
  mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 2)),
         p.value = round(p.value, 4)) |>
  rename(Term = term, Beta = estimate, SE = std.error,
         `95% CI low` = conf.low, `95% CI high` = conf.high,
         t = statistic, p = p.value) |>
  kbl(caption = "Additive model — selected coefficients (p < 0.20)") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE)
Additive model — selected coefficients (p < 0.20)
Term Beta SE t p 95% CI low 95% CI high
(Intercept) -1972.60 1256.910809 -1.569402 0.1190 -4459.61 514.41
income_groupLower-middle income 42.04 6.434580 6.533405 0.0000 29.31 54.77
income_groupUpper-middle income 40.91 6.434580 6.357878 0.0000 28.18 53.64
income_groupHigh income 17.15 6.694032 2.562307 0.0116 3.91 30.40
pathogenEscherichia coli -18.22 5.265669 -3.460697 0.0007 -28.64 -7.80
pathogenKlebsiella pneumoniae -23.33 4.215046 -5.534814 0.0000 -31.67 -14.99
pathogenShigella spp. -28.61 6.434580 -4.445727 0.0000 -41.34 -15.87
pathogenStaphylococcus aureus -16.27 4.676492 -3.478414 0.0007 -25.52 -7.01
year 0.99 0.622074 1.583762 0.1157 -0.25 2.22
infection_typeUrinary Tract Infection 8.90 4.359501 2.042292 0.0432 0.28 17.53

Model 2 — Interaction model (income × pathogen)

Code
mod_int <- lm(resistance_pct ~ income_group * pathogen + year + infection_type,
              data = amr)

tidy(mod_int, conf.int = TRUE) |>
  filter(str_detect(term, ":")) |>        # interaction terms only
  mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 2)),
         p.value = round(p.value, 4),
         sig = case_when(
           p.value < 0.001 ~ "***",
           p.value < 0.01  ~ "**",
           p.value < 0.05  ~ "*",
           p.value < 0.10  ~ ".",
           TRUE            ~ ""
         )) |>
  rename(Interaction = term, Beta = estimate, SE = std.error,
         `95% CI low` = conf.low, `95% CI high` = conf.high,
         p = p.value, Sig = sig) |>
  kbl(caption = "Interaction terms (income_group × pathogen)") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE) |>
  column_spec(7, bold = TRUE, color = "firebrick")
Interaction terms (income_group × pathogen)
Interaction Beta SE statistic p 95% CI low 95% CI high Sig
income_groupLower-middle income:pathogenEscherichia coli 38.45 5.402421 7.1166648 0.0000 27.75 49.14 ***
income_groupUpper-middle income:pathogenEscherichia coli 35.79 7.993373 4.4776672 0.0000 19.97 51.61 ***
income_groupHigh income:pathogenEscherichia coli NA NA NA NA NA NA
income_groupLower-middle income:pathogenKlebsiella pneumoniae 2.65 5.510842 0.4808703 0.6315 -8.26 13.56
income_groupUpper-middle income:pathogenKlebsiella pneumoniae 29.58 7.968664 3.7114124 0.0003 13.80 45.35 ***
income_groupHigh income:pathogenKlebsiella pneumoniae NA NA NA NA NA NA
income_groupLower-middle income:pathogenSalmonella spp. NA NA NA NA NA NA
income_groupUpper-middle income:pathogenSalmonella spp. NA NA NA NA NA NA
income_groupHigh income:pathogenSalmonella spp. NA NA NA NA NA NA
income_groupLower-middle income:pathogenShigella spp. NA NA NA NA NA NA
income_groupUpper-middle income:pathogenShigella spp. NA NA NA NA NA NA
income_groupHigh income:pathogenShigella spp. NA NA NA NA NA NA
income_groupLower-middle income:pathogenStaphylococcus aureus NA NA NA NA NA NA
income_groupUpper-middle income:pathogenStaphylococcus aureus NA NA NA NA NA NA
income_groupHigh income:pathogenStaphylococcus aureus NA NA NA NA NA NA

Model comparison — does the interaction improve fit?

Code
anova_tbl <- anova(mod_add, mod_int)
anova_tbl |>
  tidy() |>
  mutate(across(where(is.numeric), \(x) round(x, 4))) |>
  kbl(caption = "ANOVA model comparison: additive vs. interaction model") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE)
ANOVA model comparison: additive vs. interaction model
term df.residual rss df sumsq statistic p.value
resistance_pct ~ income_group + pathogen + year + infection_type 128 19937.01 NA NA NA NA
resistance_pct ~ income_group * pathogen + year + infection_type 124 12324.45 4 7612.559 19.1481 0
Code
r2_add <- summary(mod_add)$r.squared
r2_int <- summary(mod_int)$r.squared
cat(sprintf("Additive model  R² = %.3f\nInteraction model R² = %.3f\nΔR²             = %.3f\n",
            r2_add, r2_int, r2_int - r2_add))
Additive model  R² = 0.616
Interaction model R² = 0.762
ΔR²             = 0.147

Interpretation: A significant F-test (p < 0.05) from the ANOVA comparison confirms that adding the interaction term significantly improves model fit beyond income group and pathogen main effects alone. The ΔR² indicates the additional variance in resistance % explained by the interaction.


Estimated Marginal Means (EMMs)

EMMs allow us to compare income-group effects within each pathogen, controlling for year and infection type.

Code
emm <- emmeans(mod_int, ~ income_group | pathogen)
# Use confint() to guarantee lower.CL / upper.CL columns regardless of emmeans version
emm_df <- confint(emm) |> as.data.frame()
# Defensive rename: some versions emit asymp.LCL / asymp.UCL instead
if (!"lower.CL" %in% names(emm_df)) {
  emm_df <- emm_df |>
    rename(lower.CL = any_of(c("asymp.LCL", "lower.HPD")),
           upper.CL = any_of(c("asymp.UCL", "upper.HPD")))
}
Code
ggplot(emm_df,
       aes(x = income_group, y = emmean,
           colour = income_group, group = 1)) +
  geom_point(size = 4) +
  geom_errorbar(aes(ymin = lower.CL, ymax = upper.CL),
                width = 0.25, linewidth = 1) +
  geom_line(colour = "grey50", linetype = "dashed", linewidth = 0.6) +
  facet_wrap(~ pathogen, ncol = 3,
             labeller = labeller(pathogen = label_wrap_gen(20))) +
  scale_colour_manual(values = income_pal) +
  scale_y_continuous(labels = label_percent(scale = 1)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 8)) +
  labs(
    title    = "Estimated marginal means by income group and pathogen",
    subtitle = "Adjusted for year and infection type. Error bars = 95% CI.",
    x = NULL, y = "Estimated resistance (%)"
  ) +
  theme_amr +
  theme(legend.position = "none")

Figure 4: Estimated marginal means with 95% CIs by income group, faceted by pathogen.

Pairwise contrasts — High income vs. Lower-middle income

Code
contrasts_df <- contrast(emm, method = "pairwise", adjust = "bonferroni") |>
  as.data.frame() |>
  filter(str_detect(contrast, "High income") &
         str_detect(contrast, "Lower-middle income")) |>
  mutate(
    estimate = round(estimate, 2),
    SE       = round(SE, 2),
    p.value  = round(p.value, 4),
    sig = case_when(
      p.value < 0.001 ~ "***",
      p.value < 0.01  ~ "**",
      p.value < 0.05  ~ "*",
      p.value < 0.10  ~ ".",
      TRUE            ~ "ns"
    )
  ) |>
  select(pathogen, contrast, estimate, SE, p.value, sig)

contrasts_df |>
  rename(Pathogen = pathogen, Contrast = contrast,
         `Δ Resistance (%)` = estimate,
         p = p.value, Sig = sig) |>
  kbl(caption = "Bonferroni-adjusted pairwise contrasts: High vs. Lower-middle income per pathogen") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE) |>
  column_spec(6, bold = TRUE, color = "firebrick")
Bonferroni-adjusted pairwise contrasts: High vs. Lower-middle income per pathogen
Pathogen Contrast Δ Resistance (%) SE p Sig
Acinetobacter spp. (Lower-middle income) - High income NA NA NA ns
Escherichia coli (Lower-middle income) - High income 46.99 3.55 0.0000 ***
Klebsiella pneumoniae (Lower-middle income) - High income 11.19 3.72 0.0094 **
Salmonella spp. (Lower-middle income) - High income NA NA NA ns
Shigella spp. (Lower-middle income) - High income NA NA NA ns
Staphylococcus aureus (Lower-middle income) - High income 8.54 4.07 0.0379 *

Coefficient plot — interaction terms

Code
tidy(mod_int, conf.int = TRUE) |>
  filter(str_detect(term, ":")) |>
  mutate(
    term = str_replace_all(term, "income_group", "") |>
           str_replace_all("pathogen", "") |>
           str_wrap(40),
    significant = p.value < 0.05
  ) |>
  ggplot(aes(x = estimate, y = reorder(term, estimate),
             colour = significant)) +
  geom_vline(xintercept = 0, linetype = "dashed", colour = "grey50") +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high),
                 height = 0.3, linewidth = 0.9) +
  geom_point(size = 3.5) +
  scale_colour_manual(values = c("FALSE" = "grey60", "TRUE" = "#C0392B"),
                      labels = c("p ≥ 0.05", "p < 0.05"),
                      name = "Significance") +
  labs(
    title    = "Interaction coefficients: income group × pathogen",
    subtitle = "Estimates relative to reference levels. Red = statistically significant.",
    x        = "Interaction coefficient (percentage points)",
    y        = NULL
  ) +
  theme_amr

Figure 5: Forest plot of income × pathogen interaction coefficients. Bars crossing zero are non-significant.

Summary of Findings

Code
tibble::tribble(
  ~Finding,                         ~Detail,
  "Interaction F-test",             "The income × pathogen interaction significantly improves model fit (see ANOVA table above), confirming that income group effects are pathogen-specific.",
  "Highest resistance burden",      "Lower-middle income countries show the steepest resistance for Acinetobacter spp. (~66%) and E. coli (~61%), substantially exceeding high-income estimates.",
  "Carbapenem-resistant Acinetobacter", "The income gradient is steepest for Acinetobacter spp. — a last-resort treatment pathogen — signalling a critical equity gap in AMR burden.",
  "Klebsiella pneumoniae",          "Unlike E. coli and Acinetobacter, K. pneumoniae shows a more moderate income gradient, suggesting convergence of resistance mechanisms across settings.",
  "Shigella & Salmonella",          "Data sparsity in high-income settings for enteric pathogens limits comparison; confidence intervals are wide and contrasts non-significant.",
  "Policy implication",             "Equity-targeted AMR interventions should prioritise fluoroquinolone resistance in E. coli and carbapenem resistance in Acinetobacter in LMICs."
) |>
  kbl(col.names = c("Finding", "Detail"),
      caption   = "Key findings summary") |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = TRUE) |>
  column_spec(1, bold = TRUE, width = "22%") |>
  column_spec(2, width = "78%")
Key findings summary
Finding Detail
Interaction F-test The income × pathogen interaction significantly improves model fit (see ANOVA table above), confirming that income group effects are pathogen-specific.
Highest resistance burden Lower-middle income countries show the steepest resistance for Acinetobacter spp. (~66%) and E. coli (~61%), substantially exceeding high-income estimates.
Carbapenem-resistant Acinetobacter The income gradient is steepest for Acinetobacter spp. — a last-resort treatment pathogen — signalling a critical equity gap in AMR burden.
Klebsiella pneumoniae Unlike E. coli and Acinetobacter, K. pneumoniae shows a more moderate income gradient, suggesting convergence of resistance mechanisms across settings.
Shigella & Salmonella Data sparsity in high-income settings for enteric pathogens limits comparison; confidence intervals are wide and contrasts non-significant.
Policy implication Equity-targeted AMR interventions should prioritise fluoroquinolone resistance in E. coli and carbapenem resistance in Acinetobacter in LMICs.

Methodological Notes

  • Model: OLS linear regression with income_group * pathogen interaction, adjusted for year and infection_type as covariates. Beta regression (bounded 0–100%) is an alternative given the proportional outcome but was not adopted here given the exploratory nature.
  • Multiple comparisons: Pairwise EMM contrasts use Bonferroni correction — conservative but appropriate given the equity-reporting context.
  • Sample size limitation: Several income × pathogen cells have small n (notably Low income, which contains only Salmonella spp. records), which inflates standard errors and reduces power for those contrasts.
  • Weighted extension: Consider re-running with weights = total_isolates_tested to give higher-volume surveillance sites proportionally greater influence.

Analysis conducted in R. Data source: WHO GLASS / simulated AMR surveillance data.