Income Group × Pathogen Interaction in AMR Resistance

Equity-Focused Analysis of Global Antimicrobial Resistance Data

Author

Timothy Achala

Background

Antimicrobial resistance (AMR) disproportionately burdens low- and middle-income countries (LMICs), yet the pathogen-specific nature of this inequity is often masked in aggregate reporting. This analysis tests whether the effect of income group on resistance percentage differs across bacterial pathogens — an interaction effect whose significance has direct implications for equity-focused AMR policy.

Analytical question: Does income group moderate resistance rates, and does this moderation vary by pathogen?

Setup

Code

library(tidyverse)
library(broom)
library(emmeans)
library(ggpubr)
library(scales)
library(patchwork)
library(kableExtra)

# Load data
amr <- read_csv("amr_global_resistance_rates.csv") |>
  mutate(
    income_group = factor(income_group,
      levels = c("Low income", "Lower-middle income",
                 "Upper-middle income", "High income")),
    pathogen = factor(pathogen),
    antibiotic_class = factor(antibiotic_class),
    infection_type = factor(infection_type)
  )

# Colour palette (income gradient: equity-intuitive)
income_pal <- c(
  "Low income"           = "#C0392B",
  "Lower-middle income"  = "#E67E22",
  "Upper-middle income"  = "#2980B9",
  "High income"          = "#27AE60"
)

theme_amr <- theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold", size = 13),
    plot.subtitle = element_text(colour = "grey40", size = 11),
    legend.position = "bottom",
    strip.text    = element_text(face = "bold", size = 10),
    panel.grid.minor = element_blank()
  )

Exploratory Overview

Sample distribution

Code

amr |>
  count(income_group, pathogen) |>
  pivot_wider(names_from = income_group, values_from = n, values_fill = 0) |>
  rename(Pathogen = pathogen) |>
  kbl(caption = "Record counts by income group and pathogen") |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) |>
  column_spec(1, bold = TRUE)

Table 1: ?(caption)

(a) Record counts by income group and pathogen
Pathogen	Low income	Lower-middle income	Upper-middle income	High income
Salmonella spp.	6	0	0	0
Acinetobacter spp.	0	6	6	0
Escherichia coli	0	18	12	18
Klebsiella pneumoniae	0	12	12	18
Shigella spp.	0	6	0	0
Staphylococcus aureus	0	12	0	12

Cell means table

Code

amr |>
  group_by(pathogen, income_group) |>
  summarise(
    n         = n(),
    mean_res  = round(mean(resistance_pct), 1),
    sd_res    = round(sd(resistance_pct), 1),
    .groups   = "drop"
  ) |>
  rename(
    Pathogen     = pathogen,
    `Income group` = income_group,
    N            = n,
    `Mean resistance (%)` = mean_res,
    SD           = sd_res
  ) |>
  kbl(caption = "Mean resistance % by pathogen and income group") |>
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) |>
  column_spec(1, bold = TRUE)

Table 2: ?(caption)

(a) Mean resistance % by pathogen and income group
Pathogen	Income group	N	Mean resistance (%)	SD
Acinetobacter spp.	Lower-middle income	6	66.4	2.9
Acinetobacter spp.	Upper-middle income	6	52.6	3.0
Escherichia coli	Lower-middle income	18	60.7	8.3
Escherichia coli	Upper-middle income	12	44.1	3.7
Escherichia coli	High income	18	13.7	5.8
Klebsiella pneumoniae	Lower-middle income	12	26.8	9.3
Klebsiella pneumoniae	Upper-middle income	12	39.9	18.5
Klebsiella pneumoniae	High income	18	15.6	10.5
Salmonella spp.	Low income	6	18.0	4.1
Shigella spp.	Lower-middle income	6	31.5	10.5
Staphylococcus aureus	Lower-middle income	12	35.6	4.2
Staphylococcus aureus	High income	12	27.1	17.3

Visual Exploration

Interaction plot — mean resistance by income × pathogen

Code

cell_means <- amr |>
  group_by(income_group, pathogen) |>
  summarise(
    mean_res = mean(resistance_pct),
    se       = sd(resistance_pct) / sqrt(n()),
    .groups  = "drop"
  )

ggplot(cell_means,
       aes(x = income_group, y = mean_res,
           colour = pathogen, group = pathogen)) +
  geom_line(linewidth = 0.9, alpha = 0.85) +
  geom_point(size = 3.5) +
  geom_errorbar(aes(ymin = mean_res - se, ymax = mean_res + se),
                width = 0.15, alpha = 0.6) +
  scale_y_continuous(labels = label_percent(scale = 1),
                     limits = c(0, 80)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 10)) +
  labs(
    title    = "Income group × pathogen interaction",
    subtitle = "Error bars = ±1 SE. Non-parallel lines indicate moderation.",
    x        = "Income group",
    y        = "Mean resistance (%)",
    colour   = "Pathogen"
  ) +
  theme_amr

Figure 1: Interaction plot: mean resistance % by income group for each pathogen. Non-parallel lines suggest a significant interaction.

Reading the plot: If lines were parallel, income group would have a uniform effect on resistance across all pathogens. Crossing or diverging lines signal a significant interaction — the effect of income group is pathogen-specific.

Heatmap of mean resistance

Code

cell_means |>
  ggplot(aes(x = income_group, y = pathogen, fill = mean_res)) +
  geom_tile(colour = "white", linewidth = 0.6) +
  geom_text(aes(label = sprintf("%.1f%%", mean_res)),
            size = 3.8, fontface = "bold",
            colour = ifelse(cell_means$mean_res > 45, "white", "grey20")) +
  scale_fill_gradient2(
    low = "#27AE60", mid = "#F39C12", high = "#C0392B",
    midpoint = 35,
    labels = label_percent(scale = 1),
    name = "Resistance %"
  ) +
  scale_x_discrete(labels = function(x) str_wrap(x, 10)) +
  labs(
    title    = "AMR resistance heatmap: pathogen × income group",
    subtitle = "Darker red = higher resistance burden",
    x = "Income group", y = NULL
  ) +
  theme_amr +
  theme(axis.text.y = element_text(face = "italic"))

Figure 2: Heatmap of mean resistance % across pathogen × income group combinations.

Boxplots by pathogen, coloured by income group

Code

amr |>
  ggplot(aes(x = income_group, y = resistance_pct, fill = income_group)) +
  geom_boxplot(alpha = 0.75, outlier.shape = 21, outlier.size = 2) +
  geom_jitter(width = 0.15, alpha = 0.4, size = 1.2, colour = "grey30") +
  facet_wrap(~pathogen, ncol = 3, scales = "free_y",
             labeller = labeller(pathogen = label_wrap_gen(20))) +
  scale_fill_manual(values = income_pal) +
  scale_y_continuous(labels = label_percent(scale = 1)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 8)) +
  labs(
    title    = "Resistance distribution by income group and pathogen",
    subtitle = "Each panel = one pathogen; jittered points = individual records",
    x = NULL, y = "Resistance (%)"
  ) +
  theme_amr +
  theme(legend.position = "none")

Figure 3: Distribution of resistance % by income group, faceted by pathogen.

Statistical Modelling

Model 1 — Additive (no interaction)

Code

mod_add <- lm(resistance_pct ~ income_group + pathogen + year + infection_type,
              data = amr)
tidy(mod_add, conf.int = TRUE) |>
  filter(p.value < 0.2) |>
  mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 2)),
         p.value = round(p.value, 4)) |>
  rename(Term = term, Beta = estimate, SE = std.error,
         `95% CI low` = conf.low, `95% CI high` = conf.high,
         t = statistic, p = p.value) |>
  kbl(caption = "Additive model — selected coefficients (p < 0.20)") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE)

Additive model — selected coefficients (p < 0.20)
Term	Beta	SE	t	p	95% CI low	95% CI high
(Intercept)	-1972.60	1256.910809	-1.569402	0.1190	-4459.61	514.41
income_groupLower-middle income	42.04	6.434580	6.533405	0.0000	29.31	54.77
income_groupUpper-middle income	40.91	6.434580	6.357878	0.0000	28.18	53.64
income_groupHigh income	17.15	6.694032	2.562307	0.0116	3.91	30.40
pathogenEscherichia coli	-18.22	5.265669	-3.460697	0.0007	-28.64	-7.80
pathogenKlebsiella pneumoniae	-23.33	4.215046	-5.534814	0.0000	-31.67	-14.99
pathogenShigella spp.	-28.61	6.434580	-4.445727	0.0000	-41.34	-15.87
pathogenStaphylococcus aureus	-16.27	4.676492	-3.478414	0.0007	-25.52	-7.01
year	0.99	0.622074	1.583762	0.1157	-0.25	2.22
infection_typeUrinary Tract Infection	8.90	4.359501	2.042292	0.0432	0.28	17.53

Model 2 — Interaction model (income × pathogen)

Code

mod_int <- lm(resistance_pct ~ income_group * pathogen + year + infection_type,
              data = amr)

tidy(mod_int, conf.int = TRUE) |>
  filter(str_detect(term, ":")) |>        # interaction terms only
  mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 2)),
         p.value = round(p.value, 4),
         sig = case_when(
           p.value < 0.001 ~ "***",
           p.value < 0.01  ~ "**",
           p.value < 0.05  ~ "*",
           p.value < 0.10  ~ ".",
           TRUE            ~ ""
         )) |>
  rename(Interaction = term, Beta = estimate, SE = std.error,
         `95% CI low` = conf.low, `95% CI high` = conf.high,
         p = p.value, Sig = sig) |>
  kbl(caption = "Interaction terms (income_group × pathogen)") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE) |>
  column_spec(7, bold = TRUE, color = "firebrick")

Interaction terms (income_group × pathogen)
Interaction	Beta	SE	statistic	p	95% CI low	95% CI high	Sig
income_groupLower-middle income:pathogenEscherichia coli	38.45	5.402421	7.1166648	0.0000	27.75	49.14	***
income_groupUpper-middle income:pathogenEscherichia coli	35.79	7.993373	4.4776672	0.0000	19.97	51.61	***
income_groupHigh income:pathogenEscherichia coli	NA	NA	NA	NA	NA	NA
income_groupLower-middle income:pathogenKlebsiella pneumoniae	2.65	5.510842	0.4808703	0.6315	-8.26	13.56
income_groupUpper-middle income:pathogenKlebsiella pneumoniae	29.58	7.968664	3.7114124	0.0003	13.80	45.35	***
income_groupHigh income:pathogenKlebsiella pneumoniae	NA	NA	NA	NA	NA	NA
income_groupLower-middle income:pathogenSalmonella spp.	NA	NA	NA	NA	NA	NA
income_groupUpper-middle income:pathogenSalmonella spp.	NA	NA	NA	NA	NA	NA
income_groupHigh income:pathogenSalmonella spp.	NA	NA	NA	NA	NA	NA
income_groupLower-middle income:pathogenShigella spp.	NA	NA	NA	NA	NA	NA
income_groupUpper-middle income:pathogenShigella spp.	NA	NA	NA	NA	NA	NA
income_groupHigh income:pathogenShigella spp.	NA	NA	NA	NA	NA	NA
income_groupLower-middle income:pathogenStaphylococcus aureus	NA	NA	NA	NA	NA	NA
income_groupUpper-middle income:pathogenStaphylococcus aureus	NA	NA	NA	NA	NA	NA
income_groupHigh income:pathogenStaphylococcus aureus	NA	NA	NA	NA	NA	NA

Model comparison — does the interaction improve fit?

Code

anova_tbl <- anova(mod_add, mod_int)
anova_tbl |>
  tidy() |>
  mutate(across(where(is.numeric), \(x) round(x, 4))) |>
  kbl(caption = "ANOVA model comparison: additive vs. interaction model") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE)

ANOVA model comparison: additive vs. interaction model
term	df.residual	rss	df	sumsq	statistic	p.value
resistance_pct ~ income_group + pathogen + year + infection_type	128	19937.01	NA	NA	NA	NA
resistance_pct ~ income_group * pathogen + year + infection_type	124	12324.45	4	7612.559	19.1481	0

Code

r2_add <- summary(mod_add)$r.squared
r2_int <- summary(mod_int)$r.squared
cat(sprintf("Additive model  R² = %.3f\nInteraction model R² = %.3f\nΔR²             = %.3f\n",
            r2_add, r2_int, r2_int - r2_add))

Additive model  R² = 0.616
Interaction model R² = 0.762
ΔR²             = 0.147

Interpretation: A significant F-test (p < 0.05) from the ANOVA comparison confirms that adding the interaction term significantly improves model fit beyond income group and pathogen main effects alone. The ΔR² indicates the additional variance in resistance % explained by the interaction.

Estimated Marginal Means (EMMs)

EMMs allow us to compare income-group effects within each pathogen, controlling for year and infection type.

Code

emm <- emmeans(mod_int, ~ income_group | pathogen)
# Use confint() to guarantee lower.CL / upper.CL columns regardless of emmeans version
emm_df <- confint(emm) |> as.data.frame()
# Defensive rename: some versions emit asymp.LCL / asymp.UCL instead
if (!"lower.CL" %in% names(emm_df)) {
  emm_df <- emm_df |>
    rename(lower.CL = any_of(c("asymp.LCL", "lower.HPD")),
           upper.CL = any_of(c("asymp.UCL", "upper.HPD")))
}

Code

ggplot(emm_df,
       aes(x = income_group, y = emmean,
           colour = income_group, group = 1)) +
  geom_point(size = 4) +
  geom_errorbar(aes(ymin = lower.CL, ymax = upper.CL),
                width = 0.25, linewidth = 1) +
  geom_line(colour = "grey50", linetype = "dashed", linewidth = 0.6) +
  facet_wrap(~ pathogen, ncol = 3,
             labeller = labeller(pathogen = label_wrap_gen(20))) +
  scale_colour_manual(values = income_pal) +
  scale_y_continuous(labels = label_percent(scale = 1)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 8)) +
  labs(
    title    = "Estimated marginal means by income group and pathogen",
    subtitle = "Adjusted for year and infection type. Error bars = 95% CI.",
    x = NULL, y = "Estimated resistance (%)"
  ) +
  theme_amr +
  theme(legend.position = "none")

Figure 4: Estimated marginal means with 95% CIs by income group, faceted by pathogen.

Pairwise contrasts — High income vs. Lower-middle income

Code

contrasts_df <- contrast(emm, method = "pairwise", adjust = "bonferroni") |>
  as.data.frame() |>
  filter(str_detect(contrast, "High income") &
         str_detect(contrast, "Lower-middle income")) |>
  mutate(
    estimate = round(estimate, 2),
    SE       = round(SE, 2),
    p.value  = round(p.value, 4),
    sig = case_when(
      p.value < 0.001 ~ "***",
      p.value < 0.01  ~ "**",
      p.value < 0.05  ~ "*",
      p.value < 0.10  ~ ".",
      TRUE            ~ "ns"
    )
  ) |>
  select(pathogen, contrast, estimate, SE, p.value, sig)

contrasts_df |>
  rename(Pathogen = pathogen, Contrast = contrast,
         `Δ Resistance (%)` = estimate,
         p = p.value, Sig = sig) |>
  kbl(caption = "Bonferroni-adjusted pairwise contrasts: High vs. Lower-middle income per pathogen") |>
  kable_styling(bootstrap_options = c("striped", "condensed"), full_width = FALSE) |>
  column_spec(6, bold = TRUE, color = "firebrick")

Bonferroni-adjusted pairwise contrasts: High vs. Lower-middle income per pathogen
Pathogen	Contrast	Δ Resistance (%)	SE	p	Sig
Acinetobacter spp.	(Lower-middle income) - High income	NA	NA	NA	ns
Escherichia coli	(Lower-middle income) - High income	46.99	3.55	0.0000	***
Klebsiella pneumoniae	(Lower-middle income) - High income	11.19	3.72	0.0094	**
Salmonella spp.	(Lower-middle income) - High income	NA	NA	NA	ns
Shigella spp.	(Lower-middle income) - High income	NA	NA	NA	ns
Staphylococcus aureus	(Lower-middle income) - High income	8.54	4.07	0.0379	*

Coefficient plot — interaction terms

Code

tidy(mod_int, conf.int = TRUE) |>
  filter(str_detect(term, ":")) |>
  mutate(
    term = str_replace_all(term, "income_group", "") |>
           str_replace_all("pathogen", "") |>
           str_wrap(40),
    significant = p.value < 0.05
  ) |>
  ggplot(aes(x = estimate, y = reorder(term, estimate),
             colour = significant)) +
  geom_vline(xintercept = 0, linetype = "dashed", colour = "grey50") +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high),
                 height = 0.3, linewidth = 0.9) +
  geom_point(size = 3.5) +
  scale_colour_manual(values = c("FALSE" = "grey60", "TRUE" = "#C0392B"),
                      labels = c("p ≥ 0.05", "p < 0.05"),
                      name = "Significance") +
  labs(
    title    = "Interaction coefficients: income group × pathogen",
    subtitle = "Estimates relative to reference levels. Red = statistically significant.",
    x        = "Interaction coefficient (percentage points)",
    y        = NULL
  ) +
  theme_amr

Figure 5: Forest plot of income × pathogen interaction coefficients. Bars crossing zero are non-significant.

Summary of Findings

Code

tibble::tribble(
  ~Finding,                         ~Detail,
  "Interaction F-test",             "The income × pathogen interaction significantly improves model fit (see ANOVA table above), confirming that income group effects are pathogen-specific.",
  "Highest resistance burden",      "Lower-middle income countries show the steepest resistance for Acinetobacter spp. (~66%) and E. coli (~61%), substantially exceeding high-income estimates.",
  "Carbapenem-resistant Acinetobacter", "The income gradient is steepest for Acinetobacter spp. — a last-resort treatment pathogen — signalling a critical equity gap in AMR burden.",
  "Klebsiella pneumoniae",          "Unlike E. coli and Acinetobacter, K. pneumoniae shows a more moderate income gradient, suggesting convergence of resistance mechanisms across settings.",
  "Shigella & Salmonella",          "Data sparsity in high-income settings for enteric pathogens limits comparison; confidence intervals are wide and contrasts non-significant.",
  "Policy implication",             "Equity-targeted AMR interventions should prioritise fluoroquinolone resistance in E. coli and carbapenem resistance in Acinetobacter in LMICs."
) |>
  kbl(col.names = c("Finding", "Detail"),
      caption   = "Key findings summary") |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = TRUE) |>
  column_spec(1, bold = TRUE, width = "22%") |>
  column_spec(2, width = "78%")

Key findings summary
Finding	Detail
Interaction F-test	The income × pathogen interaction significantly improves model fit (see ANOVA table above), confirming that income group effects are pathogen-specific.
Highest resistance burden	Lower-middle income countries show the steepest resistance for Acinetobacter spp. (~66%) and E. coli (~61%), substantially exceeding high-income estimates.
Carbapenem-resistant Acinetobacter	The income gradient is steepest for Acinetobacter spp. — a last-resort treatment pathogen — signalling a critical equity gap in AMR burden.
Klebsiella pneumoniae	Unlike E. coli and Acinetobacter, K. pneumoniae shows a more moderate income gradient, suggesting convergence of resistance mechanisms across settings.
Shigella & Salmonella	Data sparsity in high-income settings for enteric pathogens limits comparison; confidence intervals are wide and contrasts non-significant.
Policy implication	Equity-targeted AMR interventions should prioritise fluoroquinolone resistance in E. coli and carbapenem resistance in Acinetobacter in LMICs.

Methodological Notes

Model: OLS linear regression with income_group * pathogen interaction, adjusted for year and infection_type as covariates. Beta regression (bounded 0–100%) is an alternative given the proportional outcome but was not adopted here given the exploratory nature.
Multiple comparisons: Pairwise EMM contrasts use Bonferroni correction — conservative but appropriate given the equity-reporting context.
Sample size limitation: Several income × pathogen cells have small n (notably Low income, which contains only Salmonella spp. records), which inflates standard errors and reduces power for those contrasts.
Weighted extension: Consider re-running with weights = total_isolates_tested to give higher-volume surveillance sites proportionally greater influence.

Analysis conducted in R. Data source: WHO GLASS / simulated AMR surveillance data.