The metric measures that are used to describe the burden rate of AMR are:
The meaning of terminology
Attributable (to AMR): Refers to deaths or health outcomes directly caused by antimicrobial resistance, meaning they would not have occurred if the infection had been drug-susceptible.
Associated (with AMR): Refers to deaths or health outcomes that occur in the presence of AMR, regardless of whether resistance was the direct cause.
The data on these burden rates were extracted from the GBD Microbe website, which provides estimates of the global burden of antimicrobial resistance (AMR). This platform reports the mortality and DALY (Disability-Adjusted Life Years) rates caused by various drug-resistant pathogens. For this study, we used data categorized by the WHO Priority Pathogen List—which classifies pathogens into critical, high, and medium priority groups—across 204 countries and territories.
att_burden_rate <- read.csv("C:/Users/symou/Desktop/Research_AMR/Research_AMR/amr_burden/att_death_rate/microbe_download.csv")
att_burden_rate <- data.frame (head(att_burden_rate))
print(att_burden_rate[c("Location", "Age", "Metric", "Pathogen", "Antibiotic.class", "Counterfactual", "Year", "Value")])
## Location Age Metric Pathogen
## 1 Afghanistan Age-standardized Rate (per 100k) Acinetobacter baumannii
## 2 Afghanistan Age-standardized Rate (per 100k) Acinetobacter baumannii
## 3 Afghanistan Age-standardized Rate (per 100k) Acinetobacter baumannii
## 4 Afghanistan Age-standardized Rate (per 100k) Acinetobacter baumannii
## 5 Afghanistan Age-standardized Rate (per 100k) Acinetobacter baumannii
## 6 Afghanistan Age-standardized Rate (per 100k) Acinetobacter baumannii
## Antibiotic.class Counterfactual Year Value
## 1 Carbapenems Attributable 1990 2.274598
## 2 Carbapenems Attributable 1991 2.296411
## 3 Carbapenems Attributable 1992 2.470749
## 4 Carbapenems Attributable 1993 2.471484
## 5 Carbapenems Attributable 1994 2.619534
## 6 Carbapenems Attributable 1995 2.539058
\[ \text{Critical Group Burden Rate} = \frac{\sum_{i} \left( \text{BurdenRate}_i \times \text{TotalDeaths}_i \right)}{\sum_{i} \text{TotalDeaths}_i} \]
library (dplyr)
library (tibble)
critical_priority_pairs <- tibble(
Pathogen = c(
"Klebsiella pneumoniae", "Escherichia coli", "Acinetobacter baumannii",
"Mycobacterium tuberculosis", "Mycobacterium tuberculosis", "Mycobacterium tuberculosis",
"Escherichia coli", "Klebsiella pneumoniae", "Enterobacter spp.", "Citrobacter spp.",
"Proteus spp.", "Serratia spp.", "Morganella spp."
),
Antibiotic.class = c(
"Carbapenems", "Third-generation cephalosporins", "Carbapenems",
"Resistance to one or more antibiotics", "Extensive drug resistance in TB",
"Multi-drug resistance excluding extensive drug resistance in TB",
"Carbapenems", "Third-generation cephalosporins", "Carbapenems", "Third-generation cephalosporins",
"Third-generation cephalosporins", "Third-generation cephalosporins", "Third-generation cephalosporins"
)
)
# High Priority bug-drug pairs
high_priority_pairs <- tibble(
Pathogen = c(
"Salmonella enterica serovar Typhi", "Shigella spp.", "Enterococcus faecium",
"Pseudomonas aeruginosa", "Non-typhoidal Salmonella", "Neisseria gonorrhoeae",
"Staphylococcus aureus", "Neisseria gonorrhoeae"
),
Antibiotic.class = c(
"Fluoroquinolones", "Fluoroquinolones", "Vancomycin", "Carbapenems",
"Fluoroquinolones", "Fluoroquinolones", "Methicillin", "Third-generation cephalosporins"
)
)
# Medium Priority bug-drug pairs
medium_priority_pairs <- tibble(
Pathogen = c(
"Group A Streptococcus", "Streptococcus pneumoniae",
"Haemophilus influenzae", "Group B Streptococcus"
),
Antibiotic.class = c(
"Macrolides", "Macrolides", "Aminopenicillin", "Penicillin"
)
)
library(dplyr)
country_list <- read.csv("country_list.csv", fileEncoding = "Windows-1254", stringsAsFactors = FALSE, header = FALSE)
colnames(country_list) <- "location_name"
sdi_value <- read.csv("sdi_value.csv", fileEncoding = "Windows-1254", stringsAsFactors = FALSE)
quintile_sdi_value <- read.csv ("sdi_quintile_ref.csv")
df_sdi_val <- country_list %>%
inner_join(
sdi_value %>%
filter (year_id == 2021) %>%
select (location_name, year_id, mean_value),
by = "location_name"
) %>%
mutate(
sdi_category = case_when(
mean_value >= 0 & mean_value < 0.46581580319161997 ~ "low",
mean_value >= 0.46581580319161997 & mean_value < 0.6188294452454329 ~ "middle_low",
mean_value >= 0.6188294452454329 & mean_value < 0.7119746219361235 ~ "middle",
mean_value >= 0.7119746219361235 & mean_value < 0.8102959891918925 ~ "middle_high",
mean_value >= 0.8102959891918925 ~ "high"
)
)
# JOIN THE SDI CATEGORY TO THE 4 METRIC OF AMR BURDEN DATAFRAME FOR FURTHER STATISTICAL ANALYSIS
## write a function that join the sdi category to each country. Then aggregate the data
agg_with_sdi <- function(metric_df, sdi_df,
location_col_metric = "Location",
location_col_sdi = "location_name",
rate_col = "weighted_avg_rate",
group_cols = c("Location", "priority_group", "sdi_category")) {
# join SDI to metric dataframe by location columns
joined_df <- metric_df %>%
left_join(sdi_df, by = setNames(location_col_sdi, location_col_metric))
# Aggregate mean rate by group
aggregated_df <- joined_df %>%
group_by(across(all_of(group_cols))) %>%
summarise(mean_rate = mean(.data[[rate_col]], na.rm = TRUE),
mean_sdi = mean(mean_value, na.rm = TRUE),
.groups = "drop")
}
final_df <- read.csv("C:/Users/symou/Desktop/Research_AMR/Research_AMR/agg_priority_group_att_death_rate.csv")
print(head(final_df))
## Location priority_group sdi_category mean_rate mean_sdi
## 1 Afghanistan Critical low 2.2511523 0.3372000
## 2 Afghanistan High low 2.0505373 0.3372000
## 3 Afghanistan Medium low 0.4975396 0.3372000
## 4 Albania Critical middle 0.4635214 0.7068498
## 5 Albania High middle 0.7081419 0.7068498
## 6 Albania Medium middle 0.0735871 0.7068498
Since our main objective was to examine the effects of WHO priority group, SDI category, and their interaction on a continuous outcome variable (burden rate), we initially applied a two-way ANOVA. However, the assumptions of ANOVA were violated — specifically, the residuals were not normally distributed, and there was unequal variance across groups.
To address these issues while still testing the interaction between the two categorical factors, we used the Aligned Rank Transformation (ART) ANOVA. The ART method allows for nonparametric analysis of factorial designs by first aligning the data to isolate each main effect and interaction, then applying a rank transformation. This makes it possible to perform ANOVA-like tests without requiring normality or equal variance.
Additionally, because the data included repeated measures across multiple countries, and countries could differ in baseline burden levels, we used a mixed-effects model to account for the random effect of country. This helps control for variability between countries that is not explained by the fixed effects (priority group and SDI). For example, countries vary in healthcare access, antibiotic usage policies, disease surveillance systems, and public health infrastructure. These differences can influence the overall AMR burden, regardless of the WHO priority group or SDI category.
After fitting the ART ANOVA, we used a linear model on the rank-transformed data to estimate the group effects. From this model, we calculated estimated marginal means (EMMs), which represent the adjusted average rank for each combination of SDI category and WHO priority group. This step helps us interpret the group differences while accounting for the other variable and the random effect of country. The EMMs make it easier to compare groups even in the presence of non-normal data and unequal variance, as they are based on the fitted model of aligned ranks rather than raw values.
library (ARTool)
library (emmeans)
library (dplyr)
# Using mixed-effects model to account for fixed effects of priority group and SDI category,
# while modeling random variability across countries (Location) as random intercepts.
## How do WHO priority pathogen groups and SDI categories affect AMR attributable death rates across countries,
## accounting for country-level variability
# create a function to run an ART anova
run_art_analysis <- function(data) {
# ensure categorical variables are treated as factor
data <- data %>%
mutate(across(c(Location, priority_group, sdi_category), as.factor))
# fit ART model with mixed effect for location by + (1|Location)
art_model <- art(mean_rate ~ priority_group * sdi_category + (1|Location), data = data)
print(anova(art_model))
# post-hoc for priority_group variable by estimated marginal means of aligned from linear model
emm_model_priority <- artlm(art_model, "priority_group")
emm_priority <- emmeans (emm_model_priority, pairwise ~ priority_group, adjust = "bonferroni")
# post-hoc for sdi_category
emm_model_sdi <- artlm(art_model, "sdi_category")
emm_sdi <- emmeans(emm_model_sdi, pairwise ~ sdi_category, adjust = "bonferroni")
# Interaction of two variables
emm_model_interaction <- artlm(art_model, "priority_group:sdi_category")
emm_interaction <- emmeans (emm_model_interaction, ~ priority_group | sdi_category)
interaction_pairs <- pairs (emm_interaction, adjust = "bonferroni")
# return all results as a list
return (list (
priority_group = list (emmeans = emm_priority$emmeans, contrasts = emm_priority$contrasts),
sdi_group = list (emmeans = emm_sdi$emmeans, contrasts = emm_sdi$contrasts),
interaction = list (emmeans = emm_interaction, contrasts = interaction_pairs)
))
}
att_asdr_art <- run_art_analysis(final_df)
## Analysis of Variance of Aligned Rank Transformed Data
##
## Table Type: Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df)
## Model: Mixed Effects (lmer)
## Response: art(mean_rate)
##
## F Df Df.res Pr(>F)
## 1 priority_group 352.452 2 398 < 2.22e-16 ***
## 2 sdi_category 27.498 4 199 < 2.22e-16 ***
## 3 priority_group:sdi_category 16.588 8 398 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## NOTE: Results may be misleading due to involvement in interactions
## NOTE: Results may be misleading due to involvement in interactions
The result showed that there is a significant different of death rate among different pathogen priority group and sdi_category. There is also a significant interaction of the two variable affecting the death rate
# Post Hoc result of variable priority_group
list(
Emms_path = att_asdr_art$priority_group$emmeans,
Contrasts_path = att_asdr_art$priority_group$contrasts)
## $Emms_path
## priority_group emmean SE df lower.CL upper.CL
## Critical 370 8.99 543 353 388
## High 412 8.99 543 394 429
## Medium 136 8.99 543 118 153
##
## Results are averaged over the levels of: sdi_category
## Degrees-of-freedom method: kenward-roger
## Confidence level used: 0.95
##
## $Contrasts_path
## contrast estimate SE df t.ratio p.value
## Critical - High -41.3 11.2 398 -3.689 0.0008
## Critical - Medium 234.5 11.2 398 20.925 <.0001
## High - Medium 275.9 11.2 398 24.615 <.0001
##
## Results are averaged over the levels of: sdi_category
## Degrees-of-freedom method: kenward-roger
## P value adjustment: bonferroni method for 3 tests
# Post Hoc result of variable sdi_category
list(
Emms_sdi = att_asdr_art$sdi_group$emmeans,
Contrasts_sdi = att_asdr_art$sdi_group$contrasts)
## $Emms_sdi
## sdi_category emmean SE df lower.CL upper.CL
## high 158 16.9 199 124 191
## low 380 18.8 199 343 417
## middle 326 17.1 199 292 359
## middle_high 310 16.3 199 278 342
## middle_low 373 16.9 199 340 406
##
## Results are averaged over the levels of: priority_group
## Degrees-of-freedom method: kenward-roger
## Confidence level used: 0.95
##
## $Contrasts_sdi
## contrast estimate SE df t.ratio p.value
## high - low -222.19 25.2 199 -8.807 <.0001
## high - middle -168.07 24.0 199 -7.000 <.0001
## high - middle_high -152.43 23.5 199 -6.497 <.0001
## high - middle_low -215.45 23.9 199 -9.028 <.0001
## low - middle 54.11 25.4 199 2.133 0.3413
## low - middle_high 69.75 24.9 199 2.807 0.0550
## low - middle_low 6.73 25.2 199 0.267 1.0000
## middle - middle_high 15.64 23.6 199 0.662 1.0000
## middle - middle_low -47.38 24.0 199 -1.973 0.4984
## middle_high - middle_low -63.02 23.5 199 -2.686 0.0785
##
## Results are averaged over the levels of: priority_group
## Degrees-of-freedom method: kenward-roger
## P value adjustment: bonferroni method for 10 tests
# Post Hoc result of the interaction between two variables
list(
Emms_int = att_asdr_art$interaction$emmeans,
Contrasts_int = att_asdr_art$interaction$contrasts)
## $Emms_int
## sdi_category = high:
## priority_group emmean SE df lower.CL upper.CL
## Critical 161 25.1 534 111 210
## High 366 25.1 534 316 415
## Medium 434 25.1 534 385 483
##
## sdi_category = low:
## priority_group emmean SE df lower.CL upper.CL
## Critical 346 27.9 534 291 401
## High 179 27.9 534 125 234
## Medium 306 27.9 534 251 361
##
## sdi_category = middle:
## priority_group emmean SE df lower.CL upper.CL
## Critical 285 25.4 534 236 335
## High 360 25.4 534 310 410
## Medium 322 25.4 534 272 372
##
## sdi_category = middle_high:
## priority_group emmean SE df lower.CL upper.CL
## Critical 255 24.3 534 208 303
## High 393 24.3 534 346 441
## Medium 319 24.3 534 271 367
##
## sdi_category = middle_low:
## priority_group emmean SE df lower.CL upper.CL
## Critical 339 25.1 534 289 388
## High 222 25.1 534 173 271
## Medium 292 25.1 534 242 341
##
## Degrees-of-freedom method: kenward-roger
## Confidence level used: 0.95
##
## $Contrasts_int
## sdi_category = high:
## contrast estimate SE df t.ratio p.value
## Critical - High -205.0 30.9 398 -6.632 <.0001
## Critical - Medium -273.2 30.9 398 -8.840 <.0001
## High - Medium -68.3 30.9 398 -2.209 0.0833
##
## sdi_category = low:
## contrast estimate SE df t.ratio p.value
## Critical - High 166.6 34.4 398 4.851 <.0001
## Critical - Medium 40.2 34.4 398 1.171 0.7266
## High - Medium -126.4 34.4 398 -3.680 0.0008
##
## sdi_category = middle:
## contrast estimate SE df t.ratio p.value
## Critical - High -74.3 31.3 398 -2.376 0.0539
## Critical - Medium -36.2 31.3 398 -1.159 0.7419
## High - Medium 38.1 31.3 398 1.218 0.6720
##
## sdi_category = middle_high:
## contrast estimate SE df t.ratio p.value
## Critical - High -138.2 29.9 398 -4.630 <.0001
## Critical - Medium -63.6 29.9 398 -2.131 0.1012
## High - Medium 74.6 29.9 398 2.499 0.0386
##
## sdi_category = middle_low:
## contrast estimate SE df t.ratio p.value
## Critical - High 116.9 30.9 398 3.782 0.0005
## Critical - Medium 47.0 30.9 398 1.521 0.3874
## High - Medium -69.9 30.9 398 -2.262 0.0728
##
## Degrees-of-freedom method: kenward-roger
## P value adjustment: bonferroni method for 3 tests
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.3
interaction_emm <- att_asdr_art$interaction$emmeans
interaction_df <- as.data.frame(interaction_emm)
ggplot(interaction_df, aes(x = priority_group, y = emmean, fill = sdi_category)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.8)) +
geom_errorbar(aes(ymin = lower.CL, ymax = upper.CL),
position = position_dodge(width = 0.8), width = 0.2) +
scale_fill_brewer(palette = "Pastel1") +
labs(title = "Interaction: Priority Group × SDI Category",
x = "Priority Group", y = "Estimated Marginal Mean (Aligned)",
fill = "SDI Category") +
theme_minimal()
- I am still learning the intuition of
Aligned Rank Transformation ANOVA, so I am not sure if I do the analysis
correctly or not.
- What is surprising me that how could the estimated marginal mean
differ so much from the raw mean. I understand that this aligned ranked
value of the data is used to account for the difference of base line for
each country then fitting the linear model on that rank to have
estimated mean. However, I am not sure if the process is 100 percent
correct.
- That is why I would like to have your opinion, if the methodology I am
using, the code that I wrote is seemingly correct.
- If you would have a time, please have a look and check for me please
professor
- Thank you in advance!