Executive Summary

This analysis addresses the methodological question raised in ANCOM-BC Issue #196: Can normalized abundance data be effectively used with differential abundance tools designed for raw counts?

1. Introduction and Motivation

Background

Many differential abundance (DA) tools were originally designed for raw read counts where library size differences are informative and corrected internally (e.g. via size-factor estimation, library size, gene length). For genome-level metagenomic data, however, raw counts are dominated by genome length; common practice is therefore to normalise reads to TPM/RPKM or coverage. This creates a methodological question about the potential effect on the results.

This study evaluates 48 metric × method combinations (6 metrics × 8 DA tools) using CAMISIM synthetic communities with known ground truth.

Research Questions

Cross-metric performance: How do DA tools perform across different abundance metrics?
Normalization impact: Does length/depth normalization improve or reduce DA detection power?
Method preferences: Do some DA tools work better with specific abundance metrics?
Consistency evaluation: Are DA results consistent across metrics for individual tools?

Study Design

Simulation: CAMISIM with 24 genomes (14 truly differentially abundant)
Synthetic sample generation: CAMISIM created 60 metagenomic samples (20 base samples × 3 replicates each; 30 control, 30 treatment)
Up-stream processing: Each sample was processed with the standard MAGician workflow (repository).
Abundance Metrics: 6 metrics (counts, TPM, RPKM, relative abundance, breadth, coverage)
Coverage quantification: Genome-level abundance metrics were computed using CoverM ((repository).
DA Methods: 8 tools (ALDEx2, ANCOM, corncob, dearseq, edgeR, limma, metagenomeSeq, NOISeq) ran by Benchdamic (repository).
Evaluation: Binary classification and threshold-free metrics (F1-Score, AUC-PR, Precision, Recall, Specificity, NPV)
Statistical Tests: t-test for F1-Score differences

Definitions

Abundance-metric normalisation cheat-sheet

Metric	Length normalisation	Depth normalisation	What the number means
counts	✗ (raw reads)	✗	Number of read pairs whose best hit is this genome
relative %	✗	✓ (values sum to 100 %)	Share of all mapped reads that belong to the genome
breadth	✓ (divide by length)	✗	Fraction of genome bases with ≥ 1× read coverage
coverage (mean depth)	✓ (divide by length)	✗ *	Mean per-base depth across the genome
RPKM	✓ (/ kb)	✓ (/ 10⁶ reads)	Reads-per-kilobase-per-million mapped reads
TPM	✓ (/ kb)	✓ (rescaled to 10⁶)	Transcripts-per-million; similar to RPKM but sample-wise rescaled

^{* Coverage still increases with sequencing depth, but it no
longer carries the multiplicative genome-length bias.}

Data Loading and Preprocessing

Details data import, cleaning, and integration steps used to construct the analysis dataset.

# Load required libraries
suppressPackageStartupMessages({
  library(tidyverse)
  library(ggplot2)
  library(gridExtra)
  library(pROC)
  library(PRROC)
  library(corrplot)
  library(knitr)
  library(kableExtra)
  library(DT)
  library(plotly)
  library(RColorBrewer)
  library(viridis)
})

theme_set(theme_bw() + 
          theme(text = element_text(size = 12),
                plot.title = element_text(face = "bold", size = 14),
                axis.title = element_text(face = "bold"),
                legend.position = "bottom"))

metric_colors <- c("counts" = "#1f77b4", "tpm" = "#ff7f0e", "rpkm" = "#2ca02c", 
                   "relative" = "#d62728", "breadth" = "#9467bd", "coverage" = "#8c564b")

tool_colors <- c("ALDEx2" = "#1f77b4", "ANCOM" = "#ff7f0e", "basic" = "#2ca02c",
                "corncob" = "#d62728", "dearseq" = "#9467bd", "edgeR" = "#8c564b",
                "limma" = "#e377c2", "metagenomeSeq" = "#7f7f7f", "NOISeq" = "#bcbd22")

Genomic Sample Distribution

The genomes and their distribution was:

Sample Genome Distribution Across Synthetic Samples
genome	seq_type	C01_rep01	C01_rep02	C01_rep03	C02_rep01	C02_rep02	C02_rep03	C03_rep01	C03_rep02	C03_rep03	C04_rep01	C04_rep02	C04_rep03	C05_rep01	C05_rep02	C05_rep03	C06_rep01	C06_rep02	C06_rep03	C07_rep01	C07_rep02	C07_rep03	C08_rep01	C08_rep02	C08_rep03	C09_rep01	C09_rep02	C09_rep03	C10_rep01	C10_rep02	C10_rep03	T01_rep01	T01_rep02	T01_rep03	T02_rep01	T02_rep02	T02_rep03	T03_rep01	T03_rep02	T03_rep03	T04_rep01	T04_rep02	T04_rep03	T05_rep01	T05_rep02	T05_rep03	T06_rep01	T06_rep02	T06_rep03	T07_rep01	T07_rep02	T07_rep03	T08_rep01	T08_rep02	T08_rep03	T09_rep01	T09_rep02	T09_rep03	T10_rep01	T10_rep02	T10_rep03
GCA_000006785.2_ASM678v2.fa	chromosome	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40	40
GCA_000013285.1_ASM1328v1.fa	chromosome	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14
GCA_000013785.1_ASM1378v1.fa	chromosome	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9
GCA_000015865.1_ASM1586v1.fa	chromosome	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16	16
GCA_000022345.1_ASM2234v1.fa	chromosome	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120	120
GCA_000023785.1_ASM2378v1.fa	chromosome	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9
GCA_000025905.1_ASM2590v1.fa	chromosome	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12
GCA_000092125.1_ASM9212v1.fa	chromosome	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44	44
GCA_000092825.1_ASM9282v1.fa	chromosome	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60	60
GCA_000092985.1_ASM9298v1.fa	chromosome	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	7	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28
GCA_000143845.1_ASM14384v1.fa	chromosome	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	19	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152	152
GCA_000143985.1_ASM14398v1.fa	chromosome	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	6	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12
GCA_000227705.3_ASM22770v3.fa	chromosome	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	12	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48	48
GCA_000231385.3_ASM23138v3.fa	chromosome	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	11	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88	88
GCA_000233715.3_ASM23371v3.fa	chromosome	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10
GCA_000235405.3_ASM23540v3.fa	chromosome	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13	13
GCA_000242255.3_ASM24225v3.fa	chromosome	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8
GCA_000255115.3_ASM25511v3.fa	chromosome	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5
GCA_000265425.1_ASM26542v1.fa	chromosome	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28	28
GCA_000325705.1_ASM32570v1.fa	chromosome	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20	20
GCA_000439255.1_ASM43925v1.fa	chromosome	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	5	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10	10
GCA_000445015.1_ASM44501v1.fa	chromosome	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9	9
GCA_001692755.1_ASM169275v1.fa	chromosome	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15	15
GCA_001877055.1_ASM187705v1.fa	chromosome	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14	14

Ground Truth

# Load ground truth
truth_table <- read_csv('/home/mrahman/da_analysis/data_camisim_realistic_reps/truth_table.csv') %>%
  mutate(genome = str_replace(genome, "\\.fa$", "")) %>%
  mutate(genome = str_replace_all(genome, "[^A-Za-z0-9]", "")) %>%
  distinct(genome, .keep_all = TRUE)

cat("DA genomes:", sum(truth_table$is_DA), "\n")

## DA genomes: 14

cat("Non-DA genomes:", sum(!truth_table$is_DA), "\n")

## Non-DA genomes: 10

truth_table %>%
  group_by(is_DA) %>%
  summarise(
    n_genomes = n(),
    mean_log2FC = mean(true_log2FC, na.rm = TRUE),
    min_log2FC = min(true_log2FC, na.rm = TRUE),
    max_log2FC = max(true_log2FC, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  kable(caption = "Ground Truth Summary", digits = 3) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"))

Ground Truth Summary
is_DA	n_genomes	mean_log2FC	min_log2FC	max_log2FC
FALSE	10	0.000	0	0
TRUE	14	1.857	1	3

Ground Truth Table: True Differential Abundance Status
genome	is_DA	true_log2FC	baseline_weight	enriched_weight
GCA0000067852ASM678v2	TRUE	2	10	40
GCA0000132851ASM1328v1	FALSE	0	14	14
GCA0000137851ASM1378v1	FALSE	0	9	9
GCA0000158651ASM1586v1	TRUE	1	8	16
GCA0000223451ASM2234v1	TRUE	3	15	120
GCA0000237851ASM2378v1	FALSE	0	9	9
GCA0000259051ASM2590v1	FALSE	0	12	12
GCA0000921251ASM9212v1	TRUE	2	11	44
GCA0000928251ASM9282v1	TRUE	2	15	60
GCA0000929851ASM9298v1	TRUE	2	7	28
GCA0001438451ASM14384v1	TRUE	3	19	152
GCA0001439851ASM14398v1	TRUE	1	6	12
GCA0002277053ASM22770v3	TRUE	2	12	48
GCA0002313853ASM23138v3	TRUE	3	11	88
GCA0002337153ASM23371v3	TRUE	1	5	10
GCA0002354053ASM23540v3	FALSE	0	13	13
GCA0002422553ASM24225v3	FALSE	0	8	8
GCA0002551153ASM25511v3	FALSE	0	5	5
GCA0002654251ASM26542v1	TRUE	1	14	28
GCA0003257051ASM32570v1	TRUE	2	5	20
GCA0004392551ASM43925v1	TRUE	1	5	10
GCA0004450151ASM44501v1	FALSE	0	9	9
GCA0016927551ASM169275v1	FALSE	0	15	15
GCA0018770551ASM187705v1	FALSE	0	14	14

Simulated read-depth per sample

The CAMISIM simulation generated samples with varying sequencing depths to mimic limited realistic sequencing conditions:

## Sequencing depth summary:

## - Total samples: 60

## - Range: 16,022,043 - 58,612,937 read pairs

## - Mean: 35,477,784 read pairs

## - Median: 35,149,150 read pairs

## - Coefficient of variation: 0.349

Sequencing depth (read pairs) per simulated sample (first 5 entries)
Sample	Read Pairs
C01_rep01	25,527,277
C01_rep02	17,623,623
C01_rep03	27,662,275
C02_rep01	28,211,719
C02_rep02	44,786,243

Loading DA Results

# Define available metrics
metrics <- c("counts", "tpm", "rpkm", "relative", "breadth", "coverage")
cat("Available metrics:", paste(metrics, collapse = ", "), "\n")

## Available metrics: counts, tpm, rpkm, relative, breadth, coverage

# load DA results for a specific metric
load_metric_results <- function(metric) {
  da_dir <- paste0('/home/mrahman/da_analysis/data_camisim_realistic_reps/da/', metric)
  da_files <- list.files(da_dir, pattern = '^DA_.*\\.csv$', full.names = TRUE) %>%
            .[!grepl("DA_basic\\.csv$", .)] # basic results are invalid 0s

read_da_file <- function(path) {
    tool <- basename(path) %>% str_replace_all('^DA_|\\.csv$', '')
    
    df <- read_csv(path, col_names = FALSE, show_col_types = FALSE) %>%
      rename(genome = X1) %>%
      filter(genome != "genome") %>% 
      mutate(tool = tool,
             metric = metric,
             genome = str_replace(genome, "\\.fa$", ""),
             genome = str_replace_all(genome, "[^A-Za-z0-9]", ""))
    
    # Handle different column structures
    if (ncol(df) >= 3) {
      df <- df %>% 
        rename(rawP = X2, adjP = X3) %>%
        mutate(rawP = as.numeric(rawP), adjP = as.numeric(adjP))
    } else {
      df <- df %>% mutate(rawP = NA_real_, adjP = NA_real_)
    }
    
    return(df)
  }
  
  result <- map_dfr(da_files, read_da_file) %>%
    distinct(tool, genome, .keep_all = TRUE) %>%
    group_by(tool, metric) %>%
    mutate(adjP = p.adjust(rawP, method = "BH")) %>% # recompute
    ungroup() %>%
    mutate(adjP = replace_na(adjP, 1))
  
  return(result)
}

all_metrics_results <- map_dfr(metrics, load_metric_results)


master_df <- all_metrics_results %>%
  left_join(truth_table, by = 'genome') %>%
  complete(genome, tool, metric, fill = list(adjP = 1)) %>%
  mutate(score = -log10(adjP),
         is_DA = replace_na(is_DA, FALSE))

cat("Total combinations:", nrow(master_df), "\n")

## Total combinations: 1152

# Show available combinations
combinations <- master_df %>% 
  count(metric, tool) %>% 
  arrange(metric, tool)

DT::datatable(combinations, 
              caption = "Available Tool-Metric Combinations",
              options = list(pageLength = 5, scrollX = TRUE))

3. Performance Analysis Across Metrics

Assesses tool–metric combinations using classification metrics such as F1-score and AUC-PR.

performance_metrics <- master_df %>%
  group_by(tool, metric) %>%
  summarise(
    n_genomes = n(),
    n_da_genomes = sum(is_DA, na.rm = TRUE),
    TP = sum(adjP < 0.05 & is_DA, na.rm = TRUE),
    FP = sum(adjP < 0.05 & !is_DA, na.rm = TRUE),
    TN = sum(adjP >= 0.05 & !is_DA, na.rm = TRUE),
    FN = sum(adjP >= 0.05 & is_DA, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  mutate(
    Precision = ifelse(TP + FP > 0, TP / (TP + FP), 0),
    Recall = ifelse(TP + FN > 0, TP / (TP + FN), 0),
    F1_Score = ifelse(Precision + Recall > 0, 2 * Precision * Recall / (Precision + Recall), 0),
    Specificity = ifelse(TN + FP > 0, TN / (TN + FP), 0),
    NPV = ifelse(TN + FN > 0, TN / (TN + FN), 0)
  ) %>%
  mutate(across(where(is.numeric), ~round(., 3)))

calculate_auc_metrics <- function(df, show_progress = interactive()) {
  unique_combinations <- df %>% distinct(tool, metric)
  
  auc_results <- data.frame(
    tool = character(),
    metric = character(),
    auc_roc = numeric(),
    auc_pr = numeric(),
    stringsAsFactors = FALSE
  )
  
  if (show_progress) {
    pb <- txtProgressBar(min = 0, max = nrow(unique_combinations), style = 3)
  }
  
  for(i in 1:nrow(unique_combinations)) {
    current_tool <- unique_combinations$tool[i]
    current_metric <- unique_combinations$metric[i]
    
    tool_data <- df %>% 
      filter(tool == current_tool, metric == current_metric, !is.na(score), !is.na(is_DA))
    
    auc_roc_val <- NA_real_
    auc_pr_val <- NA_real_
    
    if (nrow(tool_data) > 5 && sum(tool_data$is_DA) > 0 && sum(!tool_data$is_DA) > 0) {
      # Calculate AUC-ROC
      tryCatch({
        roc_obj <- pROC::roc(tool_data$is_DA, tool_data$score, quiet = TRUE)
        auc_roc_val <- as.numeric(pROC::auc(roc_obj))
      }, error = function(e) {
      })
      
      # Calculate AUC-PR
      da_scores <- tool_data$score[tool_data$is_DA == TRUE]
      non_da_scores <- tool_data$score[tool_data$is_DA == FALSE]
      da_scores <- da_scores[!is.na(da_scores)]
      non_da_scores <- non_da_scores[!is.na(non_da_scores)]
      
      if (length(da_scores) > 0 && length(non_da_scores) > 0) {
        tryCatch({
          pr_obj <- PRROC::pr.curve(scores.class0 = da_scores, 
                                   scores.class1 = non_da_scores, 
                                   curve = FALSE)
          auc_pr_val <- pr_obj$auc.integral
        }, error = function(e) {
          # Silent error handling
        })
      }
    }
    
    auc_results <- rbind(auc_results, data.frame(
      tool = current_tool,
      metric = current_metric,
      auc_roc = auc_roc_val,
      auc_pr = auc_pr_val,
      stringsAsFactors = FALSE
    ))
    
    if (show_progress) setTxtProgressBar(pb, i)
  }
  
  if (show_progress) close(pb)
  return(auc_results)
}

auc_metrics <- calculate_auc_metrics(master_df, show_progress = FALSE) %>%
  mutate(across(where(is.numeric), ~round(., 3)))

# Combine all performance metrics
complete_performance <- performance_metrics %>%
  left_join(auc_metrics, by = c("tool", "metric"))


# Display comprehensive performance table
DT::datatable(complete_performance %>% 
                select(tool, metric, F1_Score, auc_pr, Precision, Recall, auc_roc) %>%
                arrange(metric, desc(F1_Score)), 
              caption = "Complete Performance Matrix",
              options = list(pageLength = 15, scrollX = TRUE)) %>%
  formatRound(columns = c("F1_Score", "auc_pr", "Precision", "Recall", "auc_roc"), digits = 3)

4. Metric Performance Summary

Summarises aggregate performance statistics to rank abundance metrics across all tools.

# Metric performance summary
metric_summary <- complete_performance %>%
  group_by(metric) %>%
  summarise(
    n_tools = n(),
    mean_f1 = mean(F1_Score, na.rm = TRUE),
    max_f1 = max(F1_Score, na.rm = TRUE),
    mean_auc_pr = mean(auc_pr, na.rm = TRUE),
    max_auc_pr = max(auc_pr, na.rm = TRUE),
    mean_precision = mean(Precision, na.rm = TRUE),
    mean_recall = mean(Recall, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  arrange(desc(mean_f1)) %>%
  mutate(across(where(is.numeric), ~round(., 3)))

# Create interactive table
metric_summary %>%
  kable(caption = "Metric Performance Summary (Ranked by Mean F1-Score)", 
        col.names = c("Metric", "N Tools", "Mean F1", "Max F1", "Mean AUC-PR", 
                     "Max AUC-PR", "Mean Precision", "Mean Recall")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed")) %>%
  row_spec(1:3, bold = TRUE, color = "white", background = "#2E86AB")

Metric Performance Summary (Ranked by Mean F1-Score)
Metric	N Tools	Mean F1	Max F1	Mean AUC-PR	Max AUC-PR	Mean Precision	Mean Recall
counts	8	0.739	0.788	0.643	0.747	0.593	0.982
tpm	8	0.733	0.737	0.636	0.783	0.581	0.991
breadth	8	0.728	0.737	0.560	0.698	0.579	0.982
coverage	8	0.673	0.765	0.647	0.818	0.580	0.830
rpkm	8	0.664	0.722	0.649	0.833	0.577	0.812
relative	8	0.643	0.800	0.674	0.846	0.595	0.750

Observation: Counts delivers the highest mean F1-score (0.739) and, together with TPM and breadth, maintains near-perfect recall (> 0.98). Normalized metrics—especially relative abundance and coverage—achieve the best mean and maximum AUC-PR values, which mean they rank true positives more effectively even if their overall F1 lags behind. Thus, raw counts still performs better on balanced accuracy, but some length/depth-normalized metrics provide stronger ranking power for differential calls.

ANCOM Performance Across Metrics

# Filter performance for ANCOM only
ancom_performance <- complete_performance %>%
  filter(tool == "ANCOM") %>%
  select(metric, F1_Score, auc_pr, Precision, Recall) %>%
  arrange(desc(F1_Score))

# Display table with existing aesthetics
ancom_performance %>%
  kable(caption = "ANCOM Performance Across Abundance Metrics", digits = 3) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

ANCOM Performance Across Abundance Metrics
metric	F1_Score	auc_pr	Precision	Recall
counts	0.737	0.588	0.583	1.000
tpm	0.737	0.620	0.583	1.000
breadth	0.703	0.465	0.565	0.929
coverage	0.703	0.585	0.565	0.929
relative	0.686	0.846	0.571	0.857
rpkm	0.686	0.760	0.571	0.857

Observation: For ANCOM, counts and TPM tie on F1, yet normalized metrics (relative, RPKM) produce the highest AUC-PR. Taken together, normalization do not degrade performance and often improves the quality of ranked DA calls, confirming that normalized inputs could be as good as raw counts for MAG-level differential abundance.

Counts vs Coverage: Performance Across All Methods

# Filter for counts and coverage metrics
counts_cov_perf <- complete_performance %>%
  filter(metric %in% c("counts", "coverage")) %>%
  select(tool, metric, F1_Score, auc_pr, Precision, Recall) %>%
  arrange(tool, metric)

# Display in wide format for easier comparison
counts_cov_wide <- counts_cov_perf %>%
  pivot_wider(names_from = metric, values_from = c(F1_Score, auc_pr, Precision, Recall)) %>%
  arrange(desc(F1_Score_coverage))

counts_cov_wide %>%
  kable(caption = "Counts vs Coverage Performance (All Tools)", digits = 3) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Counts vs Coverage Performance (All Tools)
tool	F1_Score_counts	F1_Score_coverage	auc_pr_counts	auc_pr_coverage	Precision_counts	Precision_coverage	Recall_counts	Recall_coverage
metagenomeSeq	0.703	0.765	0.616	0.627	0.565	0.650	0.929	0.929
limma	0.737	0.737	0.745	0.694	0.583	0.583	1.000	1.000
dearseq	0.737	0.706	0.548	0.577	0.583	0.545	1.000	1.000
ANCOM	0.737	0.703	0.588	0.585	0.583	0.565	1.000	0.929
NOISeq	0.737	0.703	0.615	0.539	0.583	0.565	1.000	0.929
ALDEx2	0.737	0.593	0.639	0.722	0.583	0.615	1.000	0.571
edgeR	0.737	0.593	0.747	0.818	0.583	0.615	1.000	0.571
corncob	0.788	0.588	0.649	0.614	0.684	0.500	0.929	0.714

# Long format for plotting
counts_cov_perf %>%
  ggplot(aes(x = tool, y = F1_Score, fill = metric)) +
  geom_col(position = position_dodge(width = 0.8), width = 0.7) +
  scale_fill_manual(values = metric_colors[c("counts", "coverage")]) +
  labs(title = "F1-Score: Counts vs Coverage Across Tools",
       x = "DA Tool", y = "F1-Score", fill = "Metric") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Best Performing Methods for Counts and Coverage

# Determine best tool for counts and coverage based on F1_Score
best_counts_cov <- complete_performance %>%
  filter(metric %in% c("counts", "coverage")) %>%
  group_by(metric) %>%
  slice_max(order_by = F1_Score, n = 1, with_ties = FALSE) %>%
  ungroup() %>%
  select(metric, tool, F1_Score, auc_pr, Precision, Recall) %>%
  arrange(desc(F1_Score))

best_counts_cov %>%
  kable(caption = "Best Performing Method for Counts and Coverage", digits = 3) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Best Performing Method for Counts and Coverage
metric	tool	F1_Score	auc_pr	Precision	Recall
counts	corncob	0.788	0.649	0.684	0.929
coverage	metagenomeSeq	0.765	0.627	0.650	0.929

# Metric comparison box plots
# Define consistent metric order
metric_order <- c("counts", "tpm", "breadth", "coverage", "rpkm", "relative")

p1 <- ggplot(complete_performance, aes(x = factor(metric, levels = metric_order), 
                                       y = F1_Score, fill = metric)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, alpha = 0.6, size = 2) +
  scale_fill_manual(values = metric_colors) +
  scale_y_continuous(limits = c(0, 1)) +
  labs(title = "A) F1-Score Distribution by Metric",
       x = "Abundance Metric", y = "F1-Score") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        legend.position = "none") +
  coord_flip()

p2 <- ggplot(complete_performance %>% filter(!is.na(auc_pr)), 
             aes(x = factor(metric, levels = metric_order), y = auc_pr, fill = metric)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, alpha = 0.6, size = 2) +
  scale_fill_manual(values = metric_colors) +
  scale_y_continuous(limits = c(0, 1)) +
  labs(title = "B) AUC-PR Distribution by Metric",
       x = "Abundance Metric", y = "AUC-PR") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        legend.position = "none") +
  coord_flip()

p3 <- ggplot(complete_performance, aes(x = factor(metric, levels = metric_order), 
                                       y = Precision, fill = metric)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, alpha = 0.6, size = 2) +
  scale_fill_manual(values = metric_colors) +
  scale_y_continuous(limits = c(0, 1)) +
  labs(title = "C) Precision Distribution by Metric",
       x = "Abundance Metric", y = "Precision") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        legend.position = "none") +
  coord_flip()

p4 <- ggplot(complete_performance, aes(x = factor(metric, levels = metric_order), 
                                       y = Recall, fill = metric)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.2, alpha = 0.6, size = 2) +
  scale_fill_manual(values = metric_colors) +
  scale_y_continuous(limits = c(0, 1)) +
  labs(title = "D) Recall Distribution by Metric",
       x = "Abundance Metric", y = "Recall") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        legend.position = "none") +
  coord_flip()

gridExtra::grid.arrange(p1, p2, p3, p4, ncol = 2, 
                       top = "Performance Metrics Distribution Across Abundance Metrics")

Observation: Counts give the best balanced accuracy (highest F1, almost-perfect recall), but normalized metrics—especially relative have better ranking power (top AUC-PR) at the cost of recall. So, normalized metric such as relative can boost detection quality but raw counts remain safest for balanced hit rates.

5. Tool Consistency Analysis

Quantifies how consistently each tool performs across metrics and ranks their stability.

# Tool consistency analysis
tool_rankings <- complete_performance %>%
  group_by(metric) %>%
  arrange(desc(F1_Score)) %>%
  mutate(f1_rank = row_number()) %>%
  arrange(desc(auc_pr)) %>%
  mutate(auc_pr_rank = row_number()) %>%
  ungroup() %>%
  mutate(combined_rank = (f1_rank + auc_pr_rank) / 2) %>%
  arrange(metric, combined_rank)

tool_consistency <- tool_rankings %>%
  group_by(tool) %>%
  summarise(
    metrics_tested = n(),
    mean_f1 = mean(F1_Score, na.rm = TRUE),
    sd_f1 = sd(F1_Score, na.rm = TRUE),
    mean_auc_pr = mean(auc_pr, na.rm = TRUE),
    sd_auc_pr = sd(auc_pr, na.rm = TRUE),
    mean_rank = mean(combined_rank, na.rm = TRUE),
    rank_consistency = sd(combined_rank, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  arrange(mean_rank) %>%
  mutate(across(where(is.numeric), ~round(., 3)))

# Display tool consistency table
tool_consistency %>%
  kable(caption = "Tool Consistency Analysis (Ranked by Mean Performance Rank)", 
        col.names = c("Tool", "Metrics Tested", "Mean F1", "SD F1", "Mean AUC-PR", 
                     "SD AUC-PR", "Mean Rank", "Rank Consistency")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed")) %>%
  row_spec(1:3, bold = TRUE, color = "white", background = "#2E86AB")

Tool Consistency Analysis (Ranked by Mean Performance Rank)
Tool	Metrics Tested	Mean F1	SD F1	Mean AUC-PR	SD AUC-PR	Mean Rank	Rank Consistency
ALDEx2	6	0.645	0.109	0.706	0.047	3.250	1.541
edgeR	6	0.667	0.078	0.731	0.131	4.167	0.876
metagenomeSeq	6	0.735	0.040	0.643	0.065	4.250	1.969
limma	6	0.720	0.029	0.651	0.106	4.417	0.970
NOISeq	6	0.732	0.018	0.553	0.055	4.500	1.517
ANCOM	6	0.709	0.023	0.644	0.137	4.583	1.715
dearseq	6	0.727	0.016	0.549	0.041	5.167	0.983
corncob	6	0.641	0.126	0.600	0.066	5.667	1.889

# Best performing tools per metric
best_tools_per_metric <- tool_rankings %>%
  group_by(metric) %>%
  slice_min(combined_rank, n = 3) %>%
  select(metric, tool, F1_Score, auc_pr, combined_rank) %>%
  arrange(metric, combined_rank)

# Create interactive table
DT::datatable(best_tools_per_metric, 
              caption = "Top 3 Tools per Metric (by Combined F1 + AUC-PR Ranking)",
              options = list(pageLength = 20, scrollX = TRUE)) %>%
  formatRound(columns = c("F1_Score", "auc_pr", "combined_rank"), digits = 3)

6. Cross-Metric Performance Heatmaps

# F1-Score heatmap
f1_heatmap_data <- complete_performance %>%
  select(tool, metric, F1_Score) %>%
  pivot_wider(names_from = metric, values_from = F1_Score) %>%
  column_to_rownames("tool") %>%
  as.matrix()

# Replace NAs with 0 for visualization
f1_heatmap_data[is.na(f1_heatmap_data)] <- 0

# Create F1-Score heatmap
par(mfrow = c(1, 1))
corrplot(f1_heatmap_data, 
         method = "color", 
         is.corr = FALSE,
         col = colorRampPalette(c("#053061", "#FFFFFF", "#67001F"))(200),
         tl.col = "black", 
         tl.srt = 45,
         addCoef.col = "black",
         number.cex = 0.7,
         title = "F1-Score Heatmap: Tools vs Metrics",
         mar = c(0,0,2,0),
         cl.cex = 0.8)

# AUC-PR heatmap
auc_pr_heatmap_data <- complete_performance %>%
  select(tool, metric, auc_pr) %>%
  filter(!is.na(auc_pr)) %>%
  pivot_wider(names_from = metric, values_from = auc_pr) %>%
  column_to_rownames("tool") %>%
  as.matrix()

# Fill missing values with 0
auc_pr_complete <- matrix(0, nrow = nrow(f1_heatmap_data), ncol = ncol(f1_heatmap_data))
rownames(auc_pr_complete) <- rownames(f1_heatmap_data)
colnames(auc_pr_complete) <- colnames(f1_heatmap_data)

# Fill in available values
for(i in rownames(auc_pr_heatmap_data)) {
  for(j in colnames(auc_pr_heatmap_data)) {
    if(i %in% rownames(auc_pr_complete) & j %in% colnames(auc_pr_complete)) {
      auc_pr_complete[i, j] <- auc_pr_heatmap_data[i, j]
    }
  }
}

corrplot(auc_pr_complete, 
         method = "color", 
         is.corr = FALSE,
         col = colorRampPalette(c("#053061", "#FFFFFF", "#67001F"))(200),
         tl.col = "black", 
         tl.srt = 45,
         addCoef.col = "black",
         number.cex = 0.7,
         title = "AUC-PR Heatmap: Tools vs Metrics",
         mar = c(0,0,2,0),
         cl.cex = 0.8)

8. Key Findings and Statistical Analysis

8.1 Normalization Impact

# Detailed normalization impact analysis
normalization_analysis <- metric_summary %>%
  mutate(
    category = case_when(
      metric == "counts" ~ "Raw Counts",
      metric %in% c("tpm", "rpkm", "coverage") ~ "Length/Depth Normalized",
      metric %in% c("relative", "breadth") ~ "Other Metrics"
    )
  ) %>%
  group_by(category) %>%
  summarise(
    n_metrics = n(),
    mean_f1 = mean(mean_f1),
    mean_auc_pr = mean(mean_auc_pr, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  mutate(across(where(is.numeric), ~round(., 3)))

normalization_analysis %>%
  kable(caption = "Normalization Impact Analysis", 
        col.names = c("Category", "N Metrics", "Mean F1", "Mean AUC-PR")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed")) %>%
  row_spec(which(normalization_analysis$category == "Length/Depth Normalized"), 
           bold = TRUE, color = "white", background = "#2E86AB")

Normalization Impact Analysis
Category	N Metrics	Mean F1	Mean AUC-PR
Length/Depth Normalized	3	0.690	0.644
Other Metrics	2	0.685	0.617
Raw Counts	1	0.739	0.643

# Calculate improvement
counts_f1 <- normalization_analysis$mean_f1[normalization_analysis$category == "Raw Counts"]
normalized_f1 <- normalization_analysis$mean_f1[normalization_analysis$category == "Length/Depth Normalized"]
improvement <- ((normalized_f1 - counts_f1) / counts_f1) * 100

cat("\n**Key Finding**: Length/depth normalization provides a", 
    round(improvement, 1), "% improvement in F1-Score over raw counts\n")

## 
## **Key Finding**: Length/depth normalization provides a -6.6 % improvement in F1-Score over raw counts

8.2 Champion Analysis

# Identify champions for each metric
champions <- best_tools_per_metric %>% 
  group_by(metric) %>% 
  slice_head(n = 1) %>%
  ungroup() %>%
  arrange(desc(F1_Score))

champions %>%
  kable(caption = "Champion Tools for Each Metric", 
        col.names = c("Metric", "Champion Tool", "F1-Score", "AUC-PR", "Combined Rank")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed")) %>%
  row_spec(1:3, bold = TRUE, color = "white", background = "#2E86AB")

Champion Tools for Each Metric
Metric	Champion Tool	F1-Score	AUC-PR	Combined Rank
relative	metagenomeSeq	0.800	0.750	1.5
counts	corncob	0.788	0.649	2.0
breadth	ALDEx2	0.737	0.698	1.0
coverage	limma	0.737	0.694	2.5
tpm	ALDEx2	0.737	0.670	2.0
rpkm	NOISeq	0.722	0.573	3.0

8.3 Statistical Significance Testing

# Perform statistical tests for metric differences
library(broom)

# Test if normalized metrics significantly outperform raw counts
performance_long <- complete_performance %>%
  mutate(
    normalization = case_when(
      metric == "counts" ~ "Raw",
      metric %in% c("tpm", "rpkm", "coverage") ~ "Normalized",
      TRUE ~ "Other"
    )
  ) %>%
  filter(normalization %in% c("Raw", "Normalized"))

# t-test for F1-Score differences
f1_test <- t.test(F1_Score ~ normalization, data = performance_long)
auc_test <- t.test(auc_pr ~ normalization, data = performance_long, na.rm = TRUE)

# Helper to return significance notation
sig_code <- function(p) {
  if (p < 0.001) return("***")
  else if (p < 0.01) return("**")
  else if (p < 0.05) return("*")
  else return("ns")
}

# Extract means by explicit group name to avoid alphabetical ordering issues
raw_mean_f1        <- f1_test$estimate[["mean in group Raw"]]
normalized_mean_f1 <- f1_test$estimate[["mean in group Normalized"]]

raw_mean_auc        <- auc_test$estimate[["mean in group Raw"]]
normalized_mean_auc <- auc_test$estimate[["mean in group Normalized"]]

cat("**Statistical Tests**\n\n")

## **Statistical Tests**

cat("F1-Score: Raw vs Normalized\n")

## F1-Score: Raw vs Normalized

cat("p-value:", format(f1_test$p.value, scientific = TRUE), sig_code(f1_test$p.value), "\n")

## p-value: 2.219172e-03 **

cat("Raw mean:", round(raw_mean_f1, 3), "\n")

## Raw mean: 0.739

cat("Normalized mean:", round(normalized_mean_f1, 3), "\n\n")

## Normalized mean: 0.69

cat("AUC-PR: Raw vs Normalized\n")

## AUC-PR: Raw vs Normalized

cat("p-value:", format(auc_test$p.value, scientific = TRUE), sig_code(auc_test$p.value), "\n")

## p-value: 9.898109e-01 ns

cat("Raw mean:", round(raw_mean_auc, 3), "\n")

## Raw mean: 0.643

cat("Normalized mean:", round(normalized_mean_auc, 3), "\n")

## Normalized mean: 0.644

Observation: Raw counts have a higher mean F1 (0.739) than normalized metrics (0.690) and the difference is statistically significant (p ≈ 2 × 10⁻³, ). For AUC-PR the means are essentially identical (0.643 vs 0.644) with no significant difference (p ≈ 0.99, ns). —

9. Discussion and Implications

9.1 Answer to ANCOM-BC Issue #196

Question: Can normalized abundance data (TPM, RPKM) be effectively used with differential abundance tools designed for raw counts?

Answer: It depends on the performance goal. Normalized abundance metrics (TPM, RPKM, coverage) work well with all DA tools and equal raw counts on overall ranking power (AUC-PR) while improving precision, yet raw counts still give the highest balanced accuracy (F1) and recall. Thus normalization is a valid choice—advantageous when ranking or precision is critical—but it does not universally outperform raw counts, which remain the safest option when maximising true-positive recovery is the priority.

9.2 Key Insights

Best Performing Combinations

# Identify the top 3 performing tool-metric combinations by F1-Score
best_combos <- complete_performance %>%
  filter(!is.na(F1_Score), !is.na(auc_pr)) %>%
  arrange(desc(F1_Score), desc(auc_pr)) %>%
  slice_head(n = 3) %>%
  mutate(label = sprintf("- **%s + %s**: F1 = %.3f, AUC-PR = %.3f", tool, metric, F1_Score, auc_pr))

cat(paste(best_combos$label, collapse = "\n"))

## - **metagenomeSeq + relative**: F1 = 0.800, AUC-PR = 0.750
## - **corncob + counts**: F1 = 0.788, AUC-PR = 0.649
## - **metagenomeSeq + coverage**: F1 = 0.765, AUC-PR = 0.627

10. Conclusions and Future Directions

Summarises overarching conclusions and suggests avenues for future research.

Main Conclusions

# Generate final summary statistics
final_stats <- list(
  total_combinations = nrow(complete_performance),
  best_metric = metric_summary$metric[1],
  best_tool = tool_consistency$tool[1],
  improvement = round(improvement, 1),
  significance = ifelse(f1_test$p.value < 0.05, "Significant", "Not significant")
)

cat("**FINAL CONCLUSIONS**\n\n")

## **FINAL CONCLUSIONS**

cat("1. **", final_stats$total_combinations, " tool-metric combinations evaluated**\n")

## 1. ** 48  tool-metric combinations evaluated**

cat("2. **Best abundance metric**: ", final_stats$best_metric, "\n")

## 2. **Best abundance metric**:  counts

cat("3. **Most consistent tool**: ", final_stats$best_tool, "\n")

## 3. **Most consistent tool**:  ALDEx2

cat("4. **Length/Depth Normalization effect**: ", final_stats$improvement, "% degradation (", final_stats$significance, ")\n")

## 4. **Length/Depth Normalization effect**:  -6.6 % degradation ( Significant )

cat("5. **ANCOM-BC Issue #196**: **Length-depth normalization provides no significant advantages over raw count metric** -")

## 5. **ANCOM-BC Issue #196**: **Length-depth normalization provides no significant advantages over raw count metric** -

Session Information

sessionInfo()

## R version 4.3.3 (2024-02-29)
## Platform: x86_64-conda-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.1 LTS
## 
## Matrix products: default
## BLAS/LAPACK: /home/mrahman/miniconda3/envs/r_env/lib/libopenblasp-r0.3.27.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Berlin
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] broom_1.0.8        viridis_0.6.5      viridisLite_0.4.2  RColorBrewer_1.1-3
##  [5] plotly_4.11.0      DT_0.33            kableExtra_1.4.0   knitr_1.50        
##  [9] corrplot_0.95      PRROC_1.4          rlang_1.1.6        pROC_1.18.5       
## [13] gridExtra_2.3      lubridate_1.9.4    forcats_1.0.0      stringr_1.5.1     
## [17] dplyr_1.1.4        purrr_1.0.4        readr_2.1.5        tidyr_1.3.1       
## [21] tibble_3.3.0       ggplot2_3.5.2      tidyverse_2.0.0   
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6      xfun_0.52         bslib_0.9.0       htmlwidgets_1.6.4
##  [5] tzdb_0.5.0        vctrs_0.6.5       tools_4.3.3       crosstalk_1.2.1  
##  [9] generics_0.1.4    parallel_4.3.3    pkgconfig_2.0.3   data.table_1.17.6
## [13] lifecycle_1.0.4   compiler_4.3.3    farver_2.1.2      textshaping_1.0.1
## [17] htmltools_0.5.8.1 sass_0.4.10       yaml_2.3.10       lazyeval_0.2.2   
## [21] pillar_1.10.2     crayon_1.5.3      jquerylib_0.1.4   cachem_1.1.0     
## [25] tidyselect_1.2.1  digest_0.6.37     stringi_1.8.7     labeling_0.4.3   
## [29] fastmap_1.2.0     grid_4.3.3        cli_3.6.5         magrittr_2.0.3   
## [33] dichromat_2.0-0.1 withr_3.0.2       backports_1.5.0   scales_1.4.0     
## [37] bit64_4.6.0-1     timechange_0.3.0  rmarkdown_2.29    httr_1.4.7       
## [41] bit_4.6.0         hms_1.1.3         evaluate_1.0.4    Rcpp_1.0.14      
## [45] glue_1.8.0        xml2_1.3.8        svglite_2.2.1     rstudioapi_0.17.1
## [49] vroom_1.6.5       jsonlite_2.0.0    R6_2.6.1          plyr_1.8.9       
## [53] systemfonts_1.2.3

12. Data and Code

Lists locations of data, scripts, and pipeline resources for transparent reuse.

MAGician Snakemake pipeline: core5:/home/mrahman/magician
MAGician outputs: core5:/local/mrahman/magician/result/
Sample-distribution design script: core5:/home/mrahman/magician/make_sample_distribution.py
Benchdamic execution script: core5:/home/mrahman/magician/workflow/scripts/run_benchdamic_ALL.R
This R Markdown file: core5:/home/mrahman/da_analysis/Rmds/complete_multi_metric_analysis_camisim_realistic_reps.Rmd

Multi-Metric Differential Abundance Benchmarking (CAMISIM_Realistic_Reps)

Evaluation of DA Tools Across Six Abundance Metrics

Md Mobashir Rahman