Frequentist vs Bayesian Calibration Comparison

Overview

The curveR ecosystem provides two complementary approaches to immunoassay standard curve fitting:

curveRfreq: Per-plate NLS with AIC model selection
curveRbayes: Hierarchical Bayesian with LOO-CV model selection

Both packages depend on curveRcore for model math, ensuring identical parameterisation. This vignette demonstrates fitting the same three alpha plates (curve_id = 1, 2, 3) from bead_assay_example with both approaches, then comparing the results.

Setup

library(curveRcore)
library(curveRfreq)
library(curveRbayes)

data("bead_assay_example", package = "curveRcore")

Shared Settings

Both methods use the same preprocessing pipeline, constraints, and study parameters. This ensures any differences in the output come from the estimation method, not the data preparation.

alpha_cids <- c(1, 2, 3)

ac <- new_antigen_constraints("alpha", std_curve_conc = 10000,
                               pcov_threshold = 20)
sp <- new_study_params(is_log_response = TRUE, is_log_independent = TRUE,
                        apply_prozone = TRUE, blank_option = "ignored")
fo <- new_fit_options(model_names = c("logistic4", "gompertz4"),
                       n_grid = 200, cv_x_max = 150)

Filter to Alpha Plates

alpha_data <- bead_assay_example
alpha_data$standards <- bead_assay_example$standards[
  bead_assay_example$standards$curve_id %in% alpha_cids, ]
alpha_data$blanks <- bead_assay_example$blanks[
  bead_assay_example$blanks$curve_id %in% alpha_cids, ]
alpha_data$samples <- bead_assay_example$samples[
  bead_assay_example$samples$curve_id %in% alpha_cids, ]
alpha_data$curve_id_lookup <- bead_assay_example$curve_id_lookup[
  bead_assay_example$curve_id_lookup$curve_id %in% alpha_cids, ]

cat("Plates:", paste(alpha_cids, collapse = ", "), "\n")
cat("Standards:", nrow(alpha_data$standards), "rows\n")
cat("Samples:", nrow(alpha_data$samples), "rows\n")

Frequentist Fit (All 3 Plates)

curveRfreq fits each plate independently via multi-start Levenberg-Marquardt NLS, then selects the best model by AIC.

freq_mp <- fit_calibration_freq_multiplate(
  data                = alpha_data,
  antigen_constraints = ac,
  study_params        = sp,
  fit_options         = fo,
  verbose             = TRUE
)

freq_summary <- summary_table(freq_mp)
print(freq_summary[, c("plate_label", "best_model", "aic", "a", "b", "c", "d")])

Extract the plate 1 result for head-to-head comparison:

freq_plate1 <- freq_mp$plates[[1]]
freq_plate1

Bayesian Fit (All 3 Plates, Hierarchical)

curveRbayes fits all three plates simultaneously. The hierarchical prior shares information across plates: each plate’s parameters are modelled as random deviations from population-level means.

bayes_result <- fit_calibration_bayes(
  data                = alpha_data,
  antigen_constraints = ac,
  study_params        = sp,
  fit_options         = fo,
  model_families      = c("logistic4", "gompertz4"),
  chains              = 4,
  warmup              = 1000,
  sampling            = 1000,
  adapt_delta         = 0.9,
  seed                = 42,
  n_draws_predict     = 500,
  verbose             = TRUE
)

bayes_result

Per-plate posterior summaries:

bayes_tbl <- summary_table_bayes(bayes_result)
print(bayes_tbl)

Grid Comparison (Plate 1)

compare_calibrations() merges the 200-point prediction grids from both methods, prefixing columns with the method name:

grid_comp <- compare_calibrations(freq_plate1, bayes_result)

head(grid_comp[, c("log10_concentration",
                    "frequentist_predicted_response",
                    "bayesian_predicted_response")])

par(mfrow = c(1, 2))

# Predicted response
plot(grid_comp$frequentist_predicted_response,
     grid_comp$bayesian_predicted_response,
     xlab = "Frequentist", ylab = "Bayesian",
     main = "Predicted Response", pch = 16, cex = 0.6)
abline(0, 1, col = "red", lty = 2)

# pcov (precision of back-calculated concentration)
plot(grid_comp$frequentist_pcov,
     grid_comp$bayesian_pcov,
     xlab = "Frequentist pcov", ylab = "Bayesian pcov",
     main = "Precision of Concentration (pcov)", pch = 16, cex = 0.6)
abline(0, 1, col = "red", lty = 2)

Points below the red diagonal in the pcov plot indicate regions where the Bayesian hierarchical pooling delivers tighter concentration estimates than per-plate NLS.

Parameter Comparison (Plate 1)

compare_parameters() produces a side-by-side table of the best-fit parameter estimates. Frequentist values are NLS point estimates with Wald standard errors; Bayesian values are posterior means with posterior SDs.

param_comp <- compare_parameters(freq_plate1, bayes_result)
print(param_comp[, c("term",
                      "frequentist_estimate", "frequentist_se",
                      "bayesian_estimate", "bayesian_se",
                      "diff", "rel_diff")])

For well-identified parameters (b, c, d), the two methods typically agree within the SE. The lower asymptote (a) may differ because the Bayesian model’s hierarchical prior and blank-anchoring provide regularisation that NLS lacks.

Sample Comparison

compare_samples() merges the per-sample back-calculated concentrations from both methods:

sample_comp <- compare_samples(freq_plate1, bayes_result,
                                by = c("sampleid", "curve_id"))

if (!is.null(sample_comp)) {
  head(sample_comp[, c("sampleid",
                        "frequentist_predicted_log10_concentration",
                        "bayesian_predicted_log10_concentration",
                        "conc_diff")])
}

if (!is.null(sample_comp)) {
  plot(sample_comp$frequentist_predicted_log10_concentration,
       sample_comp$bayesian_predicted_log10_concentration,
       xlab = "Frequentist log10(conc)", ylab = "Bayesian log10(conc)",
       main = "Sample Back-Calculation", pch = 16)
  abline(0, 1, col = "red", lty = 2)
}

Agreement Metrics

agreement_metrics() computes bias, MAE, RMSE, Pearson correlation, and Lin’s concordance correlation coefficient (CCC). CCC is the key metric: it penalises both lack of correlation AND systematic bias.

# Grid-level response agreement
resp_metrics <- agreement_metrics(
  grid_comp$frequentist_predicted_response,
  grid_comp$bayesian_predicted_response
)
cat(sprintf("Response: bias=%.4f, RMSE=%.4f, CCC=%.4f\n",
            resp_metrics$bias, resp_metrics$rmse, resp_metrics$ccc))

# Sample-level concentration agreement
if (!is.null(sample_comp)) {
  conc_metrics <- agreement_metrics(
    sample_comp$frequentist_predicted_log10_concentration,
    sample_comp$bayesian_predicted_log10_concentration
  )
  cat(sprintf("Concentration: bias=%.4f, RMSE=%.4f, CCC=%.4f\n",
              conc_metrics$bias, conc_metrics$rmse, conc_metrics$ccc))
}

When Do They Disagree?

The two methods tend to diverge in two regimes:

Near the asymptotes – the Bayesian hierarchical prior on a and d stabilises estimates where NLS has wide confidence intervals and the delta-method pcov explodes.
Low-signal plates – hierarchical pooling borrows strength from other plates, while NLS fits each plate independently and may fail to converge or produce unstable estimates.

The pcov column captures this directly: grid points where the Bayesian pcov is lower than the frequentist pcov are regions where hierarchical pooling helped.

pcov_diff <- grid_comp$frequentist_pcov - grid_comp$bayesian_pcov
plot(grid_comp$log10_concentration, pcov_diff,
     type = "l", lwd = 2,
     xlab = "log10(concentration)", ylab = "Freq pcov - Bayes pcov",
     main = "Bayesian Precision Advantage (positive = Bayes better)")
abline(h = 0, col = "gray50", lty = 2)

Summary

The curveR ecosystem makes cross-method comparison straightforward because both packages share the same model math, preprocessing, and output class. The typical workflow is:

Define shared settings (new_antigen_constraints, new_study_params, new_fit_options)
Fit with fit_calibration_freq() and fit_calibration_bayes()
Compare with compare_calibrations(), compare_parameters(), compare_samples(), and agreement_metrics()