STRAT-NUDGE cluster randomized trial

Author

A.Amstutz

STRAT-NUDGE cluster randomized trial (CRT)

Nuding intervention on the level of SHCS physicians at SHCS sites in Switzerland to promote the use of a novel TB risk score to do individual risk-tailored TB screening (i.e. IGRA testing):

Control: Enhanced standard of care: Electronic health records (EHR)-based based nudge consisting of a physician-targeted reminder for IGRA testing for all (no risk score)
Intervention: EHR-based nudge consisting of a physician-targeted reminder for IGRA testing guided by the risk score (individually risk-tailored: no testing for low-risk, and testing for high-risk))

Parameters and design considerations

Two-arm parallel-group, multi-center, superiority, cluster-randomized trial
Clusters are SHCS physicians, with their SHCS cohort participants, from all major sites (7 sites across Switzerland)
Eligible physician-clusters are active stable SHCS physician and eligible participants are SHCS participants with no recorded IGRA
Unit of data collection with be the SHCS cohort participants, but level of inference will be the physicians, i.e. cluster-average
Primary outcome, powered sample size on, is “Strategy-concordant IGRA testing at the first clinical encounter” defined as:
- Intervention group: risk-tailored IGRA testing (IGRA performed in high-risk participants and not performed in low-risk participants), and
- Control group: untailored IGRA testing (IGRA performed in all participants).
  
  Participants are classified as adherent to strategy (yes/no) based on whether testing at the first encounter aligns with their group-specific strategy.
We expect that:
- across the entire cohort ca. 10% high-risk and 90% low-risk participants
- in the control group: ca. 50% will follow their nudge
- in the intervention group: ca. 98% will follow the nudge for the high-risk group (i.e. testing)
- in the intervention group: ca. 66% will follow the nudge for the low-risk group (i.e. not to test)
  - That results in an intervention adherence rate of 0.10 x 0.98 + 0.90 x 0.66 = ca. 70%
- => delta of 20 absolute percentage points
Cluster size (m) of eligible overall participants (6m recruitment period):
- on average 18 eligible participants per physician-cluster over a 6m period
CV (coefficient of variation), ratio of standard deviation of cluster sizes to mean of cluster sizes:
- 0.98
ICC for the primary outcome: 0.20 (behavioural intervention/outcome)
Max. ca. 90 eligible clusters, i.e. 45 clusters per arm, due to cohort
Min. desired power 80%, two-sided alpha of 0.05
1:1 allocation

Packages

Code

req_pkgs <- c("pwr",
              "dplyr",
              "purrr",
              "ggplot2",
              "lme4",
              "geepack", # for GEE (if needed)
              "MASS", # for GLMM PQL
              "marginaleffects", # for marginal standardization
              
              "future",
              "future.apply",
              "nlme",
              
              "tibble",
              "knitr",
              "kableExtra",
              "splines"
)
install_if_missing <- function(pkgs){
  for(p in pkgs){
    if(!requireNamespace(p, quietly=TRUE)){
      install.packages(p, repos="https://cloud.r-project.org")
    }
    library(p, character.only=TRUE)
  }
}
install_if_missing(req_pkgs)

# set global RNG seed for reproducibility
set.seed(20250809)

Corresponding individual randomized trial

Sample size for the individual randomized trial on the same question

Code

# Parameters
p_C <- 0.50
p_I <- 0.70
power <- 0.80 
ICC <- 0.20
alpha <- 0.05

# Effect size, standardized as Cohen's h
h_I_C <- ES.h(p1 = p_I, p2 = p_C)
cat("Cohen's h for Intervention vs Control:", round(h_I_C, 3), "\n")

Cohen's h for Intervention vs Control: 0.412

Code

# Sample size pair-wise comparison
ss_I_C <- pwr.2p.test(h = h_I_C, sig.level = alpha, power = power)
n_per_arm <- ceiling(ss_I_C$n)
n_total <- n_per_arm * 2

cat("Sample size per arm:", n_per_arm, "\n")

Sample size per arm: 93

Code

cat("Total trial sample size (2-arm trial):", n_total)

Total trial sample size (2-arm trial): 186

(1) Sample size calculation CRT: formula-based

Add the design effect (DEFF) to the individual RCT sample size. The usual standard DEFF formula:

DEFF = 1+(m−1)ICC , whereby m = cluster size

However, let’s not forget the cluster size variation. The usual conservative adjustment of the DEFF with cluster size variation is (e.g. see here: https://pmc.ncbi.nlm.nih.gov/articles/PMC7394950/#sup1):

DEFF_cv = 1+((m(1+CV^2)−1))ICC , whereby CV is the coefficient of variation (ratio of standard deviation of cluster sizes to mean of cluster sizes)

Code

# Parameters
p_C <- 0.50
p_I <- 0.70 
power <- 0.80 
ICC <- 0.20
alpha <- 0.05

m <- 18 # average cluster size
CV <- 0.98 # CV

deff <- 1+(m-1)*ICC # standard DEFF
deff_cv <- 1+((m*(1+CV^2))-1)*ICC # DEFF with cluster size variation

# Effect size
h_I_C <- ES.h(p1 = p_I, p2 = p_C)

# sample size for corresponding individual RCT 
ss <- pwr.2p.test(h = h_I_C, power = power, sig.level = alpha)$n

# CRT sample size
ss_crt <- ceiling(ss * deff_cv)
n_clusters <- ceiling(ss_crt / m)
cat("Cluster sample size int arm 1:", n_clusters, "\n")

Cluster sample size int arm 1: 41

Code

cat("Individual sample size int arm 1:", ss_crt, "\n")

Individual sample size int arm 1: 729

Code

# Total
tot_clusters <- n_clusters * 2
tot_ind <- ss_crt * 2
cat("Total cluster sample size:", tot_clusters, "\n")

Total cluster sample size: 82

Code

cat("Total individual sample size:", tot_ind, "\n")

Total individual sample size: 1458

(1.1) Varying assumptions - Standard sample size calculation

(1.1.1) Varying ICC

Varying ICC, all other parameters fixed

Code

power <- 0.80
alpha <- 0.05
p_C <- 0.50 
m <- 18
CV <- 0.98
delta <- 0.20

# Range of ICC values to test
ICC_values <- seq(0.05, 0.35, by = 0.01)

results_df <- data.frame(
  ICC = numeric(),
  n_clusters_per_arm = numeric(),
  n_individuals_per_arm = numeric()
)

cohen_h <- function(p1, p2) {
  2 * (asin(sqrt(p1)) - asin(sqrt(p2)))
}

for (icc in ICC_values) {
  p_I1 <- p_C + delta
  
  deff_cv <- 1 + ((m * (1 + CV^2)) - 1) * icc
  
  h_I1_C <- cohen_h(p_I1, p_C)
  
  ss1 <- pwr.2p.test(h = h_I1_C, power = power, sig.level = alpha)$n
  
  n_per_arm_crt <- ceiling(ss1 * deff_cv)
  
  n_clusters_per_arm <- ceiling(n_per_arm_crt / m)
  
  results_df <- rbind(results_df, data.frame(
    ICC = icc,
    n_clusters_per_arm = n_clusters_per_arm,
    n_individuals_per_arm = n_per_arm_crt
  ))
}

Code

ggplot(results_df, aes(x = ICC, y = n_clusters_per_arm * 2)) +
  geom_line(color = "darkred", size = 1) +
  geom_point(color = "darkred", size = 2) +
  labs(
    title = "Total clusters needed vs. ICC",
    x = "ICC",
    y = "Total clusters needed"
  ) +
  theme_minimal() +
  scale_x_continuous(breaks = seq(0.05, 0.35, by = 0.02)) +
  scale_y_continuous(breaks = seq(0, max(results_df$n_clusters_per_arm * 2), by = 5))

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

(2) Sample size calculation CRT: Simulation-based

(2.1) Parameters

We follow the simulation setup according to J. Thompson & C. Leyrat for binary outcomes: https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01699-2

Data-generating model (per cluster):

For arm i (0=control, 1=intervention) and cluster j: Y_ij ∼ Binomial(m_ij, p_ij), logit⁡(p_ij) = β0 + β1_i + u_j , where u_j is a cluster random effect with mean 0 and variance σ^2_b
- Binomial(m_ij, p_ij): Conditional on p_ij, we assume each of the m_ij individuals in that cluster are independent Bernoulli trials with probability p_ij. So Y_ij is a binomial draw with that probability, for the cluster-level.
  - A Bernoulli trial is a random event with two outcomes (success/failure), with the same, independent, probability of success every time.
  - Independence assumption (within-cluster): Whether one person gets the IGRA test doesn’t change the probability for another person in the same cluster (once p_ij is fixed). The correlation between people’s outcomes in the same cluster comes entirely from them sharing the same p_ij (via the physician)
  - => Y_ij can be any integer from 0 to m_ij. E.g. if m_ij =42 and p_ij =0.50, then Y_ij is the total number of IGRA tests in that cluster, drawn from a binomial distribution with 42 trials and 50% success probability.
- logit⁡(p_ij) = β0 + β1_i + u_j:
  - Using the logit link maps probability p ∈ (0,1) to the whole real line, so we can model it as a linear predictor.
  - β0 is the baseline log-odds (the logit of the control probability for a typical cluster, i.e. when u_j = 0), representing the the marginal cluster-specific probability.
  - β1_i encodes the treatment effect and is a log-odds difference; exp⁡(β1) is the conditional odds ratio comparing treatment vs control for the same cluster (holding u_j fixed).
  - u_j is the cluster random intercept (a cluster-level shift on the log-odds scale). It captures unobserved cluster-level factors (e.g. physician tendency) that move all individuals in the cluster up/down in log-odds. Typically, u_j has mean 0 and variance σ^2_b and is independent across clusters (see above). The random intercept does not change the conditional treatment effect, it only shifts the baseline log-odds for that whole cluster. In other words, the difference in log-odds between arms for the same cluster is always constant, but the actual probabilities shift up/down with u_j. For clusters with positive u_j both arms have higher probabilities; for negative u_j both are lower.

ICC on log-odds scale:

ICC = p = rho = σ^2_b / (σ^2_b+(π^2/3))
The ICC consists of individual-level variance (noise) and between-cluster variance (noise), in the sense of: between-cluster variance / total variance. The between-cluster variance approximates the cluster random effect variance (σ^2_b)
In logistic models with binary outcomes, the individual level variation is usually fixed at π^2/3 (3.29)
So, focusing on the cluster random effect variance (σ^2_b), we can derive it from the formula above as: σ_b = sigma_b = sqrt((ICC(π^2/3))/(1−ICC))
(If there’s additional within-site variation over time, i.e. baseline period or SW-CRT, we include σ^2_b_p, typically as a fraction of σ^2_b, e.g., half the site-level variance)

Cluster effect distributions:

While ICC is the proportion of the total variance (in the latent scale) that comes from between-cluster differences (“what fraction of the total variability is due to between-cluster differences”), the σ^2_b is an absolute variance (“How big the cluster intercept spread is in log-odds units” or “how much variation there is in prescription tendency across clusters”) and can have different shapes.
GLMM assumes normal distribution, but reality is often skewed. Simulate three scenarios including a realistic/skewed/conservative scenario and see if GLMM breaks (as in paper above):
a) Normal: u_j ∼ N(0, σ^2_b)
- Symmetric, bell-shaped, skewness = 0, kurtosis = 0.
b) Gamma (skewed): generate a_j ∼ Gamma(shape=2,scale=1), then set u_j = σ_b((a_j−2)/sqrt(2))
- A shape parameter of 2 give a distribution with skew 1.4 and kurtosis 3, i.e., positive skew (some clusters much higher tendency than average)
c) Uniform: u_j ∼ Uniform(−sqrt(3)σ_b, sqrt(3)σ_b)
- Skewness = 0 (perfectly symmetric), Kurtosis = −6/5 (lighter tails than normal), no extreme values, overall flat, all clusters are evenly spread; to test if GLMMs are sensitive to lack of tail weight, i.e., whether they rely on the normal distribution’s tails to stabilize estimates.

Cluster sizes m_ij:

Allow for varying cluster size, i.e. varying coefficient of variation (CV) of cluster sizes, using same approach as they did: They sampled cluster sizes so that m_ij = 2 + δ_ij, drawn from a Negative Binomial:
- δ_ij ∼ NegBin(size = (m-2)^2/(s^2-(m-2)), p = m-2/s^2)
- where s is the SD of cluster sizes (CV = s/m).
- This yields a minimum cluster size of 3. (note: they note no.offails and prob.offail; but the above should represent the same).
- δ is in a way the random component added to 2 to get the cluster size (of min 3).

(2.2) Create main functions and simulate one dataset

Code

# 1) compute sigma_b from ICC (on latent logit scale):
icc_to_sigma <- function(rho){
  if(rho<=0) return(0)
  sigma_b <- sqrt( (rho * (pi^2/3)) / (1 - rho) )
  return(sigma_b)
}

# 2) compute beta0 for given control prevalence p0
p_to_beta0 <- function(p0){
  qlogis(p0)
}

# 3) given p0 and p1, compute OR on the cluster-specific log-odds scale
p0_p1_to_OR <- function(p0, p1){
  odds0 <- p0 / (1 - p0)
  odds1 <- p1 / (1 - p1)
  odds1 / odds0
}

# 4) generate random cluster-level u_j for the three distributions
generate_u <- function(n_clusters, sigma_b, dist = c("normal","gamma","uniform")){
  dist <- match.arg(dist)
  if(sigma_b == 0) return(rep(0, n_clusters))
  if(dist == "normal"){
    return(rnorm(n_clusters, mean=0, sd = sigma_b))
  } else if(dist == "gamma"){
    # they used Gamma(shape=2, scale=1) then standardized to mean 0 and sd sigma_b
    a <- rgamma(n_clusters, shape=2, scale=1)
    # a has mean 2, var 2. Standardize: (a - 2)/sqrt(2) then scale to sigma_b
    return(sigma_b * (a - 2)/sqrt(2))
  } else if(dist == "uniform"){
    cut <- sqrt(3) * sigma_b
    return(runif(n_clusters, min = -cut, max = cut))
  }
}

# 5) generate cluster sizes with target mean m and CV. Implementation follows their negative-binomial based approach and enforces minimum cluster size of 3.
generate_cluster_sizes <- function(n_clusters, m, CV){
  if(CV == 0){
    return(rep(m, n_clusters))
  }
  s <- CV * m
  # We want delta = m_j - 2 to follow NegBin with mean (m-2) and variance s^2
  mu_delta <- m - 2
  var_delta <- s^2
  if(var_delta <= mu_delta){
    # Negative Binomial requires variance > mean. So, this is an impossible NB parameterization
    # If so, fall back to a discrete uniform around m
    low <- max(3, floor(m - s*1.5))
    high <- ceiling(m + s*1.5)
    out <- pmax(3, round(runif(n_clusters, low, high)))
    return(out)
  }
  size_nb <- (mu_delta^2) / (var_delta - mu_delta) # see formula above
  prob_nb <- mu_delta / var_delta # see formula above
  # rnbinom in R uses size, prob; mean = size*(1-prob)/prob, but with this param it matches
  delta <- rnbinom(n_clusters, size = size_nb, prob = prob_nb)
  m_j <- 2 + delta
  m_j[m_j < 3] <- 3 # enforce min 3 (generating 2+delta ensures >=2, we bump to 3)
  return(m_j)
}

# Parameters for single simulated dataset
n_clusters <- 82
m_mean <- 18
CV <- 0.98
p0 <- 0.50
p1 <- 0.70
OR <- p0_p1_to_OR(p0, p1) # compute OR
rho <- 0.20 # ICC
re_dist <- "gamma"

# Simulate
set.seed(20250809)
sigma_b <- icc_to_sigma(rho)
u_j <- generate_u(n_clusters, sigma_b, dist = re_dist)
sizes <- generate_cluster_sizes(n_clusters, m_mean, CV)
arm_assign <- sample(rep(0:1, length.out = n_clusters))
beta0 <- p_to_beta0(p0)
beta1 <- log(OR)
y <- integer(n_clusters)

for(j in seq_len(n_clusters)){ # iterate over each cluster
  # create the linear predictor (NOTE: beta1 turns 0 if arm0, and 1 * beta1 if arm1)
  linpred <- beta0 + beta1 * arm_assign[j] + u_j[j] 
  # apply the inverse logit (logistic function) to convert log-odds to probability
  p_j <- plogis(linpred) 
  # Simulate the number of successes in cluster j
  y[j] <- rbinom(1, size = sizes[j], prob = p_j) 
}

df_sim <- data.frame(cluster = seq_len(n_clusters),
                      arm = arm_assign,
                      size = sizes,
                      y = y)
df_sim

   cluster arm size  y
1        1   0    3  1
2        2   1    3  2
3        3   1    5  4
4        4   1    3  0
5        5   1   30 25
6        6   0   82 30
7        7   0   16 12
8        8   0   10  8
9        9   1   26 15
10      10   0   10  4
11      11   0   11  2
12      12   0    3  2
13      13   1   17 12
14      14   1   18 15
15      15   0    7  2
16      16   0   41 24
17      17   0    3  0
18      18   0   26 12
19      19   0   36  6
20      20   1    5  3
21      21   0   15 11
22      22   1   18 12
23      23   1    3  3
24      24   0   38 19
25      25   1   17  7
26      26   0   22 11
27      27   1   44 20
28      28   1    5  4
29      29   1   73 36
30      30   1    6  3
31      31   0    3  0
32      32   1   13 13
33      33   0    5  3
34      34   1    5  5
35      35   1   14 10
36      36   0    4  3
37      37   0    4  1
38      38   1   31 24
39      39   1   17  9
40      40   0    3  3
41      41   0   16  7
42      42   1    3  2
43      43   1   30 15
44      44   0    7  1
45      45   1   19 17
46      46   0   15  5
47      47   1   20 14
48      48   1   56 49
49      49   0    3  0
50      50   1   13 12
51      51   1   22 18
52      52   0   25 14
53      53   1   21  9
54      54   0    4  4
55      55   0   62 19
56      56   1    7  3
57      57   0    3  3
58      58   0    5  3
59      59   0   13  5
60      60   0    3  1
61      61   0    3  1
62      62   0    6  4
63      63   1   27 14
64      64   0    7  3
65      65   1    5  4
66      66   0   28 12
67      67   0    5  0
68      68   1   11  9
69      69   0   17 11
70      70   1   12 11
71      71   1    6  5
72      72   1   19 18
73      73   1   24 14
74      74   1   10  9
75      75   1   10 10
76      76   0   12  4
77      77   0    3  3
78      78   1    4  3
79      79   1    3  3
80      80   0   19 15
81      81   0   45 20
82      82   1   20 16

Code

mean_sizes <- df_sim %>%
  group_by(arm) %>%
  summarise(mean_size = mean(size))

ggplot(df_sim, aes(x = factor(cluster), y = size, fill = factor(arm))) +
  geom_bar(stat = "identity", color = "black") +
  geom_hline(data = mean_sizes, aes(yintercept = mean_size, color = factor(arm)),
             linetype = "dashed", size = 1, show.legend = FALSE) +
  geom_text(data = mean_sizes, aes(x = Inf, y = mean_size, label = paste0("Mean = ", round(mean_size, 1))),
            hjust = 1.1, vjust = -0.5, color = c("skyblue4", "tomato3"), size = 4) +
  scale_fill_manual(values = c("skyblue", "tomato"), labels = c("Control (arm=0)", "Intervention (arm=1)")) +
  scale_color_manual(values = c("skyblue4", "tomato3")) +
  labs(x = "Cluster", y = "Cluster Size", fill = "Treatment Group") +
  theme_minimal() +
  ggtitle("Cluster size per cluster") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

size = number of individuals in a cluster

y = number of individual-level successes (binary=1) observed in the cluster, i.e., represents the number of individuals in that cluster who correctly received an IGRA test (no if low-risk, yes if high-risk)

(2.3) Simulate power, using cluster-level analysis approach

NOTES:

Use cluster-level analysis (unweighted t-test on log-odds, with 0.5 continuity correction, as per guidance according to Thompson & Leyrat & al -> “clan” command)
Use gamma distribution (most conservative), simulate 500-1000 trials

(2.3.1) Create function

Code

simulate_power <- function(n_clusters = 82, 
                           m_mean = 18, 
                           CV = 0.98,
                           p0 = 0.50, 
                           p1 = 0.70, 
                           rho = 0.20,
                           re_dist = "gamma", 
                           n_sim = 1000,
                           alpha = 0.05, 
                           seed = 20250809) {
  set.seed(seed)
  
  # Compute derived parameters
  sigma_b <- icc_to_sigma(rho)
  beta0 <- p_to_beta0(p0)
  OR <- p0_p1_to_OR(p0, p1)
  beta1 <- log(OR)
  
  p_values <- numeric(n_sim)
  
  for (i in seq_len(n_sim)) {
    u_j <- generate_u(n_clusters, sigma_b, dist = re_dist)
    sizes <- generate_cluster_sizes(n_clusters, m_mean, CV)
    arm_assign <- sample(rep(0:1, length.out = n_clusters))
    
    y <- integer(n_clusters)
    for (j in seq_len(n_clusters)) {
      linpred <- beta0 + beta1 * arm_assign[j] + u_j[j]
      p_j <- plogis(linpred)
      y[j] <- rbinom(1, size = sizes[j], prob = p_j)
    }
    
    # Cluster-level log-odds with 0.5 continuity correction
    log_odds <- log((y + 0.5) / (sizes - y + 0.5))
    
    # Unweighted t-test
    group0 <- log_odds[arm_assign == 0]
    group1 <- log_odds[arm_assign == 1]
    
    test <- try(t.test(group1, group0, var.equal = TRUE), silent = TRUE)
    p_values[i] <- if (inherits(test, "try-error")) NA else test$p.value
  }
  
  # Estimate power
  mean(p_values < alpha, na.rm = TRUE)
}

(2.3.2) Calculate baseline scenario

Code

power_estimate <- simulate_power(n_clusters = 82,
                                 m_mean = 18,
                                 CV = 0.98,
                                 p0 = 0.50,
                                 p1 = 0.70,
                                 rho = 0.20,
                                 re_dist = "gamma",
                                 n_sim = 1000)

cat("Estimated power:", round(power_estimate, 3), "\n")

Estimated power: 0.91

(2.3.3) Vary effect sizes

Code

p0_vals <- seq(0.35, 0.55, by = 0.05)
p1_vals <- seq(0.50, 0.85, by = 0.05)

grid <- expand.grid(p0 = p0_vals, p1 = p1_vals)

results <- grid %>%
  rowwise() %>%
  mutate(power = simulate_power(n_clusters = 82,
                                m_mean = 18,
                                CV = 0.98,
                                p0 = p0,
                                p1 = p1,
                                rho = 0.2,
                                re_dist = "gamma",
                                n_sim = 1000)) %>%
  ungroup()

# Plot
ggplot(results, aes(x = p1, y = power, color = factor(p0))) +
  
  # Shaded region above 90% power
  geom_rect(aes(xmin = -Inf, xmax = Inf, ymin = 0.9, ymax = Inf),
            fill = "lightgrey", alpha = 0.3, inherit.aes = FALSE) +
  
  # Power curves
  geom_line(size = 1.2) +
  geom_point() +
  
  # Labels and scales
  labs(title = "Power curves by p0 and p1",
       x = "Intervention group probability (p1)",
       y = "Estimated power",
       color = "Control group probability (p0)") +
  scale_y_continuous(breaks = seq(0, 1, by = 0.1),
                     limits = c(0, 1),
                     labels = scales::percent_format(accuracy = 1)) +
  scale_x_continuous(breaks = seq(0, 1, by = 0.05)) +
  theme_minimal(base_size = 14)

(2.3.4) Vary ICC

Code

# Vector of ICC values to test
icc_values <- seq(0.05, 0.45, by = 0.02)

# Run power simulations for each ICC
power_results <- sapply(icc_values, function(rho) {
  simulate_power(n_clusters = 82,
                 m_mean = 18,
                 CV = 0.98,
                 p0 = 0.50,
                 p1 = 0.70,
                 rho = rho,
                 re_dist = "gamma",
                 n_sim = 1000,
                 alpha = 0.05,
                 seed = 20250809)
})

# Create data frame for plotting
df_power_icc <- data.frame(ICC = icc_values, Power = power_results)

# Plot
ggplot(df_power_icc, aes(x = ICC, y = Power)) +
  geom_line(color = "darkred", size = 1.2) +
  geom_point(color = "firebrick") +
  labs(title = "Power Curve by ICC",
       x = "Intraclass Correlation (ICC)",
       y = "Estimated Power") +
  scale_y_continuous(breaks = seq(0.70, 1, by = 0.1),
                     limits = c(0.70, 1),
                     labels = scales::percent_format(accuracy = 1)) +
  theme_minimal()

Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_line()`).

Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_point()`).

(2.3.5) Vary number of clusters

Code

# Vector of cluster counts to test
n_clusters_vec <- seq(30, 110, by = 1)

# Run power simulations for each cluster count
power_results <- sapply(n_clusters_vec, function(nc) {
  simulate_power(n_clusters = nc,
                 m_mean = 18,
                 CV = 0.98,
                 p0 = 0.50,
                 p1 = 0.70,
                 rho = 0.20,
                 re_dist = "gamma",
                 n_sim = 1000,
                 alpha = 0.05,
                 seed = 20250809)
})

# Create data frame for plotting
df_power_css <- data.frame(Cluster_ss = n_clusters_vec, Power = power_results)

# Plot
ggplot(df_power_css, aes(x = Cluster_ss, y = Power)) +
  geom_line(color = "darkgreen", size = 1.2) +
  geom_point(color = "forestgreen") +
  labs(title = "Power vs Number of total clusters",
       x = "Total number of clusters",
       y = "Estimated power") +
  scale_y_continuous(breaks = seq(0.50, 1, by = 0.1),
                     limits = c(0.50, 1),
                     labels = scales::percent_format(accuracy = 1)) +
  scale_x_continuous(breaks = seq(30, 110, by = 5)) +
  theme_minimal()

(2.3.6) Vary number of individuals per cluster (mean cluster size)

Code

m_mean_vec <- seq(2, 40, by = 2)

# Run power simulations for each cluster count
power_results <- sapply(m_mean_vec, function(n) {
  simulate_power(n_clusters = 82,
                 m_mean = n,
                 CV = 0.98,
                 p0 = 0.50,
                 p1 = 0.70,
                 rho = 0.20,
                 re_dist = "gamma",
                 n_sim = 1000,
                 alpha = 0.05,
                 seed = 20250809)
})

# Create data frame for plotting
df_power_iss <- data.frame(Individual_ss = m_mean_vec, Power = power_results)

# Plot
ggplot(df_power_iss, aes(x = Individual_ss, y = Power)) +
  geom_line(color = "darkblue", size = 1.2) +
  geom_point(color = "skyblue") +
  labs(title = "Power vs Average number of individuals per cluster",
       x = "Average number of individuals per cluster",
       y = "Estimated power") +
  scale_y_continuous(breaks = seq(0.70, 1, by = 0.1),
                     limits = c(0.70, 1),
                     labels = scales::percent_format(accuracy = 1)) +
  scale_x_continuous(breaks = seq(2, 40, by = 2)) +
  theme_minimal()

(2.4) Simulate power, using GLMM analysis approach

NOTES:

As per trial protocol
Ca. 80 clusters in total => cluster number large enough to use normal GLMM (not with restricted pseudo-likelihood and reduced degree of freedom as per guidance according to Thompson & Leyrat & al)
Keep gamma distribution throughout

(2.4.1) Create function

Code

simulate_power_glmer <- function(n_clusters = 82, 
                                   m_mean = 18, 
                                   CV = 0.98,
                                   p0 = 0.50, 
                                   p1 = 0.70, 
                                   rho = 0.20,
                                   re_dist = "gamma", 
                                   n_sim = 1000,
                                   alpha = 0.05, 
                                   seed = 20250809) {
  set.seed(seed)
  
  sigma_b <- icc_to_sigma(rho)
  beta0 <- p_to_beta0(p0)
  OR <- p0_p1_to_OR(p0, p1)
  beta1 <- log(OR)
  
  p_values <- numeric(n_sim)
  
  for (i in seq_len(n_sim)) {
    u_j <- generate_u(n_clusters, sigma_b, dist = re_dist)
    sizes <- generate_cluster_sizes(n_clusters, m_mean, CV)
    arm_assign <- sample(rep(0:1, length.out = n_clusters))
    
    y <- integer(n_clusters)
    for (j in seq_len(n_clusters)) {
      linpred <- beta0 + beta1 * arm_assign[j] + u_j[j]
      p_j <- plogis(linpred)
      y[j] <- rbinom(1, size = sizes[j], prob = p_j)
    }
    
    dd_sim <- data.frame(
      y = y,
      size = sizes,
      arm = factor(arm_assign),
      cluster = factor(seq_len(n_clusters))
    )
    
    # Fit GLMM using glmer
    fit <- try(
      lme4::glmer(
        cbind(y, size - y) ~ arm + (1 | cluster),
        family = binomial(link = "logit"),
        data = dd_sim,
        control = lme4::glmerControl(optimizer = "bobyqa") # see Bates et al. (lme4 developers)
      ),
      silent = TRUE
    )

    if (!inherits(fit, "try-error")) {
      sm <- summary(fit)
      p_values[i] <- sm$coefficients["arm1", "Pr(>|z|)"]
    } else {
      p_values[i] <- NA
    }
  }

  mean(p_values < alpha, na.rm = TRUE)
}

(2.4.2) Calculate baseline scenario

Code

power_estimate <- simulate_power_glmer(n_clusters = 82,
                                         m_mean = 18,
                                         CV = 0.98,
                                         p0 = 0.50,
                                         p1 = 0.70,
                                         rho = 0.20,
                                         re_dist = "gamma",
                                         n_sim = 1000)

cat("Estimated power (GLMM):", round(power_estimate, 3), "\n")

Estimated power (GLMM): 0.947

Currently, the cluster size variation (CV = 0.98) is too high, estimation is imprecise. We need to work on refining the eligibility criteria for the clusters (physicians)

(2.4.3) Vary effect sizes

Code

# p0_vals <- seq(0.35, 0.55, by = 0.05)
# p1_vals <- seq(0.50, 0.85, by = 0.05)
# 
# grid_glmm <- expand.grid(p0 = p0_vals, p1 = p1_vals)
# 
# # Use map2 to apply the function to each p0/p1 pair, bit more efficient
# grid_glmm$power <- map2_dbl(grid_glmm$p0, grid_glmm$p1, ~ simulate_power_glmmPQL(
#   n_clusters = 110,
#   m_mean = 100,
#   CV = 0.4,
#   p0 = .x,
#   p1 = .y,
#   rho = 0.2,
#   re_dist = "gamma",
#   n_sim = 300 # reduced for speed
# ))
# 
# ggplot(grid_glmm, aes(x = p1, y = power, color = factor(p0))) +
#   geom_rect(aes(xmin = -Inf, xmax = Inf, ymin = 0.9, ymax = Inf),
#             fill = "lightgrey", alpha = 0.3, inherit.aes = FALSE) +
#   geom_line(size = 1.2) +
#   geom_point() +
#   labs(title = "Power Curves by p0 and p1 (GLMM)",
#        x = "Intervention group probability (p1)",
#        y = "Estimated power",
#        color = "Control group probability (p0)") +
#   scale_y_continuous(breaks = seq(0, 1, by = 0.1),
#                      limits = c(0, 1),
#                      labels = scales::percent_format(accuracy = 1)) +
#   theme_minimal(base_size = 14)

(2.4.4) Vary ICC

Code

# icc_values <- seq(0.05, 0.35, by = 0.02)
# 
# power_results_glmm <- sapply(icc_values, function(rho) {
#   simulate_power_glmmPQL(n_clusters = 110,
#                  m_mean = 100,
#                  CV = 0.40,
#                  p0 = 0.48,
#                  p1 = 0.64,
#                  rho = rho,
#                  re_dist = "gamma",
#                  n_sim = 300, # reduced for speed
#                  alpha = 0.05,
#                  seed = 20250809)
# })
# 
# df_power_icc_glmm <- data.frame(ICC = icc_values, Power = power_results_glmm)
# 
# ggplot(df_power_icc_glmm, aes(x = ICC, y = Power)) +
#   geom_line(color = "darkred", size = 1.2) +
#   geom_point(color = "firebrick") +
#   labs(title = "Power curve by ICC (GLMM)",
#        x = "Intra-cluster correlation (ICC)",
#        y = "Estimated Power") +
#   scale_y_continuous(breaks = seq(0.70, 1, by = 0.1),
#                      limits = c(0.70, 1),
#                      labels = scales::percent_format(accuracy = 1)) +
#   theme_minimal()

(2.4.5) Vary number of clusters

Code

# n_clusters_vec <- seq(80, 120, by = 1)
# 
# power_results_glmm <- sapply(n_clusters_vec, function(nc) {
#   simulate_power_glmmPQL(n_clusters = nc,
#                  m_mean = 100,
#                  CV = 0.40,
#                  p0 = 0.48,
#                  p1 = 0.64,
#                  rho = 0.20,
#                  re_dist = "gamma",
#                  n_sim = 300, # reduced for speed
#                  alpha = 0.05,
#                  seed = 20250809)
# })
# 
# df_power_css_glmm <- data.frame(Cluster_ss = n_clusters_vec, Power = power_results_glmm)
# 
# ggplot(df_power_css_glmm, aes(x = Cluster_ss, y = Power)) +
#   geom_line(color = "darkgreen", size = 1.2) +
#   geom_point(color = "forestgreen") +
#   labs(title = "Power vs Number of total clusters (GLMM)",
#        x = "Total number of clusters",
#        y = "Estimated power") +
#   scale_y_continuous(breaks = seq(0.70, 1, by = 0.1),
#                      limits = c(0.70, 1),
#                      labels = scales::percent_format(accuracy = 1)) +
#   scale_x_continuous(breaks = seq(22, 36, by = 1)) +
#   theme_minimal()

(3) Simulate the full dataset and implement the main analysis strategy

The main analysis strategy:

Generalized linear mixed model (GLMM) with restricted pseudo-likelihood estimation and small-sample correction for degrees of freedom (clusters minus cluster-level parameters), as suggested by Thompson and colleagues (could be dropped here with 50+ clusters per arm)
We will adjust the model for these a priori defined covariates (as fixed effects):
- Cluster-level covariates (cluster mean): baseline testing rate (as continuous covariates assuming linearity) and site (7 levels)
- Individual-level covariates:
  - Self-reported sex/gender (for now, as binary covariate: male, female)
  - Age (as continuous covariates assuming non-linear association modelled using restricted cubic splines with 3 knots at 10^th, 50^th and 90^th percentiles of the observed age distribution)
  - Region of origin (xxx levels)
We will report the resulting treatment-effect (beta-1), which is the log-odds difference between intervention and control or – when exponentiated – the adjusted odds ratio, with its 95% confidence interval. This represents a relative cluster-specific effect, conditional on all included covariates.
In addition, we will use marginal standardization and report the resulting population-average marginal relative risk and risk difference with their 95% confidence intervals (is primarily and individual-average estimate, but equals a cluster-average estimate in case of no informative cluster size)

Notes on simulating a realistic dataset:

Reuse the helper functions from chapter 2.2, incl. conservative gamma distribution for u_j
Causal structure:
- Cluster latent effect (u_j) influences baseline IGRA testing rate (through alpha, the correlation strength between baseline and u_j) and directly affects the outcome via the cluster random effect
- Baseline IGRA testing rate directly affects the outcome (via beta_baseline), representing residual correlation beyond the shared cluster effect alpha
- => baseline_rate directly pushes the outcome (via beta_baseline) and shares correlation with u_j (i.e., indirectly pushes the outcome), in the sense of: “Clusters with higher IGRA testing rate at baseline -> higher individually risk-tailored testing rate at endline”
- Treatment (arm 1) directly affects the outcome (through beta_1), but correlation above and noise below masking it
Add some baseline noise (e.g. tau = 0.45) ensuring that even clusters with the same u_j will show some variability in their observed baseline_rate
alpha (correlation baseline_rate with u_j): e.g. a value of 0.3 means 30% of the variation in the baseline logit is driven by u_j (i.e. drives true cluster tendency or the “cluster-to-cluster variation” at baseline, which also has an impact on the outcome), while the remaining comes from independent measurement noise.
Produce an individual-level dataset, not cluster-level only - as the real-life dataset will look like and in case we also want to add individual-level correlations

(3.1) Simulate one dataset and check some diagnostics

Code

## We use the helper functions from chapter 2.2:

# icc_to_sigma
# generate_u
# generate_cluster_sizes
# p_to_beta0
# p0_p1_to_OR

## General parameters
set.seed(20250809)
n_clusters <- 82
m_mean <- 18
CV <- 0.98
p0 <- 0.50
p1 <- 0.70
OR <- p0_p1_to_OR(p0, p1)
rho <- 0.20 # ICC (on latent logit scale)
re_dist <- "gamma" # distribution for u_j, keep it conservative

# Individual-level covariates
age_mean <- 35
age_sd <- 12
sex_prob <- 0.40

## Generate cluster structure
sizes <- generate_cluster_sizes(n_clusters, m_mean, CV)
sigma_b <- icc_to_sigma(rho)
u_j <- generate_u(n_clusters, sigma_b, dist = re_dist)
arm_assign <- sample(rep(0:1, length.out = n_clusters))

# First important thing to mimic: IGRA testing rate at baseline
# alpha controls how much the baseline rate depends on the same latent cluster effect
# The bigger alpha, the more high-baseline clusters will also tend to have high endline outcomes indirectly, because u_j is reused in the outcome model => indirect correlation
# Baseline IGRA testing rate is explained by u_j + random noise eps + global average level gamma0.
gamma0 <- qlogis(p0) # the average cluster-level log-odds of baseline IGRA testing rate
alpha <- 0.3 # how much baseline (logit) depends on u_j (i.e. the latent cluster effect); 0 would be no correlation
tau <- 0.45 # the residual variation (SD) in baseline log-odds not explained by u_j, i.e. the random measurement noise
eps <- rnorm(n_clusters, 0, tau)
logit_b <- gamma0 + alpha * u_j + eps # putting it all together
baseline_rate <- plogis(logit_b) # map back to probability scale

# Second, the sites: uncorrelated 7-level
site <- sample(1:7, n_clusters, replace = TRUE,
               prob = c(0.05, 0.15, 0.20, 0.10, 0.15, 0.25, 0.10))

## Fixed effects on outcome, direct correlations on outcome
beta0 <- p_to_beta0(p0) # intercept
beta1 <- log(OR) # intervention effect
beta_baseline <- 0.5 # how strongly the baseline rate predicts the endline outcome, independent of u_j
beta_site <- 0.0 # no correlation
beta0_adj <- beta0 - 1.0 # after including u_j, baseline_rate, etc., the overall mean outcome probability can drift because of the nonlinear logistic function. Stabilize.

## Simulate individual-level data
ind_list <- vector("list", length = n_clusters)
for(j in seq_len(n_clusters)){
  nj <- sizes[j]
  age_j <- rnorm(nj, mean = age_mean, sd = age_sd) # draw from normal
  sex_j <- rbinom(nj, 1, prob = sex_prob) # draw from bernoulli
  
  logit_baseline_j <- qlogis(baseline_rate[j]) # back to logit
  # the log-odds of IGRA testing rate for all individuals in cluster j (same cluster-level predictors for all)
  linpred_j <- beta0 +
               beta1 * arm_assign[j] +
               beta_baseline * logit_baseline_j +
               beta_site * site[j] +
               u_j[j] # latent cluster random effect
  
  p_ij <- plogis(linpred_j) # Predicted probability of receiving an IGRA testing for each individual in cluster j. Since all individuals in a cluster share the same cluster-level covariates, p_ij is identical for everyone in the cluster (unless we later include individual-level predictors...)
  y_ij <- rbinom(nj, 1, p_ij) # the outcome; bernoulli with probability p_ij
  
  # save data for this one cluster
  ind_list[[j]] <- data.frame(
    cluster = j,
    arm = arm_assign[j],
    age = age_j,
    sex = sex_j,
    site = site[j],
    baseline_rate = baseline_rate[j],
    u_j = u_j[j],
    p = p_ij,
    y = y_ij
  )
}
df_ind <- do.call(rbind, ind_list)

## Cluster-level summary, aggregate at cluster-level
df_cluster <- aggregate(y ~ cluster + arm, data = df_ind, sum) # aggregate number of outcomes
df_cluster$size <- aggregate(y ~ cluster, data = df_ind, length)$y # count number of ind => cluster size
cluster_meta <- data.frame(
  cluster = seq_len(n_clusters),
  arm = arm_assign,
  site = site,
  baseline_rate = baseline_rate,
  u_j = u_j
)
df_sim <- merge(df_cluster, cluster_meta, by = c("cluster","arm"))
df_sim <- df_sim[order(df_sim$cluster),
                 c("cluster","arm","size","y","baseline_rate",
                   "site","u_j")]

## Diagnostics
cat("Mean baseline_rate =", round(mean(baseline_rate),3), "\n")

Mean baseline_rate = 0.494

Code

cat("First few cluster-level rows:\n")

First few cluster-level rows:

Code

print(head(df_sim, 10))

   cluster arm size  y baseline_rate site        u_j
1        1   1    3 13     0.5574604    3  1.6757329
12       2   0   15 17     0.6486764    7  1.2961684
23       3   0   20  6     0.3237354    2 -1.0068194
34       4   1    3  3     0.3708562    6 -0.3483302
45       5   0   33  3     0.7243767    6 -0.4398316
56       6   1    5  8     0.6799559    7  0.3351121
67       7   1    3 35     0.4334983    7  1.2158472
78       8   0    5  4     0.6554908    4  1.7820900
82       9   1   30 12     0.4461980    3 -1.1302940
2       10   1   82  5     0.2767682    7 -1.0755483

Code

cat("First few individual-level rows:\n")

First few individual-level rows:

Code

print(head(df_ind, 50))

   cluster arm       age sex site baseline_rate       u_j         p y
1        1   1 62.929683   1    3     0.5574604  1.675733 0.9332963 1
2        1   1 25.098711   0    3     0.5574604  1.675733 0.9332963 0
3        1   1 23.034774   0    3     0.5574604  1.675733 0.9332963 1
4        1   1 43.487902   0    3     0.5574604  1.675733 0.9332963 1
5        1   1 36.093478   1    3     0.5574604  1.675733 0.9332963 1
6        1   1 35.830547   1    3     0.5574604  1.675733 0.9332963 0
7        1   1 61.808917   1    3     0.5574604  1.675733 0.9332963 1
8        1   1 33.756109   0    3     0.5574604  1.675733 0.9332963 1
9        1   1 21.832337   0    3     0.5574604  1.675733 0.9332963 1
10       1   1 41.505355   0    3     0.5574604  1.675733 0.9332963 1
11       1   1 26.195464   0    3     0.5574604  1.675733 0.9332963 1
12       1   1 41.345142   0    3     0.5574604  1.675733 0.9332963 1
13       1   1 48.646471   1    3     0.5574604  1.675733 0.9332963 1
14       1   1 25.924216   0    3     0.5574604  1.675733 0.9332963 1
15       1   1 25.958232   1    3     0.5574604  1.675733 0.9332963 1
16       2   0 38.362368   0    7     0.6486764  1.296168 0.8324068 1
17       2   0 31.537976   1    7     0.6486764  1.296168 0.8324068 1
18       2   0 39.650086   1    7     0.6486764  1.296168 0.8324068 1
19       2   0 19.820009   0    7     0.6486764  1.296168 0.8324068 1
20       2   0 49.845996   1    7     0.6486764  1.296168 0.8324068 1
21       2   0 25.980348   0    7     0.6486764  1.296168 0.8324068 0
22       2   0 34.500022   0    7     0.6486764  1.296168 0.8324068 1
23       2   0 56.066420   0    7     0.6486764  1.296168 0.8324068 0
24       2   0  8.990755   0    7     0.6486764  1.296168 0.8324068 1
25       2   0 44.342623   0    7     0.6486764  1.296168 0.8324068 1
26       2   0 52.066237   1    7     0.6486764  1.296168 0.8324068 1
27       2   0 52.103608   0    7     0.6486764  1.296168 0.8324068 1
28       2   0 23.644091   1    7     0.6486764  1.296168 0.8324068 1
29       2   0 40.964825   0    7     0.6486764  1.296168 0.8324068 1
30       2   0 43.050042   0    7     0.6486764  1.296168 0.8324068 1
31       2   0 36.919039   1    7     0.6486764  1.296168 0.8324068 1
32       2   0 25.352254   1    7     0.6486764  1.296168 0.8324068 1
33       2   0 27.091611   0    7     0.6486764  1.296168 0.8324068 0
34       2   0 33.633571   0    7     0.6486764  1.296168 0.8324068 1
35       2   0 39.662819   0    7     0.6486764  1.296168 0.8324068 1
36       3   0 14.381811   0    2     0.3237354 -1.006819 0.2017893 0
37       3   0 48.466743   0    2     0.3237354 -1.006819 0.2017893 1
38       3   0 42.521137   1    2     0.3237354 -1.006819 0.2017893 0
39       3   0 26.055303   0    2     0.3237354 -1.006819 0.2017893 0
40       3   0 34.919207   0    2     0.3237354 -1.006819 0.2017893 0
41       3   0 39.721004   1    2     0.3237354 -1.006819 0.2017893 0
42       3   0 25.793194   0    2     0.3237354 -1.006819 0.2017893 1
43       3   0 23.434322   0    2     0.3237354 -1.006819 0.2017893 0
44       3   0 22.620895   0    2     0.3237354 -1.006819 0.2017893 0
45       3   0 31.902922   0    2     0.3237354 -1.006819 0.2017893 0
46       3   0 37.658602   0    2     0.3237354 -1.006819 0.2017893 1
47       3   0 45.585714   0    2     0.3237354 -1.006819 0.2017893 0
48       3   0 18.930409   0    2     0.3237354 -1.006819 0.2017893 0
49       3   0 45.909040   0    2     0.3237354 -1.006819 0.2017893 0
50       3   0 42.280878   0    2     0.3237354 -1.006819 0.2017893 0

Code

cat("\nOverall N =", sum(df_sim$size), "individuals across", n_clusters, "clusters\n")


Overall N = 1461 individuals across 82 clusters

Code

# Compute mean prescription rate per arm
arm_rates <- aggregate(y ~ arm, data = df_ind, mean)
arm_rates$y <- round(arm_rates$y, 3)
for(i in seq_len(nrow(arm_rates))){
  cat("Arm", arm_rates$arm[i], "observed IGRA testing rate:", arm_rates$y[i], "\n")
}

Arm 0 observed IGRA testing rate: 0.447 
Arm 1 observed IGRA testing rate: 0.624

Code

invisible(list(individual = df_ind, cluster = df_sim)) # prevents automatic printing to console

(3.2) The primary analysis approach (GLMM, fully adjusted)

Code

## Precompute spline basis for age and convert to numeric
age_spline <- as.data.frame(ns(df_ind$age, knots = quantile(df_ind$age, probs=c(0.1,0.5,0.9))))
colnames(age_spline) <- paste0("age_spline", seq_len(ncol(age_spline)))
age_spline[] <- lapply(age_spline, as.numeric)
df_ind <- cbind(df_ind, age_spline)

## Ensure factor levels
df_ind$arm <- factor(df_ind$arm, levels = c(0,1)) # 0 = control, 1 = intervention
df_ind$sex <- factor(df_ind$sex, levels = c(0,1)) # 0 = male, 1 = female
df_ind$site <- factor(df_ind$site, levels = c(1,2,3,4,5,6,7))

## Fit GLMM using glmer (fully adjusted, age spline) 
spline_cols <- colnames(df_ind)[grepl("^age_spline", colnames(df_ind))]

form_primary <- as.formula(
  paste("y ~ arm + baseline_rate + site + sex +",
        paste(spline_cols, collapse = " + "),
        "+ (1 | cluster)")
)
model_primary <- lme4::glmer(
  formula = form_primary,
  family = binomial(link = "logit"),
  data = df_ind,
  control = lme4::glmerControl(optimizer = "bobyqa") 
)

### Now, let's make a few comparisons
## 1. Unadjusted OR
form_unadj <- as.formula(paste("y ~ arm + (1 | cluster)"))
model_unadj <- lme4::glmer(
  formula = form_unadj,
  family = binomial(link = "logit"),
  data = df_ind,
  control = lme4::glmerControl(optimizer = "bobyqa") 
)

coef_name_unadj <- grep("^arm", names(fixef(model_unadj)), value=TRUE)
coef_arm_unadj <- fixef(model_unadj)[coef_name_unadj]
se_arm_unadj <- summary(model_unadj)$coefficients[coef_name_unadj,"Std. Error"]
p_val_unadj <- summary(model_unadj)$coefficients[coef_name_unadj,"Pr(>|z|)"]
OR_unadj <- exp(coef_arm_unadj)
CI_unadj <- exp(confint(model_unadj, method="Wald")[coef_name_unadj, ])

## 2. Adjusted for stratification variables only
form_strata <- as.formula(paste("y ~ arm + baseline_rate + site + (1 | cluster)"))
model_strata <- lme4::glmer(
  formula = form_strata,
  family = binomial(link = "logit"),
  data = df_ind,
  control = lme4::glmerControl(optimizer = "bobyqa") 
)

coef_name_strata <- grep("^arm", names(fixef(model_strata)), value=TRUE)
coef_arm_strata <- fixef(model_strata)[coef_name_strata]
se_arm_strata <- summary(model_strata)$coefficients[coef_name_strata,"Std. Error"]
p_val_strata <- summary(model_strata)$coefficients[coef_name_strata,"Pr(>|z|)"]
OR_strata <- exp(coef_arm_strata)
CI_strata <- exp(confint(model_strata, method="Wald")[coef_name_strata, ])

## 3. Fully adjusted, age as spline (see main model above)
coef_name_full <- grep("^arm", names(fixef(model_primary)), value=TRUE)
coef_arm_full <- fixef(model_primary)[coef_name_full]
se_arm_full <- summary(model_primary)$coefficients[coef_name_full,"Std. Error"]
p_val_full <- summary(model_primary)$coefficients[coef_name_full,"Pr(>|z|)"]
OR_full <- exp(coef_arm_full)
CI_full <- exp(confint(model_primary, method="Wald")[coef_name_full, ])

## 4. And finally, calculate RR for the main model, using marginal standardization
RR_model <- tryCatch({
  avg_comparisons(model_primary, variables="arm", type="response", comparison="ratio", re.form = NA)
}, error=function(e) NULL)

if(!is.null(RR_model)){
  rr <- RR_model$estimate[1]
  rr_cl <- RR_model$conf.low[1]
  rr_ch <- RR_model$conf.high[1]
} else {
  rr <- rr_cl <- rr_ch <- NA_real_
}

## Combine it all into a table
results_table <- data.frame(
  Metric = c("Unadjusted", "Adjusted for strat only", "Fully adjusted; age spline"),
  OR = c(sprintf("%.3f", OR_unadj),
         sprintf("%.3f", OR_strata),
         sprintf("%.3f", OR_full)),
  CI_lower = c(sprintf("%.3f", CI_unadj[1]),
               sprintf("%.3f", CI_strata[1]),
               sprintf("%.3f", CI_full[1])),
  CI_upper = c(sprintf("%.3f", CI_unadj[2]),
               sprintf("%.3f", CI_strata[2]),
               sprintf("%.3f", CI_full[2])),
  wald_based_p_value = c(sprintf("%.3f", p_val_unadj), sprintf("%.3f", p_val_strata), sprintf("%.3f", p_val_full)),
  RR = c(NA, NA, sprintf("%.3f", rr)),
  RR_CI_lower = c(NA, NA, sprintf("%.3f", rr_cl)),
  RR_CI_upper = c(NA, NA, sprintf("%.3f", rr_ch))
)

results_table %>%
  kable("html", caption="Intervention effect: OR and RR with 95% CI (single simulation)") %>%
  kable_styling(bootstrap_options="striped", full_width=FALSE)

Intervention effect: OR and RR with 95% CI (single simulation)
Metric	OR	CI_lower	CI_upper	RR	RR_CI_lower	RR_CI_upper
Unadjusted	2.742	1.602	4.693	NA	NA	NA
Adjusted for strat only	2.796	1.750	4.466	NA	NA	NA
Fully adjusted; age spline	2.785	1.738	4.462	1.510	1.217	1.804

CAVE:

This is 1 randomly simulated dataset.
Add third individual-level covariate (region of origin)

Due to correlation structure the adjustment for baseline outcome rate (part of the stratification factors) increases power and precision. The further adjustment for individual-level covariates does not change much, since there is no simulated correlation at that level.

RR only constructed for primary model (fully adjusted model)

(3.3) Put all together and simulate the power

1000 simulations, based on dataset simulation (Chapter 3.1) and primary analysis model (Chapter 3.2)

Code

simulate_crt <- function(
  n_clusters = 82,
  m_mean = 18,
  CV = 0.98,
  p0 = 0.50,
  p1 = 0.70,
  rho = 0.20,
  re_dist = "gamma",
  alpha = 0.3, # weak-moderate correlation between u_j and baseline outcome rate
  tau = 0.45, # SD of baseline noise
  beta_baseline = 0.2, # weak-moderate pos correlation: baseline outcome rate -> outcome, independent of u_j
  age_mean = 35,
  age_sd = 12,
  sex_prob = 0.40
){

  # (1) Compute OR and intercept
  OR <- p0_p1_to_OR(p0, p1)
  beta0 <- p_to_beta0(p0)
  beta1 <- log(OR)
  beta0_adj <- beta0 - 1.0
  
  # (2) Generate clusters
  sizes <- generate_cluster_sizes(n_clusters, m_mean, CV)
  sigma_b <- icc_to_sigma(rho)
  u_j <- generate_u(n_clusters, sigma_b, dist = re_dist)
  arm_assign <- sample(rep(0:1, length.out = n_clusters))
  
  # (3) Baseline IGRA testing rate
  eps <- rnorm(n_clusters, 0, tau)
  logit_b <- qlogis(p0) + alpha * u_j + eps
  baseline_rate <- plogis(logit_b)
  
  # (4) site
  site <- sample(1:7, n_clusters, replace = TRUE,
               prob = c(0.05, 0.15, 0.20, 0.10, 0.15, 0.25, 0.10))
  
  # (5) Individual-level simulation
  ind_list <- vector("list", length = n_clusters)
  for(j in seq_len(n_clusters)){
    nj <- sizes[j]
    age_j <- rnorm(nj, mean = age_mean, sd = age_sd)
    sex_j <- rbinom(nj, 1, sex_prob)
    logit_baseline_j <- qlogis(baseline_rate[j])
    
    linpred_j <- beta0_adj +
                 beta1 * arm_assign[j] +
                 beta_baseline * logit_baseline_j +
                 beta_site * site[j] + 
                 u_j[j]
    
    p_ij <- plogis(linpred_j)
    y_ij <- rbinom(nj, 1, p_ij)
    
    ind_list[[j]] <- data.frame(
      cluster = j,
      arm = arm_assign[j],
      age = age_j,
      sex = sex_j,
      site = site[j],
      baseline_rate = baseline_rate[j],
      u_j = u_j[j],
      p = p_ij,
      y = y_ij
    )
  }
  
  df_ind <- do.call(rbind, ind_list)
  
  # (6) Cluster-level summary
  df_cluster <- aggregate(y ~ cluster + arm, data = df_ind, sum)
  df_cluster$size <- aggregate(y ~ cluster, data = df_ind, length)$y
  cluster_meta <- data.frame(
    cluster = seq_len(n_clusters),
    arm = arm_assign,
    site = site,
    baseline_rate = baseline_rate,
    u_j = u_j
  )
  df_sim <- merge(df_cluster, cluster_meta, by = c("cluster","arm"))
  df_sim <- df_sim[order(df_sim$cluster),
                   c("cluster","arm","size","y","baseline_rate",
                     "site","u_j")]
  
  return(list(
    individual = df_ind,
    cluster = df_sim
  ))
}

# Default simulation
# sim_data <- simulate_crt()
# df_ind <- sim_data$individual
# df_cluster <- sim_data$cluster

# Number of simulations
n_sims <- 100
set.seed(20250809)

# Storage
results <- data.frame(
  sim = 1:n_sims,
  unadj_signif = NA,
  adj_signif = NA,
  beta_unadj = NA,
  beta_adj = NA,
  results$OR_unadj <- NA,
  results$OR_unadj_lower <- NA,
  results$OR_unadj_upper <- NA,
  results$OR_adj <- NA,
  results$OR_adj_lower <- NA,
  results$OR_adj_upper <- NA
)

for(i in seq_len(n_sims)){
  
  # (1) Simulate trial
  sim_data <- simulate_crt()
  df_ind <- sim_data$individual
  
  # (2) Prepare age spline
  age_spline <- as.data.frame(ns(df_ind$age, 
                                 knots = quantile(df_ind$age, probs=c(0.1,0.5,0.9))))
  colnames(age_spline) <- paste0("age_spline", seq_len(ncol(age_spline)))
  df_ind <- cbind(df_ind, age_spline)
  
  # (3) Ensure factors
  df_ind$arm <- factor(df_ind$arm, levels = c(0,1))
  df_ind$sex <- factor(df_ind$sex, levels = c(0,1))
  df_ind$site <- factor(df_ind$site, levels = c(1,2,3,4,5,6,7))
  
  # (4) Unadjusted model
  form_unadj <- as.formula(paste("y ~ arm + (1 | cluster)"))
  model_unadj <- lme4::glmer(
  formula = form_unadj,
  family = binomial(link = "logit"),
  data = df_ind,
  control = lme4::glmerControl(optimizer = "bobyqa")
  )
  coef_name_unadj <- grep("^arm", names(fixef(model_unadj)), value=TRUE)
  beta1_unadj <- fixef(model_unadj)[coef_name_unadj]
  se1_unadj <- summary(model_unadj)$coefficients[coef_name_unadj,"Std. Error"]
  pval_unadj <- summary(model_unadj)$coefficients[coef_name_unadj,"Pr(>|z|)"]
  
  # (5) Fully adjusted model
  spline_cols <- colnames(df_ind)[grepl("^age_spline", colnames(df_ind))]
  form_adj <- as.formula(
  paste("y ~ arm + baseline_rate + site + sex +",
        paste(spline_cols, collapse = " + "),
        "+ (1 | cluster)")
  )
  model_adj <- lme4::glmer(
  formula = form_adj,
  family = binomial(link = "logit"),
  data = df_ind,
  control = lme4::glmerControl(optimizer = "bobyqa") 
  )
  coef_name_adj <- grep("^arm", names(fixef(model_adj)), value=TRUE)
  beta1_adj <- fixef(model_adj)[coef_name_adj]
  se1_adj <- summary(model_adj)$coefficients[coef_name_adj,"Std. Error"]
  pval_adj <- summary(model_adj)$coefficients[coef_name_adj,"Pr(>|z|)"]
  
  # (6) Save results including OR and CI
  CI_unadj <- exp(confint(model_unadj, method="Wald")[coef_name_unadj, ])
  results$OR_unadj[i] <- exp(beta1_unadj)
  results$OR_unadj_lower[i] <- CI_unadj[1]
  results$OR_unadj_upper[i] <- CI_unadj[2]
  
  CI_adj <- exp(confint(model_adj, method="Wald")[coef_name_adj, ])
  results$OR_adj[i] <- exp(beta1_adj)
  results$OR_adj_lower[i] <- CI_adj[1]
  results$OR_adj_upper[i] <- CI_adj[2]
  
  results$unadj_signif[i] <- (pval_unadj < 0.05)
  results$adj_signif[i]   <- (pval_adj < 0.05)
}

# (7) Compute estimated power
power_unadj <- mean(results$unadj_signif)
power_adj <- mean(results$adj_signif)

cat("Estimated power (unadjusted)  =", round(power_unadj,4), "\n")

Estimated power (unadjusted)  = 0.86

Code

cat("Estimated power (fully adjusted) =", round(power_adj,4), "\n")

Estimated power (fully adjusted) = 0.97

Code

# Summary of ORs
summary(results[,c("OR_unadj","OR_unadj_lower","OR_unadj_upper",
                   "OR_adj","OR_adj_lower","OR_adj_upper")])

    OR_unadj     OR_unadj_lower   OR_unadj_upper      OR_adj     
 Min.   :1.147   Min.   :0.6807   Min.   :1.931   Min.   :1.418  
 1st Qu.:1.893   1st Qu.:1.1332   1st Qu.:3.116   1st Qu.:2.020  
 Median :2.252   Median :1.4282   Median :3.783   Median :2.347  
 Mean   :2.440   Mean   :1.4774   Mean   :4.041   Mean   :2.435  
 3rd Qu.:2.825   3rd Qu.:1.7251   3rd Qu.:4.522   3rd Qu.:2.717  
 Max.   :5.455   Max.   :3.2861   Max.   :9.056   Max.   :4.521  
  OR_adj_lower     OR_adj_upper  
 Min.   :0.9161   Min.   :2.175  
 1st Qu.:1.2806   1st Qu.:3.127  
 Median :1.5198   Median :3.667  
 Mean   :1.5689   Mean   :3.791  
 3rd Qu.:1.8132   3rd Qu.:4.330  
 Max.   :2.9192   Max.   :7.296

Currently, the cluster size variation (CV = 0.98) is too high, estimation is imprecise (and SE too narrow). We need to work on refining the eligibility criteria for the clusters (physicians)

(4) Stratified randomization algorithm

(4.1) Covariate-constrained randomization

If we use batch-randomization (all clusters randomized at once and no new clusters entering later), the probably simple covariate-constrained randomization to be used: https://rethinkingclinicaltrials.org/chapters/design/experimental-designs-and-randomization-schemes/covariate-constrained-randomization/

Exact 1:1 overall allocation:
- 41 Control / 41 Intervention
Soft site stratification:
- Each site gets roughly half clusters in each arm
- Small deviations allowed for odd-sized sites
Global numeric balance re baseline outcome rate:
- Optimises baseline_date mean difference between arms
Random selection among best allocations:
- Preserves randomness while enforcing optimal balance

Code

set.seed(20250820)

n_clusters <- 82
site <- factor(sample(
  1:7, n_clusters, replace = TRUE,
  prob = c(0.05, 0.15, 0.20, 0.10, 0.15, 0.25, 0.10)
))

cluster_data <- data.frame(
  cluster_id = 1:n_clusters,
  baseline_rate = runif(n_clusters, 0.35, 0.65),
  site = site
)

treatments <- c("Control", "Intervention")

### CCR function for global randomisation
run_global_ccr <- function(df, n_sims = 10000, top_pct = 0.10) {
  n <- nrow(df)
  n_ctrl <- n_int <- n / 2  # exact 1:1 allocation
  
  scores <- numeric(n_sims)
  allocs <- matrix(NA, nrow = n_sims, ncol = n)
  
  for (i in 1:n_sims) {
    # Random 1:1 allocation
    arm_assign <- sample(c(rep("Control", n_ctrl),
                           rep("Intervention", n_int)))
    allocs[i, ] <- arm_assign
    temp <- df
    temp$arm <- arm_assign
    
    # Numeric covariate imbalance
    means_diff <- abs(tapply(temp$baseline_rate, temp$arm, mean)["Control"] -
                      tapply(temp$baseline_rate, temp$arm, mean)["Intervention"])
    
    # Soft site stratification: imbalance = sum of squared differences between observed vs ideal allocation per site
    site_table <- table(temp$site, temp$arm)
    ideal_site <- table(temp$site) / 2  # target per arm per site (soft)
    site_diff <- sum((site_table[, "Control"] - ideal_site)^2 +
                     (site_table[, "Intervention"] - ideal_site)^2)
    
    # Total score: numeric imbalance + small weight for site imbalance
    scores[i] <- means_diff + 0.01 * site_diff
  }
  
  # Select best allocations
  threshold <- quantile(scores, top_pct)
  best_idx <- which(scores <= threshold)
  chosen <- sample(best_idx, 1)
  
  df$final_arm <- allocs[chosen, ]
  return(df)
}

### Run it
final_result <- run_global_ccr(cluster_data)
print(final_result)

   cluster_id baseline_rate site    final_arm
1           1     0.5106078    4      Control
2           2     0.3758802    6 Intervention
3           3     0.5055127    1 Intervention
4           4     0.4308069    7 Intervention
5           5     0.6309318    6      Control
6           6     0.5829530    3      Control
7           7     0.5397852    2 Intervention
8           8     0.4321546    6 Intervention
9           9     0.5446242    6      Control
10         10     0.3965107    1      Control
11         11     0.4836322    2      Control
12         12     0.6162682    6      Control
13         13     0.6090239    6      Control
14         14     0.5005644    5 Intervention
15         15     0.6034779    6 Intervention
16         16     0.5563255    6      Control
17         17     0.4810453    4 Intervention
18         18     0.6341021    5 Intervention
19         19     0.5228288    6 Intervention
20         20     0.4573054    3 Intervention
21         21     0.4577337    6 Intervention
22         22     0.4572299    5      Control
23         23     0.6190049    1 Intervention
24         24     0.6238735    3      Control
25         25     0.3792085    6 Intervention
26         26     0.3979843    1 Intervention
27         27     0.5827194    7 Intervention
28         28     0.3678664    3      Control
29         29     0.4648712    4 Intervention
30         30     0.5082518    6      Control
31         31     0.5782608    3 Intervention
32         32     0.5159968    4 Intervention
33         33     0.5251654    7      Control
34         34     0.3843489    6      Control
35         35     0.6089731    6 Intervention
36         36     0.5031032    5      Control
37         37     0.4041871    7      Control
38         38     0.6092656    5      Control
39         39     0.4701395    2      Control
40         40     0.4326763    3 Intervention
41         41     0.5573211    1 Intervention
42         42     0.3511044    5      Control
43         43     0.4021090    6      Control
44         44     0.4842421    3 Intervention
45         45     0.3615362    6      Control
46         46     0.5424458    6 Intervention
47         47     0.5349092    4 Intervention
48         48     0.4250874    2 Intervention
49         49     0.6187297    2 Intervention
50         50     0.5905047    3      Control
51         51     0.4687591    3      Control
52         52     0.4065764    4 Intervention
53         53     0.6257611    1 Intervention
54         54     0.4012752    6 Intervention
55         55     0.6443693    4      Control
56         56     0.5372378    6      Control
57         57     0.5869562    6      Control
58         58     0.4314816    2      Control
59         59     0.5850968    5      Control
60         60     0.4714507    4      Control
61         61     0.4283994    3      Control
62         62     0.4214659    1      Control
63         63     0.4622420    3 Intervention
64         64     0.4819583    6      Control
65         65     0.4594293    3 Intervention
66         66     0.4733779    3 Intervention
67         67     0.6452266    4      Control
68         68     0.4311654    2      Control
69         69     0.5038346    1      Control
70         70     0.4152730    6 Intervention
71         71     0.5302194    3 Intervention
72         72     0.3952863    4      Control
73         73     0.6158604    5 Intervention
74         74     0.5933622    5 Intervention
75         75     0.5612075    3 Intervention
76         76     0.5715312    2      Control
77         77     0.4923413    7 Intervention
78         78     0.4417027    4      Control
79         79     0.4798992    7 Intervention
80         80     0.6474203    1      Control
81         81     0.4415370    4 Intervention
82         82     0.4459414    5      Control

Code

### Checks
cat("\nOverall treatment counts:\n")


Overall treatment counts:

Code

print(table(final_result$final_arm))


     Control Intervention 
          41           41

Code

cat("\nBalance within each site (aim: approximate 1:1):\n")


Balance within each site (aim: approximate 1:1):

Code

print(table(final_result$site, final_result$final_arm))

   
    Control Intervention
  1       4            5
  2       5            3
  3       6            9
  4       6            6
  5       6            4
  6      12           10
  7       2            4

Code

cat("\nBalance by mean baseline_rate by arm (aim: approximate 1:1):\n")


Balance by mean baseline_rate by arm (aim: approximate 1:1):

Code

print(tapply(final_result$baseline_rate, final_result$final_arm, mean))

     Control Intervention 
   0.5031426    0.5034632