Nuding intervention on the level of SHCS physicians at SHCS sites in Switzerland to promote the use of a novel TB risk score to do individual risk-tailored TB screening (i.e. IGRA testing):
Control: Enhanced standard of care: Electronic health records (EHR)-based based nudge consisting of a physician-targeted reminder for IGRA testing (no risk score)
Intervention: EHR-based nudge consisting of a physician-targeted reminder for IGRA testing guided by the risk score (individually risk-tailored: no testing for low-risk, and testing for high-risk))
Clusters are SHCS physicians, with their SHCS cohort participants, from all major sites (7 sites across Switzerland)
Eligible physician-clusters are active stable SHCS physician and eligible participants are SHCS participants with no recorded IGRA
Primary outcome, powered sample size on, is “individually risk-tailored IGRA testing rate”, i.e. testing among high-risk and no testing among low-risk participants (according to the TB score)
Unit of data collection with be the SHCS cohort participants, but level of inference will be the physicians, i.e. cluster-average
Binary primary outcome: Proportion of patients successfully risk-tailored screened at first presentation
Baseline primary outcome rate (control clusters):
We estimate across the entire cohort that 10% are at high-risk, 90% at low-risk for developing TB (according to score)
The high-risk should be screened, the low-risk should NOT be screened
We estimate that:
We estimate an IGRA testing rate of 45% on average across all physician-clusters, higher among the 10% high-risk and lower among the 90% low-risk, e.g.:
That means 90% (of the 10% high-risk) are correctly tested and 60% (of the 90% low-risk) are correctly NOT tested at the moment => Overall: 0.1*(0.9) + 0.9*(0.60) = 63%
=> we estimate a baseline outcome rate of 63%
Expected outcome in intervention, and delta:
We estimate that we can increase the strata-tailored success outcome rate to:
95% (of the 10% high-risk) are correctly tested and 80% (of the 90% low-risk) are correctly NOT tested => Overall: 0.1*(0.95) + 0.9*(0.8) = 81.5%
=> we estimate an outcome rate of 81% in intervention
=> delta of 81-63 = 18 absolute percentage points
Cluster size (m) of eligible overall participants (6m recruitment period):
on average 18 eligible participants per physician-cluster over a 6m period
CV (coefficient of variation), ratio of standard deviation of cluster sizes to mean of cluster sizes:
0.98
ICC for the primary outcome: 0.20 (behavioural intervention/outcome)
Max. ca. 90 eligible clusters, i.e. 45 clusters per arm, due to cohort
Min. desired power 80%, two-sided alpha of 0.05
1:1 allocation
Packages
Code
req_pkgs <-c("pwr","dplyr","purrr","ggplot2","lme4","geepack", # for GEE (if needed)"MASS", # for GLMM PQL"marginaleffects", # for marginal standardization"future","future.apply","nlme","tibble","knitr","kableExtra","splines")install_if_missing <-function(pkgs){for(p in pkgs){if(!requireNamespace(p, quietly=TRUE)){install.packages(p, repos="https://cloud.r-project.org") }library(p, character.only=TRUE) }}install_if_missing(req_pkgs)# set global RNG seed for reproducibilityset.seed(20250809)
Corresponding individual randomized trial
Sample size for the individual randomized trial on the same question
Code
# Parametersp_C <-0.63p_I <-0.81power <-0.80ICC <-0.20alpha <-0.05# Effect size, standardized as Cohen's hh_I_C <-ES.h(p1 = p_I, p2 = p_C)cat("Cohen's h for Intervention vs Control:", round(h_I_C, 3), "\n")
For arm i (0=control, 1=intervention) and cluster j: Y_ij ∼ Binomial(m_ij, p_ij), logit(p_ij) = β0 + β1_i + u_j , where u_j is a cluster random effect with mean 0 and variance σ^2_b
Binomial(m_ij, p_ij): Conditional on p_ij, we assume each of the m_ij individuals in that cluster are independent Bernoulli trials with probability p_ij. So Y_ij is a binomial draw with that probability, for the cluster-level.
A Bernoulli trial is a random event with two outcomes (success/failure), with the same, independent, probability of success every time.
Independence assumption (within-cluster): Whether one person gets the IGRA test doesn’t change the probability for another person in the same cluster (once p_ij is fixed). The correlation between people’s outcomes in the same cluster comes entirely from them sharing the same p_ij (via the physician)
=> Y_ij can be any integer from 0 to m_ij. E.g. if m_ij =42 and p_ij =0.50, then Y_ij is the total number of IGRA tests in that cluster, drawn from a binomial distribution with 42 trials and 50% success probability.
logit(p_ij) = β0 + β1_i + u_j:
Using the logit link maps probability p ∈ (0,1) to the whole real line, so we can model it as a linear predictor.
β0 is the baseline log-odds (the logit of the control probability for a typical cluster, i.e. when u_j = 0), representing the the marginal cluster-specific probability.
β1_i encodes the treatment effect and is a log-odds difference; exp(β1) is the conditional odds ratio comparing treatment vs control for the same cluster (holding u_j fixed).
u_j is the cluster random intercept (a cluster-level shift on the log-odds scale). It captures unobserved cluster-level factors (e.g. physician tendency) that move all individuals in the cluster up/down in log-odds. Typically, u_j has mean 0 and variance σ^2_b and is independent across clusters (see above). The random intercept does not change the conditional treatment effect, it only shifts the baseline log-odds for that whole cluster. In other words, the difference in log-odds between arms for the same cluster is always constant, but the actual probabilities shift up/down with u_j. For clusters with positive u_j both arms have higher probabilities; for negative u_j both are lower.
ICC on log-odds scale:
ICC = p = rho = σ^2_b / (σ^2_b+(π^2/3))
The ICC consists of individual-level variance (noise) and between-cluster variance (noise), in the sense of: between-cluster variance / total variance. The between-cluster variance approximates the cluster random effect variance (σ^2_b)
In logistic models with binary outcomes, the individual level variation is usually fixed at π^2/3 (3.29)
So, focusing on the cluster random effect variance (σ^2_b), we can derive it from the formula above as: σ_b = sigma_b = sqrt((ICC(π^2/3))/(1−ICC))
(If there’s additional within-site variation over time, i.e. baseline period or SW-CRT, we include σ^2_b_p, typically as a fraction of σ^2_b, e.g., half the site-level variance)
Cluster effect distributions:
While ICC is the proportion of the total variance (in the latent scale) that comes from between-cluster differences (“what fraction of the total variability is due to between-cluster differences”), the σ^2_b is an absolute variance (“How big the cluster intercept spread is in log-odds units” or “how much variation there is in prescription tendency across clusters”) and can have different shapes.
GLMM assumes normal distribution, but reality is often skewed. Simulate three scenarios including a realistic/skewed/conservative scenario and see if GLMM breaks (as in paper above):
b) Gamma (skewed): generate a_j ∼ Gamma(shape=2,scale=1), then set u_j = σ_b((a_j−2)/sqrt(2))
A shape parameter of 2 give a distribution with skew 1.4 and kurtosis 3, i.e., positive skew (some clusters much higher tendency than average)
c) Uniform: u_j ∼ Uniform(−sqrt(3)σ_b, sqrt(3)σ_b)
Skewness = 0 (perfectly symmetric), Kurtosis = −6/5 (lighter tails than normal), no extreme values, overall flat, all clusters are evenly spread; to test if GLMMs are sensitive to lack of tail weight, i.e., whether they rely on the normal distribution’s tails to stabilize estimates.
Cluster sizes m_ij:
Allow for varying cluster size, i.e. varying coefficient of variation (CV) of cluster sizes, using same approach as they did: They sampled cluster sizes so that m_ij = 2 + δ_ij, drawn from a Negative Binomial:
δ_ij ∼ NegBin(size = (m-2)^2/(s^2-(m-2)), p = m-2/s^2)
where s is the SD of cluster sizes (CV = s/m).
This yields a minimum cluster size of 3. (note: they note no.offails and prob.offail; but the above should represent the same).
δ is in a way the random component added to 2 to get the cluster size (of min 3).
(2.2) Create main functions and simulate one dataset
Code
# 1) compute sigma_b from ICC (on latent logit scale):icc_to_sigma <-function(rho){if(rho<=0) return(0) sigma_b <-sqrt( (rho * (pi^2/3)) / (1- rho) )return(sigma_b)}# 2) compute beta0 for given control prevalence p0p_to_beta0 <-function(p0){qlogis(p0)}# 3) given p0 and p1, compute OR on the cluster-specific log-odds scalep0_p1_to_OR <-function(p0, p1){ odds0 <- p0 / (1- p0) odds1 <- p1 / (1- p1) odds1 / odds0}# 4) generate random cluster-level u_j for the three distributionsgenerate_u <-function(n_clusters, sigma_b, dist =c("normal","gamma","uniform")){ dist <-match.arg(dist)if(sigma_b ==0) return(rep(0, n_clusters))if(dist =="normal"){return(rnorm(n_clusters, mean=0, sd = sigma_b)) } elseif(dist =="gamma"){# they used Gamma(shape=2, scale=1) then standardized to mean 0 and sd sigma_b a <-rgamma(n_clusters, shape=2, scale=1)# a has mean 2, var 2. Standardize: (a - 2)/sqrt(2) then scale to sigma_breturn(sigma_b * (a -2)/sqrt(2)) } elseif(dist =="uniform"){ cut <-sqrt(3) * sigma_breturn(runif(n_clusters, min =-cut, max = cut)) }}# 5) generate cluster sizes with target mean m and CV. Implementation follows their negative-binomial based approach and enforces minimum cluster size of 3.generate_cluster_sizes <-function(n_clusters, m, CV){if(CV ==0){return(rep(m, n_clusters)) } s <- CV * m# We want delta = m_j - 2 to follow NegBin with mean (m-2) and variance s^2 mu_delta <- m -2 var_delta <- s^2if(var_delta <= mu_delta){# Negative Binomial requires variance > mean. So, this is an impossible NB parameterization# If so, fall back to a discrete uniform around m low <-max(3, floor(m - s*1.5)) high <-ceiling(m + s*1.5) out <-pmax(3, round(runif(n_clusters, low, high)))return(out) } size_nb <- (mu_delta^2) / (var_delta - mu_delta) # see formula above prob_nb <- mu_delta / var_delta # see formula above# rnbinom in R uses size, prob; mean = size*(1-prob)/prob, but with this param it matches delta <-rnbinom(n_clusters, size = size_nb, prob = prob_nb) m_j <-2+ delta m_j[m_j <3] <-3# enforce min 3 (generating 2+delta ensures >=2, we bump to 3)return(m_j)}# Parameters for single simulated datasetn_clusters <-84m_mean <-18CV <-0.98p0 <-0.63p1 <-0.82OR <-p0_p1_to_OR(p0, p1) # compute ORrho <-0.20# ICCre_dist <-"gamma"# Simulateset.seed(20250809)sigma_b <-icc_to_sigma(rho)u_j <-generate_u(n_clusters, sigma_b, dist = re_dist)sizes <-generate_cluster_sizes(n_clusters, m_mean, CV)arm_assign <-sample(rep(0:1, length.out = n_clusters))beta0 <-p_to_beta0(p0)beta1 <-log(OR)y <-integer(n_clusters)for(j inseq_len(n_clusters)){ # iterate over each cluster# create the linear predictor (NOTE: beta1 turns 0 if arm0, and 1 * beta1 if arm1) linpred <- beta0 + beta1 * arm_assign[j] + u_j[j] # apply the inverse logit (logistic function) to convert log-odds to probability p_j <-plogis(linpred) # Simulate the number of successes in cluster j y[j] <-rbinom(1, size = sizes[j], prob = p_j) }df_sim <-data.frame(cluster =seq_len(n_clusters),arm = arm_assign,size = sizes,y = y)df_sim
mean_sizes <- df_sim %>%group_by(arm) %>%summarise(mean_size =mean(size))ggplot(df_sim, aes(x =factor(cluster), y = size, fill =factor(arm))) +geom_bar(stat ="identity", color ="black") +geom_hline(data = mean_sizes, aes(yintercept = mean_size, color =factor(arm)),linetype ="dashed", size =1, show.legend =FALSE) +geom_text(data = mean_sizes, aes(x =Inf, y = mean_size, label =paste0("Mean = ", round(mean_size, 1))),hjust =1.1, vjust =-0.5, color =c("skyblue4", "tomato3"), size =4) +scale_fill_manual(values =c("skyblue", "tomato"), labels =c("Control (arm=0)", "Intervention (arm=1)")) +scale_color_manual(values =c("skyblue4", "tomato3")) +labs(x ="Cluster", y ="Cluster Size", fill ="Treatment Group") +theme_minimal() +ggtitle("Cluster size per cluster") +theme(axis.text.x =element_text(angle =45, hjust =1))
size = number of individuals in a cluster
y = number of individual-level successes (binary=1) observed in the cluster, i.e., represents the number of individuals in that cluster who correctly received an IGRA test (no if low-risk, yes if high-risk)
(2.3) Simulate power, using cluster-level analysis approach
NOTES:
Use cluster-level analysis (unweighted t-test on log-odds, with 0.5 continuity correction, as per guidance according to Thompson & Leyrat & al -> “clan” command)
Use gamma distribution (most conservative), simulate 500-1000 trials
p0_vals <-seq(0.35, 0.55, by =0.05)p1_vals <-seq(0.50, 0.85, by =0.05)grid <-expand.grid(p0 = p0_vals, p1 = p1_vals)results <- grid %>%rowwise() %>%mutate(power =simulate_power(n_clusters =84,m_mean =18,CV =0.98,p0 = p0,p1 = p1,rho =0.2,re_dist ="gamma",n_sim =1000)) %>%ungroup()# Plotggplot(results, aes(x = p1, y = power, color =factor(p0))) +# Shaded region above 90% powergeom_rect(aes(xmin =-Inf, xmax =Inf, ymin =0.9, ymax =Inf),fill ="lightgrey", alpha =0.3, inherit.aes =FALSE) +# Power curvesgeom_line(size =1.2) +geom_point() +# Labels and scaleslabs(title ="Power curves by p0 and p1",x ="Intervention group probability (p1)",y ="Estimated power",color ="Control group probability (p0)") +scale_y_continuous(breaks =seq(0, 1, by =0.1),limits =c(0, 1),labels = scales::percent_format(accuracy =1)) +scale_x_continuous(breaks =seq(0, 1, by =0.05)) +theme_minimal(base_size =14)
(2.3.4) Vary ICC
Code
# Vector of ICC values to testicc_values <-seq(0.05, 0.45, by =0.02)# Run power simulations for each ICCpower_results <-sapply(icc_values, function(rho) {simulate_power(n_clusters =84,m_mean =18,CV =0.98,p0 =0.63,p1 =0.81,rho = rho,re_dist ="gamma",n_sim =1000,alpha =0.05,seed =20250809)})# Create data frame for plottingdf_power_icc <-data.frame(ICC = icc_values, Power = power_results)# Plotggplot(df_power_icc, aes(x = ICC, y = Power)) +geom_line(color ="darkred", size =1.2) +geom_point(color ="firebrick") +labs(title ="Power Curve by ICC",x ="Intraclass Correlation (ICC)",y ="Estimated Power") +scale_y_continuous(breaks =seq(0.70, 1, by =0.1),limits =c(0.70, 1),labels = scales::percent_format(accuracy =1)) +theme_minimal()
(2.3.5) Vary number of clusters
Code
# Vector of cluster counts to testn_clusters_vec <-seq(30, 110, by =1)# Run power simulations for each cluster countpower_results <-sapply(n_clusters_vec, function(nc) {simulate_power(n_clusters = nc,m_mean =18,CV =0.98,p0 =0.63,p1 =0.81,rho =0.20,re_dist ="gamma",n_sim =1000,alpha =0.05,seed =20250809)})# Create data frame for plottingdf_power_css <-data.frame(Cluster_ss = n_clusters_vec, Power = power_results)# Plotggplot(df_power_css, aes(x = Cluster_ss, y = Power)) +geom_line(color ="darkgreen", size =1.2) +geom_point(color ="forestgreen") +labs(title ="Power vs Number of total clusters",x ="Total number of clusters",y ="Estimated power") +scale_y_continuous(breaks =seq(0.50, 1, by =0.1),limits =c(0.50, 1),labels = scales::percent_format(accuracy =1)) +scale_x_continuous(breaks =seq(30, 110, by =5)) +theme_minimal()
(2.3.6) Vary number of individuals per cluster (mean cluster size)
Code
m_mean_vec <-seq(2, 40, by =2)# Run power simulations for each cluster countpower_results <-sapply(m_mean_vec, function(n) {simulate_power(n_clusters =84,m_mean = n,CV =0.98,p0 =0.63,p1 =0.81,rho =0.20,re_dist ="gamma",n_sim =1000,alpha =0.05,seed =20250809)})# Create data frame for plottingdf_power_iss <-data.frame(Individual_ss = m_mean_vec, Power = power_results)# Plotggplot(df_power_iss, aes(x = Individual_ss, y = Power)) +geom_line(color ="darkblue", size =1.2) +geom_point(color ="skyblue") +labs(title ="Power vs Average number of individuals per cluster",x ="Average number of individuals per cluster",y ="Estimated power") +scale_y_continuous(breaks =seq(0.70, 1, by =0.1),limits =c(0.70, 1),labels = scales::percent_format(accuracy =1)) +scale_x_continuous(breaks =seq(2, 40, by =2)) +theme_minimal()
(2.4) Simulate power, using GLMM analysis approach
NOTES:
As per trial protocol
84 clusters in total, 42 per arm => cluster number large enough to use normal GLMM (not with restricted pseudo-likelihood and reduced degree of freedom as per guidance according to Thompson & Leyrat & al)
Currently, the cluster size variation (CV = 0.98) is too high, estimation is imprecise. We need to work on refining the eligibility criteria for the clusters (physicians)
(2.4.3) Vary effect sizes
Code
# p0_vals <- seq(0.35, 0.55, by = 0.05)# p1_vals <- seq(0.50, 0.85, by = 0.05)# # grid_glmm <- expand.grid(p0 = p0_vals, p1 = p1_vals)# # # Use map2 to apply the function to each p0/p1 pair, bit more efficient# grid_glmm$power <- map2_dbl(grid_glmm$p0, grid_glmm$p1, ~ simulate_power_glmmPQL(# n_clusters = 110,# m_mean = 100,# CV = 0.4,# p0 = .x,# p1 = .y,# rho = 0.2,# re_dist = "gamma",# n_sim = 300 # reduced for speed# ))# # ggplot(grid_glmm, aes(x = p1, y = power, color = factor(p0))) +# geom_rect(aes(xmin = -Inf, xmax = Inf, ymin = 0.9, ymax = Inf),# fill = "lightgrey", alpha = 0.3, inherit.aes = FALSE) +# geom_line(size = 1.2) +# geom_point() +# labs(title = "Power Curves by p0 and p1 (GLMM)",# x = "Intervention group probability (p1)",# y = "Estimated power",# color = "Control group probability (p0)") +# scale_y_continuous(breaks = seq(0, 1, by = 0.1),# limits = c(0, 1),# labels = scales::percent_format(accuracy = 1)) +# theme_minimal(base_size = 14)
# n_clusters_vec <- seq(80, 120, by = 1)# # power_results_glmm <- sapply(n_clusters_vec, function(nc) {# simulate_power_glmmPQL(n_clusters = nc,# m_mean = 100,# CV = 0.40,# p0 = 0.48,# p1 = 0.64,# rho = 0.20,# re_dist = "gamma",# n_sim = 300, # reduced for speed# alpha = 0.05,# seed = 20250809)# })# # df_power_css_glmm <- data.frame(Cluster_ss = n_clusters_vec, Power = power_results_glmm)# # ggplot(df_power_css_glmm, aes(x = Cluster_ss, y = Power)) +# geom_line(color = "darkgreen", size = 1.2) +# geom_point(color = "forestgreen") +# labs(title = "Power vs Number of total clusters (GLMM)",# x = "Total number of clusters",# y = "Estimated power") +# scale_y_continuous(breaks = seq(0.70, 1, by = 0.1),# limits = c(0.70, 1),# labels = scales::percent_format(accuracy = 1)) +# scale_x_continuous(breaks = seq(22, 36, by = 1)) +# theme_minimal()
(3) Simulate the full dataset and implement the main analysis strategy
The main analysis strategy:
Generalized linear mixed model (GLMM) with restricted pseudo-likelihood estimation and small-sample correction for degrees of freedom (clusters minus cluster-level parameters), as suggested by Thompson and colleagues (could be dropped here with 50+ clusters per arm)
We will adjust the model for these a priori defined covariates (as fixed effects):
Cluster-level covariates (cluster mean): baseline testing rate (as continuous covariates assuming linearity) and site (7 levels)
Individual-level covariates:
Self-reported sex/gender (for now, as binary covariate: male, female)
Age (as continuous covariates assuming non-linear association modelled using restricted cubic splines with 3 knots at 10th, 50th and 90th percentiles of the observed age distribution)
Region of origin (xxx levels)
We will report the resulting treatment-effect (beta-1), which is the log-odds difference between intervention and control or – when exponentiated – the adjusted odds ratio, with its 95% confidence interval. This represents a relative cluster-specific effect, conditional on all included covariates.
In addition, we will use marginal standardization and report the resulting population-average marginal relative risk and risk difference with their 95% confidence intervals (is primarily and individual-average estimate, but equals a cluster-average estimate in case of no informative cluster size)
Notes on simulating a realistic dataset:
Reuse the helper functions from chapter 2.2, incl. conservative gamma distribution for u_j
Causal structure:
Cluster latent effect (u_j) influences baseline IGRA testing rate (through alpha, the correlation strength between baseline and u_j) and directly affects the outcome via the cluster random effect
Baseline IGRA testing rate directly affects the outcome (via beta_baseline), representing residual correlation beyond the shared cluster effect alpha
=> baseline_rate directly pushes the outcome (via beta_baseline) and shares correlation with u_j (i.e., indirectly pushes the outcome), in the sense of: “Clusters with higher IGRA testing rate at baseline -> higher individually risk-tailored testing rate at endline”
Treatment (arm 1) directly affects the outcome (through beta_1), but correlation above and noise below masking it
Add some baseline noise (e.g. tau = 0.45) ensuring that even clusters with the same u_j will show some variability in their observed baseline_rate
alpha (correlation baseline_rate with u_j): e.g. a value of 0.3 means 30% of the variation in the baseline logit is driven by u_j (i.e. drives true cluster tendency or the “cluster-to-cluster variation” at baseline, which also has an impact on the outcome), while the remaining comes from independent measurement noise.
Produce an individual-level dataset, not cluster-level only - as the real-life dataset will look like and in case we also want to add individual-level correlations
(3.1) Simulate one dataset and check some diagnostics
Code
## We use the helper functions from chapter 2.2:# icc_to_sigma# generate_u# generate_cluster_sizes# p_to_beta0# p0_p1_to_OR## General parametersset.seed(20250809)n_clusters <-84m_mean <-18CV <-0.98p0 <-0.63p1 <-0.81OR <-p0_p1_to_OR(p0, p1)rho <-0.20# ICC (on latent logit scale)re_dist <-"gamma"# distribution for u_j, keep it conservative# Individual-level covariatesage_mean <-35age_sd <-12sex_prob <-0.40## Generate cluster structuresizes <-generate_cluster_sizes(n_clusters, m_mean, CV)sigma_b <-icc_to_sigma(rho)u_j <-generate_u(n_clusters, sigma_b, dist = re_dist)arm_assign <-sample(rep(0:1, length.out = n_clusters))# First important thing to mimic: IGRA testing rate at baseline# alpha controls how much the baseline rate depends on the same latent cluster effect# The bigger alpha, the more high-baseline clusters will also tend to have high endline outcomes indirectly, because u_j is reused in the outcome model => indirect correlation# Baseline IGRA testing rate is explained by u_j + random noise eps + global average level gamma0.gamma0 <-qlogis(p0) # the average cluster-level log-odds of baseline IGRA testing ratealpha <-0.3# how much baseline (logit) depends on u_j (i.e. the latent cluster effect); 0 would be no correlationtau <-0.45# the residual variation (SD) in baseline log-odds not explained by u_j, i.e. the random measurement noiseeps <-rnorm(n_clusters, 0, tau)logit_b <- gamma0 + alpha * u_j + eps # putting it all togetherbaseline_rate <-plogis(logit_b) # map back to probability scale# Second, the sites: uncorrelated 7-levelsite <-sample(1:7, n_clusters, replace =TRUE,prob =c(0.05, 0.15, 0.20, 0.10, 0.15, 0.25, 0.10))## Fixed effects on outcome, direct correlations on outcomebeta0 <-p_to_beta0(p0) # interceptbeta1 <-log(OR) # intervention effectbeta_baseline <-0.5# how strongly the baseline rate predicts the endline outcome, independent of u_jbeta_site <-0.0# no correlationbeta0_adj <- beta0 -1.0# after including u_j, baseline_rate, etc., the overall mean outcome probability can drift because of the nonlinear logistic function. Stabilize.## Simulate individual-level dataind_list <-vector("list", length = n_clusters)for(j inseq_len(n_clusters)){ nj <- sizes[j] age_j <-rnorm(nj, mean = age_mean, sd = age_sd) # draw from normal sex_j <-rbinom(nj, 1, prob = sex_prob) # draw from bernoulli logit_baseline_j <-qlogis(baseline_rate[j]) # back to logit# the log-odds of IGRA testing rate for all individuals in cluster j (same cluster-level predictors for all) linpred_j <- beta0 + beta1 * arm_assign[j] + beta_baseline * logit_baseline_j + beta_site * site[j] + u_j[j] # latent cluster random effect p_ij <-plogis(linpred_j) # Predicted probability of receiving an IGRA testing for each individual in cluster j. Since all individuals in a cluster share the same cluster-level covariates, p_ij is identical for everyone in the cluster (unless we later include individual-level predictors...) y_ij <-rbinom(nj, 1, p_ij) # the outcome; bernoulli with probability p_ij# save data for this one cluster ind_list[[j]] <-data.frame(cluster = j,arm = arm_assign[j],age = age_j,sex = sex_j,site = site[j],baseline_rate = baseline_rate[j],u_j = u_j[j],p = p_ij,y = y_ij )}df_ind <-do.call(rbind, ind_list)## Cluster-level summary, aggregate at cluster-leveldf_cluster <-aggregate(y ~ cluster + arm, data = df_ind, sum) # aggregate number of outcomesdf_cluster$size <-aggregate(y ~ cluster, data = df_ind, length)$y # count number of ind => cluster sizecluster_meta <-data.frame(cluster =seq_len(n_clusters),arm = arm_assign,site = site,baseline_rate = baseline_rate,u_j = u_j)df_sim <-merge(df_cluster, cluster_meta, by =c("cluster","arm"))df_sim <- df_sim[order(df_sim$cluster),c("cluster","arm","size","y","baseline_rate","site","u_j")]## Diagnosticscat("Mean baseline_rate =", round(mean(baseline_rate),3), "\n")
(3.2) The primary analysis approach (GLMM, fully adjusted)
Code
## Precompute spline basis for age and convert to numericage_spline <-as.data.frame(ns(df_ind$age, knots =quantile(df_ind$age, probs=c(0.1,0.5,0.9))))colnames(age_spline) <-paste0("age_spline", seq_len(ncol(age_spline)))age_spline[] <-lapply(age_spline, as.numeric)df_ind <-cbind(df_ind, age_spline)## Ensure factor levelsdf_ind$arm <-factor(df_ind$arm, levels =c(0,1)) # 0 = control, 1 = interventiondf_ind$sex <-factor(df_ind$sex, levels =c(0,1)) # 0 = male, 1 = femaledf_ind$site <-factor(df_ind$site, levels =c(1,2,3,4,5,6,7))## Fit GLMM using glmer (fully adjusted, age spline) spline_cols <-colnames(df_ind)[grepl("^age_spline", colnames(df_ind))]form_primary <-as.formula(paste("y ~ arm + baseline_rate + site + sex +",paste(spline_cols, collapse =" + "),"+ (1 | cluster)"))model_primary <- lme4::glmer(formula = form_primary,family =binomial(link ="logit"),data = df_ind,control = lme4::glmerControl(optimizer ="bobyqa") )### Now, let's make a few comparisons## 1. Unadjusted ORform_unadj <-as.formula(paste("y ~ arm + (1 | cluster)"))model_unadj <- lme4::glmer(formula = form_unadj,family =binomial(link ="logit"),data = df_ind,control = lme4::glmerControl(optimizer ="bobyqa") )coef_name_unadj <-grep("^arm", names(fixef(model_unadj)), value=TRUE)coef_arm_unadj <-fixef(model_unadj)[coef_name_unadj]se_arm_unadj <-summary(model_unadj)$coefficients[coef_name_unadj,"Std. Error"]p_val_unadj <-summary(model_unadj)$coefficients[coef_name_unadj,"Pr(>|z|)"]OR_unadj <-exp(coef_arm_unadj)CI_unadj <-exp(confint(model_unadj, method="Wald")[coef_name_unadj, ])## 2. Adjusted for stratification variables onlyform_strata <-as.formula(paste("y ~ arm + baseline_rate + site + (1 | cluster)"))model_strata <- lme4::glmer(formula = form_strata,family =binomial(link ="logit"),data = df_ind,control = lme4::glmerControl(optimizer ="bobyqa") )coef_name_strata <-grep("^arm", names(fixef(model_strata)), value=TRUE)coef_arm_strata <-fixef(model_strata)[coef_name_strata]se_arm_strata <-summary(model_strata)$coefficients[coef_name_strata,"Std. Error"]p_val_strata <-summary(model_strata)$coefficients[coef_name_strata,"Pr(>|z|)"]OR_strata <-exp(coef_arm_strata)CI_strata <-exp(confint(model_strata, method="Wald")[coef_name_strata, ])## 3. Fully adjusted, age as spline (see main model above)coef_name_full <-grep("^arm", names(fixef(model_primary)), value=TRUE)coef_arm_full <-fixef(model_primary)[coef_name_full]se_arm_full <-summary(model_primary)$coefficients[coef_name_full,"Std. Error"]p_val_full <-summary(model_primary)$coefficients[coef_name_full,"Pr(>|z|)"]OR_full <-exp(coef_arm_full)CI_full <-exp(confint(model_primary, method="Wald")[coef_name_full, ])## 4. And finally, calculate RR for the main model, using marginal standardizationRR_model <-tryCatch({avg_comparisons(model_primary, variables="arm", type="response", comparison="ratio", re.form =NA)}, error=function(e) NULL)if(!is.null(RR_model)){ rr <- RR_model$estimate[1] rr_cl <- RR_model$conf.low[1] rr_ch <- RR_model$conf.high[1]} else { rr <- rr_cl <- rr_ch <-NA_real_}## Combine it all into a tableresults_table <-data.frame(Metric =c("Unadjusted", "Adjusted for strat only", "Fully adjusted; age spline"),OR =c(sprintf("%.3f", OR_unadj),sprintf("%.3f", OR_strata),sprintf("%.3f", OR_full)),CI_lower =c(sprintf("%.3f", CI_unadj[1]),sprintf("%.3f", CI_strata[1]),sprintf("%.3f", CI_full[1])),CI_upper =c(sprintf("%.3f", CI_unadj[2]),sprintf("%.3f", CI_strata[2]),sprintf("%.3f", CI_full[2])),wald_based_p_value =c(sprintf("%.3f", p_val_unadj), sprintf("%.3f", p_val_strata), sprintf("%.3f", p_val_full)),RR =c(NA, NA, sprintf("%.3f", rr)),RR_CI_lower =c(NA, NA, sprintf("%.3f", rr_cl)),RR_CI_upper =c(NA, NA, sprintf("%.3f", rr_ch)))results_table %>%kable("html", caption="Intervention effect: OR and RR with 95% CI (single simulation)") %>%kable_styling(bootstrap_options="striped", full_width=FALSE)
Intervention effect: OR and RR with 95% CI (single simulation)
Metric
OR
CI_lower
CI_upper
wald_based_p_value
RR
RR_CI_lower
RR_CI_upper
Unadjusted
1.853
1.188
2.888
0.007
NA
NA
NA
Adjusted for strat only
2.266
1.615
3.179
0.000
NA
NA
NA
Fully adjusted; age spline
2.281
1.627
3.199
0.000
1.223
1.121
1.325
CAVE: This is 1 randomly simulated dataset.
Also: Add third individual-level covariate (region of origin)
Due to correlation structure the adjustment for baseline outcome rate (part of the stratification factors) increases power and precision. The further adjustment for individual-level covariates does not change much, since there is no simulated correlation at that level.
RR only constructed for primary model (fully adjusted model)
(3.3) Put all together and simulate the power
1000 simulations, based on dataset simulation (Chapter 3.1) and primary analysis model (Chapter 3.2)
cat("Estimated power (fully adjusted) =", round(power_adj,4), "\n")
Estimated power (fully adjusted) = 0.99
Code
# Summary of ORssummary(results[,c("OR_unadj","OR_unadj_lower","OR_unadj_upper","OR_adj","OR_adj_lower","OR_adj_upper")])
OR_unadj OR_unadj_lower OR_unadj_upper OR_adj
Min. :1.421 Min. :0.8232 Min. :2.360 Min. :1.480
1st Qu.:2.075 1st Qu.:1.2656 1st Qu.:3.387 1st Qu.:2.061
Median :2.354 Median :1.4809 Median :3.733 Median :2.449
Mean :2.460 Mean :1.5329 Mean :3.957 Mean :2.557
3rd Qu.:2.820 3rd Qu.:1.7779 3rd Qu.:4.479 3rd Qu.:2.799
Max. :4.223 Max. :2.4439 Max. :7.297 Max. :5.040
OR_adj_lower OR_adj_upper
Min. :0.9115 Min. :2.310
1st Qu.:1.3575 1st Qu.:3.129
Median :1.6095 Median :3.685
Mean :1.6821 Mean :3.896
3rd Qu.:1.8696 3rd Qu.:4.197
Max. :3.1847 Max. :7.976