Background

The PROSPECT Study is a three-arm pragmatic individually-randomised trial to investigate the effect of optimised TB/HIV screening and linkage on case detection, treatment initiation and mortality.


Figure 1: Trial Design


Research Assistants stationed at the clinic registration desk will systematically record the age, sex and presence of TB symptoms (any of cough, fever, weight loss, night sweats) of all adult clinic attenders, and screen for study eligibility.

Following completion of informed consent procedures participants will be randomly allocated by computer-generated code to one of three trial groups:

  1. Standard of care clinician-directed HIV screening/linkage, and TB screening (smear) (conducted by routine facility clinicians in standard clinical areas), or
  1. Universal HIV testing/linkage (conducted by research staff in a separate study area), and clinician-directed TB screening (smear), or
  1. Universal HIV testing/linkage, and optimised TB screening (chest x-ray, and if classified as “any abnormality”, Xpert MTB/Rif ULTRA, culture and smear) (conducted by research staff in a separate study area)


The primary trial outcome will compare between groups the proportion of randomised participants with microbiologically-confirmed undiagnosed/untreated pulmonary TB at 8 weeks.


The seconary outcomes will compare between groups:


We wish to make pairwise comparisons between groups, i.e. we will compare the prevalence of undiagnosed/untreated pulmonary TB at 8-week assessment between:

  1. Group 2 vs. Group 1
  2. Group 3 vs. Group 2
  3. Group 3 vs. Group 1


For analysis of the primary outcome, we will compare between groups the proportion of randomised participants with untreated microbiologically-confirmed TB (defined as either sputum smear positive, or sputum Xpert Ultra positive, or sputum TB culture positive for M. tuberculosis) at 8-week assessment.

As the trial has sufficient power for k=3 pairwise comparisons, we will report relative risk ratios and 95% confidence intervals for comparisons between each pair of groups (i.e. Group 1 vs 2; Group 2 vs 3; and Group 1 vs 3), and a statistically significant relative difference will be defined by p<0.05. We will additionally estimate the absolute differences in proportions between pairs of groups, and will compare between pairs of groups using absolute risk differences and 95% confidence intervals.


Set up

setwd("~/Dropbox/Projects/PROSPECT Study 20160610/Documentation/Samplesize/20161019_v3")
library(tidyverse)
library(DT)
library(plotly)
sessionInfo()
## R version 3.3.1 (2016-06-21)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.12 (Sierra)
## 
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] plotly_4.5.6    DT_0.2          dplyr_0.5.0     purrr_0.2.2    
## [5] readr_1.0.0     tidyr_0.6.0     tibble_1.2      ggplot2_2.2.0  
## [9] tidyverse_1.0.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.8       knitr_1.15.1      magrittr_1.5     
##  [4] munsell_0.4.3     viridisLite_0.1.3 colorspace_1.3-1 
##  [7] R6_2.2.0          httr_1.2.1        stringr_1.1.0    
## [10] plyr_1.8.4        tools_3.3.1       grid_3.3.1       
## [13] gtable_0.2.0      DBI_0.5-1         htmltools_0.3.5  
## [16] yaml_2.1.14       lazyeval_0.2.0    rprojroot_1.1    
## [19] digest_0.6.10     assertthat_0.1    base64enc_0.1-3  
## [22] htmlwidgets_0.8   evaluate_0.10     rmarkdown_1.2    
## [25] stringi_1.1.2     scales_0.4.1      backports_1.0.4  
## [28] jsonlite_1.1
Sys.time()
## [1] "2016-12-02 08:53:09 GMT"

Sample size estimates

We set up some parameters for sample size estimations:


We additionally assume that:

pA=seq(0.14,0.18,0.01) #Vary pA around 16% best estimate
rr=seq(0.700,0.900,0.0025)
tau=3
alpha=0.05
beta=0.20

#expand the grid for all possible combinations
df<-expand.grid(beta=beta,
                 alpha=alpha,
                 pA=pA,
                 rr=rr)

#now mutate to add columns for the sample size required per group
df <- df %>%
  mutate(pB=pA*rr) %>%
  mutate(n=(pA*(1-pA)+pB*(1-pB))*((qnorm(1-alpha/2/tau)+qnorm(1-beta))/(pA-pB))^2) %>%
  mutate(n=ceiling(n)) %>%
  mutate(abs.Risk.Reduct=round(pA-pB,2)) %>%
  mutate(Num.needed.screen=ceiling(1/abs.Risk.Reduct))
df$pA<-as.factor(df$pA)


#make a table of possible scenarios
#(Note, n is number required per group)
rrs<-seq(0.70,0.90,0.05)
df_p<- df %>%
  filter(rr %in% rrs)

#table of scenarios
datatable(df_p,
          extensions='Scroller',
          options = list(
            scrollY=200,
            scroller=TRUE))


Figure 2: Sample size estimates (Group 2 vs. Group 1)

rrs_x<-seq(0.70,0.90,0.01)
df_x <- df %>%
  filter(rr %in% rrs_x)


df_x %>% plot_ly(x=~rr, y=~n, color=~pA) %>%
  layout(yaxis=list(title="Number per group"),
         xaxis=list(title="Relative risk"))


Therefore, a sample size of 1571 per group gives 80% power to detect a 25% relative reduction in undiagnosed/untreated PTB, comparing Group 2 with Group 1.


Now consider sample size required for comparison between Group 3 and Group 2.

Assuming a 25% relative reduction between Group 2 and Group 1, this means that the estimated proportion of participants in Group 2 with undiagnosed/untreated PTB will be 12%.

pA=seq(0.10,0.14,0.01) #Vary pA around 12% best estimate
rr=seq(0.600,0.850,0.0025)
tau=3
alpha=0.05
beta=0.20

#expand the grid for all possible combinations
df2<-expand.grid(beta=beta,
                 alpha=alpha,
                 pA=pA,
                 rr=rr)

#now mutate to add columns for the sample size required per group
df2 <- df2 %>%
  mutate(pB=pA*rr) %>%
  mutate(n=(pA*(1-pA)+pB*(1-pB))*((qnorm(1-alpha/2/tau)+qnorm(1-beta))/(pA-pB))^2) %>%
  mutate(n=ceiling(n)) %>%
  mutate(abs.Risk.Reduct=round(pA-pB,2)) %>%
  mutate(Num.needed.screen=ceiling(1/abs.Risk.Reduct))
df2$pA<-as.factor(df2$pA)


#make a table of possible scenarios
#(Note, n is number required per group)
rrs2<-seq(0.60,0.85,0.05)
df2_p<- df2 %>%
  filter(rr %in% rrs2)

#table of scenarios
datatable(df2_p,          
          extensions='Scroller',
          options = list(
            scrollY=200,
            scroller=TRUE))


Figure 2: Sample size estimates (Group 3 vs. Group 2)

df2_p %>% plot_ly(x=~rr, y=~n, color=~pA) %>%
  layout(yaxis=list(title="Number per group"),
         xaxis=list(title="Relative risk"))


Therefore, a sample size of 1475 per group gives 80% power to detect a 30% relative reduction in undiagnosed/untreated PTB, comparing Group 2 with Group 1.

This additionally means that the study will have 80% power to detect a 47.5% relative reduction when comparing Group 3 with Group 1.

The total sample size required per group, inflating by 5% to account for loss to follow-up, is therefore 1650, or 4950 overall.


Feasibility

Our previous work shows that approximately 5836 adults per month attend Ndirande Health Centre, 10% of whom have TB symptoms.

Therefore, assuming that 70% of these meet eligibility criteria and agree to participate, it would take approximately:

ceiling(4950/(5836*0.1*0.70))
## [1] 13

13 months to recruit the required sample size.