Contamination adjusted intention-to-treat analytical approach for RCTs

Understanding intention-to-treat analysis

Randomised controlled trials (RCTs) are referred to as the “gold standard” for evaluating the impact of an intervention. However, RCTs sometime suffer from non-compliance.

Intention-to-treat is a method of analysis used in RCTs in which all participants are included in the analysis as per the group they were originally assigned to regardless whether they complied with treatment or not. It differs from per-protocol analysis where non-compliers are removed from analysis, which impact the statistical power.

However, there are situations when there is cross-over of participants from control to treatment and this leads to contamination. In such cases, ITT would normally ignore contamination altogether. When such a scenario happen, contamination adjusted intention to treat (CA ITT), can be used as a method of analysis. CA ITT adjusts the treatment effect on an outcome by considering the percentage of participants who receive a treatment, thereby providing a more accurate assessment of treatment efficacy in the presence of contamination.

In a nutshell, CA ITT use the assignment to a treatment as an instrumental variable. The assignment variable is first regressed to the variable that indicates compliance to treatment, then regressed to the outcome using a two stage regression- ivreg in R.

Research study

An organization developed an intervention to tackle nutritional challenges faced by families with pregnant mothers in rural communities of Kenya. The intervention’s goal was to enhance dietary diversity, with particular emphasis on improving the Household Dietary Diversity Score (HDDS) as the primary outcome.

To enable the organisation measure the impact, the intervention was implemented in randomly selected clusters while other clusters were selected as control areas. At the endline, a cluster randomised controlled trial (cRCT) design study was done with a key focus on HDDS outcome. Analysis however show that there was a cross-over effect of participants from control areas to treatment areas. What was the effect of the programme?

library(haven)
library(knitr)
library(kableExtra)
library(ivreg)
library(haven)
library(DT)
library(sjPlot)
library(expss)

mydata <- read_dta("C:/Users/Dell/OneDrive - Triggerise/Statistical/2sls/mydata.dta")
attach(mydata)
head(mydata,10) %>%
  kable("html") %>%
  kable_styling(font_size=12) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

participant	assignment	treated	HDDS
1	0	0	2
2	1	1	3
3	0	0	3
4	1	1	2
5	1	1	3
6	1	1	4
7	1	1	3
8	1	1	2
9	0	0	3
10	1	1	3

Proportion contaminated

The table below shows number of participants who either complied or did not comply with the treatment. Based on the overall sample, 213 participants from control areas, representing 21.5% of the total respondents assigned to control areas had received the intervention.

sjPlot::tab_xtab(var.row = mydata$assignment, var.col = mydata$treated, title = "Compliance to treatment", show.row.prc = TRUE)

Compliance to treatment
assignment	treated		Total
assignment	No	Yes	Total
Control	779 78.5 %	213 21.5 %	992 100 %
Treatment	0 0 %	1176 100 %	1176 100 %
Total	779 35.9 %	1389 64.1 %	2168 100 %
χ²=1438.009 · df=1 · φ=0.815 · p=0.000

Descriptive analysis

Analysis shows that at the endline, the average household dietary diversity score (HDDS) was 5.4. Comparison by treatment vs control group, HDDS for the treatment group was higher than the control group (5.9 vs 4.8).

mydata %>% 
    tab_cells(HDDS) %>%
    tab_cols(total(label = "#Total| |"), assignment) %>% 
    tab_stat_fun(Mean = w_mean, "Std. dev." = w_sd, "Valid N" = w_n, method = list) %>%
    tab_pivot()

	#Total			assignment
				Control			Treatment
	Mean	Std. dev.	Valid N	Mean	Std. dev.	Valid N	Mean	Std. dev.	Valid N
Dietary Diversity	5.4	2.5	2168	4.8	2.3	992	5.9	2.5	1176

CA ITT model (impact)

After adjusting for the contamination effect, the analysis reveals that the intervention led to a statistically significant increase in the Household Dietary Diversity Score (HDDS) by 1.51 units (p-value < 0.0001).

iv_model <- ivreg(HDDS ~ treated| assignment, data = mydata)
summary(iv_model)

## 
## Call:
## ivreg(formula = HDDS ~ treated | assignment, data = mydata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -4.94388 -1.94388  0.05612  1.56874  6.56874 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.43126    0.09757   45.41   <2e-16 ***
## treated      1.51261    0.12998   11.64   <2e-16 ***
## 
## Diagnostic tests:
##                   df1  df2 statistic p-value    
## Weak instruments    1 2166  4296.990  <2e-16 ***
## Wu-Hausman          1 2165     0.026   0.872    
## Sargan              0   NA        NA      NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.368 on 2166 degrees of freedom
## Multiple R-Squared: 0.0847,  Adjusted R-squared: 0.08427 
## Wald test: 135.4 on 1 and 2166 DF,  p-value: < 2.2e-16