Responder_Fallacy

Stephen senn has written an excellent article about hype of precision medicine and how the responder-fallacy has led to false dawn of precisionomics. In a highly instructive article - he shows how use of term “response” is often abused and can mislead us. We will try to simulate this example in R

Let us consider a counter-factual experiment in which 1000 patients with mean FEV1 ( a measure of pulmonary function ) of 2.3 L and standard deviation of 0.26 L are given an inhaler which increases each value on average by 0.2 L with sd of 0.03L

pre = rnorm(1000,2.3,0.26)
post= pre+ rnorm(1000,0.5,0.03)

Let’s look at histogram of FEV1 at baseline

library(tidyverse)

## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr

## Conflicts with tidy packages ----------------------------------------------

## filter(): dplyr, stats
## lag():    dplyr, stats

library(gridExtra)

## 
## Attaching package: 'gridExtra'

## The following object is masked from 'package:dplyr':
## 
##     combine

inhale = data.frame(pre,post)

inhale %>% ggplot(aes(pre))+geom_histogram(fill="blue")+ geom_vline(xintercept = mean(pre))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Let’s look at histogram of FEV1 in post(2nd reading)

inhale %>% ggplot(aes(post))+geom_histogram(fill="blue")+ geom_vline(xintercept = mean(pre))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Now let’s sample 24 patients from this population

data_frame(placebo= pre[1:24], active= post[1:24]) %>% 
  mutate(id=row_number()) %>%  # add row_number as id
  gather(group,FEV1,-id) %>%  # note - sign to keep out id, key value neednt be explicit
  
  ggplot( aes(x=as.factor(id), y=FEV1)) + geom_point(aes(color=group)) + geom_line()+ labs( title= "Counter Factual trial",
                                                                                                             x = "Patient",
                                                                                                             y = "FEV1")+ coord_flip()

Here we can see individual response. Now of course such a counterfactual trial is not possible , we have parallel group trial where each member belongs to either placebo or active treatment group

data_frame(placebo= pre[1:24], active= post[1:24]) %>% 
  mutate(id=row_number()) %>%  # add row_number as id
  gather(group,FEV1,-id) %>%  # note - sign to keep out id, key value neednt be explicit
  
  filter((group=="placebo"&id%%2==0)|(group=="active"&id%%2!=0)) %>% 
  ggplot( aes(x=as.factor(id), y=FEV1)) + geom_point(aes(color=group)) +  labs( title= "parallel trial",
                                                                                                             x = "Patient",
                                                                                                             y = "FEV1")+ coord_flip()

Thus in this real world parallel trial it is very difficult to determine response as was seen in counterfactual trial

So How do we determine response by crossover trials ..

Let’s consider a crossover trial with first crossover with active-placebo difference of 0.5 with sd of 0.2 L .

Now suppose we arbitrarily classify responder as those having more than 0.3 L increase in FEV1

set.seed(7)
crossover1 = rnorm(1000,0.5,0.2)

kil = ecdf(crossover1)(0.3)*100
data_frame(crossover1 = rnorm(1000,0.5,0.2)) %>% mutate(x=if_else(crossover1<0.3,"non-responder","responder")) %>% 
  ggplot(aes(crossover1,fill=factor(x)))+geom_histogram()+geom_vline(xintercept = 0.3)+ # change color of histogram
  annotate("text",x=1,y=90,label=paste(kil," % non-responders"))+
  theme(legend.title=element_blank()) # remove legend

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Thus we see that while every patient has some increase in FEV1, almost 14.8% will be labelled non-responder due to arbitrary dichotomization , one of the reasons to avoid dichomotization which can lead to erroneous conclusions.

So how do we judge response then let’s see the second crossover trial in which there is an element of response i.e. the placebo-active difference is highly correlated(r=0.9) with first crossover trial

simcor = function(x, r,m,s){ r2 = r**2 
ve = 1-r2 
SD = sqrt(ve) 
e = rnorm(length(x), mean=0, sd=SD) 
y = r*x + e  
y1 = m+y*s

return(y1) } 


x=scale(crossover1)


crossover2 = simcor(x=x,r=0.9,m=0.5,s=0.2)

kil1 = ecdf(crossover2)(0.3)*100

data_frame(crossover2 = as.vector(crossover2)) %>% mutate(x=if_else(crossover2<0.3,"non-responder","responder")) %>% # as.vector for crossover2
  ggplot(aes(crossover2,fill=factor(x)))+geom_histogram()+geom_vline(xintercept = 0.3)+ # change color of histogram
  annotate("text",x=1,y=90,label=paste(kil1," % non-responders"))+
  theme(legend.title=element_blank()) # remove legend

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

So now we see 16.2 % in second crossover trial( which are highly correlated with first cross-over trial) are non-responders( response being defined as greater than 0.3 L increase)

Let us see a table of responder vs non responder in cross over 1 vs crossover 2

data_frame(crossover1 = as.vector(crossover1),crossover2 = as.vector(crossover2)) %>% 
  mutate( response1 = case_when(crossover1<0.3 ~ "non-responder",
                                TRUE~"responder"),response2 = case_when(crossover2<0.3 ~ "non-responder",
                                TRUE~"responder")) %>% select(response1,response2) %>% group_by(response1,response2) %>% summarise(n=n(),percent = 100*( n() / nrow(.)) )

## # A tibble: 4 x 4
## # Groups:   response1 [?]
##       response1     response2     n percent
##           <chr>         <chr> <int>   <dbl>
## 1 non-responder non-responder   112    11.2
## 2 non-responder     responder    36     3.6
## 3     responder non-responder    50     5.0
## 4     responder     responder   802    80.2

Note : absolute numbers might vary due to simulation Thus we see that out of ~850 who responded in first 800 responded in second(~95%), while out of ~150 who didnt respond in first around ~32 did respond it in second trial (~20%) . ~8.5 % have inconsistent response in this crossover trial with an element of personal response . Thus marginal probability is around ~85%

Let’s visualise it

p = data_frame(crossover1 = as.vector(crossover1),crossover2 = as.vector(crossover2)) %>% 
  mutate(group = case_when(
  crossover1 >0.3 & crossover2 >0.3 ~ "responders",
   crossover1 >0.3 & crossover2 <0.3 ~ "inconsistent",
  crossover1 <0.3 & crossover2 >0.3 ~"inconsistent",
  crossover1 <0.3 & crossover2 <0.3 ~ "non-responders"
  
)) %>% ggplot(aes(crossover1,crossover2,color=group))+geom_point()+geom_hline(yintercept=0.3,linetype="dotted")+geom_vline(xintercept=0.3,linetype="dotted")+geom_hline(yintercept=0.5)+geom_vline(xintercept=0.5) 

p
ggExtra::ggMarginal(p,type = "histogram")

Now consider a situation with no element of personal response

crossover2 = simcor(x=x,r=0.02,m=0.5,s=0.2)


data_frame(crossover1 = as.vector(crossover1),crossover2 = as.vector(crossover2)) %>% mutate( response1 = case_when(crossover1<0.3 ~ "non-responder",
                                TRUE~"responder"),response2 = case_when(crossover2<0.3 ~ "non-responder",
                                TRUE~"responder")) %>% select(response1,response2) %>% group_by(response1,response2) %>% summarise(n=n(),percent = 100*( n() / nrow(.)) )

## # A tibble: 4 x 4
## # Groups:   response1 [?]
##       response1     response2     n percent
##           <chr>         <chr> <int>   <dbl>
## 1 non-responder non-responder    23     2.3
## 2 non-responder     responder   125    12.5
## 3     responder non-responder   138    13.8
## 4     responder     responder   714    71.4

Note : absolute numbers might vary due to simulation Compare this situation to original simulation with no personal response.

Here also marginal probability of response is ~85% Thus we see that out of ~850 who responded in first 714 responded in second(~85%), while out of ~150 who didnt respond in first around ~23 did respond it in second trial (~20%)

Thus we see almost there is almost ~25% inconsistent response in these crossover trial as opposed to the case in which there was element of personal response (~8% inconsistency )

“The important point to note is that had we only run one cross-over trial, that is to say only using one period we could not have drawn a scatter plot of response (if this is defined as the difference between treatment with an active treatment and treatment with placebo). All we could have drawn is a marginal distribution. However, the marginal distributions on the X axis in scatter diagrams are indistinguishable or at least, given either you could not tell which of the two cases it repre- sented” ~ Senn

Thus key to identification is adequate replication.

q = data_frame(crossover1 = as.vector(crossover1),crossover2 = as.vector(crossover2)) %>% mutate(group = case_when(
  crossover1 >0.3 & crossover2 >0.3 ~ "responders",
   crossover1 >0.3 & crossover2 <0.3 ~ "inconsistent",
  crossover1 <0.3 & crossover2 >0.3 ~"inconsistent",
  crossover1 <0.3 & crossover2 <0.3 ~ "non-responders"
  
)) %>% ggplot(aes(crossover1,crossover2,color=group))+geom_point()+geom_hline(yintercept=0.3,linetype="dotted")+geom_vline(xintercept=0.3,linetype="dotted")+geom_hline(yintercept=0.5)+geom_vline(xintercept=0.5) 

q

ggExtra::ggMarginal(q, type = "histogram")

Key points

Avoid using terms like response
It is diffcult to predict “response” by looking at marginal distributions of cross-over trials
Avoid dichotomization

References

Mastering variation

Responder_Fallacy

Anupam kumar Singh, MD

5 December 2017

Key points

References