How to run MIMOSA for ICS data analysis

Background

MIMOSA (Mixture models for single-cell assays) is an R package used to make response calls for ICS (Intracellular Cytokine Staining) assay data, a type of flow cytometric assay.

We’ve received some inquiries on how to format ICS data for MIMOSA and how to use the package to analyze data, so I thought it would be useful to prepare a simple example demonstrating the analysis pipeline.

Data

An ICS assay measures the number of antigen-specific T-cells in a sample (that sample may come from a subject participating in a vaccine trial) by stimulating PBMCs with a specific antigen or antigen pool, and then counting the number of CD4 or CD8 T-cells that produce one or a combination of cytokines in response to stimulation.
We are generally interested in measuring how this response changes over time (e.g. before and after vaccination).
However, just measuring the stimulated sample is usually insufficient as there may be some background response even in the absence of stimulation. That is to say, there may be some T-cells that produce elevated cytokines even in the absence of stimulation. There is, therefore, a need to account for this background.
To do so, we generally have two samples per subject, a stimulated sample and a non-stimulated sample. The output of the assay for a single subject can be summarized in the following table:

	Negative	Positive
Non-stimulated	56000	8
Stimulated	46000	27

The above is telling us that for the subject in question, we measured 27 cytokine-producing T-cells in the stimulated sample and 8 in the non-stimulated sample, while 56000 T-cells in the non-stimulated sample did not produce the measured cytokines, and 46000 T-cells did not produce cytokines in the stimulated sample. In total we measured 56008 T-cells in the non-stimulated sample and 46017 T-cells in the stimulated sample.

Responders or non-responders?

The question we want to answer is: did the subject’s T-cells respond to antigen stimulation?

Since the counts are different, we would need to look at the proportion of T-cells producing cytokines.

knitr::kable(prop.table(m,1)*100)

	Negative	Positive
Non-stimulated	99.98572	0.0142837
Stimulated	99.94134	0.0586612

In the non-stimulated sample, 0.014% of T-cells produce cytokines, while in the stimulated sample 0.058% of T-cells produce cytokines. Is this a significant differnce? That depends on the total cell counts (i.e. an increase of 1% of 100 cells is less believabl than an increase of 1% of 10,000 cells). The typical way these data are analyzed would be to use Fisher’s Exact test to test for a significant increase.

## 
##  Fisher's Exact Test for Count Data
## 
## data:  m
## p-value = 0.0001155
## alternative hypothesis: true odds ratio is greater than 1
## 95 percent confidence interval:
##  2.030757      Inf
## sample estimates:
## odds ratio 
##   4.108751

Fisher’s test tells us the odds ratio is 4.1 and the increase is significant. In a typical assay, however, we have multiple subjects (tens to 100’s) so there is an opportunity to “share” information across those subjects and learn something about the distribution of responses in the stimulated and non-stimulated samples.

MIMOSA

This is what MIMOSA does, in fact. MIMOSA models the cell counts across all subjects as arising from a mixture of two processes, either:
1. The observations are generated from the non-responder distribution.
2. They are generated from the responder distribution.

The non-responder distribution should be indistinguishable from the distribution of background responses. The expected proportion of positive cells arising from the responder distribution should be greater than that arising from the background / non responder distribution.

The model fit by MIMOSA is a two-component Beta-Binomial mixture, using MCMC to estimate the model parameters and mixture weights.

Example

As an example, let’s assume that the average response magnitude is 0.06%, while the average background magnitude is 0.015%. These numbers are in the ballpark of low-level responses typically seen in HIV vaccine ICS assays. We’d need to collect on the order of 50,000 T-cells to reliably see cell-level responses at these rates. Each subject won’t have a respnose magnitude of exactly 0.06%, but there will be some variability around that number across subjects. Similarly for the background response magnitudes.

We will simulate some data from 100 subjects that meets these criteria, with a 25% response rate, such that 25 subjects are responders and 75 subjects are non-responders.

The plot shows the observed proportions of cytokine positive cells in the stimulated samples (red) and the non-stimulated samples (blue). The distribution of stimulated proportions from the 100 subjects is a mixture of 25% responders (dotted line) and 75% non-responders (solid line).

Note that there is some overlap between the responders and non-responders, so it is not possible to perfectly discriminate between them. This is where MIMOSA helps, since it borrows information across the full set of subjects to shrink the posterior estimates towards the group mean.

Formatting

TCELLSUB	NSUB	CYTNUM	NSUB_REF	CYTNUM_REF	Stim	SubjectID	Cytokine	visit
CD4	49720	29	49828	6	CMV	S1	IL2orIFNg	1
CD4	50031	35	50675	7	CMV	S2	IL2orIFNg	1
CD4	49924	37	49757	8	CMV	S3	IL2orIFNg	1
CD4	50411	32	50410	11	CMV	S4	IL2orIFNg	1
CD4	50026	32	49262	9	CMV	S5	IL2orIFNg	1
CD4	50114	45	49796	4	CMV	S6	IL2orIFNg	1

This is typical ICS data, the columns are:

TCELLSUB: the T-cell subset, in this case CD4.
NSUB: The number of CD4 T-cells negative for the measured cytokine combination in the stimulated sample.
CYTNUM: The number CD4 T-cells positive for the measured cytokine combination in the stimulated sample.
NSUB_REF: The number of CD4 T-cells negative for the measured cytokine combination in the non-stimulated sample.
CYTNUM_REF: The number of CD4 T-cells positive for the measured cytokine combination in the non-stimulated sample.
Stim: The antigen stimulation for this row, in this case “CMV”.
SubjectID: A subject-specific identifier for this row
Cytokine: The cytokine combination measured. The number in CYTNUM and CYTNUM_REF correspond to the number of cells producing this cytokine combination.
visit: The visit or timepoint corresponding to the time this sample was collected.

In a large study, one would have multiple rows per subject, potentially from multiple timepoints, and almost certainly measuring multiple T-cell subsets, and multiple antigen stimulations.

The data must contain columns to uniquely identify an observation (one row). In this case, visit, SubjectID, Stim, TCELLSUB and Cytokine uniquely and unambiguously identify one observation. The observation consists of a 4-dimensional measurment of cells positive and negative for the cytokine combination in the stimulated and non-stimulated sample.

Data should be formatted as above and the columns holding the cell counts should have the column names shown above.

Construct an Expression Set

Once the data are formatted as above, we construct an expression set object for MIMOSA:

Eset<-ConstructMIMOSAExpressionSet(thisdata=dat,
      reference=NULL,
      measure.columns = c("NSUB","CYTNUM"),
      default.cast.formula = component ~ SubjectID+Stim+TCELLSUB+visit+Cytokine,
      .variables = .(SubjectID,Stim,TCELLSUB,visit,Cytokine),
      ref.append.replace = "_NEG"
      )
Eset

## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 2 features, 200 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: S1_CMV_CD4_1_IL2orIFNg_Treatment
##     S1_CMV_CD4_1_IL2orIFNg_Reference ...
##     S99_CMV_CD4_1_IL2orIFNg_Reference (200 total)
##   varLabels: SubjectID Stim ... RefTreat (6 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: CYTNUM NSUB
##   fvarLabels: component
##   fvarMetadata: labelDescription
## experimentData: use 'experimentData(object)'
## Annotation:

The reference=NULL tells MIMOSA that the unstimulated sample counts are already on the same row as the stimualted sample counts. Most of the rest of the arguments have to do with identifying how the data should be reshaped if this is not the case.
To simplify things, it is best to get your data in the correct format and construct the expression set object as shown above.

The measure.columns argument specifies the base name for the columns that contain the cell counts, while the ref.append.replace argument specifies the suffix for the columns that contain the unstimualted sample counts. Again, it’s simples to set up the data as shown above.
It is critical that the variables passed to the default.cast.formula and .variables arguments uniquely identify a row.

Running MIMOSA

Next we fit a MIMOSA model.

cache(result_mcmc<-MIMOSA(data = Eset, 
                          formula = NSUB+CYTNUM ~ SubjectID+visit+TCELLSUB+Stim+Cytokine+RefTreat,
                          ref=RefTreat%in%"Reference",
                          subset=RefTreat%in%"Treatment",
                          method="mcmc",EXPRATE=1e-6))

The formula argument again specifies variables that identify unique rows of the data. In this case, we have only one Cytokine combination, one visit, one Stim, and one TCELLSUB. Consequently, all these variables appear on the right hand side unconditionally. MIMOSA will fit one model to all the data.

However, if we had observations from CD4 and CD8 T-cell subsets, we would write the model as:
NSUB+CYTNUM ~ SubjectID+visit+Stim+Cytokine+RefTreat | TCELLSUB Here TCELLSUB appears on the right side of the | operator. This tells MIMOSA to fit one model for each unique TCELLSUB (i.e. one model for CD4 and one model for CD8).

Similarly, if we had multiple Stim, we would write ... | TCELLSUB + Stim, and fit a separate model for each combination of T-cell subset and stimulation. In general, we want to condition on T-cell subset, stimulation, and cytokine, but group together different visits. We certainly don’t want to condition on something like cohort.

Additional arugments to MIMOSA

The method argument specifies how the model should be fit. We’ve implemented mcmc (Markov-Chain Monte-Carlo) and EM (Expectation Maximization) algorithms. EM is faster, MCMC is a bit more stable. EM may be a resonable starting point, and if it fails to converge, will fall-back to MCMC.

Summarizing results

We have only one fitted model. If we print it:

result_mcmc[[1]]

## A MIMOSA Model with 100 observations.
## Response rate (w) of 25.95 percent.

We see that the estimated response rate is 26.4% (true response rate is 25%.).

We can extract the probability of response for each subject and plot it as a function of the effect size (the difference between the stimulated and non-stimulated sample).

cnts<-countsTable(result_mcmc,proportion = FALSE)
colnames(cnts)<-c("NSUB.1","CYTNUM.1","NSUB_REF.1","CYTNUM_REF.1")
ggplot(data.frame(countsTable(result_mcmc,proportion = TRUE),cnts,getZ(result_mcmc),fdr(result_mcmc)))+
  aes(x=CYTNUM-CYTNUM_REF,y=Pr.response,col=fdr<0.05,size=CYTNUM.1+CYTNUM_REF.1)+
  geom_point()+theme_bw()+
  scale_color_discrete("q-value < 5%)",label=c("No","Yes"))+
  scale_size_continuous("Sum of Positive Counts")

The points are colored according to whether that subject is a responder at a q-value of 5%. In this case 26 subjects are called responders at this threshold. Importantly, when the observed proportion of cytokine positive cells in the stimulated sample is less than the observed proportion in the non-stimulated sample, the probability of response is 0. As the effect size increases, the probabiliy of response increases.
The jitter in the probabilities around medium effect sizes is due to differences in the number of observations, as evidenced by the size of the points, which are proportional to the total positive cell counts. This is why it is important to model the cell counts rather than just considering the proportion of positive cells. As more positive cells are observed, our belief in the veracity of the effect increases. This is lost if modeling only the proportions.

Performance measures curves comparing MIMOSA to Fisher’s exact test:

o<-rank(as.numeric(gsub("S(\\d+).*","\\1",rownames(countsTable(result_mcmc))))) #ordering of ground truth

# Compute false postive and true positive rates for MIMOSA and Fisher's using false discovery rate cutoffs
toplot=rbind(data.frame(threshold=seq(0,1,l=1000),
                        method="MIMOSA",
                        MIMOSA:::roc(fdr(result_mcmc),rep(c("TRUE","FALSE"),c(25,75))[o])),
  data.frame(threshold=seq(0,1,l=1000),
           method="Fisher",
           MIMOSA:::roc(p.adjust(apply(countsTable(result_mcmc),1,function(x){
             1-fisher.test(matrix(x,byrow=TRUE,ncol=2),alternative = "greater")$p
             }),"fdr"),
             rep(c("TRUE","FALSE"),c(25,75))[o])))

#Nominal and observed false discovery rates
d<-rbind(data.table(method="MIMOSA",
                    fdr(result_mcmc),
                    truth=rep(c(1,0),c(25,75))[o]),
    data.table(method="Fisher",
               fdr=p.adjust(apply(countsTable(result_mcmc),1,function(x){
                 1-fisher.test(matrix(x,byrow=TRUE,ncol=2),alternative = "greater")$p
                 }),"fdr"),
               truth=rep(c(1,0),c(25,75))[o]))

#Compute nominal and observed FDR for each method
nom_obs_fdr<-d[order(method,fdr)][,list(nominal=fdr,observed=cumsum(!truth)/ifelse(.I>100,(.I-100)%%101,.I)),method]

#plot true positive rate vs FDR cutoff
plt2<-ggplot(toplot)+
  aes(x=threshold,y=TPR,col=method)+
  geom_line()+theme_bw()+theme(aspect.ratio=1)+
  scale_x_continuous("Nominal FDR threshold")+
  scale_y_continuous("True Positive Rate")+coord_cartesian(xlim=c(0,0.1))

#plot false positive rate vs FDR cutoff
plt3<-ggplot(toplot)+
  aes(x=threshold,y=FPR,col=method)+
  geom_line()+theme_bw()+theme(aspect.ratio=1)+
  scale_y_continuous("False Positive Rate")+
  scale_x_continuous("Nominal FDR threshold")+
  geom_abline(1,lty=2)

#plot nominal vs observed FDR
plt4<-qplot(data=nom_obs_fdr,
            x=nominal,
            y=observed,
            col=method,geom = "line")+
  theme_bw()+
  geom_abline(1,lty=2)+
  theme(aspect.ratio=1)+
  scale_x_continuous("Nominal FDR threshold")+
  scale_y_continuous("Observed FDR")+
  scale_color_discrete(limits=c("MIMOSA","Fisher"))

The figure shows A) MIMOSA is more sensitive than Fisher’s test, particularly at FDR thresholds where we would be interested in setting a threshold for making positivity calls, B) The false positive rate is better controlled by MIMOSA, as is the C) observed false discovery rate.

Boxplots of response magnitudes

#Grab all the results
cnts<-countsTable(result_mcmc)
colnames(cnts)<-c("NSUB","CYTNUM","NSUB_REF","CYTNUM_REF")
results<-data.table(cnts,
                    pData(result_mcmc),
                    getZ(result_mcmc),
                    fdr(result_mcmc))
#Make boxplots of the background corrected magnitudes for the responders and non-responders.
ggplot(results)+
  aes(y=CYTNUM/(NSUB+CYTNUM)-CYTNUM_REF/(NSUB_REF+CYTNUM_REF),x=factor(visit),col=fdr<0.05,group=fdr<0.05)+
  geom_boxplot(position="identity",outlier.colour = NA)+
  geom_jitter(aes(size=CYTNUM+CYTNUM_REF))+
  theme_bw()+
  theme(aspect.ratio=1)+
  scale_y_continuous("Background Subtracted Proportion")+
  scale_x_discrete("Visit")+
  scale_color_discrete("Responder (q<0.05)",limits=c("TRUE","FALSE"))