Using openCyto to gate multiple files based on a negative control.

A typical application in flow cytometry is to use a negative control, or FMO (fluorescence minus-one) control to set a positivity threshold to be applied to other sample files.

When using computational tools to run an automated analysis, this can get tricky, since you need to track groups of samples and controls and ensure that they are processed appropriately.
However, this is incredibly easy to do using the BioConductor openCyto framework. OpenCyto allows you to process samples based on metadata variables defined as attributes of the data set. It also allows you to write custom gating and preprocessing routines to implement this type of gating.

We have generated some fake data from a fake study where we have treatment and control samples from multiple “subjects”. We want to define gate thresholds for each subject based on the control samples for that subject.

The gating template

Let’s take a look at a gating template that we would use to do this:

knitr:::kable(data.table::fread("template.csv"))

alias	pop	parent	dims	gating_method	gating_args	collapseDataForGating	groupBy	preprocessing_method	preprocessing_args
test	test	root	FSC-A	myGate	NA	TRUE	ptid	ppmyGate	NA

There’s a couple of things to note here. First, the groupBy column names a variable called ptid. ptid is defined in the sample metadata for this experiment It is a subject identifier used to group treatment and controls from the same subject. openCyto will process fcs files, grouping them by ptid. The gating_method and preprocessing_method contain the names of routines myGate and ppmyGate, which we’ll write momentarily.

Custom gating and preprocessing functions

Next we define two new routines. One to perform custom preprocessing, the other to perform custom gating. The preprocessing routine will identify events in each subset of data that correspond to the control samples and pass those on to the gating routine.

The gating routine will extract the events corresponding to the control sample, gate it, and pass that gate on to the openCyto framework. That gate will then be applied to all samples in the current group.

Credit goes to Jacob Frelinger from the RGLab for putting this example together.

We start by loading the openCyto library. It can be installed from BioConductor.

library(openCyto)

Preprocessing routine

The preprocessing routine has to match a specific signature, and then it’s registered so that it can be used by openCyto.

.ppmyGate <- function(fs, gs, gm, channels=NA,groupBy=NA,isCollapse=NA, ...) {
    xChannel = channels[1]
    yChannel = channels[1]
    d <- c()
    for(i in c(1:length(fs))) {
        d <- c(d,rep.int(pData(fs[i])$control,nrow(exprs(fs[[i]]))))
    }
    return(as.logical(d))
}

registerPlugins(fun=.ppmyGate, methodName='ppmyGate', dep=NA, "preprocessing")

## Registered ppmyGate

fs is the flowSet of data passed in by openCyto, and contains treatment and control samples for one ptid. There is also a control variable defined in the sample metadata, accessible via the pData Bioconductor generic. The preprocessing function extracts the control variable from the pData, and creates a vector of event-level logical indices for control-sample events and non-control sample events. These are passed on to the gating routine.

Gating routine

The gating routine is just a wrapper around the tailgate gating method built into openCyto (you could substitute any other gating routine). It uses the pp_res variable that is passed from the preprocessing routine, which identifies events from the control sample. The gate is then defined based solely on the control. That gate is returned an applied to all samples in the group.

.myGate <- function(fr, pp_res, channels=NA, filterId="ppgate", ...){
    
    my_gate <- tailgate(fr[pp_res,],channel=channels, filter_id=filterId, ...)
    return(my_gate)
}
registerPlugins(fun=.myGate,methodName='myGate',dep=NA)

## Registered myGate

Synthetic data example

Next we read our synthetic data and define the relevant metadata variables:

files <- Sys.glob('data/*.fcs')
fs  <- read.ncdfFlowSet(files)
neg <- grepl('neg', pData(fs)$name)
ptid <- as.integer(gsub('[^0-9]','', pData(fs)$name))

pData(fs)$control <- neg
pData(fs)$ptid <- ptid

knitr::kable((pData(fs)))

	name	control	ptid
neg_01.fcs	neg_01.fcs	TRUE	1
neg_02.fcs	neg_02.fcs	TRUE	2
neg_03.fcs	neg_03.fcs	TRUE	3
pos_01.fcs	pos_01.fcs	FALSE	1
pos_02.fcs	pos_02.fcs	FALSE	2
pos_03.fcs	pos_03.fcs	FALSE	3

We see there are six files, three “subjects” and one treatment and control per subject. The ptid variable is used for grouping in the template, and the control variable is used in the preprocessing to identify the control sample events.

We construct a GatingSet object, which is the input to openCyto, and read in the gating template.

gs <- GatingSet(fs)
gt <- gatingTemplate("template.csv")

Gating

Finally we do the gating.

gating(gt, gs)

## Loading required package: parallel
## Preprocessing for 'test'
## Gating for 'test'
## done.
## finished.

Results

The output shows that each “subject” has been gated based on its own negative control, and that gate applied to the treatment sample.

plotGate(gs,"test", default.y='SSC-A', xlim=c(0,25), ylim=c(0,25),margin=FALSE,xbin=128)

This general process can be adapted to your particular use case.