Flow cytometery for doctors
Fluoresence Spectrometer | Flow Cytometer |
---|---|
Molecules,Nanoparticles \(\le 500nm\) | Cells (\(\mu m\)) |
No statistical Information | Population behavior |
Single molecule data unavailable | Single cell data |
Flow cytometers can now measure dozens of parameters. Typically a given experiment of a flow cytometer will generate about 10000 rows of 8 to 24 columns (channels) of data. Software packages - FlowJo, - FCS Express
https://flowrepository.org/public_experiment_representations?top=10
fname='./Compensation Controls/Patient_17_Unstained.fcs'
ff <- flowCore::read.FCS(fname)
The object is not a simple atomic variable or an array or dataframe.
The object that is returned above is of S4 type
An S4 class definition is quite a bit different from a class definition in other programming languages such as C# and Python.
An S4 class encapsulates data fields inside a setClass function, but doesn’t encapsulate class methods.
Instead, S4 class methods are defined by pairs of special R functions named setMethod and setGeneric.
library(flowCore)
fname='./Compensation Controls/Patient_17_Unstained.fcs'
ff <- flowCore::read.FCS(fname)
typeof(ff)
[1] "S4"
# Converting S4 to dataframe
df=data.frame(ff@exprs[,1:6])
# typical structure of the data frame
summary(df)
FSC.A FSC.H FSC.W
Min. : 2909 Min. : 5009 Min. : 36588
1st Qu.: 21116 1st Qu.: 19694 1st Qu.: 66662
Median : 46014 Median : 43265 Median : 71521
Mean : 72403 Mean : 64801 Mean : 72133
3rd Qu.:128617 3rd Qu.:117858 3rd Qu.: 76619
Max. :262143 Max. :257923 Max. :180780
SSC.A SSC.H SSC.W
Min. : 569.5 Min. : 600 Min. : 42355
1st Qu.: 7990.4 1st Qu.: 7897 1st Qu.: 65152
Median : 19952.3 Median : 19438 Median : 66967
Mean : 40328.2 Mean : 37823 Mean : 67712
3rd Qu.: 52786.3 3rd Qu.: 50038 3rd Qu.: 69002
Max. :262143.0 Max. :257327 Max. :198544
Target Applications - Cell Cycle Analysis - Immunophenotyping - Cell counting and sorting - Biomarker detection In the above vase we have used unstained samples from PBMC
plot(FSC.H,SSC.H,pch=16,cex=0.2)
The above scatter plot shows a typical scattering cluster for unlabelled (no fluoresence label) PBMC suspension.A major computational challenge is to identify the cell types and associate them with a flow cytometric cluster.
# Using the flowClust code to remove outliers
library(flowClust)
fname='./Compensation Controls/Patient_17_Unstained.fcs'
ff <- flowCore::read.FCS(fname)
df=data.frame(ff@exprs[,1:6])
res1 <- flowClust(df,
varNames=c("FSC.H", "SSC.H"), K=1)
df2=Subset(df,res1)
res2<-flowClust(df2, varNames=c("FSC.H", "SSC.H"), K=1:6, B=100)
criterion(res2,"BIC")
[1] -1392214 -1378289 -1361787 -1354436 -1353362 -1350507
# There will be 6 clusters
# 6 clusters
plot(res2[[6]], data=df2, level=0.8, z.cutoff=0)
Rule of identifying outliers: 80% quantile
Now suppose we want to gate one and see the other 5 clusters
plot(FSC.H[-ik4],SSC.H[-ik4],pch=16,cex=0.3)
FlowPeaks is a fast unsupervised clustering for flow cytometry data via K-means and density peak finding, 2012, Bioinformatics 8(15):2052-8
library(flowPeaks)
fp<-flowPeaks(asinh(ff@exprs[,c(1,2)]))
plot(fp) #an alternative of using summary(fp)
A doublet is a single event that actually consists of 2 independent particles.
The cytometer classified these particles as a single event because they passed through the interrogation point very close to one another.
In other words, the particles were so close together when they passed through this laser spot, that the instrument was incapable of distinguishing them as individual events or particles.
What has been seen in case of the elimination of doublets the conventional clustering method for gating fails , as gating in this case has something to do with whether the cells are present in singlets or doublets. The common sense knowledge that is used in this case is that points with higher area-height ratio are more likely to be doublets. This is true for both the forward and side scattering.
library(easyGgplot2)
# Let us construct data using three patients
fname='./Compensation Controls/Patient_17_Unstained.fcs'
ff1 <- flowCore::read.FCS(fname)
df=data.frame(ff1@exprs[,1:6])
attach(df)
ggplot2.histogram(FSC.H,xlab="Forward Scattering",fill="white", color="black",addDensityCurve=TRUE,addMeanLine=TRUE, densityFill='#FF6666')
In this hypothetical experiment, cells were stained with FITC CD3 and with a PE isotype control, and collected at different compensation values to correct for the FITC spillover into the PE channel (panel 1 is uncompensated; 2-5 are for increasing compensation values). ## The questions - Which panel represents proper compensation? - On what basis did you make this determination?
http://www.drmr.com/compensation/ Let’s consider an experiment where we stain human peripheral blood lymphocytes with - FITC CD3 \(\rightarrow ^{Stains}\) CD4 and CD8 T cells) - PE CD8 \(\rightarrow ^{Stains}\) CD8 T cells and, less brightly, NK cells.
Let us assume that the spillover constants have been measured at 15% and 1% (from the compensation samples)?
For B cells, it will be zero in both FL1 and FL2 (remember, we are ignoring autofluorescence for the time being).
CD4 T cells will have 100 in FL1, and 15 in FL2 (i.e., 15% of the 100 units of fluorescein signal will appear in FL2).
NK cells will have 50 units in FL2, and 0.5 units in FL1 (i.e., 1% of the 10 units of PE signal).
CD8 T cells will have 104 units in FL1 (100 units of fluorescein CD3 signal plus 1% of the 400 units of PE CD8 signal), and 415 units FL2 (400 units of PE signal plus 15% of the 100 units of fluorescein signal).
\[ FL1_{measured}=FITC_{true}+0.01\cdot PE_{true}\\ FL2_{measured}=0.15\cdot FITC_{true} +PE_{true} \] ## Finding the true value of fluoresence Let’s apply these equations to our measured values for CD8 T cells (FL1 = 104, FL2 = 415). We may find PE true and FITC true from:
1*x1 + 0.15*x2 = 104
0.01*x1 + 1*x2 = 415
[,1]
[1,] 41.81272
[2,] 414.58187
Multi-color compensation is a simple extension of two-color compensation:
M(1) = A(11) x F(1) + A(21) x F(2) + … A(n1) x F(n) M(2) = A(12) x F(1) + A(22) x F(2) + … A(n2) x F(n) … M(n) = A(1n) x F(1) + A(2n) x F(2) + … A(nn) x F(n)
where M(i) is the measured fluorescence in channel i; F(i) is the amount of fluorescent molecule (i) present on the cell of interest, and A(ij) is the ratio of the fluorescence of molecule (i) in channel (i) to the fluorescence of molecule (i) in channel (j).
–> # Immuno Phenotyping A good link to start with https://www.abcam.com/protocols/flow-cytometry-immunophenotyping
The flow cytometric charecterization is more qunatitative than the microscopic detection.
# Immunophenotyping the leukomia cells Blood cells are immunophenotypically coated with fluorescent antibodies at the time of the sample preparation, and then the sample tube is placed in the flow cytometry device.
Gate out the singlets only
Construct SSC-H vs CD45 to gate the blast population.
Next CD19 (CD19 is a B cell–specific antigen expressed on chronic lymphocytic leukemia (CLL) cells)vs CD10 (The human CD10 antigen is present in common acute lymphoblastic leukemia as a cancer specific antigen ) scatterplot.
The propensity and type of leukomia and efficacy of chemotherapy with time can be determined.
Time FSC.A FSC.H
Min. : 53.6 Min. : 10144 Min. : 10024
1st Qu.: 565.5 1st Qu.: 60168 1st Qu.: 43537
Median :1085.2 Median : 79984 Median : 57508
Mean :1083.5 Mean : 93186 Mean : 61014
3rd Qu.:1595.3 3rd Qu.:109668 3rd Qu.: 73634
Max. :2111.9 Max. :262143 Max. :258709
SSC.A SSC.H FITC.A
Min. : 155.4 Min. : 156 Min. : -51.45
1st Qu.: 1407.0 1st Qu.: 1101 1st Qu.: 138.60
Median : 1939.3 Median : 1443 Median : 213.15
Mean : 3243.5 Mean : 1914 Mean : 722.33
3rd Qu.: 2998.8 3rd Qu.: 1943 3rd Qu.: 348.60
Max. :262143.0 Max. :224332 Max. :262143.00
PE.A PerCP.Cy5.5.A PE.Cy7.A
Min. : -81.9 Min. : -105.0 Min. : -90.3
1st Qu.: 444.1 1st Qu.: 536.5 1st Qu.: 1098.3
Median : 828.5 Median : 1029.0 Median : 2794.6
Mean : 1237.3 Mean : 1814.5 Mean : 4506.2
3rd Qu.: 1395.5 3rd Qu.: 1857.5 3rd Qu.: 6019.6
Max. :239286.6 Max. :262143.0 Max. :262143.0
APC.A APC.H7.A V450.A
Min. : -89.28 Min. : -224.1 Min. : -439.3
1st Qu.: 70.68 1st Qu.: 306.0 1st Qu.: 213.9
Median : 169.26 Median : 580.3 Median : 378.4
Mean : 1781.22 Mean : 1516.5 Mean : 1291.5
3rd Qu.: 778.41 3rd Qu.: 1208.1 3rd Qu.: 670.5
Max. :262143.00 Max. :262143.0 Max. :262143.0
V500c.A
Min. : -284.1
1st Qu.: 159.8
Median : 316.2
Mean : 1187.2
3rd Qu.: 621.0
Max. :262143.0
[1] -988502.2 -986779.2 -986474.7 -986135.8
Rule of identifying outliers: 50% quantile
[1] -1141117 -1123068 -1119284 -1117204
Rule of identifying outliers: 50% quantile
Time FSC.A FSC.H
Min. : 53.6 Min. : 10144 Min. : 10024
1st Qu.: 565.5 1st Qu.: 60168 1st Qu.: 43537
Median :1085.2 Median : 79984 Median : 57508
Mean :1083.5 Mean : 93186 Mean : 61014
3rd Qu.:1595.3 3rd Qu.:109668 3rd Qu.: 73634
Max. :2111.9 Max. :262143 Max. :258709
SSC.A SSC.H FITC.A
Min. : 155.4 Min. : 156 Min. : -51.45
1st Qu.: 1407.0 1st Qu.: 1101 1st Qu.: 138.60
Median : 1939.3 Median : 1443 Median : 213.15
Mean : 3243.5 Mean : 1914 Mean : 722.33
3rd Qu.: 2998.8 3rd Qu.: 1943 3rd Qu.: 348.60
Max. :262143.0 Max. :224332 Max. :262143.00
PE.A PerCP.Cy5.5.A PE.Cy7.A
Min. : -81.9 Min. : -105.0 Min. : -90.3
1st Qu.: 444.1 1st Qu.: 536.5 1st Qu.: 1098.3
Median : 828.5 Median : 1029.0 Median : 2794.6
Mean : 1237.3 Mean : 1814.5 Mean : 4506.2
3rd Qu.: 1395.5 3rd Qu.: 1857.5 3rd Qu.: 6019.6
Max. :239286.6 Max. :262143.0 Max. :262143.0
APC.A APC.H7.A V450.A
Min. : -89.28 Min. : -224.1 Min. : -439.3
1st Qu.: 70.68 1st Qu.: 306.0 1st Qu.: 213.9
Median : 169.26 Median : 580.3 Median : 378.4
Mean : 1781.22 Mean : 1516.5 Mean : 1291.5
3rd Qu.: 778.41 3rd Qu.: 1208.1 3rd Qu.: 670.5
Max. :262143.00 Max. :262143.0 Max. :262143.0
V500c.A
Min. : -284.1
1st Qu.: 159.8
Median : 316.2
Mean : 1187.2
3rd Qu.: 621.0
Max. :262143.0
FlowPeaks is a fast unsupervised clustering for flow cytometry data via K-means and density peak finding, 2012, Bioinformatics 8(15):2052-8
step 0, set the intial seeds, tot.wss=1040.56
step 1, do the rough EM, tot.wss=712.991 at 0.281295 sec
step 2, do the fine transfer of Hartigan-Wong Algorithm
tot.wss=703.557 at 0.495917 sec
** The Beauty of SOM**