This vignette demonstrates how to analyze single-cell data from a morphological profiling experiment.
The images were analyzed using CellProfiler.
This vignette assumes that
the single-cell data 110000106771.sqlite have been created using ingest and stored in ~/Downloads.
the metadata metadata.csv has the columns WELLTYPE_CODE, BARCODE, and WELL, and is stored in ~/Downloads.
library(dplyr)
library(ggplot2)
library(magrittr)
library(stringr)First, load the data. The data is contained in 4 tables named Image, Cytoplasm, Cells, and Nuclei. The code below joins these tables to create a single table named object.
backend <- file.path(Sys.getenv("HOME"), "Downloads", "110000106771.sqlite")
db <- src_sqlite(path = backend)
image <- tbl(src = db, "image")
object <-
tbl(src = db, "Cells") %>%
inner_join(tbl(src = db, "Cytoplasm"),
by = c("TableNumber", "ImageNumber", "ObjectNumber")) %>%
inner_join(tbl(src = db, "Nuclei"),
by = c("TableNumber", "ImageNumber", "ObjectNumber"))
object %<>% inner_join(image, by = c("TableNumber", "ImageNumber"))In this table, the measurement columns start with Nuclei_, Cells_, or Cytoplasm_.
variables <-
colnames(object) %>%
stringr::str_subset("^Nuclei_|^Cells_|^Cytoplasm_")How many variables?
print(length(variables))## [1] 854
How many cells?
object %>%
count() %>%
knitr::kable(caption = "No. of cells")| n |
|---|
| 526555 |
Let’s join the metadata
metadata <-
readr::read_csv("~/Downloads/metadata.csv") %>%
select(BARCODE, WELL, WELLTYPE_CODE) %>%
rename(Image_Metadata_Barcode = BARCODE,
Image_Metadata_Well = WELL,
Image_Metadata_Type = WELLTYPE_CODE)
head(metadata)## # A tibble: 6 × 3
## Image_Metadata_Barcode Image_Metadata_Well Image_Metadata_Type
## <dbl> <chr> <chr>
## 1 110000106890 A01 EMPTY
## 2 110000106890 A02 EMPTY
## 3 110000106890 A03 EMPTY
## 4 110000106890 A04 EMPTY
## 5 110000106890 A05 EMPTY
## 6 110000106890 A06 EMPTY
metadata <-
dplyr::copy_to(db,
metadata,
indexes = list("Image_Metadata_Type")
)
object %<>%
inner_join(metadata)Let’s filter the data down to a couple of wells and plot thehistogram of a single feature:
object %>%
filter(Image_Metadata_Well %in% c("A01", "A24", "A23")) %>%
select(Image_Metadata_Type,
Image_Metadata_Well,
Nuclei_Intensity_IntegratedIntensity_Hoechst) %>%
collect() %>% {
ggplot(., aes(Nuclei_Intensity_IntegratedIntensity_Hoechst,
fill=interaction(Image_Metadata_Type, Image_Metadata_Well))) +
scale_x_log10() +
geom_density(alpha = 0.5) +
guides(fill = guide_legend(title = "Well"))
}plot of chunk unnamed-chunk-5
Next, lets filter the set of features based on various measures of quality
Remove features that have near-zero variance.
futile.logger::flog.info("start")## INFO [2017-03-12 23:08:58] start
object <-
cytominer::select(
population = object,
variables = variables,
sample = object %>% filter(Image_Metadata_Well == "A01") %>% collect(),
operation = "variance_threshold"
)## INFO [2017-03-12 23:09:10] excluded:
## INFO [2017-03-12 23:09:10] Cells_AreaShape_EulerNumber
## INFO [2017-03-12 23:09:10] Cells_Children_Cytoplasm_Count
## INFO [2017-03-12 23:09:10] Cytoplasm_AreaShape_EulerNumber
## INFO [2017-03-12 23:09:10] Nuclei_AreaShape_EulerNumber
## INFO [2017-03-12 23:09:10] Nuclei_Children_Cells_Count
## INFO [2017-03-12 23:09:10] Nuclei_Children_Cytoplasm_Count
variables <-
colnames(object) %>%
str_subset("^Nuclei_|^Cells_|^Cytoplasm_")
futile.logger::flog.info("end")## INFO [2017-03-12 23:09:11] end
Filter based on correlation between features. The morphological features extracted contain several highly correlated groups. We want to to prune the set of features, retaining only one feature from each of these highly correlated sets. The function correlation_threshold provides an approximate (greedy) solution to this problem. After excluding the features, no pair of features have a correlation greater than cutoff indicated below.
futile.logger::flog.info("start")## INFO [2017-03-12 23:09:11] start
object <-
cytominer::select(
population = object,
variables = variables,
sample = object %>% filter(Image_Metadata_Well == "A01") %>% collect(),
operation = "correlation_threshold",
cutoff = 0.95)## INFO [2017-03-12 23:09:17] excluded:
## INFO [2017-03-12 23:09:17] Cells_AreaShape_MaxFeretDiameter
## INFO [2017-03-12 23:09:17] Cells_AreaShape_MeanRadius
## INFO [2017-03-12 23:09:17] Cells_Intensity_MedianIntensity_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Intensity_MinIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_StdIntensityEdge_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_StdIntensityEdge_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_StdIntensityEdge_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Intensity_UpperQuartileIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_UpperQuartileIntensity_CellMask
## INFO [2017-03-12 23:09:17] Cells_Location_CenterMassIntensity_X_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Location_CenterMassIntensity_X_CellMask
## INFO [2017-03-12 23:09:17] Cells_Location_CenterMassIntensity_X_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Location_CenterMassIntensity_Y_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Location_CenterMassIntensity_Y_CellMask
## INFO [2017-03-12 23:09:17] Cells_Location_Center_X
## INFO [2017-03-12 23:09:17] Cells_Location_Center_Y
## INFO [2017-03-12 23:09:17] Cells_Location_MaxIntensity_X_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Location_MaxIntensity_X_CellMask
## INFO [2017-03-12 23:09:17] Cells_Location_MaxIntensity_X_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Location_MaxIntensity_Y_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Location_MaxIntensity_Y_CellMask
## INFO [2017-03-12 23:09:17] Cells_Neighbors_FirstClosestObjectNumber_5
## INFO [2017-03-12 23:09:17] Cells_Neighbors_FirstClosestObjectNumber_Adjacent
## INFO [2017-03-12 23:09:17] Cells_Number_Object_Number
## INFO [2017-03-12 23:09:17] Cells_Parent_AllNuclei
## INFO [2017-03-12 23:09:17] Cells_Parent_Nuclei
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_FracAtD_CellMask_1of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_FracAtD_CellMask_2of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_FracAtD_CellMask_3of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_FracAtD_CellMask_4of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_MeanFrac_CellMask_1of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_MeanFrac_CellMask_2of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_MeanFrac_CellMask_3of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_MeanFrac_CellMask_4of4
## INFO [2017-03-12 23:09:17] Cells_Texture_AngularSecondMoment_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Contrast_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Contrast_CellMask_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Correlation_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Correlation_CellMask_10_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Correlation_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Correlation_CellMask_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Correlation_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_DifferenceEntropy_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Entropy_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Entropy_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Entropy_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Entropy_CellMask_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InfoMeas1_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InverseDifferenceMoment_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InverseDifferenceMoment_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InverseDifferenceMoment_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InverseDifferenceMoment_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumAverage_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumAverage_CellMask_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumAverage_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumEntropy_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumEntropy_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumEntropy_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumEntropy_CellMask_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumEntropy_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumEntropy_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Variance_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Variance_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_Center_Y
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_MaxFeretDiameter
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_MinFeretDiameter
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_Perimeter
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_Zernike_3_3
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_Zernike_5_5
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_Zernike_7_7
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_Zernike_8_8
## INFO [2017-03-12 23:09:17] Cytoplasm_AreaShape_Zernike_9_9
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_MeanIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_MeanIntensity_CellMask
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_MedianIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_MinIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_StdIntensityEdge_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_UpperQuartileIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_UpperQuartileIntensity_CellMask
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_CenterMassIntensity_X_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_CenterMassIntensity_X_CellMask
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_CenterMassIntensity_X_Hoechst
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_CenterMassIntensity_Y_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_CenterMassIntensity_Y_CellMask
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_CenterMassIntensity_Y_Hoechst
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_Center_X
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_Center_Y
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_MaxIntensity_X_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_MaxIntensity_X_CellMask
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_MaxIntensity_X_Hoechst
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_MaxIntensity_Y_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_MaxIntensity_Y_CellMask
## INFO [2017-03-12 23:09:17] Cytoplasm_Location_MaxIntensity_Y_Hoechst
## INFO [2017-03-12 23:09:17] Cytoplasm_Number_Object_Number
## INFO [2017-03-12 23:09:17] Cytoplasm_Parent_Cells
## INFO [2017-03-12 23:09:17] Cytoplasm_Parent_Nuclei
## INFO [2017-03-12 23:09:17] Cytoplasm_RadialDistribution_MeanFrac_CellMask_2of4
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_AngularSecondMoment_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_AngularSecondMoment_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_DifferenceEntropy_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_Entropy_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_InverseDifferenceMoment_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_InverseDifferenceMoment_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_InverseDifferenceMoment_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumAverage_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumAverage_CellMask_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumAverage_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumEntropy_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumEntropy_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumEntropy_CellMask_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumEntropy_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumEntropy_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_SumEntropy_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Nuclei_AreaShape_Center_X
## INFO [2017-03-12 23:09:17] Nuclei_AreaShape_Center_Y
## INFO [2017-03-12 23:09:17] Nuclei_AreaShape_MaxFeretDiameter
## INFO [2017-03-12 23:09:17] Nuclei_AreaShape_MeanRadius
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MaxIntensityEdge_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MaxIntensityEdge_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MaxIntensity_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MeanIntensityEdge_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MeanIntensityEdge_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MedianIntensity_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MinIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_MinIntensity_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_StdIntensity_Hoechst
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_UpperQuartileIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Intensity_UpperQuartileIntensity_Hoechst
## INFO [2017-03-12 23:09:17] Nuclei_Location_CenterMassIntensity_X_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Location_CenterMassIntensity_X_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Location_CenterMassIntensity_X_Hoechst
## INFO [2017-03-12 23:09:17] Nuclei_Location_CenterMassIntensity_Y_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Location_CenterMassIntensity_Y_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Location_CenterMassIntensity_Y_Hoechst
## INFO [2017-03-12 23:09:17] Nuclei_Location_Center_X
## INFO [2017-03-12 23:09:17] Nuclei_Location_Center_Y
## INFO [2017-03-12 23:09:17] Nuclei_Location_MaxIntensity_X_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Location_MaxIntensity_X_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Location_MaxIntensity_X_Hoechst
## INFO [2017-03-12 23:09:17] Nuclei_Location_MaxIntensity_Y_Alexa568
## INFO [2017-03-12 23:09:17] Nuclei_Location_MaxIntensity_Y_CellMask
## INFO [2017-03-12 23:09:17] Nuclei_Location_MaxIntensity_Y_Hoechst
## INFO [2017-03-12 23:09:17] Nuclei_Neighbors_FirstClosestObjectNumber_1
## INFO [2017-03-12 23:09:17] Nuclei_Neighbors_SecondClosestObjectNumber_1
## INFO [2017-03-12 23:09:17] Nuclei_Number_Object_Number
## INFO [2017-03-12 23:09:17] Nuclei_Parent_AllNuclei
## INFO [2017-03-12 23:09:17] Nuclei_Texture_Entropy_CellMask_10_0
## INFO [2017-03-12 23:09:17] Nuclei_Texture_Entropy_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Nuclei_Texture_InfoMeas2_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_AreaShape_MinFeretDiameter
## INFO [2017-03-12 23:09:17] Cells_Correlation_Correlation_Alexa568_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_MassDisplacement_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_MeanIntensityEdge_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_MADIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Center_Y
## INFO [2017-03-12 23:09:17] Cells_Location_CenterMassIntensity_Y_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Neighbors_AngleBetweenNeighbors_5
## INFO [2017-03-12 23:09:17] Cells_Neighbors_FirstClosestDistance_5
## INFO [2017-03-12 23:09:17] Cells_Neighbors_SecondClosestDistance_5
## INFO [2017-03-12 23:09:17] Cells_Location_MaxIntensity_Y_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Neighbors_SecondClosestObjectNumber_5
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_FracAtD_Alexa568_4of4
## INFO [2017-03-12 23:09:17] Cells_RadialDistribution_FracAtD_Hoechst_4of4
## INFO [2017-03-12 23:09:17] Cells_Texture_AngularSecondMoment_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_AngularSecondMoment_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Contrast_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_DifferenceEntropy_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_DifferenceEntropy_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_AngularSecondMoment_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Cells_Texture_DifferenceEntropy_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_DifferenceVariance_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_DifferenceVariance_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_AngularSecondMoment_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Entropy_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InverseDifferenceMoment_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InverseDifferenceMoment_Alexa568_10_0
## INFO [2017-03-12 23:09:17] Cells_Texture_InverseDifferenceMoment_Alexa568_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Entropy_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumVariance_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_SumVariance_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cells_Texture_Variance_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Area
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Center_X
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Eccentricity
## INFO [2017-03-12 23:09:17] Cells_AreaShape_MajorAxisLength
## INFO [2017-03-12 23:09:17] Cells_AreaShape_MinorAxisLength
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Zernike_0_0
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Zernike_2_2
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Zernike_4_4
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Zernike_6_6
## INFO [2017-03-12 23:09:17] Cells_AreaShape_Zernike_9_7
## INFO [2017-03-12 23:09:17] Cytoplasm_Correlation_Correlation_Alexa568_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_LowerQuartileIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_LowerQuartileIntensity_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_MedianIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_MADIntensity_CellMask
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_MaxIntensityEdge_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_Intensity_MaxIntensityEdge_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_MedianIntensity_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_MinIntensityEdge_Alexa568
## INFO [2017-03-12 23:09:17] Cells_Intensity_MinIntensityEdge_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_MinIntensityEdge_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Intensity_MinIntensity_CellMask
## INFO [2017-03-12 23:09:17] Cells_Intensity_MinIntensity_Hoechst
## INFO [2017-03-12 23:09:17] Cells_Intensity_MeanIntensity_Alexa568
## INFO [2017-03-12 23:09:17] Cytoplasm_RadialDistribution_FracAtD_Alexa568_1of4
## INFO [2017-03-12 23:09:17] Cytoplasm_RadialDistribution_FracAtD_Alexa568_2of4
## INFO [2017-03-12 23:09:17] Cytoplasm_RadialDistribution_FracAtD_Alexa568_4of4
## INFO [2017-03-12 23:09:17] Cytoplasm_RadialDistribution_MeanFrac_Alexa568_2of4
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_AngularSecondMoment_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_AngularSecondMoment_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_DifferenceEntropy_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_DifferenceEntropy_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_AngularSecondMoment_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_AngularSecondMoment_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_Contrast_Hoechst_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_Contrast_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_Entropy_Alexa568_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_Entropy_CellMask_3_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_DifferenceEntropy_Hoechst_10_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_DifferenceEntropy_Hoechst_5_0
## INFO [2017-03-12 23:09:17] Cytoplasm_Texture_InverseDifferenceMoment_Alexa568_3_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_InverseDifferenceMoment_CellMask_3_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_Entropy_Hoechst_10_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_Entropy_Hoechst_3_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_Entropy_Alexa568_5_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_SumVariance_Alexa568_3_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_SumVariance_CellMask_3_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_Variance_Alexa568_3_0
## INFO [2017-03-12 23:09:18] Cytoplasm_Texture_Variance_CellMask_3_0
## INFO [2017-03-12 23:09:18] Nuclei_AreaShape_MinFeretDiameter
## INFO [2017-03-12 23:09:18] Nuclei_Correlation_Correlation_Alexa568_CellMask
## INFO [2017-03-12 23:09:18] Cytoplasm_Intensity_IntegratedIntensityEdge_Hoechst
## INFO [2017-03-12 23:09:18] Cells_Intensity_MaxIntensity_Alexa568
## INFO [2017-03-12 23:09:18] Cytoplasm_Intensity_StdIntensityEdge_CellMask
## INFO [2017-03-12 23:09:18] Cells_Intensity_StdIntensity_Alexa568
## INFO [2017-03-12 23:09:18] Nuclei_Intensity_LowerQuartileIntensity_Alexa568
## INFO [2017-03-12 23:09:18] Cells_Intensity_StdIntensity_CellMask
## INFO [2017-03-12 23:09:18] Nuclei_Intensity_LowerQuartileIntensity_CellMask
## INFO [2017-03-12 23:09:18] Nuclei_Intensity_MeanIntensity_Alexa568
## INFO [2017-03-12 23:09:18] Nuclei_Intensity_MinIntensityEdge_Hoechst
## INFO [2017-03-12 23:09:18] Nuclei_Intensity_MedianIntensity_Alexa568
## INFO [2017-03-12 23:09:18] Nuclei_Intensity_MeanIntensity_CellMask
## INFO [2017-03-12 23:09:18] Nuclei_Intensity_MADIntensity_Hoechst
## INFO [2017-03-12 23:09:18] Nuclei_Texture_InfoMeas1_Alexa568_3_0
## INFO [2017-03-12 23:09:18] Nuclei_Texture_InfoMeas1_Hoechst_5_0
## INFO [2017-03-12 23:09:18] Nuclei_Texture_SumAverage_CellMask_3_0
## INFO [2017-03-12 23:09:18] Nuclei_Texture_Entropy_Alexa568_10_0
variables <-
colnames(object) %>%
str_subset("^Nuclei_|^Cells_|^Cytoplasm_")
futile.logger::flog.info("end")## INFO [2017-03-12 23:09:18] end
We need to normalize the data so that
features are on the same scale
plate-to-plate variation is reduced
The default for doing this is standardization. Here, we take all the cells from control wells in the experiment and compute normalizations parameters from that (in this case, just the mean and s.d.) and then apply it to the whole dataset (i.e. the population)
futile.logger::flog.info("start")## INFO [2017-03-12 23:09:18] start
object %<>% collect(n = Inf)
object <-
cytominer::normalize(
population = object,
variables = variables,
strata = c("Image_Metadata_Barcode"),
sample = object %>% filter(Image_Metadata_Well == "A01")
)
futile.logger::flog.info("end")## INFO [2017-03-12 23:13:16] end
In some cases, we may have features that have no variance at all (e.g. Euler number). If these features have not already been removed by this stage, the standardization step will results in all values for that feature being NA ( because s.d. = 0). Lets remove them:
futile.logger::flog.info("start")## INFO [2017-03-12 23:13:16] start
object <-
cytominer::select(
population = object,
variables = variables,
operation = "drop_na_columns"
)
variables <-
colnames(object) %>%
str_subset("^Nuclei_|^Cells_|^Cytoplasm_")
futile.logger::flog.info("end")## INFO [2017-03-12 23:13:21] end
We may want to tranform the data so that assumptions we may later make about the data distribution are satisfied (e.g. Gaussianity). The default here is generalized_log.
futile.logger::flog.info("start")
object <-
cytominer::transform(
population = object,
variables = variables
)
futile.logger::flog.info("end")Now let’s summarize the data by grouping by well and computing averages.
futile.logger::flog.info("start")## INFO [2017-03-12 23:13:21] start
profiles <-
cytominer::aggregate(
population = object,
variables = variables,
strata = c("Image_Metadata_Barcode", "Image_Metadata_Well"),
operation = "mean"
)
profiles %<>%
collect()
futile.logger::flog.info("end")## INFO [2017-03-12 23:13:23] end
How many wells?
profiles %>%
count() %>%
knitr::kable(caption = "No. of wells")| n |
|---|
| 384 |
Let’s plot the relationship between a pair of variables from this summarized data
p <-
ggplot(profiles, aes(Cells_Intensity_IntegratedIntensity_Hoechst, Nuclei_AreaShape_Area)) +
geom_point()
print(p)plot of chunk unnamed-chunk-13