After the first QC we visualise the data with the same plots, this time with the filtered object.
library(here) # for reproducible paths
library(SingleCellExperiment)
library(scater) # For qcs
library(org.Mm.eg.db) # To annotate the genenames
library(ggplot2) # for the bin2 density
library(pals) # viridis colour
project <- "fire-mice"
sce <- readRDS(here("processed",project, "sce_QC_01.RDS"))
The object has 23012 genes and 30172 cells after filtering
plotColData(sce, x = "Sample", y = "sum", colour_by = "genotype") +
ggtitle("Total count") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
plotColData(sce, x = "Sample", y = "sum", colour_by = "genotype") +
scale_y_log10() + ggtitle("Total count log scale") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
plotColData(sce, x = "Sample", y = "detected", colour_by = "genotype") +
scale_y_log10() + ggtitle("Detected Genes") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
plotColData(sce, x = "Sample", y = "sum", colour_by = "chip") +
scale_y_log10() + ggtitle("total count by batch") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
plotColData(sce, x = "Sample", y = "subsets_mt_percent", colour_by = "genotype") +
ggtitle("Mitocchondrial percentatge") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
In the x axis we can see the total number of umi (library size) per cell, the number of detected genes per cell and the mitochondrial percentage per cell; with the number of cells for each measure in the y axis.
hist(
sce$total,
breaks = 100
)
This object had already been filtrated with the cell-calling algorithm from CellRanger, that is meant to remove empty droplets. Therefore it is expected to see the total sum of umi skewed as in the plot above.
hist(
sce$detected,
breaks = 100
)
hist(
sce$subsets_mt_percent,
breaks = 100
)
plotColData(sce, x = "sum", y = "subsets_mt_percent", colour_by = "outlier")
plotColData(sce, x = "sum", y = "detected", colour_by = "outlier")
plotColData(sce, x = "sum", y = "detected", colour_by = "Sample")
Colour fill by density:
plotColData(sce, x = "sum", y="subsets_mt_percent") + geom_bin_2d(bins=c(100,100)) + scale_fill_gradientn(colours = viridis(200))
plotColData(sce, x = "sum", y="detected") + geom_bin_2d(bins=c(100,100)) + scale_fill_gradientn(colours = viridis(200))
sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19043)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.1252
## [2] LC_CTYPE=English_United Kingdom.1252
## [3] LC_MONETARY=English_United Kingdom.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United Kingdom.1252
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] pals_1.7 org.Mm.eg.db_3.14.0
## [3] AnnotationDbi_1.56.2 scater_1.23.5
## [5] ggplot2_3.3.5 scuttle_1.4.0
## [7] SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0
## [9] Biobase_2.54.0 GenomicRanges_1.46.1
## [11] GenomeInfoDb_1.30.1 IRanges_2.28.0
## [13] S4Vectors_0.32.3 BiocGenerics_0.40.0
## [15] MatrixGenerics_1.6.0 matrixStats_0.61.0
## [17] here_1.0.1
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-7 bit64_4.0.5
## [3] httr_1.4.2 rprojroot_2.0.2
## [5] tools_4.1.1 bslib_0.3.1
## [7] utf8_1.2.2 R6_2.5.1
## [9] irlba_2.3.5 vipor_0.4.5
## [11] DBI_1.1.2 colorspace_2.0-2
## [13] withr_2.4.3 tidyselect_1.1.1
## [15] gridExtra_2.3 bit_4.0.4
## [17] compiler_4.1.1 cli_3.2.0
## [19] BiocNeighbors_1.12.0 DelayedArray_0.20.0
## [21] labeling_0.4.2 sass_0.4.0
## [23] scales_1.1.1 stringr_1.4.0
## [25] digest_0.6.29 rmarkdown_2.11
## [27] XVector_0.34.0 dichromat_2.0-0
## [29] pkgconfig_2.0.3 htmltools_0.5.2
## [31] sparseMatrixStats_1.6.0 highr_0.9
## [33] maps_3.4.0 fastmap_1.1.0
## [35] rlang_1.0.1 rstudioapi_0.13
## [37] RSQLite_2.2.9 DelayedMatrixStats_1.16.0
## [39] farver_2.1.0 jquerylib_0.1.4
## [41] generics_0.1.2 jsonlite_1.7.3
## [43] BiocParallel_1.28.3 dplyr_1.0.8
## [45] RCurl_1.98-1.6 magrittr_2.0.2
## [47] BiocSingular_1.10.0 GenomeInfoDbData_1.2.7
## [49] Matrix_1.4-0 Rcpp_1.0.8
## [51] ggbeeswarm_0.6.0 munsell_0.5.0
## [53] fansi_1.0.2 viridis_0.6.2
## [55] lifecycle_1.0.1 stringi_1.7.6
## [57] yaml_2.2.2 zlibbioc_1.40.0
## [59] grid_4.1.1 blob_1.2.2
## [61] parallel_4.1.1 ggrepel_0.9.1
## [63] crayon_1.5.0 lattice_0.20-45
## [65] cowplot_1.1.1 Biostrings_2.62.0
## [67] beachmat_2.10.0 mapproj_1.2.8
## [69] KEGGREST_1.34.0 knitr_1.37
## [71] pillar_1.7.0 ScaledMatrix_1.2.0
## [73] glue_1.6.1 evaluate_0.14
## [75] vctrs_0.3.8 png_0.1-7
## [77] gtable_0.3.0 purrr_0.3.4
## [79] assertthat_0.2.1 cachem_1.0.6
## [81] xfun_0.29 rsvd_1.0.5
## [83] viridisLite_0.4.0 tibble_3.1.6
## [85] beeswarm_0.4.0 memoise_2.0.1
## [87] ellipsis_0.3.2