Datasets
The following data sets are included in this notebook. Datasets were sub-sampled to a max 200,000 K cells per dataset.
| Individual Single-Cell RNA-seq PBMC Data from Arunachalam et al. |
49,139 |
Aronow |
Link |
| Individual Single-Cell RNA-seq PBMC Data from Schulte-Schrepping et al. |
90,957 |
Aronow |
Link |
| Individual Single-Cell RNA-seq PBMC Data from Lee et al. |
43,512 |
Aronow |
Link |
| Individual Single-Cell RNA-seq PBMC Data from Wilk et al. |
41,305 |
Aronow |
Link |
| Individual Single-Cell RNA-seq PBMC Data from Guo et al. |
14,783 |
Aronow |
Link |
| Azimuth meta-analysis from Adams et al. |
312,928 |
Satija |
Link |
| Azimuth meta-analysis from Delorey et al. |
106,043 |
Satija |
Link |
| Azimuth meta-analysis from Habermann et al. |
114,396 |
Satija |
Link |
| Large-scale single-cell analysis reveals critical immune characteristics of COVID-19 patients |
1,462,702 |
Zemin Zenhg |
Link |
Read data
This will load all data from the datasets above. Only run this once per session
all_data <- readRDS("all_data.rds")
Parameters
Input the genes and cell types of interest/
Gene set 1 - minimal
#genes <- c("MALAT1", "CD68", "CD79A")
#cell_types <- c("B cell", "T cell", "neutrophil")
Gene set 2 - marker genes
genes <- c("MALAT1", "CD68", "CD79A", "CD8A", "FOXJ1", "GNLY", "CD4", "GNLY", "CD79A", "JCHAIN", "MNDA", "MUC5B", "KTR18")
cell_types <- c("B cell", "T cell", "neutrophil", "natural killer cell", "plasma cell", "macrophage", "ciliated cell")
Gene set 3 - constitutive genes
#genes <- c("LDHA", "PGK1", "ENO1", "SKP1", "TGFB1", "CDKN1C", "PKM", "CCND3", "SKP1", "ZBTB17", "MALAT1", "GAPDH", "ACTB", "RP9", "MT3", "MTR")
#cell_types <- c("B cell", "T cell", "neutrophil", "natural killer cell", "plasma cell", "macrophage", "ciliated cell", "epithelial cell")
Visual prototypes
Any time genes and/or cell types are updated this code has to be run again.
# Summarizing into median, mean, and percent cells
current_data <- get_expression(all_data, genes, cell_types)
Processing dataset 1 / 10
Processing dataset 2 / 10
Processing dataset 3 / 10
Processing dataset 4 / 10
Processing dataset 5 / 10
Processing dataset 6 / 10
Processing dataset 7 / 10
Processing dataset 8 / 10
Processing dataset 9 / 10
Processing dataset 10 / 10
dot_data <- current_data %>%
group_by (Gene, cell_type) %>%
summarise(mean = mean(Expression[Expression!=0]), median = median(Expression[Expression!=0]), percent_cells = sum(Expression != 0) / n()) %>%
ungroup()
# Appending summary to all data
current_data <- current_data %>%
dplyr::filter(Expression != 0)
ids <- str_c(current_data$Gene, current_data$cell_type)
mapping <- setNames(dot_data$median, str_c(dot_data$Gene, dot_data$cell_type))
current_data$median <- mapping[ids]
Dot plot: 2-color gradient
Dot plot: 3-color gradient
Dot plot: cellxgene colors
Density plot: 2-color gradient
Density plot: 3-color gradient
