Clustering

Set-up

The workflow and explanations bellow are from OSCA

library(SingleCellExperiment)
library(here) #reproducible paths
library(scater) # Plot dimred
library(clustree) # show relationship clustering
library(Seurat) # clusternig

project<- "fire-mice"
source(here("src/colours.R"))

if(!file.exists(here("processed", project, "sce_clusters_02.RDS"))){
sce <- readRDS(here("processed", project, "sce_dimred_02.RDS"))
} else {
  sce <- readRDS(here("processed", project, "sce_clusters_02.RDS"))
}

Motivation

Clustering is an unsupervised learning procedure that is used in scRNA-seq data analysis to empirically define groups of cells with similar expression profiles. It is worth stressing the distinction between clusters and cell types. The former is an empirical construct while the latter is a biological truth (albeit a vaguely defined one). For this reason, questions like “what is the true number of clusters?” are usually meaningless. We can define as many clusters as we like, with whatever algorithm we like - each clustering will represent its own partitioning of the high-dimensional expression space, and is as “real” as any other clustering. It is helpful to realize that clustering, like a microscope, is simply a tool to explore the data. We can zoom in and out by changing the resolution of the clustering parameters, and we can experiment with different clustering algorithms to obtain alternative perspectives of the data.

With Seurat

#only run if first time
if (!file.exists( here("processed", project, "srt_clusters_02.RDS"))) {
srt <- as.Seurat(sce)
 # delete old clustering
  srt_cluster_names <-
    grep("originalexp_snn_res", names(srt@meta.data), value = TRUE)
  for (cluster in srt_cluster_names) {
    # delete previous to ensure right 
    srt[[cluster]] <- NULL
  }
srt <- FindNeighbors(srt, reduction = "PCA", dims = 1:25)
srt <- FindClusters(srt, 
                    resolution = c(0.01, 0.04, 0.05, 
                                 seq(from = 0.1, to = 1, by = 0.1))
                   )
srt_cluster_names <- grep("originalexp_snn_res", names(srt@meta.data), value = TRUE)
srt_clusters_metadata <- srt[[srt_cluster_names]]

saveRDS(srt_clusters_metadata, here("processed", project, "srt_clusters_02.RDS"))

plot_list_func(srt, 
               col_pattern="originalexp_snn_res", 
               plot_cols = cols,
               reduction = "TSNE", 
               label_size = 4 
               )

DimPlot(srt, reduction = "TSNE", group.by = "originalexp_snn_res.1", label = TRUE)

} else{
  srt_clusters_metadata <-
    readRDS( here("processed", project, "srt_clusters_02.RDS"))
}

clustree(srt_clusters_metadata, prefix = "originalexp_snn_res.", edge_arrow = FALSE)

save clusterings

if (!file.exists( here("processed", project, "sce_clusters_02.RDS"))) {
# check the cell names are the same
identical(row.names(colData(sce)), row.names(srt_clusters_metadata))

  # delete old clustering
  srt_cluster_names <-
    grep("originalexp_snn_res", names(srt_clusters_metadata), value = TRUE)
  for (cluster in srt_cluster_names) {
  colData(sce)[[cluster]] <- NULL
  }
  
# add them
colData(sce) <- cbind(colData(sce), srt_clusters_metadata)

## save
saveRDS(sce, here("processed", project, "sce_clusters_02.RDS"))
}

#plot seurat clustering
plot_list_func(sce, 
               col_pattern="originalexp_snn_res", 
               plot_cols = cols,
               reduction = "TSNE", 
               label_size = 0.5 
               )

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

Clustering

NadineBestard

2022-12-22

Set-up

Motivation

With Seurat

save clusterings