We need to bear in mind this dataset comes from a bigger dataset where outliers have already been excluded.
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
## Warning: Using size for a discrete variable is not advised.
## Warning: Using size for a discrete variable is not advised.
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
## Warning: Using size for a discrete variable is not advised.
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
## Warning: Using size for a discrete variable is not advised.
sample S1826 has more Oligo cells with lower counts. Cutting at minimum 5000 umi counts will get rid of them. for the OPCs we can cut at 2500.
We can also cut at 10% mt genes for both celltypes.
This is the dimensional reduction done with the whole dataset, not as accurate as the one we will compute later, with only the oligos and OPCs
## OPCs Oligo
## S1823 94 250
## S1824 18 445
## S1825 30 366
## S1826 61 1318
## S1827 20 362
## S1828 58 341
## OPCs Oligo
## WT 142 1061
## KO 139 2021
## [1] "before filtering"
## [1] 18827 3363
## [1] "after filtering"
## [1] 17234 3061
result:
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
We quantify per-gene variation computing the variance of the log-normalized expression values (referred to as “log-counts” for simplicity) for each gene across all cells in the population (A. T. L. Lun, McCarthy, and Marioni 2016). We use modelGeneVar() that does also corrects for the abundance of each gene.
The next step is to select the subset of HVGs to use in downstream analyses. The simplest HVG selection strategy is to take the top X genes with the largest values for the relevant variance metric. Here I select the top 15 % of genes.
This leaves us with 1255 highly variable genes.
Here we recompute the dimensional reduction to better fit our subsetted oligo data. This will remove the dimensional batch correction performed earlier, that will be recomputed.
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.