This is just a complementary repository where I (Nadine Bestard) will work on bits and pieces of the HCA project from Luise Seeker. First objective: Integration/Merging tissue datasets, clustering and celltype identification and annotation
The analysis is summarised in main.R, a script that calls in the correct order all the following ones.
The samples that were flagged as being very poor quality in previous analysis are excluded from the dataset in filter_bad_samples.R
When deleting cells the pipeline from feature selection needs to be recomputed, as the cells deleted might have influenced the choice of variable genes and PCs. This is done in srt_dim_reduction.R, from feature selection to UMAP reduction.
The next step is a “cluster quality control”. Small clusters identified to be formed by very few individuals are not of interest for this study, there are therefore removed in filter_bad_clusters.Rmd, that analyses the cluster composition. This file is into Markdown, easier for showing as a report, stored into “docs” directory.
Once these cells are deleted the feature selection etc needs to be recomputed as there is a big change in the number of cells that might affect the choice of variable genes and clustering. srt_dim_reduction_after_clusterQC.R is a copy of the previous srt_dim_reduction.R taking as input the filtered object. The cluster QC is redone again in filter_bad_clusters_02.Rmd, only one small cluster is deleted, so we carry on without recomputing the variable genes again.
Then, to have a better understanding of the clusters before the first annotation, the object is plotted with clusters of other resolutions in srt_cluster_extra_resolution.Rmd, and the cluster QC is checked again in check_cluster_qc_resolution_1.3. Nothing is deleted this time.
the annotation is performed in annotation_01.Rmd. It also includes plots with differential expression at a celltype level.
From here the oligos and opcs are separated to analyse them deeply in a separate project. (subset_oligos_and_opcs.R)
sex_age_tissue_diff.Rmd is a document where we explore the differences between sex age and tissue. The input of this plot is the object obtained from the annotation in annotation_01.Rmd
sex_age_tissue_diff_biology.Rmd is looking at differential expressions and markers for the clusters. IN PROGRESS