Multifaceted impact of specialized neuropeptide-intensive neurons on selective vulnerability in Alzheimer’s disease

Grubman et al. 2019 QC Workbook/Supplemental Methods

Author

Nicole Flack, DVM, PhD

Modified

April 26, 2024

Background

Original Publication of Dataset

Grubman et al. (2019) A single-cell atlas of entorhinal cortex from individuals with Alzheimer’s disease reveals cell-type-specific gene expression regulation, Nature Neuroscience

Exploratory case-control comparison to detect cell-type-specific gene expression patterns in human entorhinal cortex, a brain region that shows tau and amyloid pathology in the early stages of Alzheimer’s disease (AD). Associating transcriptional changes in specific cell subpopulations with disease.

Context of Current Project

Expression of neuropeptides (NPs) has been shown to be perturbed in AD (Li and Larsen 2023).

Specifically:

Cells (mostly GABAergic neurons) expressing a large number and variety of NPs (HNP cells) are lost in AD
Expression of certain AD-associated NPs (ADNPs) is reduced in disease
HNP cells coexpress ADNPs and genes associated with tau misfolding; have molecular signatures of protein misfolding

Working hypothesis: higher energy demand in HNP cells leads to greater metabolic stress during aging; coexpression of ADNPs contributes to protein misfolding cascade and selective vulnerability in early AD

Gaps in Knowledge

Functional characteristics of HNP neurons
Relationship between NP coexpression and other pathological features of AD
Spatiotemporal NP dynamics in course of disease severity

Preprint: Li, Flack, and Larsen (2023), Multifaceted impact of specialized neuropeptide-intensive neurons on the selective vulnerability in Alzheimer’s disease (bioRxiv)

Goals of Analysis

Interrogate function profiles of HNP neurons, NP expression dynamics in AD, ADNP trajectories in disease course, and the neuroanatomical distribution of HNP neurons.

Library and Raw Data Information

Samples

Biobanked human entorhinal cortex (EC) from age-matched individuals diagnosed with Alzheimer’s disease (AD; n=3 samples, 2 individuals pooled per sample) or control (ct; n = 3 samples, also pooled) based on tau & amyloid pathology.

Libraries

DroNc-Seq (Drop-seq + snRNA-seq), 10x Chromium platform

Method reference: Habib et al. (2017)

Library prep: Chromium Single Cell 3′ Library & Gel Bead Kit v2 (10x Genomics, PN-120237)

Sequencing

Illumina NextSeq 500

Raw Data

Raw data were processed with CellRanger. The resulting expression matrix (cellranger count) contained 33,694 features and 14,876 cells.

Genes with at least 2 transcripts in at least 10 cells were retained. Counts for one hundred postmortem interval-associated genes identified in cerebral cortex by Zhu et al. (2017) were removed. The resulting matrix contained 24,165 features across 14,853 cells and was used as input for this analysis.

Load Counts

Code

# load into merged Seurat object and add metadata
seurat_big <- Read10X_Multi_Directory(
  "data_grubman",
  default_10X_path = FALSE,
  secondary_path = "",
  merge = T,
) %>%
  # read in filters
  # only features found in at least 3 cells
  # only nuclei with at least 200 features
  CreateSeuratObject(.,
                     min.cells = 3,
                     min.features = 200) %>%
  # Disease Group
  AddMetaData(., str_extract(.@meta.data$orig.ident, "^(AD|ct)"),
              col.name = "type") %>%
  # Hemoglobin gene content
  AddMetaData(., PercentageFeatureSet(., "^HB[^(P)]"),
              col.name = "percent_hb") %>%
  # MT, ribo genes; 
  Add_Cell_QC_Metrics(seurat_object = ., species = "human")

Project(seurat_big) <- "AD_TCX"
seurat_big@meta.data$type <- factor(seurat_big@meta.data$type,
                                    levels = c("ct", "AD"))

seurat_big

An object of class Seurat 
24165 features across 14853 samples within 1 assay 
Active assay: RNA (24165 features, 0 variable features)
 2 layers present: counts, data

Initial QC

Preliminary Plots

Nuclei Per Sample/Group

Number of nuclei per sample varied between ~1,200 and 3,500. Nuclei per group (Alzheimer’s and control) were similar despite larger variation in nuclei per sample.

Code

# per sample
seurat_big@meta.data |>
  dplyr::rename(Group = type) |>
  ggplot(aes(x = orig.ident, fill = Group)) +
  geom_bar(position = "stack") +
  ggtitle("Nuclei Per Sample") +
  labs(x = "Sample", y = "Count") +
  theme_bw() +
  ggpubr::labs_pubr()  +
  scale_fill_paletteer_d("ggsci::default_nejm")

Code

# Stacked samples to show per group
seurat_big@meta.data |>
  dplyr::rename(Group = type,
                Sample = orig.ident) |>
  ggplot(aes(x = Group, fill = Sample)) +
  geom_bar(position = "stack",) +
  ggtitle("Nuclei Per Group") +
  labs(x = "Group", y = "Count") +
  theme_bw() +
  labs_pubr() +
  scale_fill_paletteer_d("ggsci::default_nejm")

Features Per Nucleus

Minimum feature count of 200 set at data read in

Code

# histograms
seurat_big@meta.data |> 
  ggplot(aes(x = nFeature_RNA)) +
  geom_histogram(binwidth = 100) +
  facet_wrap(~ orig.ident) +
  ggtitle("NGenes per Nucleus") +
  labs(x = "NGenes", y = "Count") +
  theme_bw() +
  labs_pubr()

Code

# violin plots
QC_Plots_Genes(seurat_big, group.by = "orig.ident")

UMIs Per Nucleus

Code

QC_Plots_UMIs(seurat_big, group.by = "orig.ident")

Code

quantile(seurat_big@meta.data$nCount_RNA,
         probs = seq(0,1,0.1))

     0%     10%     20%     30%     40%     50%     60%     70%     80%     90% 
  232.0   417.0   587.0   735.6   861.0   969.0  1101.0  1282.0  1589.0  2171.0 
   100% 
26674.0

Percent Mitochondrial, Ribosomal, Immediate-Early, Hemoglobin

High percentages of mitochondrial genes is less expected for nuclei; indicator of incomplete removal of cytoplasmic material. Kit was Nuclei EZ Prep (Sigma, NUC101).

Code

VlnPlot(seurat_big, features = c("percent_mito", "percent_ribo"))

Code

VlnPlot(seurat_big, features = c("percent_ieg", "percent_hb"))

Novelty/Complexity

Code

QC_Plots_Complexity(seurat_big)

Scatter Plots

Nuclei with high percentage of mitochondrial transcripts are not the same nuclei on the high end of genes/UMIs.

Code

QC_Plot_GenevsFeature(seurat_big, feature1 = "percent_mito")

Code

QC_Plot_UMIvsFeature(seurat_big, feature1 = "percent_mito")

Code

QC_Plot_UMIvsGene(seurat_big)

Filtering and Normalization

Based on plots, filter counts, then normalize, find variable features, scale, find neighbors, and cluster.

200 < Features < 2500
UMIs < 20000
Percent mitochondrial < 5
Percent ribosomal < 10
Percent immediate early genes < 5

Code

all_genes <- rownames(seurat_big)

seurat_filter <- seurat_big %>%
  subset(., subset = nFeature_RNA < 2500 &
           nFeature_RNA > 200 & # redundant in case read in option changed
           nCount_RNA < 20000 &
           percent_mito < 5 &
           percent_ribo < 10 &
           percent_ieg < 5
           ) %>%
  NormalizeData(.) %>%
  FindVariableFeatures(., 
                       selection.method = "vst",
                       nfeatures = 2000
                       # x.low.cutoff = 0.0125, 
                       # x.high.cutoff = 3,
                       # y.cutof = 0.5
                       ) %>%
  ScaleData(., features = all_genes) %>% 
  RunPCA(., features = VariableFeatures(object = .))

ElbowPlot(seurat_filter)

Code

seurat_filter <- seurat_filter %>%
  FindNeighbors(., dims = 1:9) %>%
  FindClusters(., resolution = 0.2) %>%
  RunUMAP(., dims = 1:9, seed.use = 100) %>%
  RunTSNE(., dims = 1:9, seed.use = 100)

Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 14268
Number of edges: 449460

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9520
Number of communities: 9
Elapsed time: 1 seconds

Code

seurat_filter

An object of class Seurat 
24165 features across 14268 samples within 1 assay 
Active assay: RNA (24165 features, 2000 variable features)
 3 layers present: counts, data, scale.data
 3 dimensional reductions calculated: pca, umap, tsne

Recheck Plots

Code

# combo QC
QC_Plots_Combined_Vln(seurat_filter)

Code

# nuclei per group
seurat_filter@meta.data |>
  dplyr::rename(Group = type,
                Sample = orig.ident) |>
  ggplot(aes(x = Group, fill = Sample)) +
  geom_bar(position = "stack",) +
  ggtitle("Nuclei Per Group") +
  labs(x = "Group", y = "Count") +
  theme_bw() +
  labs_pubr() +
  scale_fill_paletteer_d("ggsci::default_nejm")

Code

# UMI vs. gene
QC_Plot_UMIvsGene(seurat_filter,group.by = "orig.ident",shuffle_seed = T)

Clustering Visualization

UMAP

Code

# default appearance
UMAPPlot(seurat_filter, 
         label = TRUE)

Code

# split by sample
UMAPPlot(seurat_filter,
         label = TRUE,
         split.by = "orig.ident",
         ncol = 3)

Code

# split by group
UMAPPlot(seurat_filter,
         label = TRUE,
         split.by = "type")

tSNE

Code

# default appearance
TSNEPlot(seurat_filter, 
         label = TRUE)

Code

# split by sample
TSNEPlot(seurat_filter,
         label = TRUE,
         split.by = "orig.ident",
         ncol = 3)

Code

# split by group
TSNEPlot(seurat_filter,
         label = TRUE,
         split.by = "type")

Marker Genes

Code

markers <- FindAllMarkers(seurat_filter)

markers_top <- markers |> 
  group_by(cluster) |> 
  slice_head(n = 10) 

markers_neuronal <-
  inner_join(markers_df_brain, 
             markers_top, 
             by = c("markers" = "gene"),
                    relationship = "many-to-many") |> 
  group_by(cluster, cell) |> 
  mutate(n = n()) |> 
  ungroup() |> 
  group_by(cell) |> 
  arrange(desc(n)) |> 
  slice_head(n=1)

FeaturePlot(seurat_filter, features = markers_neuronal$markers,
            reduction = "umap")

References

Grubman, Alexandra, Gabriel Chew, John F. Ouyang, Guizhi Sun, Xin Yi Choo, Catriona McLean, Rebecca K. Simmons, et al. 2019. “A Single-Cell Atlas of Entorhinal Cortex from Individuals with Alzheimer’s Disease Reveals Cell-Type-Specific Gene Expression Regulation.” Nature Neuroscience 22 (12): 2087–97. https://doi.org/10.1038/s41593-019-0539-4.

Habib, Naomi, Inbal Avraham-Davidi, Anindita Basu, Tyler Burks, Karthik Shekhar, Matan Hofree, Sourav R Choudhury, et al. 2017. “Massively Parallel Single-Nucleus RNA-Seq with DroNc-Seq.” Nature Methods 14 (10): 955–58. https://doi.org/10.1038/nmeth.4407.

Li, Manci, Nicole Flack, and Peter A Larsen. 2023. “Multifaceted Impact of Specialized Neuropeptide-Intensive Neurons on the Selective Vulnerability in Alzheimer’s Disease.” http://dx.doi.org/10.1101/2023.11.13.566905.

Li, Manci, and Peter A. Larsen. 2023. “Single-Cell Sequencing of Entorhinal Cortex Reveals Widespread Disruption of Neuropeptide Networks in Alzheimer’s Disease.” Alzheimer’s & Dementia 19 (8): 3575–92. https://doi.org/10.1002/alz.12979.

Zhu, Yizhang, Likun Wang, Yuxin Yin, and Ence Yang. 2017. “Systematic Analysis of Gene Expression Patterns Associated with Postmortem Interval in Human Tissues.” Scientific Reports 7 (1). https://doi.org/10.1038/s41598-017-05882-0.