Summary

I have compared the genes highlighted as deferentially expressed between old and young OPCs from published rat and mouse DE analysis.

The rat data-set has been obtain from “Metformin Restores CNS Remyelination Capacity by Rejuvenating Aged Stem Cells”, Neumann et al. In the supplementary data they provide the young-old DE analysis for a list of OPC related genes. They also mention in the publication other genes that have been found as differential expressed, these have been added manually. In total this adds to 53 genes reported.

The mouse data set has been extracted from “Single-cell transcriptomic profiling of the aging mouse brain”, Ximerakis et al. They provide the full young-old DE analysis for different brain cell types, 328 genes are deferentially expressed in OPCs. Comparing both data-sets only 8 genes have been found to be commonly young-old deferentially expressed in both, mouse and rat.

Library

library(here)

Prepare Rat data

Import DE expression analysis from “Metformin Restores CNS Remyelination Capacity by Rejuvenating Aged Stem Cells”, Neumann et al., doi: 10.1016/j.stem.2019.08.015. Supplementary data Table S1.
This table contains Log2-Transformed RNA-Seq Expression Data for Characteristic OPC Genes between Young and Aged OPCs.

#Import, carefull, first line is empty.
rat_opc <- read.csv(here("Data", "Rat_bulkRNAseq", "DE_OPCs_rat_TableS1.csv"), skip = 1, header = TRUE)

# Remove empty columns
rat_opc <- rat_opc[, 1:12]

Filter the genes that are deferentially expressed.(FDR < 0.05 and FC > 10% change)

# FDR < 0.05
rat_opc_de <- rat_opc[rat_opc$adj.p.value < 0.05, ]

# FC > 10% change, is FC under 0.9 or above 1.1
# The FC is in log2 scale, then it is under -0.152 or above +0.138, rounded to abs(0.14)
 rat_opc_de <- rat_opc_de[ abs(rat_opc_de$log2FC) > 0.14, ]
 
# Check if there are 34 genes from the 48 ( corresponding to the 70.33% de indicated in the publication)
dim(rat_opc_de)[1]
## [1] 31
# There are only 31, even being less strict and only filtering by FDR.
# Genes differentially expressed ( only taking into account FDR)
dim(rat_opc[rat_opc$adj.p.value < 0.05, ])[1]
## [1] 31
# Not differentially expressed
dim(rat_opc[rat_opc$adj.p.value > 0.05, ])[1]
## [1] 17

The numbers are sensible but do not exactly correspond with the Figure 2A from the original paper, there are only 31 OPC genes differential expressed between young and old rats (instead of 34).

#Save these 31 gene names 
rat_opc_de_genes <- as.vector(rat_opc_de$gene.name)

This data set only Includes characteristic OPC genes. In the paper they also validate by qRT-PCR deferentially expressed genes that are not OPC specific ( Ascl1, Enpp6, Cnp1, Sirt2 ) and they point out that “several genes” associated with cell senescence were also deferentially expressed, and list 18 of them. Include those genes manually :

# Add the qRT-PCR validated genes non OPC specific
rat_de_genes <- c(rat_opc_de_genes, "Ascl1","Enpp6", "Cnp1", "Sirt2")

# Add the scenecscence genes
rat_de_genes <- c(rat_de_genes, "Srebf1", "Cdkn2a", "Il1b", "Aurka", "Tlr3",
                  "Slc16a7", "Matk", "Pik3r5", "Ckb", "Runx1", "Irf5", "Syk",
                  "Src", "Pkm", "Glb1", "Sgk1", "Tnfsf15", "Nek6")

# How many genes in total?
length(rat_de_genes)
## [1] 53

Prepare Mouse data

Import data from “Single-cell transcriptomic profiling of the aging mouse brain”, Ximerakis et al. doi: 10.1038/s41593-019-0491-3. Supplementary Table 6: Differential gene expression data between young and old cell types, extracting only OPCs tab.

mouse <- read.csv(here("Data", "Mouse_scRNAseq", "mouse_2018_DE_TableS6_OPCs.csv"), header = TRUE)

Filter the genes that are deferentially expressed.(FDR < 0.05 and FC > 10%)

#FDR < 0.05
mouse_de <- mouse[mouse$padj < 0.05, ]

# FC > 10 % change, this is FC under 0.9 or above 1.1
# The FC is in ln scale, then under -0.105 or above 0.095, rounded to abs(0.1)
mouse_de <- mouse_de[abs(mouse_de$logFC_Young_to_Old) > 0.1, ]

Check there are 321 genes, as specified in the publication summary (Table 8)

dim(mouse_de)[1]
## [1] 328
#Save the genenames in a vector
mouse_de_genes <- as.vector(mouse_de$Gene)

Compare both species

Find the genes deferentially expressed in both species

.

# Save the common genes
both <- mouse_de_genes[mouse_de_genes %in% rat_de_genes]
# How many are there?
length(both) 
## [1] 8
# The 8 common genes are:
paste(both, collapse = ", ")
## [1] "Ptprz1, Atp1a2, Apoe, Fabp7, Vcan, Sox11, Ntm, Sirt2"

There are at least 8 genes deferentially expressed between young and old rats AND mice, these are:“Ptprz1, Atp1a2, Apoe, Fabp7, Vcan, Sox11, Ntm, Sirt2”

Are these differentially expressed in the same direction in both species?

In mouse the FC is young_to_old. If there is more in old the FC is positive, if it is negative it is more in young. (ex: Apoe shows in interactive app as upregulated in Old OPCs, has a value of lnFC +0.28) The rat one is the other way arround. (Apoe is around 18 units in old, around 17 in young and has a log2FC of -0.87)

# Rat opcs
both_rat_de <- rat_opc_de[rat_opc_de$gene.name %in% both, c( "gene.name", "log2FC") ]
#Sirt2 does not appear because it was add later. It is more expressed in rat aged OPCs
# try rbind 
sirt2 <- data.frame(gene.name=as.factor("Sirt2"), log2FC="0")
both_rat_de <- rbind(both_rat_de, sirt2)
# sort

both_rat_de <- both_rat_de[order(both_rat_de$gene.name), ]
both_rat_de
##    gene.name       log2FC
## 1       Apoe -0.870445627
## 27    Atp1a2  0.926158846
## 47     Fabp7  2.331338605
## 31       Ntm  0.988096997
## 38    Ptprz1  1.243937692
## 11     Sirt2            0
## 48     Sox11   3.59338157
## 35      Vcan   1.18060429

Sirt2 does not appear because it was added later. It is more expressed in rat aged OPCs

# Mouse opcs
both_mouse_de <- mouse_de[mouse_de$Gene %in% both, c( "Gene", "logFC_Young_to_Old") ]
both_mouse_de <- both_mouse_de[order(both_mouse_de$Gene), ]
both_mouse_de
##       Gene logFC_Young_to_Old
## 68    Apoe          0.2006911
## 42  Atp1a2          0.2280063
## 86   Fabp7         -0.2252232
## 202    Ntm          0.1448372
## 22  Ptprz1          0.2226761
## 346  Sirt2         -0.1329073
## 180  Sox11         -0.1409136
## 110   Vcan          0.2213516
# Both in a single table
both_de <- data.frame(Gene=both_mouse_de$Gene, mouse_logFC = both_mouse_de$logFC_Young_to_Old, rat_log2FC = both_rat_de$log2FC)
both_de 
##     Gene mouse_logFC   rat_log2FC
## 1   Apoe   0.2006911 -0.870445627
## 2 Atp1a2   0.2280063  0.926158846
## 3  Fabp7  -0.2252232  2.331338605
## 4    Ntm   0.1448372  0.988096997
## 5 Ptprz1   0.2226761  1.243937692
## 6  Sirt2  -0.1329073            0
## 7  Sox11  -0.1409136   3.59338157
## 8   Vcan   0.2213516   1.18060429

Genes in rat not in mouse

.

# Save rat genes that are not differentially expressed in the mouse dataset
rat_only <- rat_de_genes[!(rat_de_genes %in% mouse_de_genes)]
# How many are they (sould be 53-8)
length(rat_only)
## [1] 45

Are some of these genes not present in the first mouse dataset?

# Check how many of these genes are simply not present in the initial mouse dataset
rat_not_bigmouse <- rat_de_genes[!(rat_de_genes %in% mouse$Gene)]
# How many are there?
length(rat_not_bigmouse)
## [1] 7
# Which ones?
rat_not_bigmouse
## [1] "Cnp1"    "Il1b"    "Pik3r5"  "Runx1"   "Irf5"    "Syk"     "Tnfsf15"

There are 45 genes that do not appear as deferentially expressed in the mouse data-set, even if they had been highlighted in the rat one. From these 7 of them are simply not even present in the big first mouse data. Most of them are genes I added manually. It seemed very coincidental, but I checked the spelling a couple of times and I can not find misspelling

Genes present in mouse, not in rat

 

Keep in mind the rat gene DE list was not exhaustive. Is it worth to redo/get the full DE from rat?

mouse_only <- mouse_de_genes[!(mouse_de_genes %in% rat_de_genes)]
length(mouse_only)
## [1] 320