About

This analysis shows the gene enrichment analysis of selected genes from PTEE by using ABAEnrichment package in R. This R package contains the gene expression data from different human brain stages such as parental, infant (0-2 yrs), child (3-11 yrs), adolescent (12-19 yrs) and adult (>19 yrs).

Required R pakages

library(ABAEnrichment)
library(reshape2)
library(ggplot2)

Input genes

This gene list can be import form PTEE Gene analysis in the follwing format Entrez-ID, Ensembl-ID or gene-symbol

Load input gene list into R

Genes <- read.delim2('genes.txt',header = F)

Get all brain regions that have direct or indirect expression data

From ABA extract all expressions

all_regions = get_id('')

Get expression data of inputed genes in corresponds to age category

Expression_out <- get_expression(structure_ids=all_regions$structure_id,gene_ids=Genes$V1, dataset='5_stages')
## dataset_5_stages already exists in package environment: FALSE
##  Load dataset_5_stages...
##  Done.
## Warning in get_expression(structure_ids = all_regions$structure_id, gene_ids =
## Genes$V1, : No expression data for genes: SCN10A, RNU4ATAC, RMRP, ASPM, KIF14,
## FANCD2, OCLN, MIR17HG.
## Returning data from: 5_stages (RPKM from RNA-seq).

Get high expression of genes in each age category

Expression_all <- cbind (apply(Expression_out$age_category_1,2,max, na.rm = TRUE), apply(Expression_out$age_category_2,2,max, na.rm = TRUE),apply(Expression_out$age_category_3,2,max, na.rm = TRUE),apply(Expression_out$age_category_4,2,max, na.rm = TRUE),apply(Expression_out$age_category_5,2,max, na.rm = TRUE))

Reshape the data

colnames(Expression_all) <- c("Stage1-prenatal", "Stage2-infant","Stage3-child","Stage4-adolescent","Stage5-adult")
melt_data<-melt(Expression_all,varnames=c('Gene_Name', 'Stages'))

Box Plot of expressions

ggplot(melt_data, aes(x=Stages, y=value,fill=Stages)) + geom_boxplot()

Compute one-way ANOVA test

The R function aov() used to calculate the test and summary.aov() used to summarize the analysis

#Compute the analysis of variance
res.aov <- aov(value ~ Stages, data = melt_data)
# Summary of the analysis
summary(res.aov)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## Stages        4  356.8   89.19   11.35 2.92e-08 ***
## Residuals   185 1453.3    7.86                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Tukey multiple pairwise-comparisons

As the ANOVA test is significant, we can compute Tukey HSD (Tukey Honest Significant Differences, R function: TukeyHSD()) for performing multiple pairwise-comparison between the means of groups.

The function TukeyHD() takes the fitted ANOVA as an argument.

TukeyHSD(res.aov)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = value ~ Stages, data = melt_data)
## 
## $Stages
##                                          diff       lwr       upr     p adj
## Stage2-infant-Stage1-prenatal     -3.11940848 -4.890794 -1.348023 0.0000254
## Stage3-child-Stage1-prenatal      -3.52405791 -5.295443 -1.752672 0.0000014
## Stage4-adolescent-Stage1-prenatal -3.35485282 -5.126238 -1.583467 0.0000048
## Stage5-adult-Stage1-prenatal      -3.60379663 -5.375182 -1.832411 0.0000007
## Stage3-child-Stage2-infant        -0.40464943 -2.176035  1.366736 0.9701717
## Stage4-adolescent-Stage2-infant   -0.23544435 -2.006830  1.535941 0.9961368
## Stage5-adult-Stage2-infant        -0.48438815 -2.255774  1.286997 0.9433816
## Stage4-adolescent-Stage3-child     0.16920508 -1.602180  1.940591 0.9989374
## Stage5-adult-Stage3-child         -0.07973872 -1.851124  1.691647 0.9999462
## Stage5-adult-Stage4-adolescent    -0.24894381 -2.020329  1.522442 0.9952075

-> diff: Difference between means of the two groups.

-> lwr, upr: The lower and the upper end point of the confidence interval at 95% (default).

-> p adj: p-value after adjustment for the multiple comparisons.