Identifying Protein Abonormalities in Down Syndrome Mice

Introduction

Approximately 1 in 700 human babies are born with Down Syndrome, a prevalence that has steadily been increasing according to the CDC (Facts About Down Syndrome, 2020). Down Syndrome, which is a chromosomal disorder caused by an extra copy of chromosome 21 (Hsa21), changes how the human’s body and brain develop, often resulting in mental and physical challenges. These challenges can include “limitations in language and communication skills, cognition, and non-verbal problem solving” (Munir, n.d.), and children with Down Syndrome generally learn and progress more slowly than other children. Hsa21 encodes more than 500 genes, and it is unknown how many of those genes contribute to learning disabilities; as of 2015, functional information was available for less than half of the Hsa21 genes (Higuera, 2015). Rather than investigate the function and effects of each individual gene, previous research has instead investigated disruptions in pathways which are essential to learning and memory, and considered treatments which to correct those distruptions and boost learning capabilities (Sturgeon, 2012). One such treatment currently under investigation is Memantine, a drug often prescribed to Alzheimer’s patients. To learn about Memantine’s effects on essential learning pathways, researchers identified 77 different proteins that produced detectable signals in the cerebral cortex of mice exposed to a behavioral test that assesses their ability to learn. These protein expression levels were sampled in a 2014 study of regular mice (control), mice with Down Syndrome, and mice with Down Syndrome given Memantine treatment.

The goal of this analysis is to implement multiple testing to determine statistically significant differences between protein expression levels in the three identified groups of mice using the 2014 study data. Results should identify the effects of Memantine on protein expression levels and provide recommendations for proteins to target in future treatments. Results will also be compared with the published results from the original study, which used different analysis methods to identify differences between groups.

Methods

Data Description and Cleaning

The full data set consists of 77 protein expression level measurements and learning outcomes on 72 mice, 38 of which were control mice and 34 Down Syndrome (DS) mice. Each mouse was sampled 15 times. Of the 72 mice, 37 (19 control, 18 DS) were not given the behavioral test (Shock Context treatment) and learning outcomes did not differ between control and DS mice. This indicates that Shock Context treatment is required for learning outcomes in both control and DS mice; all samples that did not receive the Shock Context were discarded. Of the remaining 35 mice, three groups were of interest: control mice injected with saline (9), DS mice injected with saline (7), and DS mice injected with Memantine (9). The remaining group, control mice injected with Memantine (10) was not of interest because the learning outcome did not differ from the control mice injected with saline. Learning outcomes were not included in the original data published on UCI Machine Learning Repository, which was intended for unsupervised learning exercises, however they were published in the 2015 paper, Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome (Higuera, 2015). There are three different Learning Outcomes of interest: Normal, Failed, and Rescued. A Rescued Learning Outcome implies that the mouse was unable to learn before treatment, and able to learn after treatment. There was no difference in learning outcome within groups of mice.

Table 1: Data of interest for anaysis. One mouse is considered equal to one sample.

	Mouse Condition	Number of Mice	Treatment	Learning Outcome
Group 1	Control	9	Saline	Normal
Group 2	Down Syndrome	7	Saline	Failed
Group 3	Down Syndrome	9	Memantine	Rescued

Each mouse sample contains the protein expression levels for 77 different proteins that produced detectable signals in the cerebral cortex. There were approximately 15 missing values across 375 samples (each mouse was sampled 15 times). For further analysis, missing values were replaced by the average protein expression level for mice in the same treatment group i.e., missing values for protein P38 for Down Syndrome mice that received Memantine would be replaced with the average P38 expression level value for DS mice that received Memantine.

Statistical Methods

A two-part hypothesis testing procedure using logistic regression models will be implemented to determine (1) which protein expression levels are significantly different in DS and control mice given saline, and (2) which protein expression levels are significantly different between DS mice given Memantine and control mice given saline. In both testing scenarios, the learning outcome will be treated as the response, and each protein expression level will be tested individually to see if it contributes to that response. As a generalized linear model, logistic regression was chosen because the response variable for each hypothesis is a binary outcome, and thus is not suitable for a linear model. To meet the independence assumption associated with regression models, each mouse will be represented by the average of all 15 samples taken for that mouse.

The purpose of the two separate testing scenarios will allow direct comparisons between the protein expression levels significant when DS mice were not given Memantine vs. the control and when the DS mice were given Memantine vs. the control. In each test, the goal is to determine whether each protein expression level is associated with the learning outcome. This will be implemented by testing each protein against a null model, which assumes that the Learning Outcome response is not determined by any protein expression level, and is set to one. The null model will be tested against a model with a single protein expression level included, using an analysis of variance (ANOVA). The calculated chi-squared test statistic will be used to determine if the single-protein model is significantly different than the null model at the \(\alpha = 0.05\) significance level. A significant Analysis of Variance test indicates that the single-protein model is necessary to predict the learning outcome and should be included in the model. This process will be repeated in both testing hypothesis testing scenarios: first with data from DS and control mice both given saline, and second with data from DS mice given Memantine and control mice given saline.

Table 2. Testing protocol, to be repeated for all 77 proteins for each hypothesis

	Hypothesis	Data	Significance Testing Method
	1	Control Mice (saline) & DS Mice (saline)	anova(null, single-protein model)
	2	Control Mice (saline) & DS Mice (Memantine)	anova(null, single-protein model)

For each hypothesis, the proteins that record significant p-values when tested against the null model will be stored, and the Holm procedure will be implemented to control the Family-Wise Error Rate among these proteins. Should statistical analysis fail to find significant differences in protein expression levels using the Holm adjustment, the less conservative Benjamini-Hochberg procedure (which controls False Discovery Rate instead of Family-Wise Error Rate) will be implemented to determine which p-values should remain significant. The significance level \(\alpha = 0.05\) will be used to determine if adjusted values still significantly contribute to the Learning Outcome.

The mean values of significant proteins will be used to interpret differences between the two data groups for each hypothesis; I am mainly interested in whether the mean value of the Control Mice for a significant protein is higher or lower than the mean value of the DS mice in each test, and the actual value is of much less importance than the direction of the difference. Additionally, resulting significant proteins in this analysis will be compared with the significant proteins found in the original 2015 paper.

Results

The multiple testing procedure using the Holm adjustment to correct for Family-Wise Error Rate identified only one protein expression level as significantly different for Hypothesis 1, and no protein expression levels that were significantly different for Hypothesis 2.

Table 3. Testing Results Using Holm Adjustment

	Hypothesis	Response (Learning Outcome)	Significant Proteins
	1	Normal (= 1), Failed (= 0)	NR2A
	2	Normal (= 1), Rescued (= 0)	none

These results suggest the the Holm adjustment is likely too conservative to identify significant differences between each test group. It is also noted that during the original 2015 study, 15 proteins were found to be significantly different between Control Mice given saline and DS Mice given saline (hypothesis 1), though admittedly using different statistical methods (Higuera et al., 2015). Methods which control Family-Wise Error Rate are generally stringent and can lead to a lack of power, leading us to not reject the null hypothesis, even when the alternative is actually true. Because the above results failed to yield multiple significant differences (and it helps knowing from the original study that those differences should be found), the less conservative Benjamini-Hochberg (BH) procedure will be applied.

Table 4. Testing Results using Logistic Regression with Benjamini-Hochberg adjustment

	Hypothesis	Response (Learning Outcome)	Significant Proteins
	1	Normal (= 1), Failed (= 0)	NR2A, PNR1, APP, P38, AMPKA, S6
	2	Normal (= 1), Rescued (= 0)	ERK, AMPKA

Results from using the Bejamini-Hochberg adjustment show that 6 proteins have significantly different expression levels between Control Mice given saline and DS Mice given saline, and 2 proteins have significantly different expression levels between Control Mice given saline and DS Mice given Memantine. These results help confirm the learning outcomes: DS Mice given Memantine are able to recover their ability to learn, logically their protein expression levels should be more similar to Control Mice.

Table 5. Interpreting results: Hypothesis 1. The majority of the 6 significant proteins have higher expression levels in Control Mice, than in DS Mice, with the exceptions of proteins APP and S6.

	Control Mice	DS Mice	Difference
NR2A	4.28	3.51	+1.13
PNR1	0.856	0.765	+0.091
APP	0.39	0.442	-0.052
P38	0.366	0.309	+0.057
AMPKA	0.394	0.332	+0.062
S6	0.438	0.585	-0.147

Table 6. Interpreting results: Hypothesis 2. Both significant proteins have higher expression levels in Control Mice than in DS Mice given Memantine. Note that AMKA expression levels in DS Mice given saline are also lower than levels in Control Mice, indicating that Memantine does not significantly affect the AMKA protein.

	Average Protein Expression Level:	Control Mice	DS Mice w/Memantine	Difference
ERK		2.86	2.29	+0.39
AMPKA		0.394	0.329	+0.065

Results indicate that Memantine is able to successfully change the protein expression levels of 5 of the 6 significant protein expression levels identified in Hypothesis 1. These findings will now be compared with the findings from the 2015 study.

First, the results from Hypothesis 1 - which protein expression levels are significantly different between DS and Control mice both given saline - are compared. Of the 6 proteins identified using the methods from this analysis, 5 overlap with proteins identified in the 2015 study. NR2A is the only protein that does not overlap.

Table 7. Hypothesis 1 comparison: proteins with significant expression levels using Logistic Regression and Bejamini-Hochberg adjustment compared with results from the 2015 study using Self Organizing Maps (SOMs) and Wilcoxon Sum-Rank Test.

Logistic Regression with BH Adjustment	SOMs with Wilcoxon rank-sum test	Overlap
NR2A	BDNF	pNR1
pNR1	P38	APP
APP	AMPKA	P38
P38	S6	AMPKA
AMPKA	MTOR	S6
S6	NR2B
	RAPTOR
	TAU
	Ubiquitin
	EGR1
	pCAMKII
	PKCA
	pNR1
	APP
	GluR3

For Hypothesis 2, identifying protein expression levels significantly different between DS mice given Memantine and control mice given saline, two proteins were identified in this analysis: ERK and AMPKA. In the 2015 study, no protein expression levels were significantly different between the two groups, with the authors reporting that “Memantine treatment induces changes…that not only result in successful (rescued) learning, but also a protein profile that is not distinguished from those of normal successful learning”(Higuera, et al., 2015).

Table 8. Hypothesis 2 comparison.

	Logistic Regression with BH Adjustment	SOMs with Wilcoxon rank-sum test	Overlap
	ERK	None	None
	AMPKA

Discussion

Results of this analysis indicate that the application of Memantine on test mice results in protein expression levels within the cerebral cortex that are more similar to normal, non-DS mice. Because Down Syndrome occurs as the result of an extra chromosome, it has often been considered too complex to be responsive to drug treatments, as the extent of the effects the additional chromosome imposes on a mouse (or human) are still unknown (Sturgeon et al., 2012). The results of this analysis, and of the 2015 study indicate that drug treatments can be effective, to some extent, in altering protein levels in the brains to mice and increasing their capabilities to learn.

In Hypothesis 1, 6 proteins were identified as having expression levels that significantly contributed to the learning outcome using the Benjamini-Hochberg adjustment to control from False Discovery Rate. Of those 6, proteins NR2A, PNR1, P38 and AMPKA all have higher expression levels in the control mice, and lower expression levels in DS Mice. The remaining two proteins, APP and S6 had higher levels in DS Mice. Only one of these proteins, NR2A was found to be significant using the more stringent Holm adjustment. Interestingly, the NR2A protein is the only protein that was not also identified in the 2015 study. One large difference between this analysis and the original study is that the original study treated each of 15 measurements taken on each mouse as an independent sample and used the data in it’s original form when constructing the Self Organizing Maps. For this analysis, and because it was essential that each sample be independent to meet the assumptions of regression models, the average of the 15 measurements was used to represent a single mouse. It’s possible that the different testing scenarios lead to NR2A not being identified as significant in the original study. Additionally, the mean is not very robust to outlying values - perhaps taking the median of all 15 measurements would have yielded different results. In the Hypothesis 2 results, NR2A was no longer significant.

Using the Benjamini-Hochberg adjustment was essential to the majority of the findings. This was surprising - I expected the differences in values between Control Mice and DS Mice to be so large that the more stringent Holm adjustment would still identify multiple proteins. This was not the case however, and even with the Benjamini-Hochberg adjustment, less than half of the findings from the 2015 study were identified. It appears that the differences in individual protein expression levels between Control and DS Mice are not very large, even in proteins that are significant. Finding statistically significant differences becomes more challenging when differences are small to begin with, even if a small difference can have a huge impact on ability to learn. Further, as sample size increases, standard error tends to decrease, and smaller differences between populations are more easily detected. This analysis had a very small sample size, with each test group containing between 7 and 9 mice. Notably, because the original study used all 15 measurements for each mouse when constructing the SOMs and performing population tests, that study had a larger (albeit not entirely independent) sample size. It’s possible that the larger sample size contributed to the larger number of significant findings.

In Hypothesis 2, only 2 proteins were identified as having expression levels that contributed to learning outcome, AMPKA, and ERK. These findings help confirm the assumption that if DS mice can achieve protein expression levels that are more similar to those in regular mice, their ability to learn can be rescued. The AMPKA protein was also significant in Hypothesis 1, and expression levels appear to be the same for DS Mice with and without the Memantine treatment. This indicates that either AMPKA is not essential to learning outcomes or it is possible that a drug would be even more effective should it alter the AMKA profile. All that is known is that Memantine can rescue learning abilities in DS Mice without changing AMPKA levels. It is unknown why the ERK protein was also significant; it may just be a side effect of the Memantine treatment. This analysis did not examine additional data such as Regular/Control Mice given Memantine; this supplementary data may provide a clue as to whether Memantine effects other protein expression levels not related to learning outcomes. Surprisingly, the original 2015 study did not identify any significant proteins when examining the control mice and DS with Memantine Mice (Higuera, 2015). The study only applied the Wilcoxon rank-sum test to proteins identified in the SOM clustering as differentiating between groups; it’s possible that had the test been applied, differences in AMPKA and ERK levels would have been identified, however because both proteins did not cause differences in clustering assignments, the authors had no reason to test them.

Conclusion

While this analysis was successful at identifying protein expression level differences between Control Mice, Down Syndrome Mice, and Down Syndrome Mice given Memantine, future study and analysis is needed to understand the effects of Memantine on both Down Syndrome Mice and Control Mice and determine which changes in protein expression levels directly influenced the learning outcome. Memantine changes the expression levels in DS Mice to more closely resemble those of Control Mice, however are all those changes contributing to the outcome? Further, it is unclear how long the effects of Memantine treatment will remain. When do the protein expression levels in DS Mice revert to their pre-treatment state? Previous studies in 2014 have identified multiple other drug treatments that rescue learning in DS Mice (Gardiner, 2014). Future research topics could compare the effects of these drugs on protien expression levels with the effects of Memantine to determine which drugs are the most promising candidates for future studies.

References

Facts about down syndrome. (2020, December 28). Retrieved February 19, 2021, from https://www.cdc.gov/ncbddd/birthdefects/downsyndrome.html

Gardiner, K. (2014). Pharmacological approaches to improving cognitive function in down syndrome: Current status and considerations. Drug Design, Development and Therapy, 103. doi:10.2147/dddt.s51476

Higuera C, Gardiner KJ, Cios KJ (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126 https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0129126

Munir, M.D., K. (n.d.). Mental health issues & Down syndrome. Retrieved February 19, 2021, from https://www.ndss.org/resources/mental-health-issues-syndrome/

Sturgeon X, Le T, Ahmed MM, Gardiner KJ. Pathways to cognitive deficits in Down syndrome. Prog Brain Res. 2012;197:73–100. pmid:22541289 https://www.sciencedirect.com/science/article/pii/B9780444542991000054?via%3Dihub

Appendix: R Code

Data Cleaning and Processing

#download csv file from https://archive.ics.uci.edu/ml/datasets/Mice+Protein+Expression

#data cleaning
mice <- read.csv("Data_Cortex_DSMice.csv")
names(mice)[1] <- "MouseID" #change id column to something more easily identified

mice$MouseID <- as.character(mice$MouseID)

mice$MouseID <- gsub("_*","", paste(mice$MouseID)) 

#label all 15 samples for each mouse
mice$ID <- c(rep("mouse1a",15), rep("mouse2a",15), rep("mouse3a",15), rep("mouse4a",15), rep("mouse5a",15), rep("mouse6a",15),
             rep("mouse7a",15), rep("mouse8a",15), rep("mouse9a",15), rep("mouse10b",15), rep("mouse11b",15), rep("mouse12b",15),
             rep("mouse13b",15), rep("mouse14b",15), rep("mouse15b",15), rep("mouse16b",15), rep("mouse17b",15), rep("mouse18b",15),
             rep("mouse19c",15), rep("mouse20c",15), rep("mouse21c",15), rep("mouse22c",15), rep("mouse23c",15), rep("mouse24c",15),
             rep("mouse25c",15))

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

#clean up NA values by calculating row averages:
mice_c_CS_s <- filter(mice, class == "c-CS-s") #separate all three mouse types
mice_t_CS_s <- filter(mice, class == "t-CS-s")
mice_t_CS_m <- filter(mice, class == "t-CS-m")

library(zoo)

## Warning: package 'zoo' was built under R version 4.0.3

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

mice_cCSs <- na.aggregate(mice_c_CS_s[,2:78]) #use na.aggregate to replace na with column average
mice_tCSs <- na.aggregate(mice_t_CS_s[,2:78])
mice_tCSm <- na.aggregate(mice_t_CS_m[,2:78])

#Add back in mouse class and ID
mice_cCSs$class <- mice_c_CS_s$class
mice_cCSs$ID <- mice_c_CS_s$ID

mice_tCSs$class <- mice_t_CS_s$class
mice_tCSs$ID <- mice_t_CS_s$ID

mice_tCSm$class <- mice_t_CS_m$class
mice_tCSm$ID <- mice_t_CS_m$ID

mice_clean <- rbind(mice_cCSs, mice_tCSs, mice_tCSm) #recombine all three separate frames

mice_test1 <- mice_clean[,1:77]
mice_test1$ID <- mice_clean$ID

mice_single <- aggregate(.~ ID, mice_test1, mean) #aggregate all samples by the mean value for each mouse 
#this step was done because we cannot assume that each of 15 samples is independent 

mice_single$treatment <- "c_CS_s"

mice_single$treatment[grepl("b", mice_single$ID)] <- "t_CS_m"
mice_single$treatment[grepl("c", mice_single$ID)] <- "t_CS_s"

#Add Learning Outcomes:
mice_single$C_CS_saline <- ifelse(mice_single$treatment == "c_CS_s", 1,0)
mice_single$T_CS_m <- ifelse(mice_single$treatment == "t_CS_m", 1,0)

write.csv(mice_single, "mice_clean.csv") #write csv

Analysis

mice <- read.csv("mice_clean.csv")
mice_hyp1 <- mice[10:25,] #filter only c_CS_s and t_CS_s mice (mice that received saline)
mice_hyp2 <- filter(mice, mice$treatment != "t_CS_s") #for second hypothesis, filter out all DS mice who recieved saline

mice_protein1 <- mice_hyp1[,3:79] #create data frames of just predictors
mice_protein2 <- mice_hyp2[,3:79]

#define null model for hypothesis 1
glm.null <- glm(C_CS_saline ~1, family = binomial, data = mice_hyp1)

#create container to save p-values
p.values = rep(0, ncol(mice_protein1))
n <- ncol(mice_protein1)

#loop that tests each protein against the null model
for(i in 1:n)
{
  glm.protein <- glm(C_CS_saline ~ mice_protein1[,i], family = binomial, data = mice_hyp1)
  
  anova.glm <- anova(glm.null, glm.protein, test = "Chisq")
  
  p.values[i] <- anova.glm$`Pr(>Chi)`[2]
}

#p-value adjustments for hypothesis 1
p.holm <- p.adjust(p.values, method = "holm") #holm adjustment
names(mice_protein1)[which(p.holm < 0.05)] #print the names of significant proteins

## [1] "NR2A_N"

p.bh <- p.adjust(p.values, method = "BH") #Benjamini-Hochberg adjustment
names(mice_protein1)[which(p.bh < 0.05)] #names of significant proteins

## [1] "NR2A_N"  "pNR1_N"  "APP_N"   "P38_N"   "AMPKA_N" "S6_N"

#Same process for hypothesis 2: define null model, loop through each protein
glm.null2 <- glm(C_CS_saline ~1, family = binomial, data = mice_hyp2)

p.values2 = rep(0, ncol(mice_protein2))
n <- ncol(mice_protein2)

for(i in 1:n)
{
  glm.protein2 <- glm(C_CS_saline ~ mice_protein2[,i], family = binomial, data = mice_hyp2)
  
  anova.glm2 <- anova(glm.null2, glm.protein2, test = "Chisq")
  
  p.values2[i] <- anova.glm2$`Pr(>Chi)`[2]
}

p.holm2 <- p.adjust(p.values2, method = "holm") #holm adjustment
names(mice_protein2)[which(p.holm2 < 0.05)]

## character(0)

p.bh2 <- p.adjust(p.values2, method = "BH") #benjamini-Hochberg
names(mice_protein2)[which(p.bh2 < 0.05)]

## [1] "ERK_N"   "AMPKA_N"