SEMINAR04: Two Group Testing and Data Aggregation

1. Load and preview dataset.

library(lattice)
prDat <- read.table("/home/eacton/R/stat540-2014-ACTON-ERICA/GSE4051_data.tsv")
str(prDat, max.level = 0)
## 'data.frame':    29949 obs. of  39 variables:
prDES <- readRDS("/home/eacton/R/stat540-2014-ACTON-ERICA/GSE4051_design.rds")
str(prDES)
## 'data.frame':    39 obs. of  4 variables:
##  $ sidChar : chr  "Sample_20" "Sample_21" "Sample_22" "Sample_23" ...
##  $ sidNum  : num  20 21 22 23 16 17 6 24 25 26 ...
##  $ devStage: Factor w/ 5 levels "E16","P2","P6",..: 1 1 1 1 1 1 1 2 2 2 ...
##  $ gType   : Factor w/ 2 levels "wt","NrlKO": 1 1 1 1 2 2 2 1 1 1 ...

2. Select random gene under investigation.

set.seed(20)
(theGene <- sample(1:nrow(prDat), 1))
## [1] 26281
pDAT <- data.frame(prDES, gExp = unlist(prDat[theGene, ]))
str(pDAT)
## 'data.frame':    39 obs. of  5 variables:
##  $ sidChar : chr  "Sample_20" "Sample_21" "Sample_22" "Sample_23" ...
##  $ sidNum  : num  20 21 22 23 16 17 6 24 25 26 ...
##  $ devStage: Factor w/ 5 levels "E16","P2","P6",..: 1 1 1 1 1 1 1 2 2 2 ...
##  $ gType   : Factor w/ 2 levels "wt","NrlKO": 1 1 1 1 2 2 2 1 1 1 ...
##  $ gExp    : num  7.16 6.64 6.97 6.74 7.8 ...

3. Aggregate data - wildtype and knock-out means.

aggregate(gExp ~ gType, pDAT, FUN = mean)
##   gType  gExp
## 1    wt 7.154
## 2 NrlKO 7.152

4. Plotting is fun!

stripplot(gType ~ gExp, pDAT)

plot of chunk plot

5. Perform t-test between wildtype and knockout for gene of interest. Obtain p-value.

t.test(gExp ~ gType, pDAT)
## 
##  Welch Two Sample t-test
## 
## data:  gExp by gType 
## t = 0.0235, df = 37, p-value = 0.9814
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -0.1782  0.1824 
## sample estimates:
##    mean in group wt mean in group NrlKO 
##               7.154               7.152
ttRes <- t.test(gExp ~ gType, pDAT)
str(ttRes)
## List of 9
##  $ statistic  : Named num 0.0235
##   ..- attr(*, "names")= chr "t"
##  $ parameter  : Named num 37
##   ..- attr(*, "names")= chr "df"
##  $ p.value    : num 0.981
##  $ conf.int   : atomic [1:2] -0.178 0.182
##   ..- attr(*, "conf.level")= num 0.95
##  $ estimate   : Named num [1:2] 7.15 7.15
##   ..- attr(*, "names")= chr [1:2] "mean in group wt" "mean in group NrlKO"
##  $ null.value : Named num 0
##   ..- attr(*, "names")= chr "difference in means"
##  $ alternative: chr "two.sided"
##  $ method     : chr "Welch Two Sample t-test"
##  $ data.name  : chr "gExp by gType"
##  - attr(*, "class")= chr "htest"
ttRes$statistic
##       t 
## 0.02351
ttRes$p.value
## [1] 0.9814

Be amazed at R prowess! Subsequently cry into coffee at not being able to figure out KS? To be continued…