Download the markdown here: R-IntroExercise.Rmd

Q1: Create a matrix (call it transcriptome) with the values below. The experiments are column names and genes are the row names.

Note: use a matrix object for now, don’t worry about trying to create a data frame with factors. We will go over this again next week.

. Control Nitrogen Phosphate Potassium
GeneA 89 78 77 56
GeneB 90 99 85 97
GeneC 78 94 99 87
GeneD 81 83 80 79
results <- c(89, 90, 78, 81, 78, 99, 94, 83, 77, 85, 99, 80, 56, 97, 87, 79)
genes <- c("GeneA", "GeneB", "GeneC", "GeneD")
experiments <- c("Control", "Nitrogen", "Phosphate", "Potassium")
transcriptome <- matrix(results, 4, 4, FALSE, list(genes,experiments))
print(transcriptome)
##       Control Nitrogen Phosphate Potassium
## GeneA      89       78        77        56
## GeneB      90       99        85        97
## GeneC      78       94        99        87
## GeneD      81       83        80        79

Q2: Use an R function to calculate the average expression for all genes and save it in a vector called expression_average.

Hint: Remember that you can always get help with commands by typing ?commandname, apropos("commandname"), and example("commandname").

# using iteration
expression_average <- numeric()
for (i in 1:nrow(transcriptome)){
  expression_average[i] = mean(transcriptome[i, ])
}
print(expression_average)
## [1] 75.00 92.75 89.50 80.75
# or using a provided function
expression_average <- rowMeans(transcriptome)
print(expression_average)
## GeneA GeneB GeneC GeneD 
## 75.00 92.75 89.50 80.75

Q3: Add the expression_average vector as another column to the transcriptome matrix

Hint: You can use cbind to combine a matrix and a vector

transcriptome <- cbind(transcriptome, expression_average)
print(transcriptome)
##       Control Nitrogen Phosphate Potassium expression_average
## GeneA      89       78        77        56              75.00
## GeneB      90       99        85        97              92.75
## GeneC      78       94        99        87              89.50
## GeneD      81       83        80        79              80.75

Q4: Sort the matrix such that the gene with the highest average gene expression is on top.

Hint: the sort() function sorts the data and the order() function provides how to order the data to be sorted.

# redoing what I already did because R doesn't want to cooperate
# Q1 again
results <- c(89, 90, 78, 81, 78, 99, 94, 83, 77, 85, 99, 80, 56, 97, 87, 79)
rows <- c("GeneA", "GeneB", "GeneC", "GeneD")
columns <- c("Control", "Nitrogen", "Phosphate", "Potassium")
transcriptome <- matrix(results, nrow = 4, ncol = 4, dimnames = list(rows, columns))
# Q2 again
expression_average <- rowMeans(transcriptome)
# Q3 again
transcriptome <- cbind(transcriptome, expression_average)
# now onto the Q4 stuff
highest_to_lowest <- order(transcriptome[, "expression_average"], decreasing = TRUE)
transcriptome <- transcriptome[highest_to_lowest, ]
print(transcriptome)
##       Control Nitrogen Phosphate Potassium expression_average
## GeneB      90       99        85        97              92.75
## GeneC      78       94        99        87              89.50
## GeneD      81       83        80        79              80.75
## GeneA      89       78        77        56              75.00