In this analysis, we will use the molecular.subtyping
function from the genefu package to apply the PAM50
classifier to RNA-seq data.
# Load the PAM50 classifier and robust parameters
library(genefu)
library(dplyr)
library(pheatmap)
data(pam50)
data(pam50.robust)
# Read the expression data
exp <- read.table("tpm_counts.txt", sep='\t', header = TRUE)
dim(exp)
## [1] 19790 41
# view few lines of data
head(exp, 2)
# Create a gene_info data frame
gene_info <- select(exp, c("Genes"))
head(gene_info)
# Use the first column for row names
exp <- data.frame(exp, row.names = 1)
# view few lines of data
head(exp, 2)
# Transpose the expression matrix
texp <- t(exp)
# view few lines of data
head(as.data.frame(texp), 2)
# Rename the columns with official gene symbols
colnames(texp) <- gene_info$Genes
# Apply the PAM50 classifier using molecular.subtyping function
pam50_predictions <- molecular.subtyping(
sbt.model = "pam50",
data = texp,
annot = gene_info,
do.mapping = FALSE)
# Display the PAM50 subtypes
as.data.frame(pam50_predictions$subtype)
# Display the subtype probabilities
as.data.frame(pam50_predictions$subtype.proba)
# Display the subtypes predictions
as.data.frame(pam50_predictions$subtype.crisp)
# Display the crisp subtypes
m=as.data.frame(pam50_predictions$subtype.proba)
pheatmap(m, scale = "row", colorRampPalette(c("navy", "white", "#FF1493"))(75))
In this analysis, we loaded the PAM50 classifier and robust
parameters, read the RNA-seq expression data, and performed necessary
data preprocessing steps. We then applied the PAM50 classifier using the
molecular.subtyping function and displayed the obtained
molecular subtypes, subtype probabilities, and crisp subtypes.