Export/Import one sheet in a Workbook
Export/Import multiple sheets in a Workbook
Create workbook with multiple type data and figures
Create workbook with multiple formatting styles
Add comments adaptively
02 mai, 2023
Export/Import one sheet in a Workbook
Export/Import multiple sheets in a Workbook
Create workbook with multiple type data and figures
Create workbook with multiple formatting styles
Add comments adaptively
Other packages : XLConnect, readxl, xlsx, etc.
It’s not the fastest in reading Excel Data, but has high number of features.
82 R functions read, write, style and edit Excel spreadsheets.
Does not depend on Java.
#-- Create 1e6 x 10 table.
M <- data.frame(matrix(rnorm(1e7), ncol = 10))
#-- Write speed toward Excel
a <- Sys.time() ; openxlsx::write.xlsx(M, 'M.xlsx') ; b <- Sys.time()
b-a # Time difference of 1.0 mins
a<- Sys.time() ; writexl::write_xlsx(M , "M_writexl.xlsx") ; b<- Sys.time()
b-a # Time difference of 30.6 secs
#-- Reading speed from Excel
a <- Sys.time() ; openxlsx::read.xlsx('M.xlsx') ; b <- Sys.time()
b-a # Time difference of 48.2 secs
a <- Sys.time() ; readxl::read_xlsx('M.xlsx') ; b <- Sys.time()
b-a # Time difference of 20.3 secs
#-- Fix your own working directory
# Example :
setwd("C:\\Users\\oallatif\\CIRI\\workshop_R_Excel\\")
#-- Attach needed packages
library(openxlsx) # R <=> Excel connection.
library(tidyverse) # Data processing tasks.
library(biomaRt) # Interface to BioMart databases.
If you want, you can download the complete R code used in this Workshop as a .R file.
2 main functions : write.xlsx & read.xlsx
write.xlsx(x = trees, file = "arbres.xlsx")
arbre <- read.xlsx(xlsxFile = "arbres.xlsx") arbre %>% head()
## Girth Height Volume ## 1 8.3 70 10.3 ## 2 8.6 65 10.3 ## 3 8.8 63 10.2 ## 4 10.5 72 16.4 ## 5 10.7 81 18.8 ## 6 10.8 83 19.7
The list is a flexible and versatile structure of the R language. It is a special object whose elements can be of any kind : strings, numbers, vectors … even other lists.
Example :
#-- list contents
x <- c("January", "February", "March", "April") ; y <- c(8,4,2,7,9,6,1,10,5,3) ; z <- mpg
ma.list <- list(x,y,z)
R structures
#-- List of 3 objects - Same type.
species <- iris %>% split(iris$Species)
species %>% write.xlsx("Species.xlsx" , overwrite = TRUE)
#-- List of 6 objects - Same type. wool.tension <- split(warpbreaks, list(warpbreaks$wool , warpbreaks$tension)) write.xlsx(wool.tension , "wool.tension.xlsx" , overwrite = TRUE)
#-- List of 4 objects - different types.
myObjs <- list(billboard = billboard ,
rand.norm = rnorm(500) ,
rand.pois = rpois(50,2) ,
rand.binom = rbinom(50,10,0.3) ,
people = c("Pierre" , "Marie" , "Kacou") ,
emptySheet = " "
)
write.xlsx(myObjs , "myObjs.xlsx" , overwrite = T)
#The argument "colWidths = 'auto'" Works on Windows!!
2 main functions : loadWorkbook & readWorkbook
#-- Import the whole workbook into R.
MyWB <- loadWorkbook("myObjs.xlsx")
#-- Get sheet names from the imported workbook.
sheets(MyWB)
## [1] "billboard" "rand.norm" "rand.pois" "rand.binom" "people" ## [6] "emptySheet"
#-- Get the contents of a specific worksheet into an R data.frame. norm.distr <- readWorkbook(MyWB , sheet = "rand.norm", colNames = F) head(norm.distr)
## X1 ## 1 -0.7677441 ## 2 -0.8682696 ## 3 1.0245288 ## 4 2.7748300 ## 5 1.0699806 ## 6 0.2597911
#-- Create, modify and delete data. mutate(norm.distr , Random = X1 , X1 = NULL , Density = dnorm(Random,0,1)) -> norm.prob ggplot(norm.prob , aes(Random, Density))+ #Probability density. geom_point(alpha = 0.06, size = 4 , col = "magenta" , shape = 19)
To import multiple sheets at once, we need a for loop. Also We will use the assign function.
Here is an Example of a for loop :
#-- Put all sheet names in a vector.
sheetNamesInMyWB <- sheets(MyWB)
#-- 'for' loop that displays all sheet names.
for(s in sheetNamesInMyWB){print(s)}
## [1] "billboard" ## [1] "rand.norm" ## [1] "rand.pois" ## [1] "rand.binom" ## [1] "people" ## [1] "emptySheet"
How does the ‘assign’ function work ? Here are 2 examples :
#-- assign to the variable named 'lili' the value 25.
assign("lili" , 25)
lili
## [1] 25
#-- Assign to the variable named "salut" the character string "Bonjour Martin !"
(assign("salut" , "Bonjour Martin !"))
## [1] "Bonjour Martin !"
And now, we are ready !
#-- Loop over the Workbook sheets to assign to each sheet name its contents.
for(s in sheetNamesInMyWB){assign(s , readWorkbook(MyWB , sheet = s))}
#-- Remove a sheet from a Workbook.
#Example : remove, in 2 steps, sheets whose name contains the character string 'rand' :
#-- First, put the sheets whose name contains the character string 'rand' in a vector.
toDelete <- sheets(MyWB)[grep("rand", sheets(MyWB))]
#-- Second, use a 'for' loop to remove the sheets in the 'toDelete' vector.
for(r in toDelete){removeWorksheet(MyWB , r)}
#-- Check the names of the remaining sheets
sheets(MyWB)
## [1] "billboard" "people" "emptySheet"
#-- Renames Workbook sheets.
names(MyWB) <- c("table" , "family" , "noThing")
#-- Send 'MyWB' now to Excel as a physical '.xlsx' file.
saveWorkbook(MyWB , "myObjs_NoRand.xlsx" , overwrite = T)
#-- Open an Excel Workbook from within R.
#openXL("myObjs_NoRand.xlsx")
4 main functions : createWorkbook, addWorksheet, writeData & saveWorkbook
Multiple objects
#-- Generate data : table, model & graph.
Data <- iris[, c("Sepal.Length" , "Species")]
result <- summary(lm(Sepal.Length ~ Species , data = Data))$coeff
gg <- ggplot(Data , aes(y = Sepal.Length, x = Species))+
geom_boxplot(aes(fill = Species), show.legend = FALSE)+
geom_jitter(width = 0.1, shape = 20 , size = 6 , col = "blue", alpha = 0.5 )
print(gg)
#-- Create a blank workbook, named wb3. wb3 <- createWorkbook() #-- Insert in wb3 a blank sheet, named "Sepal.Length". addWorksheet(wb3, "Sepal.Length") #-- Start filling the sheet "Sepal.Length" with the desired R objects to export to Excel. writeData (wb3, "Sepal.Length" , "Data" , startRow = 1 , startCol = 1) writeData (wb3, "Sepal.Length" , Data , startRow = 2 , startCol = 1) writeData (wb3, "Sepal.Length" , "Model Coeff." , startRow = 1 , startCol = 5) writeData (wb3, "Sepal.Length" , result , startRow = 2 , startCol = 5) print(gg)
insertPlot (wb3, "Sepal.Length" , startRow = 10, startCol = 5)
#-- Stop filling, send the Workbook 'wb3' to Excel now.
saveWorkbook(wb3, "wb3.xlsx", overwrite = TRUE)
#openXL("wb3.xlsx")
wb4 <- createWorkbook()
for(i in 1:4){
Data <- iris[, c(i , 5)]
quanti <- names(Data)[1] ; quali <- names(Data)[2]
result <- summary(lm(Data[,quanti] ~ Data[,quali]))$coeff
g1 <- ggplot(Data , aes(x = get(quali) , y = get(quanti) ))+
geom_boxplot(aes(fill = get(quali)), show.legend = FALSE)+
geom_jitter(width = 0.1, shape = 20 , size = 6 , col = "blue", alpha = 0.5)+
labs(x = quali, y = quanti)
#ggsave(paste0("lm.",quanti,".png"))
addWorksheet(wb4, quanti )
writeData (wb4, quanti , "Data" , startRow = 1 , startCol = 1)
writeData (wb4, quanti , Data , startRow = 2 , startCol = 1)
writeData (wb4, quanti , "Model Coeff." , startRow = 1 , startCol = 5)
writeData (wb4, quanti , result , startRow = 2 , startCol = 5)
print(g1)
insertPlot (wb4, quanti , startRow = 10, startCol = 5)
insertImage (wb4, quanti , "iris2.png" , startRow = 30, startCol = 5)
}
#link to iris flower : https://tinyurl.com/fafxecrz
saveWorkbook(wb4, "wb4.xlsx", overwrite = TRUE)
##openXL("wb4.xlsx")
2 main functions : createStyle & addStyle
Add style to sheet
#-- Import the Workbook created in the previous step.
wb5 <- loadWorkbook("wb4.xlsx")
#-- Create a style named 'CoeffStyle'.
CoeffStyle <- createStyle(fgFill = "green" , fontColour = "red" , borderColour = "blue",
border = c("top", "bottom", "left", "right"))
#-- Apply the created style 'CoeffStyle' to specific worksheet range.
addStyle(wb5, "Sepal.Width" , style = CoeffStyle ,
rows = 2:5 , cols = 5:8 , gridExpand = TRUE)
saveWorkbook(wb5, "wb5.xlsx", overwrite = TRUE)
#openXL("wb5.xlsx")
Create styles
#-- Import an RNA-Seq differential expression results table (Copy & Paste
# the code below in your R script).
# rna.seq <- read.table("https://t.ly/V5Lb" , header = T , sep = '\t' , dec = ',')
# write.xlsx(rna.seq , "rna.seq.xlsx" , colWidths = "auto" , overwrite = TRUE)
rna.seq <- read.xlsx("rna.seq.xlsx")
#-- Create 2 styles
logfc.pos <- createStyle(fgFill = "#FFD715")
logfc.neg <- createStyle(fgFill = "#88CEEB")
wb6 <- createWorkbook()
addWorksheet(wb6, "rna-seq")
writeData(wb6 , "rna-seq" , rna.seq)
#-- Apply the style according to the LogFC value.
for (i in 1:nrow(rna.seq)) {
if (rna.seq[i, "logFC"] >= 0) {
addStyle(wb6 , "rna-seq" , style = logfc.pos , rows = i + 1 , cols = 1:6)
} else{
addStyle(wb6 , "rna-seq" , style = logfc.neg , rows = i + 1 , cols = 1:6)
}
}
saveWorkbook(wb6 , "wb6.xlsx" , overwrite = T)
#openXL("wb6.xlsx")
rna.seq <- read.xlsx("rna.seq.xlsx")
#-- 2 styles
logfc.pos <- createStyle(fgFill = "#FFD715")
logfc.neg <- createStyle(fgFill = "#88CEEB")
wb7 <- createWorkbook()
addWorksheet(wb7, "rna-seq")
writeData(wb7 , "rna-seq" , rna.seq)
for (i in 1:nrow(rna.seq)) {
if (grepl("interferon", rna.seq[i, "description"])) {
if (rna.seq[i, "logFC"] >= 0) {
addStyle( wb7 , "rna-seq" , style = logfc.pos , rows = i + 1 , cols = 1:6)
} else{
addStyle(wb7 , "rna-seq", style = logfc.neg, rows = i + 1 , cols = 1:6)
}
}
}
saveWorkbook(wb7 , "wb7.xlsx" , overwrite = T)
#openXL("wb7.xlsx")
wb8 <- createWorkbook()
addWorksheet(wb8, "rna-seq")
writeData(wb8 , "rna-seq" , rna.seq)
#-- 2 styles
for(i in 1:nrow(rna.seq)) {
if (rna.seq[i, "logFC"] > 0) {
logfc.pos <-
createStyle(fgFill = "#FFD715" ,
fontSize = 3 * pmax(1, abs(rna.seq[i, "logFC"])))
addStyle(wb8 , "rna-seq" , style = logfc.pos , rows = i + 1 , cols = 1:6)
} else{
logfc.neg <- createStyle(fgFill = "#88CEEB",
fontSize = 3 * pmax(1, abs(rna.seq[i, "logFC"])))
addStyle(wb8 , "rna-seq", style = logfc.neg, rows = i + 1 , cols = 1:6)
}
}
saveWorkbook(wb8 , "wb8.xlsx" , overwrite = T)
#openXL("wb8.xlsx")
2 main functions createComment & writeComment
Create comments
#-- Straightforward use
wb9 <- createWorkbook()
addWorksheet(wb9 , "rna-seq" )
writeData (wb9 , "rna-seq" , rna.seq)
#-- Create your comment.
myFirstComment <- createComment(comment = "Comment to be inserted in the RNA-Seq file.",
visible = FALSE
)
#-- Insert your comment in the desired Excel cell.
writeComment(wb9 , "rna-seq" , row = 1 , col = 1 , comment = myFirstComment)
saveWorkbook(wb9 , "wb9.xlsx" , overwrite = T)
#openXL("wb9.xlsx")
#-- Create styles.
Sty1 <- createStyle(fontSize = 12, fontColour = "red", textDecoration = c("BOLD"))
Sty2 <- createStyle(fontSize = 9, fontColour = "blue")
wb10 <- createWorkbook()
addWorksheet(wb10 , "rna-seq" )
writeData (wb10 , "rna-seq" , rna.seq)
#-- Create comment and apply styles on it.
coloredComment <-
createComment(
comment = c(paste("Cette table contient", nrow(rna.seq), "genes significatifs dont" ,
sum(rna.seq$logFC>0),"Up regulated et " ),
paste(sum(rna.seq$logFC<0) , "Down Regulated")
),
style = c(Sty1 , Sty2) ,
author = "BIBS Workshop", visible = FALSE , width = 4 , height = 3
)
writeComment(wb10 , "rna-seq" , row = 1 , col = 1 , comment = coloredComment)
saveWorkbook(wb10 , "wb10.xlsx" , overwrite = T)
#openXL("wb10.xlsx")
Example with a biomaRt query
#-- Connect to a BioMart database and dataset. Default host = "https://www.ensembl.org" mart <- useMart(biomart = "ENSEMBL_MART_ENSEMBL" , dataset = "hsapiens_gene_ensembl")
listMarts() # display available databases in a host.
db <- useMart("ENSEMBL_MART_ENSEMBL")
listDatasets(db) # display available datasets in a given database.
mart <- useMart("ENSEMBL_MART_ENSEMBL" , dataset = "hsapiens_gene_ensembl")
listAttributes(mart) # display available attributes for a given dataset.
#-- Extract specified attributes from a database according to a filter features.
getBM(mart = mart ,
attributes = c("ensembl_gene_id",
"protein_id" ,
"entrezgene_id" ,
"chromosome_name",
"strand"),
filters = "ensembl_gene_id",
values = c( "ENSG00000088827","ENSG00000184347","ENSG00000128335")
)
Write as a comment the proteins Id coded by specific genes
wb11 <- createWorkbook()
addWorksheet(wb11 , "rna-seq" )
writeData (wb11 , "rna-seq" , rna.seq)
for(g in 1:nrow(rna.seq)){
if(grepl("binding protein" , rna.seq$description[g])){
results <- getBM(attributes = "protein_id" ,
filters = "ensembl_gene_id" ,
values = rna.seq$ensembl_gene_id[g] ,
mart = mart
)
myDynamicComment <- createComment(
comment = paste(results$protein_id , collapse = " " , sep = "\n") , visible = F
)
writeComment(wb11 , "rna-seq" , row = g+1 , col = 1 , comment = myDynamicComment)
}
}
saveWorkbook(wb11 , "wb11.xlsx" , overwrite = T)
#openXL("wb11.xlsx")