Automating excel outputs
R with Excel
Introduction
In this tutorial i will review some features available to send output from R (dataframes) to Excel. This might be crucial because we feel more comfortable filtering or sorting data in excel, and also because we need to provide the output in excel to end users or as an input to other processes. There is not so much material available, compared to reading excel files. The overview is not exhaustive but restricted to some practical examples which we have handled in Regacc.
Creating a temporary file
For day to day use we can create an un-named temporary file. The best is to create a function that we can add to our rstudio snippets and call it when needed instead of typing it every time we need to do it.
<- function(.data){
show_in_excel <- paste0(tempfile(), ".xlsx")
tmp ::write_xlsx(.data,tmp)
writexlbrowseURL(tmp)
}
show_in_excel(df)
Exporting as an excel table
If we want to avoid having to manually filter the first row in Excel after opening the file for sorting/filtering we can export directly the dataframe as an Excel table.
library(openxlsx)
library(tidyverse)
<-readr::read_csv("loan_data_cleaned.csv") |>
dfmutate(default=ifelse(loan_status==1,"defaulted","non-defaulted"))
<- createStyle(
hs textDecoration = "BOLD", fontColour = "#BBBBBB", fontSize = 12,
fontName = "Arial Narrow", fgFill = "blue"
)
write.xlsx(df, "loandata.xlsx", overwrite = TRUE, asTable = TRUE, headerStyle = hs)
output
Further customisation options
If we want to customise more the output, add a name to the worksheet, table style, font, number format, colours, etc. there are endless possibilities (https://cran.r-project.org/web/packages/openxlsx/vignettes/Formatting.html).
<- createWorkbook()
wb modifyBaseFont(wb, fontSize = 12, fontName = "Calibri Light")
options(openxlsx.numFmt = "#,##0.00")
addWorksheet(wb, "loan_ES")
writeDataTable(wb, "loan_ES", df, tableStyle = "TableStyleMedium13")
saveWorkbook(wb,"loandata.xlsx", overwrite = TRUE)
Modifying an existing file
library(tidyverse)
<-readr::read_csv("loan_data_cleaned.csv") |>
dfmutate(default=ifelse(loan_status==1,"defaulted","non-defaulted"))
<- df %>%
tempgroup_by(default,grade,home_ownership) %>%
summarise(avg = round(mean(annual_inc, na.rm = TRUE)))
<- ggplot()+
ggkgeom_col(data=temp,aes(avg, reorder(default,avg), fill= home_ownership))+
geom_point(data=df, aes(annual_inc,default))+
facet_wrap(~grade, scales = "free_x")+
scale_x_continuous(scales::pretty_breaks(n=4), labels = scales::label_number())+
::theme_fivethirtyeight()+
ggthemes::scale_fill_tableau()+
ggthemestheme(legend.position = "none")
<- df |>
iv_rates select(home_ownership, loan_status) |>
mutate(home_ownership=as.factor(home_ownership)) |>
group_by(home_ownership) |>
summarize(avg_div_rate = mean(loan_status, na.rm=TRUE)) |>
ungroup() |>
mutate(
regions = home_ownership |>
fct_reorder(avg_div_rate)
)
<-iv_rates |>
gghggplot(aes(x=regions, y=avg_div_rate, fill=regions)) + geom_col(color="black") + theme_minimal() +
::scale_fill_tableau() +
ggthemestheme(legend.position="bottom") + geom_label(aes(label=scales::percent(avg_div_rate)), color="white") +
labs(
title = "loan status by home ownership",
y = "loan status",
x = "Regions"
+ scale_y_continuous(labels = scales::percent)
)
<- loadWorkbook("loandata.xlsx")
wb addWorksheet(wb, "Summary")
writeDataTable(wb, sheet="Summary", temp, startCol = 3, startRow = 3,tableStyle = "TableStyleLight13")
print(ggk)
print(ggh)
addWorksheet(wb, "Chart2")
%>% insertPlot(sheet="Chart2",startCol =1, startRow=1,width=6,height =5, dpi=600)
wb saveWorkbook(wb, "loandata_new.xlsx", overwrite = TRUE)
output
output
Split the dataframe and write each group to a different worksheet
Sometimes we may want to split a big dataframe into several worksheets using the groups of a specific column. This is helpful to organise the output and avoid the number of rows limit of 1.000.000.
<- split(df,df$default)
df_unit write.xlsx(df_unit, file = "loan_default.xlsx", overwrite = TRUE)