Goal: export dataframe to excel and replace only a specific excel worksheet in an existing file

Option 1: package openxls

dtaraw<-dtaraw%>%mutate(test=Sys.time())

wb <- loadWorkbook(paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"))

writeData(wb, sheet = "Facility-level raw data", dtaraw)

saveWorkbook(wb,
             paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"),
             overwrite = TRUE)

Problem:
1. Tab colors disappear - which is i think okay, I can live with that.
2. When I execute the code repeatedly, from the second time the workbook format is very unstable. Sometimes it works, sometimes all worksheeets lose their content, or sometimes only the data frame worksheet remains.

Option 2: package openxls

dtaraw<-dtaraw%>%mutate(test=Sys.time())

wb <- loadWorkbook(paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"))

removeWorksheet(wb, "Facility-level raw data")
addWorksheet(wb, "Facility-level raw data")
writeData(wb, "Facility-level raw data", dtaraw)
saveWorkbook(wb, 
             paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"), 
             overwrite = TRUE)

Problem: same as above. Unstable after the first run…

Option 3: package xlsx

dtaraw<-dtaraw%>%mutate(test=Sys.time())%>%
    mutate(
        testsuccess="yes",
        timeoftesting=time)

write.xlsx(dtaraw, 
           paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"), 
           sheet = "Facility-level raw data", 
           append = FALSE)
           

Problem: This one does not require openxlsx (which may cause issues because of its Java dependency - see below) uses xlsx(http://www.sthda.com/english/wiki/r-xlsx-package-a-quick-start-guide-to-manipulate-excel-files-in-r ). BUT, this one doesn’t work when the worksheet already exists. It replaces the entire workbook/file with only one worksheet, deleting all other existing tabs in a chartbook. others have the same problem

NOTE:
* openxls - Java dependent
* xlsx - NOT Java dependent. BUT, it depends on rJava package. What??!!!
https://stackoverflow.com/questions/62012760/r-xlsx-package-can-not-be-used/62012844

(Last updated: 2021-09-23 11:25:18)


A. SETTING

#SET WORKING DIRECTORIES where this markdown file and subfolders are located

setwd("~/Dropbox/0 iSquared/iSquared_WHO/ACTA/3.AnalysisPlan/R_debugging/")

chartbookdir<-("~/Dropbox/0 iSquared/iSquared_WHO/ACTA/3.AnalysisPlan/R_debugging/")

limesurveydir<-("~/Dropbox/0 iSquared/iSquared_WHO/ACTA/3.AnalysisPlan/ExportedCSV_FromLimeSurvey/")

# Define local macro for the survey 
country<-"Cameroon"     #country name 
round<-1                #round      
year<-2021              #year of the mid point in data collection   
month<-4                #month of the mid point in data collection              
surveyid<-716722        #LimeSurvey survey ID for CAMEROON  
# local macro for analysis: no change needed  

A.1. Check the existing chartbook first

# Check the list of sheets 
sheets<-excel_sheets(paste0(chartbookdir,
                            "CMR_CEHS_Chartbook_TroubleShooting.xlsx"))
sheets 

[1] “INSTRUCTIONS” “Weight”
[3] “Variables” “Facility-level raw data” [5] “Old Cover”

# Check cells are not empty  
for(i in as.vector(sheets)){
    print(i)

    dta<-read_excel(path=paste0(chartbookdir,
                               "CMR_CEHS_Chartbook_TroubleShooting.xlsx"), 
                   sheet = i, 
                   range = "A1:G20", 
                   col_names = FALSE)
    str(dta)
}

[1] “INSTRUCTIONS” tibble [20 × 7] (S3: tbl_df/tbl/data.frame) $ …1: chr [1:20] NA NA NA NA … $ …2: chr [1:20] NA NA NA NA … $ …3: logi [1:20] NA NA NA NA NA NA … $ …4: logi [1:20] NA NA NA NA NA NA … $ …5: logi [1:20] NA NA NA NA NA NA … $ …6: logi [1:20] NA NA NA NA NA NA … $ …7: logi [1:20] NA NA NA NA NA NA … [1] “Weight” tibble [20 × 7] (S3: tbl_df/tbl/data.frame) $ …1: chr [1:20] “facilitycode” “1014242” “2021034” “3023481” … $ …2: chr [1:20] “weight” “1.5” “0.5” “1.1000000000000001” … $ …3: logi [1:20] NA NA NA NA NA NA … $ …4: logi [1:20] NA NA NA NA NA NA … $ …5: logi [1:20] NA NA NA NA NA NA … $ …6: logi [1:20] NA NA NA NA NA NA … $ …7: logi [1:20] NA NA NA NA NA NA … [1] “Variables” tibble [20 × 7] (S3: tbl_df/tbl/data.frame) $ …1: num [1:20] 1 2 3 4 5 6 7 8 9 10 … $ …2: chr [1:20] “variable” “id” “submitdate” “lastpage” … $ …3: chr [1:20] “number in the questionnaire” “metadata” “metadata” “metadata” … $ …4: chr [1:20] “question” NA NA NA … $ …5: logi [1:20] NA NA NA NA NA NA … $ …6: logi [1:20] NA NA NA NA NA NA … $ …7: logi [1:20] NA NA NA NA NA NA … [1] “Facility-level raw data” tibble [0 × 0] (S3: tbl_df/tbl/data.frame) Named list() [1] “Old Cover” tibble [20 × 7] (S3: tbl_df/tbl/data.frame) $ …1: logi [1:20] NA NA NA NA NA NA … $ …2: chr [1:20] NA NA NA NA … $ …3: chr [1:20] NA NA NA NA … $ …4: logi [1:20] NA NA NA NA NA NA … $ …5: logi [1:20] NA NA NA NA NA NA … $ …6: logi [1:20] NA NA NA NA NA NA … $ …7: logi [1:20] NA NA NA NA NA NA …


B. Import and drop duplicate cases

B.1. Import raw data from LimeSurvey
#Import directly from LimeSurvey
dtaraw<-read.csv(paste0("https://extranet.who.int/dataformv3/index.php/plugins/direct?plugin=CountryOverview&docType=1&sid=",surveyid,"&language=en&function=createExport"))

obsraw<-nrow(dtaraw)
cols<-ncol(dtaraw)

As of 2021-09-23 11:25:18, the downloaded raw data has 34 observations and 166 variables.


B.2. Export/save the data daily in CSV form with date
write.csv(dtaraw, paste0(limesurveydir,"LimeSurvey_CEHS_", country, "_R", round, "_", date, ".csv"))

B.3. Export the data to chartbook

Then, export the raw data into the chartbook (green tab: “Facility-level raw data”) - as is.

# OPTION 1
# this option is ideal. But, it requires the openxls package. 
# https://www.rdocumentation.org/packages/openxlsx/versions/4.2.3
# Some people may have issues with the Java requirement 
# https://stackoverflow.com/questions/62012760/r-xlsx-package-can-not-be-used/62012844
#=========> Q1 to David, please see if you can install and load the library 

#BUT, now, I have new problems. 
#1. tab colors disappear - which is i think okay, I can live with that. 
#2. When I execute the code repeatedly, from the second time the workbook format is very unstable. Sometimes it works, sometimes all worksheeets lose their content, or sometimes only the data frame worksheet remains. I don't know what is worse between copy/pasting the worksheet and constantly having to save the initial chartbook

#=========> Q2 to David, please see if you have the same problem 

dtaraw<-dtaraw%>%mutate(test=Sys.time())

wb <- loadWorkbook(paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"))

writeData(wb, sheet = "Facility-level raw data", dtaraw)

saveWorkbook(wb,
             paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"),
             overwrite = TRUE)
# OPTION 2

dtaraw<-dtaraw%>%mutate(test=Sys.time())

wb <- loadWorkbook(paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"))

removeWorksheet(wb, "Facility-level raw data")
addWorksheet(wb, "Facility-level raw data")
writeData(wb, "Facility-level raw data", dtaraw)
saveWorkbook(wb, 
             paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"), 
             overwrite = TRUE)
# OPTION 3
dtaraw<-dtaraw%>%mutate(test=Sys.time())%>%
    mutate(
        testsuccess="yes",
        timeoftesting=time)

write.xlsx(dtaraw, 
           paste0(chartbookdir, "CMR_CEHS_Chartbook_TroubleShooting.xlsx"), 
           sheet = "Facility-level raw data", 
           append = FALSE)

THEN CHECK IF IT WORKED

sheets<-c("Weight",
          "Variables",
          "Facility-level raw data")  

# Check cells are not empty  
for(i in as.vector(sheets)){
    print(i)
    str(read_excel(path=paste0(chartbookdir,
                               "CMR_CEHS_Chartbook_TroubleShooting.xlsx"), 
                   sheet = i, 
                   range = "A1:G20", 
                   col_names = FALSE)) 
}

[1] “Weight” tibble [20 × 7] (S3: tbl_df/tbl/data.frame) $ …1: chr [1:20] “facilitycode” “1014242” “2021034” “3023481” … $ …2: chr [1:20] “weight” “1.5” “0.5” “1.1000000000000001” … $ …3: logi [1:20] NA NA NA NA NA NA … $ …4: logi [1:20] NA NA NA NA NA NA … $ …5: logi [1:20] NA NA NA NA NA NA … $ …6: logi [1:20] NA NA NA NA NA NA … $ …7: logi [1:20] NA NA NA NA NA NA … [1] “Variables” tibble [20 × 7] (S3: tbl_df/tbl/data.frame) $ …1: num [1:20] 1 2 3 4 5 6 7 8 9 10 … $ …2: chr [1:20] “variable” “id” “submitdate” “lastpage” … $ …3: chr [1:20] “number in the questionnaire” “metadata” “metadata” “metadata” … $ …4: chr [1:20] “question” NA NA NA … $ …5: logi [1:20] NA NA NA NA NA NA … $ …6: logi [1:20] NA NA NA NA NA NA … $ …7: logi [1:20] NA NA NA NA NA NA … [1] “Facility-level raw data” tibble [20 × 7] (S3: tbl_df/tbl/data.frame) $ …1: chr [1:20] “id” “32” “33” “36” … $ …2: chr [1:20] “submitdate” “2021-04-06 16:43:06” “2021-04-06 16:14:15” “2021-04-06 19:33:50” … $ …3: chr [1:20] “lastpage” “7” “7” “4” … $ …4: chr [1:20] “startlanguage” “fr” “fr” “fr” … $ …5: chr [1:20] “seed” “439831178” “1982172382” “1823890142” … $ …6: chr [1:20] “token” “GIHUv3EtSJ” “ZBnc6zBNcz” “efnA0FaOhy” … $ …7: chr [1:20] “startdate” “2021-04-06 12:47:43” “2021-04-06 13:36:27” “2021-04-06 19:15:15” …