A common use case for Scheduling on RStudio Connect, is to use that feature as part of an R-based process to automate scheduled data updates. This example report outputs a CSV file that can be used/consumed by other assets hosted on RStudio Connect.

R Markdown Output Metadata and Output Files

The purpose of this R Markdown document is to make an output data file (updated on a schedule) available over HTTP on my RStudio Connect server.

Extract/Transform Data

df <- data.frame(a=rnorm(50), b=rnorm(50), c=rnorm(50), d=rnorm(50), e=rnorm(50))

Every time this report is executed, it creates a new random data frame. Creating dummy data is not representative of a typical ETL process. You’ll likely want to replace this section with code that pulls data from a database or API.

  • Best practices for working with databases can be found at db.rstudio.com
  • The httr package is a good place to start when working with REST APIs and the http protocol

Show a nice table preview (optional)

library(gt)
library(dplyr)
df %>%
  sample_n(6) %>%
  gt() %>%
  tab_header(
    title = "Current Data Sample"
  )
Current Data Sample
a b c d e
1.5194148 -1.7494129 -0.1947018 -0.22157239 -0.3494667
-1.4540169 -2.7715115 -0.2101590 0.81436839 -0.9930990
0.8373867 1.2545409 -0.2724137 -1.36341272 -1.1471346
1.2488834 0.5995953 0.3375848 -0.09124117 1.5337757
-1.3291554 0.2626559 -0.6465437 -1.00045850 -1.7296960
1.2489406 -1.6223056 -0.2524069 -0.59752428 -0.9644789

Write data (CSV file) Important!

write.csv(df, "data.csv", row.names=FALSE)

This is the step that creates the data.csv output file. There are two ways to specify output files:

  • List file names in the R Markdown YAML header under rmd_output_metadata and rsc_output_files (done above)
  • List the output files from within the R code chunk

Reference: How to work with output files


Download data

Here is the data generated from this report: data.csv