A common use case for Scheduling on RStudio Connect, is to use that feature as part of an R-based process to automate scheduled data updates. This example report outputs a CSV file that can be used/consumed by other assets hosted on RStudio Connect.
The purpose of this R Markdown document is to make an output data file (updated on a schedule) available over HTTP on my RStudio Connect server.
df <- data.frame(a=rnorm(50), b=rnorm(50), c=rnorm(50), d=rnorm(50), e=rnorm(50))
Every time this report is executed, it creates a new random data frame. Creating dummy data is not representative of a typical ETL process. You’ll likely want to replace this section with code that pulls data from a database or API.
httr package is a good place to start when working with REST APIs and the http protocollibrary(gt)
library(dplyr)
df %>%
sample_n(6) %>%
gt() %>%
tab_header(
title = "Current Data Sample"
)
| Current Data Sample | ||||
| a | b | c | d | e |
|---|---|---|---|---|
| 1.5194148 | -1.7494129 | -0.1947018 | -0.22157239 | -0.3494667 |
| -1.4540169 | -2.7715115 | -0.2101590 | 0.81436839 | -0.9930990 |
| 0.8373867 | 1.2545409 | -0.2724137 | -1.36341272 | -1.1471346 |
| 1.2488834 | 0.5995953 | 0.3375848 | -0.09124117 | 1.5337757 |
| -1.3291554 | 0.2626559 | -0.6465437 | -1.00045850 | -1.7296960 |
| 1.2489406 | -1.6223056 | -0.2524069 | -0.59752428 | -0.9644789 |
write.csv(df, "data.csv", row.names=FALSE)
This is the step that creates the data.csv output file. There are two ways to specify output files:
rmd_output_metadata and rsc_output_files (done above)Reference: How to work with output files