Saving data in internal R data formats

save() and saveRDS()

Alan Brown

10-Dec-2016

Introduction

Saving data to file in R can be very useful when you’re working with large datasets and you have performed some long calculations that you need to preserve for downstream analytics. It can save a lot of time to preserve some intermediate steps. It also helps optimise memory usage to clear large objects from PC memory.

Save and Load operations can be very efficient You can control the compression of the file using the settings ‘compress’ and ‘compression_level’.

The save() and load() functions

The functions save(), load(), will save and load files to and from R standard file types .rsa. The .rda files allow a user to save their R data structures such as vectors, matrices, and data frames. These files are automatically compressed, with user options for additional compression.

The save() function will save the data frame df into the file df.rda, in the current working directory.

save(df, file="df.rda")

The load() command will place the objects into the global environment:

load("df.rda")

The saveRDS() function saves the object without preserving the object name. This is useful when you want to choose the name of the object.

saveRDS(df, file="df.Rda")

The readRDS() function reads the supplied file and creates a new object. The original name has not been preserved.

df2 <- readRDS(file="df.Rda")

Dataframe example

knitr::kable(
  zwin1[1:7, ], caption = 'A subset of data.'
)

A subset of data.

Open High Low Close
177.831 177.831 177.756 177.762
177.761 177.784 177.756 177.778
177.776 177.776 177.713 177.713
177.715 177.736 177.715 177.728
177.728 177.777 177.728 177.770
177.770 177.797 177.767 177.789
177.789 177.809 177.788 177.805

Saving data objects to a JSON file

Often, I need to transfer data to other software systems or client wish to provide data to other departments for use in their systems which use proprietary data formats. JSON is one format that is commonly required. The following code will export R objects to a JSON file.

library(readr)
library(jsonlite)

df %>% 
    toJSON() %>%
    write_lines(path)

Here the toJSON() function from the jsonlite package is used to convert an R data frame into the JSON format. The write_lines() function then writes the object to disk file.

Summary

When you have large memory hungry objects like data frames or matrices that you would like to save for later, save() and load() are useful. They are also useful for long, complex R workflows and scripts when you need to preserve intermediate steps, either for your own analyses, or for distribution to client departments.

Alan Brown, CTO, Tendron Systems Ltd, London, UK


References

Dzone article IBM Articles RPUBS H2O Page RPUBS use ggplot2 multiple lines on graph

CRAN R Project
R Bloggers
RStudio Download
The R Project for Statistical Computing
Tendron Systems Ltd
The Integrated Development Environment - IDE for R

The R Journal is the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R.

The R Journal

The R Journal - Current Issue