Latest Versions & Updates: This markdown document was built using the following versions of R and RStudio:

R v. 4.0.0
RStudio v. 1.3.959
Document v. 1.0.1
Last Updated: 2020-04-24

Introduction

This code through explores how to easily format time-based data using the package lubridate. Dates and times comes in a variety of formats, which can provide challenges for base R to intuit time-based data accurately.That is where lubridate comes in, smoothing the edges, making life easier when it comes to working with time-based data using R. Lubridate helps users easily wrangle date and time based objects to ensure they are useful and recognized for what they are. It also is a part of the tidyverse and conveniently works well with other tidyverse packages.



Lubridate Within the Tidyverse of Usefulness

Source: tidyverse


Content Overview

In this vignette e’ll explain and demonstrate how to parse data, get and set components of date and time, accomodate time zones, and extract time-based data. It is helpful to note that lubridate can accomplish far more than what is covered in this vignette of its features including rounding, stamping, calculating, vectorization, and extracting time based data, to name a few. You can find additional info in the Further Resources section to learn more.


Why You Should Care

This topic is valuable because R often does not easily and intuitively understand time-based data in datasets. Additionally, lubridate is more forgiving than base R as it consistently accommodates for formatting differences and provides a suite of tools to make data wrangling time-based data simple and straight-forward, while it does the heavy lifting.


Learning Objectives

Specifically, you’ll learn how to:

  • Parse data
  • Get and set components of date and time
  • Accomodate time zones
  • Extract time-based data


Installing and Loading

You can install Lubridate with function ‘install.packages()’ and load it with ‘library()’.

install.packages("lubridate")

library(lubridate)


Unlubridated Data: An Example from Jamison Crawford

my_datetime <- "12/21/18 08:30:00 AM"
print(my_datetime)
## [1] "12/21/18 08:30:00 AM"

as.POSIXlt(my_datetime)
## Error in as.POSIXlt.character(my_datetime): character string is not in a standard unambiguous format
## Here, we get an error, since the input data isn’t in “standard unambiguous format”. To format in Base R:

## Formatting in Base R, we have to manually format our datetimes, since they aren’t “standard unambiguous” format.

## In other words, we have to tell R exactly how the date and time are formatted

my_datetime <- "12/21/18 08:30:00 AM"
print(my_datetime)
## [1] "12/21/18 08:30:00 AM"

as.POSIXlt(x = my_datetime, 
           format = "%m/%d/%y %H:%M:%S %p")
## [1] "2018-12-21 08:30:00 EST"

Parse Data

A basic example shows how easy it is to parse a variey of date-time sequencing like: ymd(), ymd_hms, mdy(), mdy_hms to name a few configurations. One of the benefits of luridate is the fact that it is forgiving when it comes to incorrect seperators, month and date names. Additionally, you can parse multiple types of data at once. Below are some examples:

ymd(20151202)
## [1] "2015-12-02"
#> [1] "2015-12-02"
mdy("5/14/18")
## [1] "2018-05-14"
#> [1] "2017-05-14"
# Need additional example of month spelled out and days with abbreviated endings like rd, st, and th. 
# need additional example of how it can handle parsing multiple time units at once. 


Get and Set Components of Date and Time

More specifically, this can be used to get and set components of date and time, as tidyverse notes with year(), month(), mday(), hour(), minute() and second():

bday <- dmy("14/10/1979")
month(bday)
## [1] 10
#> [1] 10
wday(bday, label = TRUE)
## [1] Sun
## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat
#> [1] Sun
#> Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

year(bday) <- 2016
wday(bday, label = TRUE)
## [1] Fri
## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat
#> [1] Fri
#> Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat


Accomodates Time Zones

What’s more, it can also be used with some helper functions that accomdate data wrangling needs that involve time zones,

time <- ymd_hms("2010-12-13 15:30:30")
time
## [1] "2010-12-13 15:30:30 UTC"
#> [1] "2010-12-13 15:30:30 UTC"

# Changes printing
with_tz(time, "America/Chicago")
## [1] "2010-12-13 09:30:30 CST"
#> [1] "2010-12-13 09:30:30 CST"

# Changes time
force_tz(time, "America/Chicago")
## [1] "2010-12-13 15:30:30 CST"
#> [1] "2010-12-13 15:30:30 CST"


Extracting Time-Based Data

Notably, it’s valuable for its ability to extract time-based data. In the CRAN description it provides an example of code shown below:

second(arrive)
#> [1] 0
second(arrive) <- 25
arrive
#> [1] "2011-06-04 12:00:25 NZST"
second(arrive) <- 0

wday(arrive)
#> [1] 7
wday(arrive, label = TRUE)
#> [1] Sat
#> Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat



Further Resources

Lubridate is not limited in usefulness to the functions describe, but rather includes a number of other useful tools not covered like……. Learn more about [package, technique, dataset] with the following:




Works Cited

This code through references and cites the following sources: