Date-time data can be frustrating to work with in R. R commands for date-times are generally unintuitive and change depending on the type of date-time object being used. Moreover, the methods we use with date-times must be robust to time zones, leap days, daylight savings times, and other time related quirks, and R lacks these capabilities in some situations. Lubridate makes it easier to do the things R does with date-times and possible to do the things R does not.
Let’s explore some of the functions within the lubridate package.
# The easiest way to get lubridate is to install the whole tidyverse:
install.packages("tidyverse")
# Alternatively, install just lubridate:
install.packages("lubridate")
# Or the the development version from GitHub:
# install.packages("devtools")
devtools::install_github("tidyverse/lubridate")
Some of these functions can be done usng strptime and strftime in base R but we will be focusing on lubridate.
library(lubridate)
POSI….what? There are two internal implementations of date/time: POSIXct, which stores seconds since UNIX epoch (+some other data), and POSIXlt, which stores a list of day, month, year, hour, minute, second, etc.
today = today()
today
[1] "2018-09-12"
now = now()
now
[1] "2018-09-12 18:49:39 SAST"
To create POSIXlt objects from strings, lubridate has a number of aptly named functions: - ymd(), ymd_hms, dmy(), dmy_hms, mdy(), …
# When is Women's day?
ymd("20180809")
[1] "2018-08-09"
mdy("08-09-2018")
[1] "2018-08-09"
dmy("09/08/2018")
[1] "2018-08-09"
# You try
# Parse as date
___("17 Sep 2018")
____("March 6, 1957")
# Parse as date and time (with no seconds!)
_____("July 15, 2012 12:56")
# Parse as date-time (with seconds)
______('2011-04-06 08:00:10')
# Parse all as dates
x <- c("2009-01-01", "2009-01-02", "2009-01-03")
___(x)
# Format to display date as "2018-01-01"
___(180101, 180102)
# Format to display "2010-02-01"
___(010210)
# Format to display "2010-01-02"
___(010210)
You see that these functions contain the formatting of the string in their name. In addition, they are robust in which separators are used (e.g. : or -). Note that you can set the tz input argument to set the timezone.
womensday<-ymd_hms("2018-08-09-12-30-30", tz="GMT")
womensday
[1] "2018-08-09 12:30:30 GMT"
# Extract only the date
date(womensday)
[1] "2018-08-09"
# Is it a leap year?
leap_year(womensday)
[1] FALSE
Getting useful information using functions such as year(), month(), mday(), hour(), minute() and second():
year(womensday)
[1] 2018
month(womensday)
[1] 8
week(womensday)
[1] 32
wday(womensday)
[1] 5
wday(womensday, label=TRUE)
[1] Thu
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat
yday(womensday)
[1] 221
hour(womensday)
[1] 12
# You try
# Save your birthday as a date-time object (make up a time if you don't know!)
bday <- _______()
# What number week in the year were you born?
_____(bday)
# What number day in the year were you born
____(bday)
# Were you born in a leap year?
____(bday)
# What day of the week were you born? Print out the full, unabbreviated weekday
____(bday)
Lubridate can be used to not only extract but also change prts of date-time objects
month(womensday) <- 12
womensday
[1] "2018-12-09 12:30:30 GMT"
second(womensday) <-99
womensday
[1] "2018-12-09 12:31:39 GMT"
# Use update to change multiple values
youthday <- update(womensday, year = 1976, day = 16, month = 6, hour = 8)
# You try
# Change your birth date to your birthday this year
bday_2018 <- ____
bday_2018
intervals
new_interval function.# Create an interval in time
my_int <- interval(womensday, now)
my_int
[1] 2018-12-09 12:31:39 GMT--2018-09-12 16:49:39 GMT
# Access the start of an interval
start <- int_start(my_int)
start
[1] "2018-12-09 12:31:39 GMT"
# Access the end point of an interval
end <- int_end(my_int)
end
[1] "2018-09-12 16:49:39 GMT"
# Flip an interval
tni_ym <- int_flip(my_int)
tni_ym
[1] 2018-09-12 16:49:39 GMT--2018-12-09 12:31:39 GMT
Lubridate also provides two helper functions for general time spans:
durations
dhours, dseconds# Create durations
d <- ddays(14)
d
[1] "1209600s (~2 weeks)"
w <- dweeks(104)
w
[1] "62899200s (~1.99 years)"
# Create duration from numeric
as.duration(100)
[1] "100s (~1.67 minutes)"
# Create duration from a time interval
my_int_d <- as.duration(my_int)
my_int_d
[1] "7587719.47534204s (~12.55 weeks)"
periods
years, months. Note that these are not the same as the function year and month.# Create periods
p <- months(3) + days(12)
p
[1] "3m 12d 0H 0M 0S"
# Create a period from a time interval
my_int_p <- as.period(my_int)
my_int_p
[1] "-2m -26d -19H -41M -59.4753420352936S"
Why two classes?
minutes(2) # period
[1] "2M 0S"
## 2 minutes
dminutes(2) # duration
[1] "120s (~2 minutes)"
## 120s (~2 minutes)
my_int_d == my_int_p
[1] FALSE
Comparing the “timeline” and the “number line”
leap_year(2011)
[1] FALSE
## FALSE
ymd(20110101) + dyears(1)
[1] "2012-01-01"
## "2012-01-01 UTC"
ymd(20110101) + years(1)
[1] "2012-01-01"
## "2012-01-01 UTC"
leap_year(2012)
[1] TRUE
## TRUE
ymd(20120101) + dyears(1)
[1] "2012-12-31"
## "2012-12-31 UTC"
ymd(20120101) + years(1)
[1] "2013-01-01"
## "2013-01-01 UTC"
# Is my date within an interval?
a = ymd(20170101)
a
[1] "2017-01-01"
a %within% my_int
[1] FALSE
b = a + years(1) + months(7) + days(10)
b
[1] "2018-08-11"
b %within% my_int
[1] FALSE
# You try
# How many seconds in 6 months?
# Create a period of 2 seconds and 34 milliseconds
# Add four hours onto now and save as bedtime
# What interval in time has passed since your birthday and now?
# How old are you in seconds at this point in time?
# Create an interval of time from beginning of this year to now
# Has your birthday this year occured within this interval?
# How long until you turn 100 years old?!