Lubridate Package:

Makes working with date-time data easier. Base R commands for date-times are generally unintuitive not robust. According to Hadley Wickham, “lubridate has a consistent, memorable syntax, that makes working with dates fun instead of frustrating.” For additional information use help(package = lubridate) to bring up an overview of the package and its functions.

Load Packages:

library(readr)
library(lubridate)
library(dplyr)
library(knitr)

Load data from the FiveThirtyEight article “Some People Are Too Superstitious To Have A Baby On Friday The 13th”

US_births_2000 <- read_csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/births/US_births_1994-2003_CDC_NCHS.csv")%>%
                filter(year == 2000)%>%
                mutate(year = as.character(year),
                       month = as.character(month),
                       date_of_month = as.character(date_of_month)) %>%
               mutate(ymd.raw = paste(year, month, date_of_month, sep = '-'))%>%
               mutate(mdy.raw = paste(month,date_of_month,year, sep = '-'))%>%
               mutate(dmy.raw = paste(date_of_month,month, year,sep = '-'))%>%
              mutate(ymd.hms.raw = paste(year,month,date_of_month, hour(now()), minute(now()), second(now()),sep = '-'))%>%
              select(-c(year, month, date_of_month))

kable(head(US_births_2000,10),  caption = "2000 Birth Data")
2000 Birth Data
day_of_week births ymd.raw mdy.raw dmy.raw ymd.hms.raw
6 8843 2000-1-1 1-1-2000 1-1-2000 2000-1-1-12-49-10.9784529209137
7 7816 2000-1-2 1-2-2000 2-1-2000 2000-1-2-12-49-10.9784529209137
1 11123 2000-1-3 1-3-2000 3-1-2000 2000-1-3-12-49-10.9784529209137
2 12703 2000-1-4 1-4-2000 4-1-2000 2000-1-4-12-49-10.9784529209137
3 12240 2000-1-5 1-5-2000 5-1-2000 2000-1-5-12-49-10.9784529209137
4 12260 2000-1-6 1-6-2000 6-1-2000 2000-1-6-12-49-10.9784529209137
5 12280 2000-1-7 1-7-2000 7-1-2000 2000-1-7-12-49-10.9784529209137
6 8750 2000-1-8 1-8-2000 8-1-2000 2000-1-8-12-49-10.9784529209137
7 7736 2000-1-9 1-9-2000 9-1-2000 2000-1-9-12-49-10.9784529209137
1 11418 2000-1-10 1-10-2000 10-1-2000 2000-1-10-12-49-10.9784529209137

Functionality Overview:

Parsing Date Information:

  • ymd(): parses in order - year, month, date. I will apply to the “ymd.raw” column and save the results as “ymd.clean”
cleaned_date_df <- US_births_2000%>%
                  mutate(ymd.clean = ymd(ymd.raw),
                         ymd.type = class(ymd.clean))%>%
                  select(-c(ymd.raw))
kable(head(cleaned_date_df,10),  caption = "ymd() results")
ymd() results
day_of_week births mdy.raw dmy.raw ymd.hms.raw ymd.clean ymd.type
6 8843 1-1-2000 1-1-2000 2000-1-1-12-49-10.9784529209137 2000-01-01 Date
7 7816 1-2-2000 2-1-2000 2000-1-2-12-49-10.9784529209137 2000-01-02 Date
1 11123 1-3-2000 3-1-2000 2000-1-3-12-49-10.9784529209137 2000-01-03 Date
2 12703 1-4-2000 4-1-2000 2000-1-4-12-49-10.9784529209137 2000-01-04 Date
3 12240 1-5-2000 5-1-2000 2000-1-5-12-49-10.9784529209137 2000-01-05 Date
4 12260 1-6-2000 6-1-2000 2000-1-6-12-49-10.9784529209137 2000-01-06 Date
5 12280 1-7-2000 7-1-2000 2000-1-7-12-49-10.9784529209137 2000-01-07 Date
6 8750 1-8-2000 8-1-2000 2000-1-8-12-49-10.9784529209137 2000-01-08 Date
7 7736 1-9-2000 9-1-2000 2000-1-9-12-49-10.9784529209137 2000-01-09 Date
1 11418 1-10-2000 10-1-2000 2000-1-10-12-49-10.9784529209137 2000-01-10 Date
  • mdy(): parses in order- month, date, year. I will apply to the “mdy.raw” column and save the results as “mdy.clean.” The mdy() function will save the results as year-month-date. This is helpful when trying to normalize data. As you can see below, the “ymd.clean” =“mdy.clean”
cleaned_date_df <- cleaned_date_df%>%
                  mutate(mdy.clean = mdy(mdy.raw))%>%
                  select(-c(mdy.raw))
kable(head(cleaned_date_df,10),  caption = "mdy() results")
mdy() results
day_of_week births dmy.raw ymd.hms.raw ymd.clean ymd.type mdy.clean
6 8843 1-1-2000 2000-1-1-12-49-10.9784529209137 2000-01-01 Date 2000-01-01
7 7816 2-1-2000 2000-1-2-12-49-10.9784529209137 2000-01-02 Date 2000-01-02
1 11123 3-1-2000 2000-1-3-12-49-10.9784529209137 2000-01-03 Date 2000-01-03
2 12703 4-1-2000 2000-1-4-12-49-10.9784529209137 2000-01-04 Date 2000-01-04
3 12240 5-1-2000 2000-1-5-12-49-10.9784529209137 2000-01-05 Date 2000-01-05
4 12260 6-1-2000 2000-1-6-12-49-10.9784529209137 2000-01-06 Date 2000-01-06
5 12280 7-1-2000 2000-1-7-12-49-10.9784529209137 2000-01-07 Date 2000-01-07
6 8750 8-1-2000 2000-1-8-12-49-10.9784529209137 2000-01-08 Date 2000-01-08
7 7736 9-1-2000 2000-1-9-12-49-10.9784529209137 2000-01-09 Date 2000-01-09
1 11418 10-1-2000 2000-1-10-12-49-10.9784529209137 2000-01-10 Date 2000-01-10
  • dmy(): parses in order- day, month, year. The dmy() function will save the results as year-month-date. This is helpful when trying to normalize data. As you can see below, the “ymd.clean” = “dmy.clean”
cleaned_date_df <- cleaned_date_df%>%
                  mutate(dmy.clean = dmy(dmy.raw))%>%
                  select(-c(dmy.raw))
kable(head(cleaned_date_df,10),  caption = "dmy() results")
dmy() results
day_of_week births ymd.hms.raw ymd.clean ymd.type mdy.clean dmy.clean
6 8843 2000-1-1-12-49-10.9784529209137 2000-01-01 Date 2000-01-01 2000-01-01
7 7816 2000-1-2-12-49-10.9784529209137 2000-01-02 Date 2000-01-02 2000-01-02
1 11123 2000-1-3-12-49-10.9784529209137 2000-01-03 Date 2000-01-03 2000-01-03
2 12703 2000-1-4-12-49-10.9784529209137 2000-01-04 Date 2000-01-04 2000-01-04
3 12240 2000-1-5-12-49-10.9784529209137 2000-01-05 Date 2000-01-05 2000-01-05
4 12260 2000-1-6-12-49-10.9784529209137 2000-01-06 Date 2000-01-06 2000-01-06
5 12280 2000-1-7-12-49-10.9784529209137 2000-01-07 Date 2000-01-07 2000-01-07
6 8750 2000-1-8-12-49-10.9784529209137 2000-01-08 Date 2000-01-08 2000-01-08
7 7736 2000-1-9-12-49-10.9784529209137 2000-01-09 Date 2000-01-09 2000-01-09
1 11418 2000-1-10-12-49-10.9784529209137 2000-01-10 Date 2000-01-10 2000-01-10
  • ymd_hms(): parses in order- year, month, day _ hour, minute, second
cleaned_date_df <- cleaned_date_df%>%
                  mutate(ymd.hms.clean = ymd_hms(ymd.hms.raw))%>%
                  select(-c(ymd.hms.raw))
kable(head(cleaned_date_df,10),  caption = "ymd_hms() results")
ymd_hms() results
day_of_week births ymd.clean ymd.type mdy.clean dmy.clean ymd.hms.clean
6 8843 2000-01-01 Date 2000-01-01 2000-01-01 2000-01-01 12:49:10
7 7816 2000-01-02 Date 2000-01-02 2000-01-02 2000-01-02 12:49:10
1 11123 2000-01-03 Date 2000-01-03 2000-01-03 2000-01-03 12:49:10
2 12703 2000-01-04 Date 2000-01-04 2000-01-04 2000-01-04 12:49:10
3 12240 2000-01-05 Date 2000-01-05 2000-01-05 2000-01-05 12:49:10
4 12260 2000-01-06 Date 2000-01-06 2000-01-06 2000-01-06 12:49:10
5 12280 2000-01-07 Date 2000-01-07 2000-01-07 2000-01-07 12:49:10
6 8750 2000-01-08 Date 2000-01-08 2000-01-08 2000-01-08 12:49:10
7 7736 2000-01-09 Date 2000-01-09 2000-01-09 2000-01-09 12:49:10
1 11418 2000-01-10 Date 2000-01-10 2000-01-10 2000-01-10 12:49:10
cleaned_date_df <- cleaned_date_df%>%
                  mutate(wday.clean = wday(ymd.clean))%>%
                  select(-c(day_of_week))
kable(head(cleaned_date_df,10),  caption = "wday() results")
wday() results
births ymd.clean ymd.type mdy.clean dmy.clean ymd.hms.clean wday.clean
8843 2000-01-01 Date 2000-01-01 2000-01-01 2000-01-01 12:49:10 7
7816 2000-01-02 Date 2000-01-02 2000-01-02 2000-01-02 12:49:10 1
11123 2000-01-03 Date 2000-01-03 2000-01-03 2000-01-03 12:49:10 2
12703 2000-01-04 Date 2000-01-04 2000-01-04 2000-01-04 12:49:10 3
12240 2000-01-05 Date 2000-01-05 2000-01-05 2000-01-05 12:49:10 4
12260 2000-01-06 Date 2000-01-06 2000-01-06 2000-01-06 12:49:10 5
12280 2000-01-07 Date 2000-01-07 2000-01-07 2000-01-07 12:49:10 6
8750 2000-01-08 Date 2000-01-08 2000-01-08 2000-01-08 12:49:10 7
7736 2000-01-09 Date 2000-01-09 2000-01-09 2000-01-09 12:49:10 1
11418 2000-01-10 Date 2000-01-10 2000-01-10 2000-01-10 12:49:10 2
###Additi onal Lubridat e Functions :

Extract Information From Dates:

  • tody() : returns today’s date

  • year(): returns the year

  • month(): returns the month

Sys.getlocale()
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
this_day <- today()
this_day
## [1] "2018-05-09"

Date-Time Information:

these were used to create the base dataframe

  • day(): returns the day

  • wday(): day of the week

  • now(): date-time of the exact moment

  • hour(): hour of the exact moment

  • minute(): minute of the exact moment

  • second(): second of the exact moment