Makes working with date-time data easier. Base R commands for date-times are generally unintuitive not robust. According to Hadley Wickham, “lubridate has a consistent, memorable syntax, that makes working with dates fun instead of frustrating.” For additional information use help(package = lubridate) to bring up an overview of the package and its functions.
Load Packages:
library(readr)
library(lubridate)
library(dplyr)
library(knitr)
Load data from the FiveThirtyEight article “Some People Are Too Superstitious To Have A Baby On Friday The 13th”
Since it is fairly clean data, I will create additional columns to showcase lubridate functionality below
I will use the now(), hour(), minute() and second() functions to add addditional metadata to the dataframe. Please see definitions below
Once the new columns are added, I will remove the original clean columns
US_births_2000 <- read_csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/births/US_births_1994-2003_CDC_NCHS.csv")%>%
filter(year == 2000)%>%
mutate(year = as.character(year),
month = as.character(month),
date_of_month = as.character(date_of_month)) %>%
mutate(ymd.raw = paste(year, month, date_of_month, sep = '-'))%>%
mutate(mdy.raw = paste(month,date_of_month,year, sep = '-'))%>%
mutate(dmy.raw = paste(date_of_month,month, year,sep = '-'))%>%
mutate(ymd.hms.raw = paste(year,month,date_of_month, hour(now()), minute(now()), second(now()),sep = '-'))%>%
select(-c(year, month, date_of_month))
kable(head(US_births_2000,10), caption = "2000 Birth Data")
| day_of_week | births | ymd.raw | mdy.raw | dmy.raw | ymd.hms.raw |
|---|---|---|---|---|---|
| 6 | 8843 | 2000-1-1 | 1-1-2000 | 1-1-2000 | 2000-1-1-12-49-10.9784529209137 |
| 7 | 7816 | 2000-1-2 | 1-2-2000 | 2-1-2000 | 2000-1-2-12-49-10.9784529209137 |
| 1 | 11123 | 2000-1-3 | 1-3-2000 | 3-1-2000 | 2000-1-3-12-49-10.9784529209137 |
| 2 | 12703 | 2000-1-4 | 1-4-2000 | 4-1-2000 | 2000-1-4-12-49-10.9784529209137 |
| 3 | 12240 | 2000-1-5 | 1-5-2000 | 5-1-2000 | 2000-1-5-12-49-10.9784529209137 |
| 4 | 12260 | 2000-1-6 | 1-6-2000 | 6-1-2000 | 2000-1-6-12-49-10.9784529209137 |
| 5 | 12280 | 2000-1-7 | 1-7-2000 | 7-1-2000 | 2000-1-7-12-49-10.9784529209137 |
| 6 | 8750 | 2000-1-8 | 1-8-2000 | 8-1-2000 | 2000-1-8-12-49-10.9784529209137 |
| 7 | 7736 | 2000-1-9 | 1-9-2000 | 9-1-2000 | 2000-1-9-12-49-10.9784529209137 |
| 1 | 11418 | 2000-1-10 | 1-10-2000 | 10-1-2000 | 2000-1-10-12-49-10.9784529209137 |
cleaned_date_df <- US_births_2000%>%
mutate(ymd.clean = ymd(ymd.raw),
ymd.type = class(ymd.clean))%>%
select(-c(ymd.raw))
kable(head(cleaned_date_df,10), caption = "ymd() results")
| day_of_week | births | mdy.raw | dmy.raw | ymd.hms.raw | ymd.clean | ymd.type |
|---|---|---|---|---|---|---|
| 6 | 8843 | 1-1-2000 | 1-1-2000 | 2000-1-1-12-49-10.9784529209137 | 2000-01-01 | Date |
| 7 | 7816 | 1-2-2000 | 2-1-2000 | 2000-1-2-12-49-10.9784529209137 | 2000-01-02 | Date |
| 1 | 11123 | 1-3-2000 | 3-1-2000 | 2000-1-3-12-49-10.9784529209137 | 2000-01-03 | Date |
| 2 | 12703 | 1-4-2000 | 4-1-2000 | 2000-1-4-12-49-10.9784529209137 | 2000-01-04 | Date |
| 3 | 12240 | 1-5-2000 | 5-1-2000 | 2000-1-5-12-49-10.9784529209137 | 2000-01-05 | Date |
| 4 | 12260 | 1-6-2000 | 6-1-2000 | 2000-1-6-12-49-10.9784529209137 | 2000-01-06 | Date |
| 5 | 12280 | 1-7-2000 | 7-1-2000 | 2000-1-7-12-49-10.9784529209137 | 2000-01-07 | Date |
| 6 | 8750 | 1-8-2000 | 8-1-2000 | 2000-1-8-12-49-10.9784529209137 | 2000-01-08 | Date |
| 7 | 7736 | 1-9-2000 | 9-1-2000 | 2000-1-9-12-49-10.9784529209137 | 2000-01-09 | Date |
| 1 | 11418 | 1-10-2000 | 10-1-2000 | 2000-1-10-12-49-10.9784529209137 | 2000-01-10 | Date |
cleaned_date_df <- cleaned_date_df%>%
mutate(mdy.clean = mdy(mdy.raw))%>%
select(-c(mdy.raw))
kable(head(cleaned_date_df,10), caption = "mdy() results")
| day_of_week | births | dmy.raw | ymd.hms.raw | ymd.clean | ymd.type | mdy.clean |
|---|---|---|---|---|---|---|
| 6 | 8843 | 1-1-2000 | 2000-1-1-12-49-10.9784529209137 | 2000-01-01 | Date | 2000-01-01 |
| 7 | 7816 | 2-1-2000 | 2000-1-2-12-49-10.9784529209137 | 2000-01-02 | Date | 2000-01-02 |
| 1 | 11123 | 3-1-2000 | 2000-1-3-12-49-10.9784529209137 | 2000-01-03 | Date | 2000-01-03 |
| 2 | 12703 | 4-1-2000 | 2000-1-4-12-49-10.9784529209137 | 2000-01-04 | Date | 2000-01-04 |
| 3 | 12240 | 5-1-2000 | 2000-1-5-12-49-10.9784529209137 | 2000-01-05 | Date | 2000-01-05 |
| 4 | 12260 | 6-1-2000 | 2000-1-6-12-49-10.9784529209137 | 2000-01-06 | Date | 2000-01-06 |
| 5 | 12280 | 7-1-2000 | 2000-1-7-12-49-10.9784529209137 | 2000-01-07 | Date | 2000-01-07 |
| 6 | 8750 | 8-1-2000 | 2000-1-8-12-49-10.9784529209137 | 2000-01-08 | Date | 2000-01-08 |
| 7 | 7736 | 9-1-2000 | 2000-1-9-12-49-10.9784529209137 | 2000-01-09 | Date | 2000-01-09 |
| 1 | 11418 | 10-1-2000 | 2000-1-10-12-49-10.9784529209137 | 2000-01-10 | Date | 2000-01-10 |
cleaned_date_df <- cleaned_date_df%>%
mutate(dmy.clean = dmy(dmy.raw))%>%
select(-c(dmy.raw))
kable(head(cleaned_date_df,10), caption = "dmy() results")
| day_of_week | births | ymd.hms.raw | ymd.clean | ymd.type | mdy.clean | dmy.clean |
|---|---|---|---|---|---|---|
| 6 | 8843 | 2000-1-1-12-49-10.9784529209137 | 2000-01-01 | Date | 2000-01-01 | 2000-01-01 |
| 7 | 7816 | 2000-1-2-12-49-10.9784529209137 | 2000-01-02 | Date | 2000-01-02 | 2000-01-02 |
| 1 | 11123 | 2000-1-3-12-49-10.9784529209137 | 2000-01-03 | Date | 2000-01-03 | 2000-01-03 |
| 2 | 12703 | 2000-1-4-12-49-10.9784529209137 | 2000-01-04 | Date | 2000-01-04 | 2000-01-04 |
| 3 | 12240 | 2000-1-5-12-49-10.9784529209137 | 2000-01-05 | Date | 2000-01-05 | 2000-01-05 |
| 4 | 12260 | 2000-1-6-12-49-10.9784529209137 | 2000-01-06 | Date | 2000-01-06 | 2000-01-06 |
| 5 | 12280 | 2000-1-7-12-49-10.9784529209137 | 2000-01-07 | Date | 2000-01-07 | 2000-01-07 |
| 6 | 8750 | 2000-1-8-12-49-10.9784529209137 | 2000-01-08 | Date | 2000-01-08 | 2000-01-08 |
| 7 | 7736 | 2000-1-9-12-49-10.9784529209137 | 2000-01-09 | Date | 2000-01-09 | 2000-01-09 |
| 1 | 11418 | 2000-1-10-12-49-10.9784529209137 | 2000-01-10 | Date | 2000-01-10 | 2000-01-10 |
cleaned_date_df <- cleaned_date_df%>%
mutate(ymd.hms.clean = ymd_hms(ymd.hms.raw))%>%
select(-c(ymd.hms.raw))
kable(head(cleaned_date_df,10), caption = "ymd_hms() results")
| day_of_week | births | ymd.clean | ymd.type | mdy.clean | dmy.clean | ymd.hms.clean |
|---|---|---|---|---|---|---|
| 6 | 8843 | 2000-01-01 | Date | 2000-01-01 | 2000-01-01 | 2000-01-01 12:49:10 |
| 7 | 7816 | 2000-01-02 | Date | 2000-01-02 | 2000-01-02 | 2000-01-02 12:49:10 |
| 1 | 11123 | 2000-01-03 | Date | 2000-01-03 | 2000-01-03 | 2000-01-03 12:49:10 |
| 2 | 12703 | 2000-01-04 | Date | 2000-01-04 | 2000-01-04 | 2000-01-04 12:49:10 |
| 3 | 12240 | 2000-01-05 | Date | 2000-01-05 | 2000-01-05 | 2000-01-05 12:49:10 |
| 4 | 12260 | 2000-01-06 | Date | 2000-01-06 | 2000-01-06 | 2000-01-06 12:49:10 |
| 5 | 12280 | 2000-01-07 | Date | 2000-01-07 | 2000-01-07 | 2000-01-07 12:49:10 |
| 6 | 8750 | 2000-01-08 | Date | 2000-01-08 | 2000-01-08 | 2000-01-08 12:49:10 |
| 7 | 7736 | 2000-01-09 | Date | 2000-01-09 | 2000-01-09 | 2000-01-09 12:49:10 |
| 1 | 11418 | 2000-01-10 | Date | 2000-01-10 | 2000-01-10 | 2000-01-10 12:49:10 |
cleaned_date_df <- cleaned_date_df%>%
mutate(wday.clean = wday(ymd.clean))%>%
select(-c(day_of_week))
kable(head(cleaned_date_df,10), caption = "wday() results")
| births | ymd.clean | ymd.type | mdy.clean | dmy.clean | ymd.hms.clean | wday.clean |
|---|---|---|---|---|---|---|
| 8843 | 2000-01-01 | Date | 2000-01-01 | 2000-01-01 | 2000-01-01 12:49:10 | 7 |
| 7816 | 2000-01-02 | Date | 2000-01-02 | 2000-01-02 | 2000-01-02 12:49:10 | 1 |
| 11123 | 2000-01-03 | Date | 2000-01-03 | 2000-01-03 | 2000-01-03 12:49:10 | 2 |
| 12703 | 2000-01-04 | Date | 2000-01-04 | 2000-01-04 | 2000-01-04 12:49:10 | 3 |
| 12240 | 2000-01-05 | Date | 2000-01-05 | 2000-01-05 | 2000-01-05 12:49:10 | 4 |
| 12260 | 2000-01-06 | Date | 2000-01-06 | 2000-01-06 | 2000-01-06 12:49:10 | 5 |
| 12280 | 2000-01-07 | Date | 2000-01-07 | 2000-01-07 | 2000-01-07 12:49:10 | 6 |
| 8750 | 2000-01-08 | Date | 2000-01-08 | 2000-01-08 | 2000-01-08 12:49:10 | 7 |
| 7736 | 2000-01-09 | Date | 2000-01-09 | 2000-01-09 | 2000-01-09 12:49:10 | 1 |
| 11418 | 2000-01-10 | Date | 2000-01-10 | 2000-01-10 | 2000-01-10 12:49:10 | 2 |
| ###Additi | onal Lubridat | e Functions | : |
tody() : returns today’s date
year(): returns the year
month(): returns the month
Sys.getlocale()
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
this_day <- today()
this_day
## [1] "2018-05-09"
these were used to create the base dataframe
day(): returns the day
wday(): day of the week
now(): date-time of the exact moment
hour(): hour of the exact moment
minute(): minute of the exact moment
second(): second of the exact moment