Let’s explore functions that let you get and set individual components of date and time. You can extract individual parts of the date with the accessor functions in {lubridate}. Here is the list of available functions:
lubridate_accessor_functions <- data.frame(`Accessor Function` = c("year()", "month()",
"mday()", "yday()", "wday()",
"hour()", "minute()", "second()"),
Extracts = c("year", "month",
"day of the month", "day of the year",
"day of the week", "hour",
"minute", "second"))
Let’s explore some of these functions using {nycflights13} package.
Step 1: Load the flights data:
head(flights)
## # A tibble: 6 x 19
## year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
## <int> <int> <int> <int> <int> <dbl> <int> <int>
## 1 2013 1 1 517 515 2 830 819
## 2 2013 1 1 533 529 4 850 830
## 3 2013 1 1 542 540 2 923 850
## 4 2013 1 1 544 545 -1 1004 1022
## 5 2013 1 1 554 600 -6 812 837
## 6 2013 1 1 554 558 -4 740 728
## # … with 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
## # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
## # hour <dbl>, minute <dbl>, time_hour <dttm>
This data frame includes 19 variables, for date manipulations, you will use only the year
, month
, day
, hour
and minute
columns.
Step 2: Create a data frame using only the year
, month
, day
, hour
and minute
columns shown above.
flights_new <- flights %>%
select(year, month, day, hour, minute)
flights_new %>% head()
## # A tibble: 6 x 5
## year month day hour minute
## <int> <int> <int> <dbl> <dbl>
## 1 2013 1 1 5 15
## 2 2013 1 1 5 29
## 3 2013 1 1 5 40
## 4 2013 1 1 5 45
## 5 2013 1 1 6 0
## 6 2013 1 1 5 58
Step 3: To create a date/time from this sort of input, you can use make_date()
for dates and make_datetime()
for date-times.
flights_new %<>%
mutate(departure = make_datetime(year, month, day, hour, minute))
head(flights_new)
## # A tibble: 6 x 6
## year month day hour minute departure
## <int> <int> <int> <dbl> <dbl> <dttm>
## 1 2013 1 1 5 15 2013-01-01 05:15:00
## 2 2013 1 1 5 29 2013-01-01 05:29:00
## 3 2013 1 1 5 40 2013-01-01 05:40:00
## 4 2013 1 1 5 45 2013-01-01 05:45:00
## 5 2013 1 1 6 0 2013-01-01 06:00:00
## 6 2013 1 1 5 58 2013-01-01 05:58:00
Step 4: Now, to extract the year information of the flights_new$departure
column you can use following command:
flights_new$departure %>%
year() %>%
head()
## [1] 2013 2013 2013 2013 2013 2013
Step 5: For month()
and wday()
you can set label = TRUE
argument to return the abbreviated name of the month or day of the week. You can also set abbr = FALSE
to return the full name:
flights_new$departure %>%
month(label = TRUE, abbr = TRUE) %>%
head()
## [1] Jan Jan Jan Jan Jan Jan
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
flights_new$departure %>%
month(label = TRUE, abbr = FALSE) %>%
head()
## [1] January January January January January January
## 12 Levels: January < February < March < April < May < June < ... < December
Base R works slightly differently to lubridate, however the base R approach is covered in "Module 6.3.1: Convert strings to dates", where we will look at converting a character string to a date using the as.Date()
function and changing the date format using base R.
dates <- c("2020-01-01", "2019-06-30", "2012-04-27", "2008-08-08", "2010-11-19")
dates
## [1] "2020-01-01" "2019-06-30" "2012-04-27" "2008-08-08" "2010-11-19"
class(dates)
## [1] "character"
The as.Date()
function converts character strings to date format.
as.Date(dates)
## [1] "2020-01-01" "2019-06-30" "2012-04-27" "2008-08-08" "2010-11-19"
class(as.Date(dates))
## [1] "Date"
It can also handle dates in the US format:
us_dates <- c("2020-01-01", "2019-30-06", "2012-27-04", "2008-08-08", "2010-19-11")
us_dates
## [1] "2020-01-01" "2019-30-06" "2012-27-04" "2008-08-08" "2010-19-11"
class(us_dates)
## [1] "character"
as.Date(us_dates)
## [1] "2020-01-01" NA NA "2008-08-08" NA
# Note that there are missing values... that's because R thinks that we're using ISO8601
# How do we change this? We use the format argument:
as.Date(us_dates, format = "%Y-%d-%m")
## [1] "2020-01-01" "2019-06-30" "2012-04-27" "2008-08-08" "2010-11-19"
# Note that it has handled those US dates, and also converted them into ISO8601 format
class(as.Date(us_dates, format = "%Y-%d-%m"))
## [1] "Date"
We can also convert to other formats, using the format()
function, however this changes the date back into a character string:
dates2 <- format(as.Date(dates), format = "%d/%m/%Y")
dates2
## [1] "01/01/2020" "30/06/2019" "27/04/2012" "08/08/2008" "19/11/2010"
class(dates2)
## [1] "character"
Because R can handle dates in other formats and convert them to date, we can change from "dd/mm/YYYY" to "dd.mm.YY" quite easily:
dates3 <- format(as.Date(dates2, format = "%d/%m/%y"), format = "%d.%m.%y")
dates3
## [1] "01.01.20" "30.06.20" "27.04.20" "08.08.20" "19.11.20"
class(dates3)
## [1] "character"
The base R approach to dates will be addressed more fully in Module 6.3.1.