The Challenge: turning “2017-03-10T00:00:00.000Z” into a date

Dates come in many different forms. At first it can be challenging to learn how to clean them up. Luckily, the tidyverse comes with a package called lubridate that makes it easy.

This post addresses a specific question brought up by a student: how do we turn a date formatted like 2017-03-10T00:00:00.000Z into a date object?

If you want more resources, take a look at the Dates and Times chapter in R For Data Science, or the lubridate cheat sheet

Load Necessary Packages

library(tidyverse) # so we can use the %>% operator and use dplyr
library(lubridate) # so we can deal with dates. It is part of the extended tidyverse but it is not part of the core tidyverse that loads automatically.

Dates and Date-Times

2017-03-10T00:00:00.000Z is a date-time.

As it name implies, date-times include both the date and the time. You can get the current date-time by using the function lubridate::now()

now()
## [1] "2022-11-08 14:14:13 EST"

Dates just include the date. You can get the current date using the function lubridate::today()

today()
## [1] "2022-11-08"

Parsing Dates and Date-Times from Character Strings

Often when you load a dataset, the dates will load in as character strings. lubridate has a class of functions that allows you to easily parse these strings into dates and date-times. Here are some simple examples

  • ymd(): parses dates that are written in year-month-day format like 2022-11-08
  • ymd_hms(): parses date-times written in year-month-day-hour-minute-second format, like 2022-11-08 13:42:53 EST

Solving our Problem

first, we use ymd_hms() to parse the string into a date-time object.

"2017-03-10T00:00:00.000Z" %>% ymd_hms()
## [1] "2017-03-10 UTC"

And then we use ymd() to transform the date-time object into a date object.

date_object <- "2017-03-10T00:00:00.000Z" %>% 
  ymd_hms() %>%
  ymd()

date_object
## [1] "2017-03-10"
date_object %>% class()
## [1] "Date"

And that’s it. Now we can turn this into a function

ymd_hms_string_to_ymd_date <- function(ymd_hms_string) {
  ymd_hms_string %>%
    lubridate::ymd_hms() %>%
    lubridate::ymd()
}

Let’s try it out.

"2017-03-10T00:00:00.000Z" %>% ymd_hms_string_to_ymd_date()
## [1] "2017-03-10"

Bonus: How To Extract the Year or Month From a Date

When you are creating datasets, it is often helpful to create data features that you can group data by during analysis. For example, perhaps you want to group by all deals done during a specific year or month.

There are another set of functions in lubridate that help with this.

First let’s start with some data.

date_time_tibble <- tibble(date_time_char = c("2017-03-10T00:00:00.000Z", "2022-03-10T00:00:00.000Z", "2015-11-05T00:00:00.000Z"))

date_time_tibble

First, we test that our function works to clean the dates in our dataset

date_time_tibble %>%
  mutate(date = ymd_hms_string_to_ymd_date(date_time_char))

Next, we will extract date features, such as the year, the month, and the week.

date_time_tibble %>%
  mutate(date = ymd_hms_string_to_ymd_date(date_time_char),
         year = year(date),
         month = month(date, label = TRUE),
         week_of_year = week(date))

There is plenty more you can do.