A date + time is represented in R as an object of class POSIXct
. A date is represented as an object of class Date
.
# current time
now <- Sys.time()
class(now)
[1] "POSIXct" "POSIXt"
# current date
today <- Sys.Date()
class(today)
[1] "Date"
These objects store the number of seconds (for POSIXct
) or days (for Date
) since January 1st 1970 at midnight. When printed on the console, they are displayed in a human-readable format along the ISO 8601 standard
# display it in ISO 8601 format
print(now)
[1] "2014-09-10 14:22:00 CEST"
# internally, that is just a number of seconds
as.numeric(now)
[1] 1.41e+09
# same for dates
print(today)
[1] "2014-09-10"
# except it stores a number of days
as.numeric(today)
[1] 16323
Usually we get date/time information either as text or as a number of seconds/days since a given reference (also called “Julian” date/time). The first step is therefore to convert those representations into an R object of class POSIXct
or Date
.
If the date/time is in ISO 8601 format already (i.e. YYYY-MM-DD HH:MM:SS), R can parse it directly. It just adds the current time zone of your computer by default. So the easiest was to get your date/time data into R is to format it this way from the start.
x <- as.POSIXct("2014-09-24 15:23:10")
class(x)
[1] "POSIXct" "POSIXt"
print(x)
[1] "2014-09-24 15:23:10 CEST"
x <- as.Date("2014-09-24")
class(x)
[1] "Date"
print(x)
[1] "2014-09-24"
If it is in a different format, as.POSIXct
fails and you need to parse it manually.
as.POSIXct("09/24/2014 15-23-10")
Error: character string is not in a standard unambiguous format
The function strptime
can do it. You need to specify precisely the format the date/time is written in. But we will use the lubridate
package which makes things much easier. In lubridate
you just have to specify the order in which the date/time elements (year, month, day, hour, minute, second) are and the function figures out the rest (spacing separators, wether the year has 2 or 4 digits, character or numeric representation of month, etc.).
library("lubridate")
parse_date_time("2014-09-24 15:23:10", orders="ymd hms")
[1] "2014-09-24 15:23:10 UTC"
parse_date_time("09/24/2014 15-23-10", orders="mdy hms")
[1] "2014-09-24 15:23:10 UTC"
parse_date_time("24 09 2014 15 23 10", orders="dmy hms")
[1] "2014-09-24 15:23:10 UTC"
parse_date_time("24-09-14 15-23-10", orders="dmy hms")
[1] "2014-09-24 15:23:10 UTC"
parse_date_time("Sep 24, 2014 15:23:10", orders="mdy hms")
[1] "2014-09-24 15:23:10 UTC"
Note that parse_date_time
always assigns the “UTC” time zone, which makes it more consistent than strptime
or as.POSIXct
by default. “UTC” stand for “Coordinated Universal Time” and is the time at the 0 meridian; somewhat similar to Greenwich Mean Time or “GMT”. See below regarding how to deal with time zones.
It also works with dates of course.
parse_date_time("09/24/2014", orders="mdy")
[1] "2014-09-24 UTC"
parse_date_time("24 09 2014", orders="dmy")
[1] "2014-09-24 UTC"
But it still creates a POSIXct
object (notice the “UTC” time zone added above), which you have to force into a Date
object if you want to make sure it behaves as a date.
x <- parse_date_time("09/24/2014", orders="mdy")
print(x)
[1] "2014-09-24 UTC"
class(x)
[1] "POSIXct" "POSIXt"
x <- as.Date(x)
print(x)
[1] "2014-09-24"
class(x)
[1] "Date"
lubridate
even has some shortcut functions for common orders in which date and times are specified. These functions are sometimes even cleverer than their parse_date_time
counterparts (they deal with AM/PM directly for example).
x <- "2014-09-24 15:23:10"
parse_date_time(x, orders="ymd hms")
[1] "2014-09-24 15:23:10 UTC"
ymd_hms(x) # ISO
[1] "2014-09-24 15:23:10 UTC"
mdy_hms("09/14/2014 3:23:10 PM") # USA
[1] "2014-09-14 15:23:10 UTC"
dmy_hms("14-09-2014 15:23:10") # most of the rest of the world
[1] "2014-09-14 15:23:10 UTC"
ymd("2014-09-24")
[1] "2014-09-24 UTC"
mdy("09/14/2014")
[1] "2014-09-14 UTC"
dmy("14-09-2014")
[1] "2014-09-14 UTC"
For more complex formats, see ?parse_date_time
.
Since POSIXct
objects are, internally, a number of seconds, it is possible to add or subtract seconds from them.
x <- ymd_hms("2014-09-24 15:23:10")
x + 1
[1] "2014-09-24 15:23:11 UTC"
x - 1
[1] "2014-09-24 15:23:09 UTC"
# add an hour
x + 3600
[1] "2014-09-24 16:23:10 UTC"
# add a day
x + 3600 * 24
[1] "2014-09-25 15:23:10 UTC"
Dates are similar except the computation is done in days.
x <- as.Date("2014-09-24")
x + 1
[1] "2014-09-25"
x - 1
[1] "2014-09-23"
# add years
x + 364
[1] "2015-09-23"
x + 364 * 2
[1] "2016-09-21"
For more advanced computation, see the concept of “periods” in lubridate
(?Period-class
, ?lubridate
).
A “julian” date or time is a number of days or hours or seconds elapsed since a given reference. The ability to compute with dates, as shown above, makes it trivial to import those as true POSIXct
or Date
objects.
Let us consider that we are given a vector of dates in the form or the number of days since the start of 2014. Here is how to convert it into dates
days_passed <- c(10, 22, 45, 68, 85, 120, 145)
# we need to know the origin, and make it into a Date object
origin <- as.Date("2014-01-01")
# and the actual dates are
origin + days_passed
[1] "2014-01-11" "2014-01-23" "2014-02-15" "2014-03-10" "2014-03-27"
[6] "2014-05-01" "2014-05-26"
Similarly with a number of seconds since the start of an event
seconds_elapsed <- c(477, 2135, 2474, 2546, 2891, 3846, 7284)
start <- ymd_hms("2014-09-24 15:23:10")
start + seconds_elapsed
[1] "2014-09-24 15:31:07 UTC" "2014-09-24 15:58:45 UTC"
[3] "2014-09-24 16:04:24 UTC" "2014-09-24 16:05:36 UTC"
[5] "2014-09-24 16:11:21 UTC" "2014-09-24 16:27:16 UTC"
[7] "2014-09-24 17:24:34 UTC"
Sometimes, the julian date is in the form of decimal days. It therefore represents a date but also a time during the day. In which case decimal days need to be converted into second before being added to the date+time of origin
dec_days_passed <- c(4.1356, 167.8187, 168.11034, 181.02103, 189.93808)
origin <- ymd_hms("2014-01-01 00:00:00")
# NB: the origin is treated as a date and time here, since the decimal days also hold information about time
origin + dec_days_passed * 3600 * 24
[1] "2014-01-05 03:15:15 UTC" "2014-06-17 19:38:55 UTC"
[3] "2014-06-18 02:38:53 UTC" "2014-07-01 00:30:16 UTC"
[5] "2014-07-09 22:30:50 UTC"
Once your date/time data in in POSIXct
format, it is easy to extract parts of it (i.e. the hour of the day to find out wether it is day or night, the month to find out the season, etc.). Again, the ?format
function in the base
package can do this but lubridate
provides easier to use alternatives.
x <- ymd_hms("2014-09-24 15:23:10")
# base package version (always returns a character string)
format(x, "%Y") # year
[1] "2014"
format(x, "%m") # month
[1] "09"
format(x, "%Y%m%d")
[1] "20140924"
# lubridate version (return numbers when appropriate)
year(x)
[1] 2014
month(x)
[1] 9
day(x)
[1] 24
yday(x) # number of days since start of year
[1] 267
weekdays(x)
[1] "Wednesday"
week(x)
[1] 39
hour(x)
[1] 15
minute(x)
[1] 23
second(x)
[1] 10
If your time is recorded in local time and you are not along the UTC time zone, you should specify the time zone when importing the date+time. Then converting it to another time zone is done with with_tz
.
x <- ymd_hms("2014-09-24 15:23:10", tz="Europe/Paris")
x
[1] "2014-09-24 15:23:10 CEST"
# what time was in in New-York
with_tz(x, tz="America/New_York")
[1] "2014-09-24 09:23:10 EDT"
# or Tokyo
with_tz(x, tz="Asia/Tokyo")
[1] "2014-09-24 22:23:10 JST"
A common task is to convert from local to UTC time, to synchronise observations in various time zones or compute celestial patterns for example. This is of course possible with
with_tz(x, tz="UTC")
[1] "2014-09-24 13:23:10 UTC"
See the list of human-readable time zones with olson_time_zones()
(or OlsonNames()
).
Objects of class POSIXct
can deal and compute with fractional seconds but do not print them by default
x <- ymd_hms("2014-09-24 15:23:10")
y <- x + 0.5
y
[1] "2014-09-24 15:23:10 UTC"
# the additional half second is not printed but is there since adding another 0.6 seconds rolls over to the next second
y + 0.6
[1] "2014-09-24 15:23:11 UTC"
To show them, set the appropriate options
slot
x <- ymd_hms("2014-09-24 15:23:10")
x + 0.457
[1] "2014-09-24 15:23:10 UTC"
options(digits.secs=3)
x + 0.457
[1] "2014-09-24 15:23:10.457 UTC"
options(digits.secs=5)
x + 0.45756
[1] "2014-09-24 15:23:10.45756 UTC"
options(digits.secs=0)
Jean-Olivier Irisson
Last edited on 2014-09-10
http://www.obs-vlfr.fr/~irisson/