Los Angeles R User's Group Meet-up
November 14, 2013
Yasmin Lucero
Senior Statistician, Gravity.com
yasmin.lucero@gmail.com
"Mon, Nov 5, 2013"
1381505475000
2013-01-31 12:00:15.45 UTC
lm(y ~ day)
plot(y ~ date)
time1 < time2
time2 = time1 + lag
julian = yday(time)
time$hour
weekdays(time)
?DateTimeClasses
Date vs. POSIX datetime
Sys.Date()
[1] "2013-11-14"
Sys.time()
[1] "2013-11-14 10:58:53 PST"
Best practices: if you only need a date, use a Date
now = Sys.time()
now = as.numeric(now)
format(now, scientific=FALSE)
[1] "1384455533"
lt = as.POSIXlt(Sys.time())
attr(lt, 'names')
[1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst"
lt$hour
[1] 10
lt$wday
[1] 4
ISOdatetime(2013, 10, 2, 8, 30, 15)
[1] "2013-10-02 08:30:15 PDT"
as.POSIXct('2013-10-02 08:30:15')
[1] "2013-10-02 08:30:15 PDT"
as.Date('10/2/13', format='%m/%d/%y')
[1] "2013-10-02"
Datetimes come to you in many forms
as.Date('2013-10-02')
[1] "2013-10-02"
as.Date('02/10/13')
[1] "0002-10-13"
print(try(as.Date('October 2, 2013')))
[1] "Error in charToDate(x) : \n character string is not in a standard unambiguous format\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in charToDate(x): character string is not in a standard unambiguous format>
The most important advice in the whole presentation.
require(lubridate)
mdy('10/2/13')
[1] "2013-10-02 UTC"
Reason #1: A family of convenient parsing functions:
ymd, dmy, dym, mdy, dmy_h, dmy_hms...
origin
[1] "1970-01-01 UTC"
today()
[1] "2013-11-14"
now()
[1] "2013-11-14 10:58:53 PST"
Base
weekdays(today())
[1] "Thursday"
Lubridate
wday(today())
[1] 5
yday(today())
mday(today())
month(today())
quarter(today())
am(now())
tz(now())
dst(now())
days_in_month(today())
Nov
30
decimal_date(now())
[1] 2013.8697
ceiling_date(now(), unit='hour')
[1] "2013-11-14 11:00:00 PST"
floor_date(now(), unit='day')
[1] "2013-11-14 PST"
Base R has POSIXt methods for seq, cut, trunc, round
week = seq(from=today(), to=today() + 7, by='day')
print(week)
[1] "2013-11-14" "2013-11-15" "2013-11-16" "2013-11-17" "2013-11-18"
[6] "2013-11-19" "2013-11-20" "2013-11-21"
cut(week, breaks=2)
[1] 2013-11-14 2013-11-14 2013-11-14 2013-11-14 2013-11-18 2013-11-18
[7] 2013-11-18 2013-11-18
Levels: 2013-11-14 2013-11-18
getOption('digits.secs')
Default value is zero, can be set to any integer from 0 to 6.
Sys.time()
[1] "2013-11-14 10:58:53 PST"
options(digits.secs=2)
Sys.time()
[1] "2013-11-14 10:58:53.53 PST"
Operators available in base package are addition, subtraction and the logical comparison operators: ==, !=, <, >, <=, >=
now = Sys.time()
later = now + 100
later - now
Time difference of 1.6666667 mins
later > now
[1] TRUE
Difftime sets the time units based on the size of the time difference!
difftime(now, now + 150)
Time difference of -2.5 mins
difftime(now, now + 10)
Time difference of -10 secs
When you use subtraction operator on date-time objects, it calls the difftime
method. As a best practice, don't use the subtraction operator in code, instead call difftime
, and always set the units parameter explicitly.
dt.test = difftime(now, later)
dt.test
Time difference of -1.6666667 mins
attributes(dt.test)
$units
[1] "mins"
$class
[1] "difftime"
as.numeric(dt.test)
[1] -1.6666667
now() - ddays(14) # duration
[1] "2013-10-31 11:58:53.72 PDT"
now() - days(14) # period
[1] "2013-10-31 10:58:53.72 PDT"
today() %within% new_interval(ymd('2013-10-01'), ymd('2013-12-01'))
[1] TRUE
today() %m+% months(3)
[1] "2014-02-14"
today() %m-% years(2)
[1] "2011-11-14"
require(timeDate)
USThanksgivingDay(year=2014)
GMT
[1] [2014-11-27]
CAThanksgivingDay(year=2014)
GMT
[1] [2014-10-13]
Some of my favorite function names…
And for some reason, they have skew and kurtosis methods for POSIX objects.
Days when NY stock exchange is closed:
holidayNYSE(year=2014)
NewYork
[1] [2014-01-01] [2014-01-20] [2014-02-17] [2014-04-18] [2014-05-26]
[6] [2014-07-04] [2014-09-01] [2014-11-27] [2014-12-25]
DST rules for a large number of world cities:
head(Phoenix())
Phoenix offSet isdst TimeZone numeric
1 1901-12-14 20:45:52 -25200 0 MST -2147397248
2 1918-03-31 09:00:00 -21600 1 MDT -1633273200
3 1918-10-27 08:00:00 -25200 0 MST -1615132800
4 1919-03-30 09:00:00 -21600 1 MDT -1601823600
5 1919-10-26 08:00:00 -25200 0 MST -1583683200
6 1942-02-09 09:00:00 -21600 1 MWT -880210800
alignQuarterly(today())
GMT
[1] [2013-12-31]
tC = timeCalendar()
align(tC, by='2w', offset='3d')
GMT
[1] [2013-01-04] [2013-01-18] [2013-02-01] [2013-02-15] [2013-03-01]
[6] [2013-03-15] [2013-03-29] [2013-04-12] [2013-04-26] [2013-05-10]
[11] [2013-05-24] [2013-06-07] [2013-06-21] [2013-07-05] [2013-07-19]
[16] [2013-08-02] [2013-08-16] [2013-08-30] [2013-09-13] [2013-09-27]
[21] [2013-10-11] [2013-10-25] [2013-11-08] [2013-11-22]
Base R comes with a time-series class ts
. This is a vector where the index is a datetime class (instead of the usual integer type). The ts
class index is a regular time series, there are several other time series classes that faciliate more complex time series needs.