Chinmay Patil
14 May 2014
xts is an extended zoo class
The 3 main components of xts objects are
Date, POSIXct, chron, yearmon, yearqtr or timeDate clasessrequire(xts)
data(sample_matrix)
class(sample_matrix)
[1] "matrix"
str(sample_matrix)
num [1:180, 1:4] 50 50.2 50.4 50.4 50.2 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:180] "2007-01-02" "2007-01-03" "2007-01-04" "2007-01-05" ...
..$ : chr [1:4] "Open" "High" "Low" "Close"
head(sample_matrix, 3)
Open High Low Close
2007-01-02 50.04 50.12 49.95 50.12
2007-01-03 50.23 50.42 50.23 50.40
2007-01-04 50.42 50.42 50.26 50.33
tail(sample_matrix, 3)
Open High Low Close
2007-06-28 47.68 47.70 47.57 47.61
2007-06-29 47.64 47.78 47.62 47.66
2007-06-30 47.67 47.94 47.67 47.77
Most common usage.
as.xts(x, order.by)
x : a numeric vector, matrix or a factor.
order.by : an index vector with unique entries by which the observations in x are ordered.
matrix_xts <- as.xts(sample_matrix,dateFormat='Date')
str(matrix_xts)
An 'xts' object on 2007-01-02/2007-06-30 containing:
Data: num [1:180, 1:4] 50 50.2 50.4 50.4 50.2 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:4] "Open" "High" "Low" "Close"
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
NULL
df_xts <- as.xts(as.data.frame(sample_matrix),
important='very important info!')
str(df_xts)
An 'xts' object on 2007-01-02/2007-06-30 containing:
Data: num [1:180, 1:4] 50 50.2 50.4 50.4 50.2 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:4] "Open" "High" "Low" "Close"
Indexed by objects of class: [POSIXct,POSIXt] TZ:
xts Attributes:
List of 1
$ important: chr "very important info!"
xts(x = NULL,
order.by = index(x),
frequency = NULL,
unique = TRUE,
tzone = Sys.getenv("TZ"),
...)
xts(1:5, Sys.Date()+1:5)
[,1]
2014-05-22 1
2014-05-23 2
2014-05-24 3
2014-05-25 4
2014-05-26 5
DF <- as.data.frame(sample_matrix)
row.names(DF) <- NULL
DF$index <- as.Date(row.names(sample_matrix))
head(DF)
Open High Low Close index
1 50.04 50.12 49.95 50.12 2007-01-02
2 50.23 50.42 50.23 50.40 2007-01-03
3 50.42 50.42 50.26 50.33 2007-01-04
4 50.37 50.37 50.22 50.33 2007-01-05
5 50.24 50.24 50.11 50.18 2007-01-06
6 50.13 50.22 49.99 49.99 2007-01-07
res <- DF[DF$index >= as.Date('2007-03-01') & DF$index <= as.Date('2007-03-31') , ]
head(res, 3)
Open High Low Close index
59 50.82 50.82 50.56 50.57 2007-03-01
60 50.61 50.72 50.51 50.62 2007-03-02
61 50.73 50.73 50.41 50.41 2007-03-03
tail(res, 3)
Open High Low Close index
87 48.59 48.7 48.57 48.70 2007-03-29
88 48.75 49.0 48.75 48.94 2007-03-30
89 48.96 49.1 48.96 48.97 2007-03-31
res2 <- matrix_xts['2007-03']
head(res2, 3)
Open High Low Close
2007-03-01 50.82 50.82 50.56 50.57
2007-03-02 50.61 50.72 50.51 50.62
2007-03-03 50.73 50.73 50.41 50.41
tail(res2, 3)
Open High Low Close
2007-03-29 48.59 48.7 48.57 48.70
2007-03-30 48.75 49.0 48.75 48.94
2007-03-31 48.96 49.1 48.96 48.97
Lets create some sample data
DF <- data.frame(x = rnorm(1e6), index = as.POSIXct('1970-01-01') + 1:1e6)
str(DF)
'data.frame': 1000000 obs. of 2 variables:
$ x : num 0.257 -1.454 -0.314 1.379 0.869 ...
$ index: POSIXct, format: "1970-01-01 00:00:01" "1970-01-01 00:00:02" ...
And equivalent xts data
XTS <- as.xts(DF$x, order.by=DF$index)
str(XTS)
An 'xts' object on 1970-01-01 00:00:01/1970-01-12 13:46:40 containing:
Data: num [1:1000000, 1] 0.257 -1.454 -0.314 1.379 0.869 ...
Indexed by objects of class: [POSIXct,POSIXt] TZ:
Original class: 'double'
xts Attributes:
NULL
require(microbenchmark)
microbenchmark(dfres <- DF[DF$index >= as.POSIXct('1970-01-10 00:00:00') & DF$index < as.POSIXct('1970-01-11 00:00:00'),],
times = 1)
Unit: milliseconds
expr
dfres <- DF[DF$index >= as.POSIXct("1970-01-10 00:00:00") & DF$index < as.POSIXct("1970-01-11 00:00:00"), ]
min lq median uq max neval
111.6 111.6 111.6 111.6 111.6 1
microbenchmark(xtsres <- XTS['1970-01-10'],
times = 1)
Unit: milliseconds
expr min lq median uq max neval
xtsres <- XTS["1970-01-10"] 2.743 2.743 2.743 2.743 2.743 1
head(dfres, 3)
x index
777600 -1.173 1970-01-10 00:00:00
777601 -1.017 1970-01-10 00:00:01
777602 1.700 1970-01-10 00:00:02
head(xtsres, 3)
[,1]
1970-01-10 00:00:00 -1.173
1970-01-10 00:00:01 -1.017
1970-01-10 00:00:02 1.700
tail(dfres, 3)
x index
863997 -0.357323 1970-01-10 23:59:57
863998 -0.006125 1970-01-10 23:59:58
863999 -0.347745 1970-01-10 23:59:59
tail(xtsres,3)
[,1]
1970-01-10 23:59:57 -0.357323
1970-01-10 23:59:58 -0.006125
1970-01-10 23:59:59 -0.347745
E.g. Extract all the data from the beginning through January 7, 2007.
matrix_xts['/2007-01-07']
Open High Low Close
2007-01-02 50.04 50.12 49.95 50.12
2007-01-03 50.23 50.42 50.23 50.40
2007-01-04 50.42 50.42 50.26 50.33
2007-01-05 50.37 50.37 50.22 50.33
2007-01-06 50.24 50.24 50.11 50.18
2007-01-07 50.13 50.22 49.99 49.99
E.g. Extract all the data between specific period
matrix_xts['2007-01-07/2007-01-18']
Open High Low Close
2007-01-07 50.13 50.22 49.99 49.99
2007-01-08 50.04 50.10 49.97 49.99
2007-01-09 49.99 49.99 49.80 49.91
2007-01-10 49.91 50.13 49.91 49.97
2007-01-11 49.89 50.24 49.89 50.24
2007-01-12 50.21 50.36 50.17 50.29
2007-01-13 50.32 50.48 50.32 50.41
2007-01-14 50.46 50.62 50.46 50.60
2007-01-15 50.62 50.69 50.47 50.49
2007-01-16 50.62 50.74 50.57 50.68
2007-01-17 50.74 50.77 50.45 50.49
2007-01-18 50.48 50.61 50.40 50.58
## first 1 week of the data
first(matrix_xts, '1 week')
Open High Low Close
2007-01-02 50.04 50.12 49.95 50.12
2007-01-03 50.23 50.42 50.23 50.40
2007-01-04 50.42 50.42 50.26 50.33
2007-01-05 50.37 50.37 50.22 50.33
2007-01-06 50.24 50.24 50.11 50.18
2007-01-07 50.13 50.22 49.99 49.99
## last 2 week of data
last(matrix_xts, '2 weeks')
Open High Low Close
2007-06-18 47.43 47.56 47.36 47.36
2007-06-19 47.46 47.73 47.46 47.67
2007-06-20 47.71 47.82 47.67 47.67
2007-06-21 47.71 47.71 47.61 47.63
2007-06-22 47.57 47.59 47.33 47.33
2007-06-23 47.23 47.25 47.09 47.25
2007-06-24 47.24 47.30 47.21 47.23
2007-06-25 47.20 47.43 47.13 47.43
2007-06-26 47.44 47.62 47.44 47.62
2007-06-27 47.62 47.72 47.60 47.63
2007-06-28 47.68 47.70 47.57 47.61
2007-06-29 47.64 47.78 47.62 47.66
2007-06-30 47.67 47.94 47.67 47.77
## first 3 days of the last week of the data.
first(last(matrix_xts,'1 week'),'3 days')
Open High Low Close
2007-06-25 47.20 47.43 47.13 47.43
2007-06-26 47.44 47.62 47.44 47.62
2007-06-27 47.62 47.72 47.60 47.63
data1 <- xts(1:5,
order.by=Sys.Date()+( 1:5) )
names(data1) <- 'x'
data2 <- xts(101:105,
order.by=Sys.Date()+(-2:2))
names(data2) <- 'y'
data1
x
2014-05-22 1
2014-05-23 2
2014-05-24 3
2014-05-25 4
2014-05-26 5
data2
y
2014-05-19 101
2014-05-20 102
2014-05-21 103
2014-05-22 104
2014-05-23 105
merge(data1,data2)
x y
2014-05-19 NA 101
2014-05-20 NA 102
2014-05-21 NA 103
2014-05-22 1 104
2014-05-23 2 105
2014-05-24 3 NA
2014-05-25 4 NA
2014-05-26 5 NA
cbind(data1,data2)
x y
2014-05-19 NA 101
2014-05-20 NA 102
2014-05-21 NA 103
2014-05-22 1 104
2014-05-23 2 105
2014-05-24 3 NA
2014-05-25 4 NA
2014-05-26 5 NA
The periodicity function provides a quick summary as to the underlying periodicity of most time-series like objects.
periodicity(matrix_xts)
Daily periodicity from 2007-01-02 to 2007-06-30
Common use case with timeseries data is to identify the endpoints with respect to time.
endpoints(matrix_xts,on='months')
[1] 0 30 58 89 119 150 180
endpoints(matrix_xts,on='months', k = 2)
[1] 0 58 119 180
matrix_xts[endpoints(matrix_xts,on='months')]
Open High Low Close
2007-01-31 50.07 50.23 50.07 50.23
2007-02-28 50.69 50.77 50.60 50.77
2007-03-31 48.96 49.10 48.96 48.97
2007-04-30 49.14 49.34 49.11 49.34
2007-05-31 47.83 47.84 47.74 47.74
2007-06-30 47.67 47.94 47.67 47.77
One of the most ubiquitous type of data in finance is OHLC data (Open-High-Low-Close). Often is is necessary to change the periodicity of this data to something coarser - e.g. take daily data and aggregate to weekly or monthly.
to.period(matrix_xts,'months', name ="")
.Open .High .Low .Close
2007-01-31 50.04 50.77 49.76 50.23
2007-02-28 50.22 51.32 50.19 50.77
2007-03-31 50.82 50.82 48.24 48.97
2007-04-30 48.94 50.34 48.81 49.34
2007-05-31 49.35 49.69 47.52 47.74
2007-06-30 47.74 47.94 47.09 47.77
Wrapper to.monthly changes teh index to something more appropriate
to.monthly(matrix_xts, name ="")
.Open .High .Low .Close
Jan 2007 50.04 50.77 49.76 50.23
Feb 2007 50.22 51.32 50.19 50.77
Mar 2007 50.82 50.82 48.24 48.97
Apr 2007 48.94 50.34 48.81 49.34
May 2007 49.35 49.69 47.52 47.74
Jun 2007 47.74 47.94 47.09 47.77
Often it is desirable to be able to calculate a particular statistic, or evaluate a function, over a set of non-overlapping time periods.
period.apply(matrix_xts[,4],
INDEX=endpoints(matrix_xts),FUN=max)
Close
2007-01-31 50.68
2007-02-28 51.18
2007-03-31 50.62
2007-04-30 50.33
2007-05-31 49.59
2007-06-30 47.77
Same results can be achieved using apply.monthly
apply.monthly(matrix_xts[,4],FUN=max)
Close
2007-01-31 50.68
2007-02-28 51.18
2007-03-31 50.62
2007-04-30 50.33
2007-05-31 49.59
2007-06-30 47.77
Credits go to authors of xts package :
Jeff Ryan, Joshua Ulrich
My contact details : chinmay.patil@gmail.com