The basic assumption for the time series is that the data points are randomly distributed and have some dependency between data points close together in time but no dependency in data points far apart in time. These assumption can be explained by the stationarity and ergodicity concepts.
In a stationary stochastic process, the joint distribution of data points is time invariant (i.e., mean and variance doesn’t change over time). The autocovariances and autocorrelations are the measures of linear temporal dependence in a covariance stationary stochastic process, known as autocorrelation function (ACF). The ACF revels the interrelationships within a time series or correlation between all pairs of data points that are exactly same steps apart.
In a strictly stationary or covariance stationary stochastic process no assumption is made about the strength of dependence between random data points in the sequence. The strength of dependence between random points in a stochastic process diminishes the farther apart they become. This diminishing dependence assumption is captured by the concept of ergodicity. A stochastic process is ergodic if any two collections of random points partitioned far apart in the sequence.
An important class of linear time series models is the family of Autoregressive Integrated Moving Average (ARIMA) models, proposed by Box and Jenkins (1976). It assumes that the current value can depend only on the past values of the time series itself or on past values of some error term.
Moving average models are simple covariance stationary and ergodic time series models that can capture a wide variety of autocorrelation patterns. To create a covariance stationary and ergodic stochastic process in which yt and yt minus 1 are correlated but Yt and yt minus j are not correlated for j less than 1, where the time dependence in the process only lasts for one period. These processes can be created using the first order moving average (MA (1)) model. The moving average parameter, theta determines the sign and magnitude of the correlation between yt and yt minus 1. Clearly, if theta equals to 0 then yt exhibits no time dependence.
The presence of autocorrelation is one indication that an ARIMA model could be used to model the time series. From ACF plot, one can count number of significant autocorrelations, which is a useful estimate for the number of moving averages (MA) coefficients in the model. The plot for wfc shows only one MA coefficient will be required.
Partial Autocorrelation is a tool to understand interrelationships in a time series. It is the correlation between all data points that are exactly n steps apart, after accounting for their correlation with the data between those n steps. It helps to identify the number of autoregression(AR) coefficients in an ARIMA model. For wcf, no significant partial autocorrelation found.
The cross correlation function helps to discover lagged correlations between two time series. Correlation at lag 0 is the simple correlation between the variables.
[1] 0.5059
Building an ARIMA model consists three steps: 1.Model identification (involves determining the order that is the number of past values and number of past error terms to incorporate in a tentative model, 2.Model estimation (parameters of the model are estimated, generally using either the least squares or maximum likelihood methods), and 3. Diagnostic checking (e.g. Model residuals behave as white noise). The model order is usually denoted by three integers,(p,d,q), where, p= number of autoregressive coeff.; d= degree of differencing (AR); q = number of moving average coeff (MA).
Series: returns[, "sp500"]
ARIMA(1,1,0) with drift
Coefficients:
ar1 drift
0.082 0.006
s.e. 0.063 0.003
sigma^2 estimated as 0.00188: log likelihood=436.8
AIC=-867.7 AICc=-867.6 BIC=-857.1
Series: returns[, "aapl"]
ARIMA(0,1,0) with drift
Coefficients:
drift
0.018
s.e. 0.009
sigma^2 estimated as 0.0184: log likelihood=146.8
AIC=-289.5 AICc=-289.4 BIC=-282.4
Series: returns[, "vbltx"]
ARIMA(0,1,2) with drift
Coefficients:
ma1 ma2 drift
0.065 -0.220 0.006
s.e. 0.062 0.064 0.001
sigma^2 estimated as 0.000616: log likelihood=578.3
AIC=-1149 AICc=-1148 BIC=-1134
The tsdiag plots the residuals, the autocorrelation function of the residuals, and the p-values of a Portmanteau test for all lags.
The predict function calculates both the next observation and sd according the model.
$pred
Time Series:
Start = 256
End = 256
Frequency = 1
[1] 4.002
$se
Time Series:
Start = 256
End = 256
Frequency = 1
[1] 0.0827
$pred
Time Series:
Start = 256
End = 265
Frequency = 1
[1] 4.002 4.002 4.002 4.002 4.002 4.002 4.002 4.002 4.002 4.002
$se
Time Series:
Start = 256
End = 265
Frequency = 1
[1] 0.0827 0.1169 0.1432 0.1654 0.1849 0.2026 0.2188 0.2339 0.2481
[10] 0.2615