Time Series Analysis 3

Visit my website for more like this! I would love to hear your feedback (seriously).

library(astsa, quietly=TRUE, warn.conflicts=FALSE)
require(knitr)
library(ggplot2)

Data Sources:

Heavily borrowed from:

1.0 Introduction

So far we have introduced autocorrelation and cross-correlation (ACF and CCF) as techniques for clarifying relations that may occur within and between time series at various lags. We also know how to build linear models based on classical regression techniques, and exploit the large lagged correlations noted from ACF or CCF to generate accurate predictor variables. This section deals with data that are possibly nonstationary with emphasis on forecasting future values. Now that we have covered the basics, we can focus on time series specific regression techniques, as classical regression is often not suitable for explaining all aspects of a time series.

Using correlation we can identify lagged linear relationships, which give rise to autoregressive (AR) and autoregressive moving average (ARMA) models. Next we explore nonstationary models, the autoregressive integrated moving average (ARIMA). Eventually, we demonstrate the Box-Jenkins method for identifying an ARIMA model, as well as techniques for parameter estimation and forecasting. ARIMA models are models that may possibly include AR terms, MA terms, and differencing operations.

Most software specify the model attributes in the order parameter (AR order, differencing, MA order).

One of the biggest challenges in time series analysis, is choosing the best model for the data at hand. Here we also introduce the partial autocorrelation function, which is the final diagnostic tool (paired with ACF, CCF, and lagged scatter plots) to help us choose the most suitable model.

2.0 Autoregressive Moving Average Models

Classical regression is a static global model. With time series, we can allow the dependent variable to be influenced by not only the current values of the independent variables, but past values as well. We can even include past values of the dependent variable. If the present can be modeled using only the past values of the independent variables, we can start to make very appropriate forecasting predictions.

Autoregressive models are based on the idea that the current value of the series xt, can be explained as a function of past p values, where p is the number of steps into the past needed to forecast the current model. The extent to which we can forecast a real data series from its own past values can be investigated by interpreting the autocorrelation function, and/or lagged scatter plot matrix. For example, we may find using the scatter plots shows that lags 4 and 5 are linearly associated with the current value. The ACF could indicate positive lags at 1, 2, 12, and 36. The AR(1) model takes the form

\[xt=δ+ϕ1xt−1+wt\]

where the correlation between observations h time periods apart is:

\[ρh=ϕh1\]

AR(1) models value of x at time t is a linear function of the value of x at time t-1. The model assumes:

A example of a sample AR(1) process with \(φ = 0.9\) and \(φ = 0.-9\) are plotted below. This constant parameter defines how closely values across time are correlated to each other. \(φ = 0.9\) means that observations contiguous in time will tend to be close in value. The second example then, implies the opposite, whereby values contiguous in time are negatively correlated. However, observations two points in time apart are positively correlated.

In general, if φ1 is positive the ACF exponentially decreases to 0 as lag h increases, and for negative φ1, the ACF also exponentially decays to 0 as lag increases, but the algebraic signs for the autocorrelations alternate between positive and negative.

par(mfrow=c(4,1))
plot(arima.sim(list(order=c(1,0,0), ar=0.9), n=100), ylab='x', 
                    main=(expression(AR(1)~~~phi==+0.9)))
acf(arima.sim(list(order=c(1,0,0), ar=0.9), n=100), lag.max=11, main='ACF')

plot(arima.sim(list(order=c(1,0,0), ar=-0.9), n=100), ylab='x',
     main=(expression(AR(1)~~~phi==-0.9)))
acf(arima.sim(list(order=c(1,0,0), ar=-0.9), n=100), lag.max=11, main='ACF')

plot of chunk unnamed-chunk-2

AR(1) models with \(|ϕ1|>1\) are explosive because the time series can quickly become large in magnitude.

2.1 Moving Average Models

Moving average models MA of order q (MA(q)), like autoregressive models that use lagged versions of xt - 1 multiplied by a coefficient to predict xt , MA(q) models model past error multiplied by a coefficient.

\[MA(1): xt=μ+wt+θ1wt−1\] \[MA(2): xt=μ+wt+θ1wt−1+θ2wt−2\] \[MA(3): xt=μ+wt+θ1wt−1+θ2wt−2+⋯+θqwt−q\]

where \(μ\) is the mean and \(wt\) is the Gaussian normally distributed error term with mean 0.

Some general ACF properties of MA(q) models:

In general, MA(q) models have non-zero autocorrelations for the first q lags and autocorrelations = 0 for all lags > q.

A sample ACF with significant autocorrelation at only lag 1 is an indicator of a possible MA(1) model.

set.seed(1234)
ma1<-arima.sim(list(order=c(0,0,1), ma=c(-0.5)), n=100) #Simulates 200 values from MA(1)
par(mfrow=c(2,1))
plot(ma1, type='b', main='Simulated MA(1)', ylab='y')
acf(ma1, main='ACF for simulated sample data')

plot of chunk unnamed-chunk-4

A sample ACF with significant correlations at lags 1 and 2, but non-significant autocorrelation for higher lags indicates possible MA(2) model.

set.seed(1234)
ma1<-arima.sim(list(order=c(0,0,2), ma=c(0.5, 0.5)), n=100) #Simulates 200 values from MA(1)
par(mfrow=c(2,1))
plot(ma1, type='b', main='Simulated MA(2)', ylab='y')
acf(ma1, main='ACF for simulated sample data')

plot of chunk unnamed-chunk-6

2.11 Non-uniqueness of MA models and Invertibility

We note that the correlation between lags is the same for \(θ\) and \(1/θ\). To satisfy this theoretical restriction called invertibility, we restrict MA(1) models to have values with an absolute value < 1.

A MA model is said to be converging if it is algebraically equivalent to a converging infinite order AR model. This means that the AR coefficients decrease to 0 as we move back in time. This invertibility is programmed into our software and is not something we worry about in data analysis.

2.2 Partial Autocorrelation Function

We have seen that for MA(q) models, the ACF will be insignificant for lags > q. Therefore, the ACF is provides considerable information for specifying an MA(q) process. Unfortunately, if the process is ARMA or AR, the ACF alone yields little information about the orders of dependence (p, q). There is however, another function which can act like the ACF for an MA process, but for AR models; the partial autocorrelation function (PACF).

An easy example of a PACF can be explained using a linear regression where we predict y from x1, x2, and x3. Basically, in a PACF we want to, for example, correlate the “parts” of y and x3 that are not predicted by x1 and x2. However, what happens normally (non-partial) is that the linear dependency between x3 and y has accounted accounted for the dependency between y and x1 and x2.

Similarly for time series, For a time series, the PACF is the conditional autocorrelation between xs and xt, with the linear effect of everything in between those two points removed. Consider and AR(1) model, whereby the correlation between xt and xt - 2 is not zero, as it would be for an MA(1), because xt is dependent on xt - 2 through xt - 1. Thus, a PACF would break this chain of dependence by literally subtracting out (or partial out) the effect of xt - 1.

For an AR model, the theoretical PACF essentially “shuts off” the past order of the model. Thus, exactly the same as the MA(q) model selection using an ACF, we identify the order of the model by the number of non-zero partial autocorrelations. An AR(1) model will have one significant (non-zero) autocorrelation at lag 1 in the PACF.

Note: the PACF of an MA(1) model will have a pattern that gradually tapers to zero.

3.0 Specifying the elements of the model.

Recall that we specify a model using the order attribute, (AR order, differencing, MA order). And that specifying an ARIMA model is all about identifying the form combination for any particular time series. Most software specify the model attributes in the order parameter (AR order, differencing, MA order).

Identifying a possible model

Three items should be considered to determine a first guess at the form of an ARIMA model:

  1. Time series plot
  2. ACF
  3. PACF

3.1 Time seris plot

What to look for : possible trends, seasonality, outliers, constant/non-constant variance.

Note: non-constant variance in a series with no trend may indicate something like and ARCH model which is designed for modelling changing variance over time, we will cover this later.

3.2 ACF and PACF

The ACF and PACF should be considered together, as a few combining patterns will stand out, especially with experience.

3.3 Estimating and Diagnosing a Possible Model.

After you have made a few guesses at the possible model, we use software to estimate the coefficients. Usually this is done using maximum likelihood estimation methods. Once the model is estimated…

What if more than one model looks OK?

Sometimes, more than one model can appear to work for the same time series.

4.0 Examples

Lake Erie water level in October. There is possibly some overall trend, but it we can't be sure yet. We will not worry about a trend for now. Let's plot the ACF and PACF.

data <- c(14.3, 14.6, 13.5, 14.2, 12.1, 14.19, 14.55, 13.6, 14.59, 16.6, 15.4,
          12.89, 12.2, 12.15, 10.89, 11.11, 11.98, 13.44, 14.05, 13.91, 14.1,
          12.66, 14.6, 16.7, 15.4, 17.1, 15.45, 16.76, 15.7, 14.1, 15.3, 16.4,
          17.09, 16.8, 16.98, 16.39, 15.8, 15.1, 13.7, 13.79, 15.7)
# load built in commands for TSA diagnostics
source(url("http://lib.stat.cmu.edu/general/tsa2/Rcode/itall.R")) 
##   itall has been installed
plot(data, type='b', ylab='water level')

plot of chunk unnamed-chunk-8

acf2(data)

plot of chunk unnamed-chunk-8

##         ACF  PACF
##  [1,]  0.72  0.72
##  [2,]  0.52  0.02
##  [3,]  0.39  0.01
##  [4,]  0.24 -0.09
##  [5,]  0.14 -0.02
##  [6,]  0.08  0.00
##  [7,]  0.15  0.23
##  [8,]  0.17  0.02
##  [9,]  0.13 -0.11
## [10,]  0.03 -0.18
## [11,] -0.12 -0.24
## [12,] -0.16  0.11
## [13,] -0.15  0.16
## [14,] -0.12  0.06
## [15,] -0.14 -0.22
## [16,] -0.19 -0.32
## [17,] -0.14  0.05

The PACF shows a single spike at the first lag, and the ACF shows a tapering pattern, right away we can see an AR(1) model might be appropriate.

Estimating the model

sarima (data, 1, 0, 0, ) # this is the AR(1) model estimated with the tools loaded above
## initial  value 0.519565 
## iter   2 value 0.153108
## iter   3 value 0.153003
## iter   4 value 0.152866
## iter   5 value 0.152842
## iter   6 value 0.152841
## iter   6 value 0.152841
## final  value 0.152841 
## converged
## initial  value 0.150083 
## iter   2 value 0.149811
## iter   3 value 0.149705
## iter   4 value 0.149704
## iter   5 value 0.149704
## iter   6 value 0.149704
## iter   6 value 0.149704
## final  value 0.149704 
## converged

plot of chunk unnamed-chunk-9

## $fit
## 
## Call:
## arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S), 
##     xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, REPORT = 1, 
##         reltol = tol))
## 
## Coefficients:
##         ar1   xmean
##       0.709  14.583
## s.e.  0.105   0.584
## 
## sigma^2 estimated as 1.33:  log likelihood = -64.31,  aic = 134.6
## 
## $AIC
## [1] 1.38
## 
## $AICc
## [1] 1.445
## 
## $BIC
## [1] 0.4636

Interpreting the diagnostics

The AR coefficient is statistically significant (z = 0.709/0.1094 = 6.480). It’s not necessary to test the mean coefficient. We know that it’s not 0.

The time series plot of the standardized residuals seems to indicate there is no trend in the residuals, no outliers, and no obvious change in variance over time.

The ACF of the residuals shows no significant autocorrelations, great!

The Q-Q plot is a normal probability plot; small deviations at the tails are not abnormal, and usually nothing to worry about.

The last plot on the bottom gives p-values for the Ljung-Box-Pierce statistic for each lag up to 20. These tests consider the accumulated residual autocorrelation from lag 1. The dashed blue line is at 0.05, and all p-values are above it, which is a good result for this test.

What a poorly specified model will look like.

What happens if we misinterpreted the ACF and PACF and used an MA(1) model instead of an AR(1) model?

sarima(data, 0, 0, 1)
## initial  value 0.507459 
## iter   2 value 0.271938
## iter   3 value 0.270110
## iter   4 value 0.259999
## iter   5 value 0.259329
## iter   6 value 0.259211
## iter   7 value 0.259210
## iter   8 value 0.259209
## iter   9 value 0.259209
## iter  10 value 0.259209
## iter  10 value 0.259209
## iter  10 value 0.259209
## final  value 0.259209 
## converged
## initial  value 0.261526 
## iter   2 value 0.261466
## iter   3 value 0.261392
## iter   4 value 0.261391
## iter   5 value 0.261391
## iter   5 value 0.261391
## iter   5 value 0.261391
## final  value 0.261391 
## converged

plot of chunk unnamed-chunk-10

## $fit
## 
## Call:
## arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S), 
##     xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, REPORT = 1, 
##         reltol = tol))
## 
## Coefficients:
##         ma1   xmean
##       0.628  14.539
## s.e.  0.116   0.325
## 
## sigma^2 estimated as 1.67:  log likelihood = -68.89,  aic = 143.8
## 
## $AIC
## [1] 1.608
## 
## $AICc
## [1] 1.673
## 
## $BIC
## [1] 0.6917

The MA(1) coefficient is still significant, but overall the statistics are worse than our AR(1) model.

The diagnostic plots are quite informative as well. The ACF has a significant spike at lag 2, and several Ljung-Box_pierce p-values are below 0.05.

Another poorly specified model.

Next we try fitting an ARMA model to the same data

sarima(data, 1, 0, 1) # over parameterized model
## initial  value 0.519565 
## iter   2 value 0.445004
## iter   3 value 0.166926
## iter   4 value 0.159131
## iter   5 value 0.155006
## iter   6 value 0.152952
## iter   7 value 0.152844
## iter   8 value 0.152840
## iter   9 value 0.152839
## iter  10 value 0.152838
## iter  11 value 0.152838
## iter  12 value 0.152838
## iter  12 value 0.152838
## iter  12 value 0.152838
## final  value 0.152838 
## converged
## initial  value 0.150077 
## iter   2 value 0.149790
## iter   3 value 0.149741
## iter   4 value 0.149713
## iter   5 value 0.149708
## iter   6 value 0.149704
## iter   7 value 0.149704
## iter   8 value 0.149704
## iter   8 value 0.149704
## final  value 0.149704 
## converged

plot of chunk unnamed-chunk-11

## $fit
## 
## Call:
## arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = S), 
##     xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, REPORT = 1, 
##         reltol = tol))
## 
## Coefficients:
##         ar1    ma1   xmean
##       0.708  0.001  14.583
## s.e.  0.149  0.224   0.584
## 
## sigma^2 estimated as 1.33:  log likelihood = -64.31,  aic = 136.6
## 
## $AIC
## [1] 1.429
## 
## $AICc
## [1] 1.505
## 
## $BIC
## [1] 0.5541

Right away we see the MA(1) coefficient is insignificant, thus we could drop that term and be left with the correct AR(1) model. The AIC, BIC and standard error are also higher than the correct AR(1) model.

The next lesson will include forecasting with ARIMA models, here.