Overview

Today in class, we learned how to test ARIMA models and perform diagnoses on them. We also learned how to model seasonal ARIMAs.

Tests

We can perform hypothesis tests on both the autoregression coefficients and the moving average coefficients (\(\theta\) and \(\phi\), respectively). The test statistic for each of these hypothesis tests is the coefficient over the standard error of the coefficient. These test statistics follow a T distribution, with the degrees of freedom equal to n -(#\(\theta\)s, #\(\phi\)’s).

Diagnostics

We can perform residual analysis on our ARIMA model. If ARIMA is doing a good job, then our residuals, and consequently our test stat, will be small. Our null hypothesis is that the model adequately models the data, while the alternative is that the model inadequately models the data.

Two test statistics that we can use are the BoxPierce stat, or the Ljung-Box stat. the BoxPierce stat is the original, while the Ljung-Box is an improved test statistic, so we will use that one primarily. To reiterate, a large p-value means the model is adequate (which might seem a bit counterintuitive).

Seasonal ARIMA

Intro

Seasonal ARIMA is modeled with the following form: (p,d,q)x(P,D,Q)S. The first part, (p,d,q), is the orignal ARIMA model, while the second part, (P,D,Q)S is the seasonal component. S is the time span of seasonal repeating, like 12 for monthly or 4 for quarterly.

If you remember from the original ARIMA model, lowercase d helps us to remove upward or downward trend. With seasonal D, this helps us to remove seasonal trend.

General Seasonal ARIMA Steps

We will now go through an example to demonstrate the general steps that go along with seasonal ARIMA.

To demonstrate, I will use data from souvenir sales at a fancy gift shop.

souvenir <- scan("http://robjhyndman.com/tsdldata/data/fancy.dat")
souvenir <- ts(souvenir, frequency=12, start=c(1987,1))
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
library(quantmod)
## Warning: package 'quantmod' was built under R version 3.4.4
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.3
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.4.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(tseries)
## Warning: package 'tseries' was built under R version 3.4.4
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.4.4
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.4.3
## 
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
## 
##     time<-
library(forecast)
library(xts)
head(souvenir)
##          Jan     Feb     Mar     Apr     May     Jun
## 1987 1664.81 2397.53 2840.71 3547.29 3752.96 3714.74

Step 1

First, we need to plot the data to see if we have seasonality.

plot.ts(souvenir)

We can see a spike each year, which means we definitely have seasonality. Because this spike gets bigger every year, we will perform a log transformation on the data set.

logsouvenir<-log(souvenir)
plot.ts(logsouvenir)

This seems to have corrected the issue.

We can also use decomposition to see if there is seasonality.

plot(decompose(logsouvenir))

We can see from the third row down that there is for sure seasonal trend occurring.

Step 2

The second step is to do differences to identify what we want to use for D and d. If there is seasonality and no trend, we’ll have a number other than 0 for D but not d. If there is trend and no seasonality, we’ll have a number other than 0 for d but not D. If there is both seasonality and trend, we start with D being greater than 0, then check to see if the resulting differences still have a trend. If yes, we would try d greater than 0. Sometimes taking care of seasonality gets rid of the trend as well!

We first perform diffs of lag 12 (S=12 for monthly) to try and take care of seasonality, since we deemed that our data set has seasonality in step 1.

Diff1<-diff(logsouvenir, lag=12)
plot.ts(Diff1)

adf.test(Diff1)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  Diff1
## Dickey-Fuller = -2.6959, Lag order = 4, p-value = 0.292
## alternative hypothesis: stationary

We can’t really tell from the plot whether or not D=1 got rid of the trend for us. By doing a hypothesis test however, we see that there is a large p-value. This means we fail to reject the null, and we cannot say that the plot is stationary. In other words, there is still trend we need to take care of. Let’s attempt nonseasonal differences now to see if that helps to remove some of the upward/downward trend.

We use the difference data we created above, but this time with lag 1 since nonseasonal differences are calculated between each point.

diff1<-diff(Diff1, lag=1)
plot.ts(diff1)

adf.test(diff1)
## Warning in adf.test(diff1): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff1
## Dickey-Fuller = -5.3558, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary

We have a small p-value now, which means we can reject the null in favor of the alternative. Our data is now stationary (no more trend to take care of). We will proceed with building the model, with D=1 and d=1.

Step 3

The third step is to examine the ACF and the PACF to choose preliminary values for p, q, P, and Q.

ACF is the autocorrelation function. It tells us the correlation of 2 points separated by a specified lag. We determine q and Q with ACF.

PACF is the partial autocorrelation function. It’s like the ACF but controls for values of shorter lag. We determine p and P with ACF.

Acf(diff1)

To determine q, we look at spikes from low lags. Since there is one big spike over the blue line, we will choose q=1. We might even want to choose q=2 since the next spike is pretty close to the blue line. We will try q=1 first, but we might have to go back and tweak it.

To determine Q, we look at spikes from lags that are multiples of 12. We could maybe choose Q=1, but since it’s not really hitting the blue line, we’ll choose Q=0.

Pacf(diff1)

Again we look at low lags to determine p. It is pretty obvious that we should choose p=1.

We look at multiples of 12 to determine P. We will start with P=0 but might need to try P=1 since again it is close to the blue dotted line.

Step 4

We can now estimate our model. We ended up with preliminary estimates of ARIMA(1,1,1)x(0,1,0)12.

mod1 <- arima(logsouvenir, order = c(1,1,1), season = list( order = c( 0,1,0), period=12))

Step 5

The last step is to examine our models, use diagnostics, and perform hypothesis tests. We might have to make some more models here in order to improve accuracy. After creating more models, we could use AIC to compare.

summary(mod1) 
## 
## Call:
## arima(x = logsouvenir, order = c(1, 1, 1), seasonal = list(order = c(0, 1, 0), 
##     period = 12))
## 
## Coefficients:
##          ar1      ma1
##       0.2020  -0.7403
## s.e.  0.2614   0.2064
## 
## sigma^2 estimated as 0.03695:  log likelihood = 16.09,  aic = -26.17
## 
## Training set error measures:
##                        ME      RMSE       MAE        MPE     MAPE
## Training set -0.007853116 0.1767525 0.1334457 -0.1052367 1.455206
##                   MASE        ACF1
## Training set 0.3309956 -0.03525604
Box.test(mod1$residuals, type = "Ljung")
## 
##  Box-Ljung test
## 
## data:  mod1$residuals
## X-squared = 0.10818, df = 1, p-value = 0.7422

From the test, we see that we have a large p-value, so we have no evidence that the model is inadequate. Our residuals seem to be small enough!

We can test each coefficient by calc-ing the test stat. If it’s less than 2, we might want to try removing that coefficient because it would be close to zero. Since ar1’s test stat is less than 2, let’s try removing.

ARIMA (0,1,1) x (0,1,0) 12

mod2 <- arima(logsouvenir, order = c(0,1,1), season = list( order = c( 0,1,0), period=12))
Box.test(mod2$residuals, type = "Ljung")
## 
##  Box-Ljung test
## 
## data:  mod2$residuals
## X-squared = 0.064518, df = 1, p-value = 0.7995
summary(mod2)
## 
## Call:
## arima(x = logsouvenir, order = c(0, 1, 1), seasonal = list(order = c(0, 1, 0), 
##     period = 12))
## 
## Coefficients:
##           ma1
##       -0.5653
## s.e.   0.1227
## 
## sigma^2 estimated as 0.03722:  log likelihood = 15.89,  aic = -27.79
## 
## Training set error measures:
##                        ME     RMSE      MAE         MPE     MAPE      MASE
## Training set -0.006290552 0.177391 0.132399 -0.08573452 1.443332 0.3283994
##                   ACF1
## Training set 0.0272264

We get a large p-value again, which is a good sign, and ma1’s test stat is still greater than 2!

Let’s see if we can improve our model. We will try by increasing q by 1 since we said it might be greater than 1.

ARIMA (0,1,2) x (0, 1, 0) 12

mod3 <- arima(logsouvenir, order = c(0,1,2), season = list( order = c( 0,1,0), period=12))
mod3
## 
## Call:
## arima(x = logsouvenir, order = c(0, 1, 2), seasonal = list(order = c(0, 1, 0), 
##     period = 12))
## 
## Coefficients:
##           ma1      ma2
##       -0.5637  -0.0350
## s.e.   0.1156   0.1197
## 
## sigma^2 estimated as 0.03716:  log likelihood = 15.93,  aic = -25.87

ma2 test stat is not significantly different from 0 so we will stick with the previous model we created.

Just to check, we can look at their AIC’s.

AIC(mod3);AIC(mod2);AIC(mod1)
## [1] -25.86905
## [1] -27.78538
## [1] -26.17089

We obviously don’t want our first model since its AIC is so much higher than the other two models. mod2 has the lowest AIC!

R’s turn!

Let’s compare what we got to auto.arima.

automod<-auto.arima(logsouvenir)
automod
## Series: logsouvenir 
## ARIMA(2,0,0)(1,1,0)[12] with drift 
## 
## Coefficients:
##          ar1     ar2     sar1   drift
##       0.3493  0.3602  -0.3278  0.0247
## s.e.  0.1086  0.1159   0.1334  0.0044
## 
## sigma^2 estimated as 0.03182:  log likelihood=23.04
## AIC=-36.09   AICc=-35.18   BIC=-24.71

We were a little bit off. We didn’t need diffs for nonseasonality!

AIC(automod); AIC(mod3)
## [1] -36.08965
## [1] -25.86905
Box.test(automod$residuals, type="Ljung")
## 
##  Box-Ljung test
## 
## data:  automod$residuals
## X-squared = 0.11635, df = 1, p-value = 0.733

We can see the automod has the lowest AIC by far.