Chapter 8 - ARIMA models

Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

Explain the differences among these figures. Do they all indicate that the data are white noise?

Figure 8.31: Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers.

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

These pictures show the correlation between different lags of the series (shown on the x-axis).
The y-axis (the correlation) has the same scale for each plot, but the x axis shows an increasing number of lags as the series gets longer.
If the data are white noise (random) then we expect the correlations to be below the blue line, which indicates a significant lag.
For all the plots, the correlations of the lags shown are all below the significance level so they are all indicitive of white noise.

The value of a significant correlation is \(\pm \frac{1.96}{\sqrt{T}}\), where T is the number of data.
From this we can see that as the number of data increase the value of a significant correlation decreases.

A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

ggtsdisplay(ibmclose)

ACF plot shows that the autocorrelation values are bigger than critical value and decrease slowly.
Also, r1 is large(near to 1) and positive.
It means that the IBM stock data are non-stationary(that is, predictable using lagged values).
PACF plot shows that there is a strong correlation between IBM stock data and their 1 lagged values.
It means that IBM stock data can be predicted by 1 lagged values and they aren’t stationary.
To get stationary data, IBM stock data need differencing.
Differencing can help stabilize the mean of a time series by removing changes in the level of a time series.
Therefore it will eliminate or reduce trend and seasonality where the effect can make non-staionary data stationary.

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

usnetelec
usgdp
mcopper
enplanements
visitors

autoplot(usnetelec)

It is almost linearly increasing data.
It looked like that the data only need first differencing.

Box.test(diff(usnetelec), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(usnetelec)
## X-squared = 0.8508, df = 1, p-value = 0.3563

first differenced usnetelec data can be thought of as a white noise series.

kpss.test(diff(usnetelec))

## Warning in kpss.test(diff(usnetelec)): p-value greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(usnetelec)
## KPSS Level = 0.15848, Truncation lag parameter = 3, p-value = 0.1

kpss test result also shows that first differencing made the data stationary.

autoplot(usgdp)

It is almost linearly increasing data.
It looked like that the data only need first differencing.

Box.test(diff(usgdp), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(usgdp)
## X-squared = 39.187, df = 1, p-value = 3.85e-10

first differenced usnetelec data cannot be thought of as a white noise series.

autoplot(diff(usgdp))

There is still a trend left in the differenced data.
It looked like one more differencing would be enough, but use ndiffs function to check the number of differencing needed.

ndiffs(usgdp)

## [1] 2

One more differencing would be enough.

autoplot(diff(diff(usgdp)))

Plot shows that the twice differenced data is like white noise series.

Box.test(diff(diff(usgdp)), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(diff(usgdp))
## X-squared = 53.294, df = 1, p-value = 2.872e-13

But it couldn’t pass Ljung-Box test.

ggAcf(diff(diff(usgdp)))

There are still some autocorrelations left.

kpss.test(diff(diff(usnetelec)))

## Warning in kpss.test(diff(diff(usnetelec))): p-value greater than printed
## p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(diff(usnetelec))
## KPSS Level = 0.098532, Truncation lag parameter = 3, p-value = 0.1

But kpss test result shows that differencing twice was enough to make the data stationary.
Therefore in usgdp data case, even if twice differencing didn’t make the data like white noise series, it made the data stationary.

autoplot(mcopper)

mcopper data have increasing trend. And they have bigger variation for bigger prices.
Therefore I’ll use Box-Cox transformation before differencing.

lambda_mcopper <- BoxCox.lambda(mcopper)
autoplot(diff(BoxCox(mcopper, lambda_mcopper)))

Box.test(diff(BoxCox(mcopper, lambda_mcopper)), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(BoxCox(mcopper, lambda_mcopper))
## X-squared = 57.517, df = 1, p-value = 3.353e-14

Plot result looked like BoxCox transformation and first differencing made the data like white noise series.
But Ljung-Box test shows that it didn’t.

ggAcf(diff(BoxCox(mcopper, lambda_mcopper)))

There are still some autocorrelations left.

kpss.test(diff(BoxCox(mcopper, lambda_mcopper)))

## Warning in kpss.test(diff(BoxCox(mcopper, lambda_mcopper))): p-value
## greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(BoxCox(mcopper, lambda_mcopper))
## KPSS Level = 0.057275, Truncation lag parameter = 6, p-value = 0.1

But kpss test result shows that differencing with Box-Cox transformation was enough to make the data stationary.
Even if differencing with Box-Cox transformation didn’t make the data like white noise series, it made the data stationary.

autoplot(enplanements)

enplanements data have seasonality and increasing trend even if the number of enplanements fell in 2001.
Therefore, the data need seasonal differencing too.
The variations are bigger for bigger numbers. Therefore Box-Cox transformation will be used before differencing.

lambda_enplanements <- BoxCox.lambda(enplanements)
ndiffs(enplanements)

## [1] 1

nsdiffs(enplanements)

## [1] 1

the data need 1 first differencing and 1 seasonal differencing.

autoplot(diff(diff(BoxCox(enplanements, lambda_enplanements), lag = 12)))

Box.test(diff(diff(BoxCox(enplanements, lambda_enplanements), lag = 12)), type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(diff(BoxCox(enplanements, lambda_enplanements), lag = 12))
## X-squared = 29.562, df = 1, p-value = 5.417e-08

Plot result looked like BoxCox transformation and multiple differencings made the data like white noise series.
But Ljung-Box test shows that it didn’t.

ggAcf(diff(diff(BoxCox(enplanements, lambda_enplanements),lag = 12)))

There are still some autocorrelations left.

kpss.test(diff(diff(BoxCox(enplanements, lambda_enplanements),lag = 12)))

## Warning in kpss.test(diff(diff(BoxCox(enplanements, lambda_enplanements), :
## p-value greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(diff(BoxCox(enplanements, lambda_enplanements), lag = 12))
## KPSS Level = 0.042424, Truncation lag parameter = 5, p-value = 0.1

But kpss test result shows that differencings with Box-Cox transformation was enough to make the data stationary.
In enplanements data case, even if differencings with Box-Cox transformation didn’t make the data like white noise series, it made the data stationary.

autoplot(visitors)

visitors data are similar to enplanements data. They have seasonality and increasing trend. It looked like they also need Box-Cox transformation, first and seasonal differencing.

lambda_visitors <- BoxCox.lambda(visitors)
ndiffs(visitors)

## [1] 1

nsdiffs(visitors)

## [1] 1

visitors data need 1 first and 1 seasonal differencing.

autoplot(diff(diff(BoxCox(visitors, lambda_visitors),lag = 12)))

Box.test(diff(diff(BoxCox(visitors, lambda_visitors),lag = 12)),type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  diff(diff(BoxCox(visitors, lambda_visitors), lag = 12))
## X-squared = 21.804, df = 1, p-value = 3.02e-06

Plot result looked like BoxCox transformation and multiple differencings made the data like white noise series.
But Ljung-Box test shows that it didn’t.

ggAcf(diff(diff(BoxCox(visitors, lambda_visitors),lag = 12)))

There are still some autocorrelations left.

kpss.test(diff(diff(BoxCox(visitors, lambda_visitors),lag = 12)))

## Warning in kpss.test(diff(diff(BoxCox(visitors, lambda_visitors), lag =
## 12))): p-value greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(diff(BoxCox(visitors, lambda_visitors), lag = 12))
## KPSS Level = 0.015833, Truncation lag parameter = 4, p-value = 0.1

But kpss test result shows that differencings with Box-Cox transformation was enough to make the data stationary.
In visitors data case, even if differencings with Box-Cox transformation didn’t make the data like white noise series, it made the data stationary.

For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

retail <- read_excel("/Users/hovig/Downloads/retail.xlsx", skip=1)
retail.ts <- ts(retail[,"A3349873A"], frequency=12, start=c(1982,4))
autoplot(retail.ts)

the data have increasing trend and strong seasonality. And there are bigger variations for bigger numbers. Therefore I think that I need to use first differencing and seasonal differencing. And it would be better to do Box-Cox transformation.

ndiffs(retail.ts)

## [1] 1

nsdiffs(retail.ts)

## [1] 1

I’m going to do 1 first differencing and 1 seasonal differencing.

kpss.test(diff(diff(BoxCox(retail.ts, BoxCox.lambda(retail.ts)),lag = 12)))

## Warning in kpss.test(diff(diff(BoxCox(retail.ts,
## BoxCox.lambda(retail.ts)), : p-value greater than printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(diff(BoxCox(retail.ts, BoxCox.lambda(retail.ts)), lag = 12))
## KPSS Level = 0.013817, Truncation lag parameter = 5, p-value = 0.1

To make retail.ts data stationary, I did Box-Cox transformation, 1 first differencing and 1 seasonal differencing.

Use R to simulate and plot some data from simple ARIMA models.

Use the following R code to generate data from an AR(1) model with phi1 = 0.6 and sigma^2 = 1. The process starts with y1 = 0.
Produce a time plot for the series. How does the plot change as you change phi1?
Write your own code to generate data from an MA(1) model with theta1 = 0.6 and sigma^2 = 1.
Produce a time plot for the series. How does the plot change as you change theta1?
Generate data from an ARMA(1,1) model with phi1 = 0.6, theta1 = 0.6 and sigma^2 = 1.
Generate data from an AR(2) model with phi1 = -0.8, phi2 = 0.3 and sigma^2 = 1. (Note that these parameters will give a non-stationary series.)
Graph the latter two series and compare them.

y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100){
   y[i] <- 0.6*y[i-1] + e[i]
}

ar1generator <- function(phi1){
  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 2:100){
    y[i] <- phi1*y[i-1] + e[i]
  }
  return(y)
}

generate 100 data points from an AR(1) model with input phi1.
error ’e’s have variation sigma^2 as 1.

autoplot(ar1generator(0.3), series = "0.3") +
  geom_line(size = 1, colour = "red") +
  autolayer(y, series = "0.6", size = 1) +
  autolayer(ar1generator(0.9), size = 1, series = "0.9") +
  ylab("AR(1) models") +
  guides(colour = guide_legend(title = "Phi1"))

As produce plots changing phi1 value: phi increases, the variation of y increased.

ma1generator <- function(theta1){
  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 2:100){
    y[i] <- theta1*e[i-1] + e[i]
  }
  return(y)
}

generating 100 data points from an MA(1) model with input theta1.
error ’e’s have variation sigma^2 as 1.

autoplot(ma1generator(0.3), series = "0.3") +
  geom_line(size = 1, colour = "red") +
  autolayer(y, series = "0.6", size = 1) +
  autolayer(ar1generator(0.9), size = 1, series = "0.9") +
  ylab("MA(1) models") +
  guides(colour = guide_legend(title = "Theta1"))

As produce plots changing theta1 value: theta increases, the variation of y increased.

y_arima.1.0.1 <- ts(numeric(50))
e <- rnorm(50)
for(i in 2:50){
   y_arima.1.0.1[i] <- 0.6*y_arima.1.0.1[i-1] + 0.6*e[i-1] + e[i]
}

y_arima.2.0.0 <- ts(numeric(50))
e <- rnorm(50)
for(i in 3:50){
   y_arima.2.0.0[i] <- -0.8*y_arima.2.0.0[i-1] + 0.3*y_arima.2.0.0[i-2] + e[i]
}

autoplot(y_arima.1.0.1, series = "ARMA(1, 1)") +
  autolayer(y_arima.2.0.0, series = "AR(2)") +
  ylab("y") +
  guides(colour = guide_legend(title = "Models"))

autoplot(y_arima.1.0.1)

data from an AR(2) model increased with oscillation. They are non-staionary data. But data from an ARMA(1, 1) model were stationary.

Consider the number of women murdered each year (per 100,000 standard population) in the United States. (Data set wmurders).

By studying appropriate graphs of the series in R, find an appropriate ARIMA(p,d,q) model for these data.
Should you include a constant in the model? Explain.
Write this model in terms of the backshift operator.
Fit the model using R and examine the residuals. Is the model satisfactory?
Forecast three times ahead. Check your forecasts by hand to make sure that you know how they have been calculated.
Create a plot of the series with forecasts and prediction intervals for the next three periods shown.
Does auto.arima give the same model you have chosen? If not, which model do you think is better?

autoplot(wmurders)

It looked like the data don’t need seasonal differencing or Box-Cox transformation.

autoplot(diff(wmurders))

One more differencing would be needed to make the data stationary.
Differenced data slowly go to minus infinity.

ndiffs(wmurders)

## [1] 2

ndiffs function shows that the data need 2 differencing.

autoplot(diff(wmurders, differences = 2))

kpss.test(diff(wmurders, differences = 2))

## Warning in kpss.test(diff(wmurders, differences = 2)): p-value greater than
## printed p-value

## 
##  KPSS Test for Level Stationarity
## 
## data:  diff(wmurders, differences = 2)
## KPSS Level = 0.045793, Truncation lag parameter = 3, p-value = 0.1

twice differencing made the data stationary.

diff(wmurders, differences = 2) %>% ggtsdisplay()

PACF is decaying
There are significant spikes at lag 1 and 2 in the ACF - but none beyond lag 2.
If the data can be modelled by ARIMA(0, 2, q) or ARIMA(p, 2, 0),
ARIMA(0, 2, 2) is preferred to model the data

ARIMA model of the data includes twice differencing.
If there is a constant in the model, twice integrated contant will yield quadratic trend, which is dangerous for forecasting.
Therefore constant will be excluded in the model.

\((1 - B)^2*yt = (1 + theta1*B + theta2*B^2)*et\)

wmurders_arima.0.2.2 <- Arima(wmurders, order = c(0, 2, 2))
checkresiduals(wmurders_arima.0.2.2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,2)
## Q* = 11.764, df = 8, p-value = 0.1621
## 
## Model df: 2.   Total lags used: 10

The residuals of the model can be thought of as white noise series.
Not normally distributed, but satisfied

fc_wmurders_arima.0.2.2 <- forecast(wmurders_arima.0.2.2, h = 3)

forecasts by Arima function

fc_wmurders_arima.0.2.2$mean

## Time Series:
## Start = 2005 
## End = 2007 
## Frequency = 1 
## [1] 2.480525 2.374890 2.269256

forecasts by manual calculation

fc_wmurders_arima.0.2.2$model

## Series: wmurders 
## ARIMA(0,2,2) 
## 
## Coefficients:
##           ma1     ma2
##       -1.0181  0.1470
## s.e.   0.1220  0.1156
## 
## sigma^2 estimated as 0.04702:  log likelihood=6.03
## AIC=-6.06   AICc=-5.57   BIC=-0.15

formula:

\((1 - B)^2*yt = (1 - 1.0181*B + 0.1470*B^2)*et\)

\(yt = 2yt-1 - yt-2 + et - 1.0181*et-1 + 0.1470*et-2\)

years <- length(wmurders)
e <- fc_wmurders_arima.0.2.2$residuals
fc1 <- 2*wmurders[years] - wmurders[years - 1] - 1.0181*e[years] + 0.1470*e[years - 1]
fc2 <- 2*fc1 - wmurders[years] + 0.1470*e[years]
fc3 <- 2*fc2 - fc1

forecasts by manual calculation

c(fc1, fc2, fc3)

## [1] 2.480523 2.374887 2.269252

The forecasts are almost similar to the ones got by Arima function.

autoplot(fc_wmurders_arima.0.2.2)

fc_wmurders_autoarima <- forecast(auto.arima(wmurders), h = 3)
accuracy(fc_wmurders_arima.0.2.2)

##                      ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.0113461 0.2088162 0.1525773 -0.2403396 4.331729 0.9382785
##                     ACF1
## Training set -0.05094066

accuracy(fc_wmurders_autoarima)

##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.01065956 0.2072523 0.1528734 -0.2149476 4.335214 0.9400996
##                    ACF1
## Training set 0.02176343

Without RMSE, all errors show that ARIMA(0, 2, 2) is better than ARIMA(1, 2, 1).

fc_wmurders_autoarima2 <- forecast(auto.arima(wmurders, stepwise = FALSE, approximation = FALSE), h = 3)

Using auto.arima function with stepwise and approximation options false gave us ARIMA(0, 2, 3) model

accuracy(fc_wmurders_autoarima2)

##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.01336585 0.2016929 0.1531053 -0.3332051 4.387024 0.9415259
##                     ACF1
## Training set -0.03193856

In this case, some errors were better while others were worse.
Let’s check residuals and ACF, PACF plots.

ggtsdisplay(diff(wmurders, differences = 2))

It looked like that the data are similar to ARIMA(0, 2, 2) rather than ARIMA(0, 2, 3).

checkresiduals(fc_wmurders_arima.0.2.2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,2)
## Q* = 11.764, df = 8, p-value = 0.1621
## 
## Model df: 2.   Total lags used: 10

checkresiduals(fc_wmurders_autoarima2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,3)
## Q* = 10.706, df = 7, p-value = 0.152
## 
## Model df: 3.   Total lags used: 10

Very close | almost similar residuals - ARIMA(0, 2, 2) is preferred.

Data 624 - Homework 6

Ohannes (Hovig) Ohannessian

3/15/2019

Chapter 8 - ARIMA models