Exercise 8.1: ACF plots

a. White noise

Based on the ACF plots, the first and third series appear to be white noise, as all the autocorrelations fall within the critical value thresholds. However the second series appears to be borderline, as autocorrelations at lags 2 and 6 appear slightly outside the critical value thresholds.

The main differences in the ACF plots relate to the position of the critical value thresholds and the magnitude of the autocorrelations. As the length of the time series increases from 36 to 360 to 1,000, the autocorrelations appear to diminish and the critical values appear to shrink.

b. Critical values

From the text, the critical value thresholds are defined as \(\pm 1.96 / \sqrt{T}\), where \(T\) is the length of the time series. So the critical values scale as \(~ 1 / \sqrt{T}\) as \(T\) increases, which explains why the critical value thresholds in the ACF plots fall closer to the x-axis for longer time series.

For a white noise time series, we would expect all autocorrelations to converge to zero for a large enough dataset, as over the long run independent observations in a random process should appear uncorrelated. However, over the short run, there will be fluctuations in the autocorrelations arising from random patterns in a small dataset, which explains why the autocorrelations appear larger for the shorter time series.

Exercise 8.2: ibmclose time series

We use the ggtsdisplay function to plot the time series as well as the ACF and PACF plots.

ggtsdisplay(ibmclose, 
            main = "IBM Closing Price", 
            xlab = "Day") 

We note several observations from the plots:

  • The time series has changing trend and cyclic patterns, so that the local properties of the time series (e.g., the mean or variance over a time window) depend on where they are measured. In other words, the time series is non-stationary.

  • The ACF plot demonstrates that the time series has strong autocorrelations for lags going out past 25 days, although the magnitude of the autocorrelations decline over time.

  • Finally the PACF plot indicates that after accounting for lower-lagged autocorrelations, no autocorrelations after lag 1 are statistically significant.

All of this suggests that we should apply first-order differencing to make the time series stationary. We could also use the ndiffs function to determine the required order of differencing, which again indicates 1.

cat("Number of differences needed: ", ndiffs(ibmclose)) 
## Number of differences needed:  1

After applying first-order differencing, we again plot the time series as well as ACF and PACF plots. The time series appears now to be stationary.

ggtsdisplay(diff(ibmclose), 
            main = "IBM Closing Price", 
            xlab = "Day")

In fact, the time series appears to be white noise, as we can confirm by using the Ljung-Box test.

Box.test(diff(ibmclose), lag = 10, type = "Ljung-Box")
## 
##  Box-Ljung test
## 
## data:  diff(ibmclose)
## X-squared = 14.064, df = 10, p-value = 0.1701

Exercise 8.3: stationary data

For each of the time series, we (a) decide whether the Box-Cox transformation is needed to stabilize the variance, and (b) use the nsdiffs and ndiffs functions to determine the required order of differencing. Then we confirm that the transformed / differenced time series appear to be stationary.

a. usnetelec

Annual time series.

  • No Box-Cox transformation needed
  • No seasonal differencing needed
  • First-order differencing
autoplot(usnetelec)

cat("Number of differences needed: " , ndiffs(usnetelec)) 
## Number of differences needed:  1
diff(usnetelec) %>% ggtsdisplay()

b. usgdp

Quarterly time series.

  • Box-Cox transformation: \(\lambda = 0.37\)
  • No seasonal differencing needed
  • First-order differencing
autoplot(usgdp)

cat("Box-Cox transformation: lambda =", BoxCox.lambda(usgdp))
## Box-Cox transformation: lambda = 0.366352
tstransf <- BoxCox(usgdp, lambda = "auto")
cat("Number of seasonal differences needed: ", nsdiffs(tstransf))
## Number of seasonal differences needed:  0
cat("Number of differences needed: " , ndiffs(tstransf)) 
## Number of differences needed:  1
tstransf %>% diff() %>% ggtsdisplay()

c. mcopper

Monthly time series.

  • Box-Cox transformation: \(\lambda = 0.19\)
  • No seasonal differencing needed
  • First-order differencing
autoplot(mcopper)

cat("Box-Cox transformation: lambda =", BoxCox.lambda(mcopper))
## Box-Cox transformation: lambda = 0.1919047
tstransf <- BoxCox(mcopper, lambda = "auto")
cat("Number of seasonal differences needed: ", nsdiffs(tstransf))
## Number of seasonal differences needed:  0
cat("Number of differences needed: " , ndiffs(tstransf)) 
## Number of differences needed:  1
tstransf %>% diff() %>% ggtsdisplay()

d. enplanements

Monthly time series.

  • Box-Cox transformation: \(\lambda = -0.23\)
  • Seasonal differencing with 12-month lag
  • First-order differencing
autoplot(enplanements)

cat("Box-Cox transformation: lambda =", BoxCox.lambda(enplanements))
## Box-Cox transformation: lambda = -0.2269461
tstransf <- BoxCox(enplanements, lambda = "auto")
cat("Number of seasonal differences needed: ", nsdiffs(tstransf))
## Number of seasonal differences needed:  1
cat("Number of differences needed: " , ndiffs(diff(tstransf, lag = 12))) 
## Number of differences needed:  1
tstransf %>% diff(lag = 12) %>% diff() %>%
  ggtsdisplay()

e. visitors

Monthly time series.

  • Box-Cox transformation: \(\lambda = 0.28\)
  • Seasonal differencing with 12-month lag
  • First-order differencing
autoplot(visitors)

cat("Box-Cox transformation: lambda =", BoxCox.lambda(visitors))
## Box-Cox transformation: lambda = 0.2775249
tstransf <- BoxCox(visitors, lambda = "auto")
cat("Number of seasonal differences needed: ", nsdiffs(tstransf))
## Number of seasonal differences needed:  1
cat("Number of differences needed: " , ndiffs(diff(tstransf, lag = 12))) 
## Number of differences needed:  1
tstransf %>% diff(lag = 12) %>% diff() %>%
  ggtsdisplay()

Exercise 8.5: Australian retail data

For the Australian retail data, I chose the 7th variable column in the retail dataset, which relates to “Turnover - New South Wales - Hardware, building and garden supplies retailing”.

# local disk
path <- "../Week2/retail.xlsx"
# github
#path <- "https://github.com/kecbenson/DATA624/blob/master/retail.xlsx"
retaildata <- readxl::read_excel(path, skip=1)
colID <- colnames(retaildata)[8]
myts <- ts(retaildata[ , colID], frequency=12, start=c(1982,4))

autoplot(myts) + 
  labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing", 
       subtitle = "Monthly 1982/04 - 2013/12", 
       x = "Year", y = "")

We apply the following steps to the monthly time series in order to obtain stationary data:

  • Box-Cox transformation: \(\lambda = 0.92\) (although this is close enough to 1 that it’s likely unnecessary)
  • Seasonal differencing with 12-month lag
  • First-order differencing

Note that although the result of the ndiffs function indicates that no further differencing is needed after the seasonal differencing, the plot of the time series suggests first-order differencing.

cat("Box-Cox transformation: lambda =", BoxCox.lambda(myts))
## Box-Cox transformation: lambda = 0.9165544
tstransf <- BoxCox(myts, lambda = "auto")
cat("Number of seasonal differences needed: ", nsdiffs(tstransf))
## Number of seasonal differences needed:  1
cat("Number of differences needed: " , ndiffs(diff(tstransf, lag = 12))) 
## Number of differences needed:  0
tstransf %>% diff(lag = 12) %>% diff() %>%
  ggtsdisplay()

Exercise 8.6: simulated ARIMA time series

Note that for comparability across the time series, we will use the same sample errors drawn from the standard normal distribution \(N(\mu = 0, \sigma^2 = 1)\). We also set the random number seed for reproducibility.

set.seed(314)
e <- rnorm(100)

a. AR(1) model

We generate data for the AR(1) model with the parameter \(\phi\) ranging from -1 to 1:

\[y_t = \phi y_{t-1} + \epsilon_t\]

phi <- seq(-1, 1, 0.2)
y <- ts(matrix(0, nrow = 100, ncol = length(phi)))
colnames(y) <- paste0("phi = ", phi)
# loop over phi, then time
for (j in 1:length(phi)) {
  for (i in 2:100) {
    y[i, j] <- phi[j] * y[i-1, j] + e[i] 
  }
}

b. AR(1) plot

The plot below displays the time series generated from the AR(1) model for \(\phi \in \{-1, -0.8, -0.6, \cdots, 0.6, 0.8, 1\}\), while holding the sample errors \(\epsilon_t\) constant across scenarios.

Some observations:

  • For \(\phi = -1\), the time series appears unstable, growing in magnitude over time with rapid oscillations
  • As \(\phi\) increases from -0.8 to -0.2, the magnitude of the time series seems to shrink, and the oscillations appear to dampen
  • For \(\phi = 0\), the time series is white noise (\(y_t = \epsilon_t\))
  • As \(\phi\) increases from 0.2 to 0.8, the curve appears to smoothen out, with fewer sudden changes of direction; also the magnitude of the time series appears to grow
  • For \(\phi = 1\), the time series is a random walk (\(y_t = y_{t-1} + \epsilon_t\)), which exhibits long periods of positive and negatve trends with larger magnitude.
autoplot(y, facets=TRUE) + 
  labs(title = "AR(1) model for phi from -1 to +1")

c. MA(1) model

We generate data for the MA(1) model with the parameter \(\theta\) ranging from -1 to 1:

\[y_t = \epsilon_t + \theta \epsilon_{t-1}\]

theta <- seq(-1, 1, 0.2)
y <- ts(matrix(0, nrow = 100, ncol = length(theta)))
colnames(y) <- paste0("theta=", theta)
# loop over theta, then time
for (j in 1:length(theta)) {
  for (i in 2:100) {
    y[i, j] <- e[i] + theta[j] * e[i-1] 
  }
}

d. MA(1) plot

The plot below displays the time series generated from the MA(1) model for \(\theta \in \{-1, -0.8, -0.6, \cdots, 0.6, 0.8, 1\}\), while holding the sample errors \(\epsilon_t\) constant across scenarios..

Some observations:

  • As \(\theta\) increases from -1 to 0, the curve appears to smoothen out as sudden changes of direction seem to dampen
  • For \(\theta = 0\), the time series is white noise (\(y_t = \epsilon_t\)) and is identical to the case of \(\phi = 0\) in the AR(1) model above
  • As \(\theta\) increases from 0 to 1, the magnitude of the time series increases, and trend lines become more pronounced while sudden changes of direction appear to dampen.
autoplot(y, facets=TRUE) + 
  labs(title = "MA(1) model for theta from -1 to +1")

e. ARMA(1,1) model

We generate data for the ARMA(1,1) model with the parameters \(\phi_1 = 0.6\) and \(\theta_1 = 0.6\):

\[y_t = \phi y_{t-1} + \theta \epsilon_{t-1} + \epsilon_t\]

phi <- 0.6
theta <- 0.6
y1 <- ts(numeric(100))
for (i in 2:100) { 
  y1[i] <- phi * y1[i-1] + theta * e[i-1] + e[i] 
}

f. AR(2) model

We generate data for the AR(2) model with the parameters \(\phi_1 = -0.8\) and \(\phi_2 = 0.3\):

\[y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + \epsilon_t\]

phi1 <- -0.8
phi2 <- 0.3
y2 <- ts(numeric(100))
for (i in 3:100) { 
  y2[i] <- phi1 * y2[i-1] + phi2 * y2[i-2] + e[i] 
}

g. Time plot

The plot below displays the time series generated from the ARMA(1,1) (for \(\phi_1 = 0.6\) and \(\theta_1 = 0.6\)) and the AR(2) model (for \(\phi_1 = -0.8\) and \(\phi_2 = 0.3\)), while using the same sample errors \(\epsilon_t\) in both models (\(\epsilon_t \sim N(\mu = 0, \sigma^2 = 1)\)).

Some observations:

  • The ARMA(1,1) time series almost appears to be a real-life time series with mean reversion; i.e., there are periods where cyclical patterns are evident, but trends tend to revert to the mean over time.
  • The AR(2) time series is obviously unstable, as rapid oscillations occur and the amplitude grows over time without limit.
cbind("ARMA(1,1)" = y1, "AR(2)" = y2) %>% 
  autoplot(facets = TRUE) + 
  labs(title = "ARMA(1,1) and AR(2) models")

Exercise 8.7: wmurders time series

This is an annual time series of the number of women murdered each year in the U.S. (per 100,000 standard population), from 1950 through 2004.

autoplot(wmurders)

a. ARIMA(p,d,q) model

Based on the time plot above and using the ndiffs function, we apply the following steps to the data:

  • Box-Cox transformation: \(\lambda = -0.10\)
  • Second-order differencing

The resulting time series appears stationary.

cat("Box-Cox transformation: lambda =", BoxCox.lambda(wmurders))
## Box-Cox transformation: lambda = -0.09529835
tstransf <- BoxCox(wmurders, lambda = "auto")
cat("Number of differences needed: ", ndiffs(tstransf)) 
## Number of differences needed:  2
tstransf %>% diff() %>% diff() %>%
  ggtsdisplay()

Based on the spikes in the ACF plot at lags 1 and 2 and in the PACF plot at lags 1 and 2, we consider the following ARIMA models:

  • ARIMA(2,2,0)
  • ARIMA(0,2,2)
  • ARIMA(1,2,1)

b. Constant

From the text, adding a constant to the model will induce a straight line trend for \(d=1\) and a quadratic trend for \(d=2\). Since the original time series doesn’t show a trend, we omit the constant.

c. Backshift notation

Using backshift notation, the general non-seasonal ARIMA model ARIMA(p,d,q) can be written (from equation 8.4 in the text) as:

\[\left(1 - \phi_1 B - \cdots - \phi_p B^p \right) (1 - B)^d y_t = c + \left(1 + \theta_1 B + \cdots + \theta_q B^q \right) \epsilon_t\]

Since we omit the constant \(c\), we express our three candidate models as:

  • \(\left(1 - \phi_1 B - \phi_2 B^2 \right) (1 - B)^2 y_t = \epsilon_t\)
  • \((1 - B)^2 y_t = \left(1 + \theta_1 B + \theta_2 B^2 \right) \epsilon_t\)
  • \(\left(1 - \phi_1 B \right) (1 - B)^2 y_t = \left(1 + \theta_1 B \right) \epsilon_t\)

d. Model fitting

We try fitting the three candidate models and examine their residuals. We also consider two models using the auto.arima function.

  • Fit1: ARIMA(2,2,0)

The model residuals are not good, as the Ljung-Box test indicates they don’t resemble white noise.

(fit1 <- Arima(wmurders, 
               lambda = "auto", 
               order = c(2,2,0), 
               include.constant = FALSE))
## Series: wmurders 
## ARIMA(2,2,0) 
## Box Cox transformation: lambda= -0.09529835 
## 
## Coefficients:
##           ar1      ar2
##       -0.8786  -0.2906
## s.e.   0.1330   0.1341
## 
## sigma^2 estimated as 0.003242:  log likelihood=77.31
## AIC=-148.61   AICc=-148.12   BIC=-142.7
checkresiduals(fit1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,2,0)
## Q* = 17.975, df = 8, p-value = 0.02141
## 
## Model df: 2.   Total lags used: 10
  • Fit2: ARIMA(0,2,2)

The model residuals appear fine.

(fit2 <- Arima(wmurders, 
               lambda = "auto", 
               order = c(0,2,2), 
               include.constant = FALSE))
## Series: wmurders 
## ARIMA(0,2,2) 
## Box Cox transformation: lambda= -0.09529835 
## 
## Coefficients:
##           ma1     ma2
##       -1.0375  0.1950
## s.e.   0.1246  0.1196
## 
## sigma^2 estimated as 0.002908:  log likelihood=79.85
## AIC=-153.69   AICc=-153.2   BIC=-147.78
checkresiduals(fit2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,2)
## Q* = 10.119, df = 8, p-value = 0.2568
## 
## Model df: 2.   Total lags used: 10
  • Fit3: ARIMA(1,2,1)

The model residuals appear fine.

(fit3 <- Arima(wmurders, 
               lambda = "auto", 
               order = c(1,2,1), 
               include.constant = FALSE))
## Series: wmurders 
## ARIMA(1,2,1) 
## Box Cox transformation: lambda= -0.09529835 
## 
## Coefficients:
##           ar1     ma1
##       -0.3006  -0.786
## s.e.   0.1529   0.119
## 
## sigma^2 estimated as 0.002851:  log likelihood=80.37
## AIC=-154.74   AICc=-154.25   BIC=-148.83
checkresiduals(fit3)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,2,1)
## Q* = 10.378, df = 8, p-value = 0.2395
## 
## Model df: 2.   Total lags used: 10
  • Fit4: model from auto.arima with quick search

Using auto.arima yields the ARIMA(1,2,1) model, which is the same as fit3.

# try auto method
(fit4 <- auto.arima(wmurders, 
                    lambda = "auto", 
                    seasonal = FALSE, 
                    allowdrift = FALSE, 
                    allowmean = FALSE))
## Series: wmurders 
## ARIMA(1,2,1) 
## Box Cox transformation: lambda= -0.09529835 
## 
## Coefficients:
##           ar1     ma1
##       -0.3006  -0.786
## s.e.   0.1529   0.119
## 
## sigma^2 estimated as 0.002851:  log likelihood=80.37
## AIC=-154.74   AICc=-154.25   BIC=-148.83
  • Fit5: model from auto.arima with long search

Here we set the arguments stepwise and approximation to FALSE. Even so, we still arrive at the ARIMA(1,2,1) model.

# try auto method with no shortcuts
(fit5 <- auto.arima(wmurders, 
                    lambda = "auto", 
                    seasonal = FALSE, 
                    allowdrift = FALSE, 
                    allowmean = FALSE, 
                    stepwise = FALSE, 
                    approximation = FALSE))
## Series: wmurders 
## ARIMA(1,2,1) 
## Box Cox transformation: lambda= -0.09529835 
## 
## Coefficients:
##           ar1     ma1
##       -0.3006  -0.786
## s.e.   0.1529   0.119
## 
## sigma^2 estimated as 0.002851:  log likelihood=80.37
## AIC=-154.74   AICc=-154.25   BIC=-148.83

Let’s compare model metrics such as RSE, AIC, AICc, and BIC for our candidate models. Based on the metrics below, we choose fit3 = ARIMA(1,2,1) as the best model. The residual plots given above for fit3 appear satisfactory.

fitlist <- list(fit1, fit2, fit3)
metrics <- NULL 
for (j in 1:length(fitlist)) {
  temp <- fitlist[[j]]
  metrics <- rbind(metrics, 
                   c(sqrt(temp$sigma2),
                     temp$aic,
                     temp$aicc,
                     temp$bic))
}
row.names(metrics) <- paste0("fit", 1:length(fitlist)) 
metrics %>% kable(digits = c(5, 2, 2, 2), 
                  col.names=c("Sigma^2", "AIC", 
                              "AICc", "BIC"), 
                  caption = "Comparion of model metrics")   
Comparion of model metrics
Sigma^2 AIC AICc BIC
fit1 0.05694 -148.61 -148.12 -142.70
fit2 0.05392 -153.69 -153.20 -147.78
fit3 0.05339 -154.74 -154.25 -148.83

e. 3-year forecast

3-year forecasts from the ARIMA(1,2,1) model are given in the forecast output below.

fit3 %>% forecast(h = 3) 
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.487322 2.309061 2.680766 2.220398 2.789794
## 2006       2.398638 2.169841 2.654125 2.058488 2.801324
## 2007       2.310805 2.027049 2.638650 1.892470 2.832570

To calculate the forecasts by hand, we write the expression for the ARIMA(1,2,1) model without constant as follows:

\[ \begin{eqnarray} & \left(1 - \phi_1 B \right) (1 - B)^2 y_t = \left(1 + \theta_1 B \right) \epsilon_t \\ \implies &\left(1 - \phi_1 B \right) (1 - 2B + B^2) y_t = \left(1 + \theta_1 B \right) \epsilon_t \\ \implies &\left(1 - (\phi_1 + 2) B + (2 \phi_1 + 1) B^2 - \phi_1 B^3 \right) y_t = \left(1 + \theta_1 B \right) \epsilon_t \\ \implies & y_t = (\phi_1 + 2) y_{t-1} - (2 \phi_1 + 1) y_{t-2} + \phi_1 y_{t-3} + \epsilon_t + \theta_1 \epsilon_{t-1} \end{eqnarray} \]

Now the forecast as of time \(T\) for future periods \(T+h\) can be written as:

\[y_{T+h} = (\phi_1 + 2) y_{T+h-1} - (2 \phi_1 + 1) y_{T+h-2} + \phi_1 y_{T+h-3} + \epsilon_{T+h} + \theta_1 \epsilon_{T+h-1} \]

To use this expression for the point forecasts, we replace future observations with their forecasts (\(y_{T+h} = \hat{y}_{T+h}\)), set future errors to zero (\(\epsilon_{T+h} = 0\)), and replace past errors with model residuals (\(\epsilon_{T-j} = e_{T-j}\)). Now the first 3 point forecasts are given by:

\[ \begin{eqnarray} \hat{y}_{T+1} &=& (\phi_1 + 2) y_{T} - (2 \phi_1 + 1) y_{T-1} + \phi_1 y_{T-2} + \theta_1 e_{T} \\ \hat{y}_{T+2} &=& (\phi_1 + 2) \hat{y}_{T+1} - (2 \phi_1 + 1) y_{T} + \phi_1 y_{T-1} \\ \hat{y}_{T+3} &=& (\phi_1 + 2) \hat{y}_{T+2} - (2 \phi_1 + 1) \hat{y}_{T+1} + \phi_1 y_{T} \end{eqnarray} \]

We confirm below that we arrive at the same point forecasts using our manual calculation. Note that we since we applied a Box-Cox transformation to the original time series, we need to transform the observations and inverse-transform the forecasts, as appropriate.

phi <- coef(fit3)[1]
theta <- coef(fit3)[2]
n <- length(wmurders)
lambda <- BoxCox.lambda(wmurders)
y0 <- BoxCox(wmurders, lambda)
y1 <- (phi + 2) * y0[n] - (2 * phi + 1) * y0[n-1] + phi * y0[n-2] + theta * resid(fit3)[n] 
y2 <- (phi + 2) * y1 - (2 * phi + 1) * y0[n] + phi * y0[n-1] 
y3 <- (phi + 2) * y2 - (2 * phi + 1) * y1 + phi * y0[n]
df <- data.frame(Year = 2005:2007, Forecast = InvBoxCox(c(y1, y2, y3), lambda))
df 
##   Year Forecast
## 1 2005 2.487322
## 2 2006 2.398638
## 3 2007 2.310805

f. Plot forecast

We plot the time series with the 3-year forecast and prediction intervals below.

fit3 %>% forecast(h = 3) %>% autoplot() + 
  labs(title = "wmurders time series with 3-year forecast", 
       subtitle = "Forecast from ARIMA(1,2,1)", 
       x = "Year", y = "") + 
  geom_vline(xintercept = 2005, lty = 3, lwd = 1) 

We also show a close-up of the forecast below.

fit3 %>% forecast(h = 3) %>% autoplot() + 
  labs(title = "Close-up of wmurders time series with 3-year forecast", 
       subtitle = "Forecast from ARIMA(1,2,1)", 
       x = "Year", y = "") + 
  geom_vline(xintercept = 2005, lty = 3, lwd = 1) + 
  coord_cartesian(xlim = c(1995, 2007), 
                  ylim = c(1.5, 4.0))

g. auto.arima function

Yes, the auto.arima function gives the same model as fit3 = ARIMA(1,2,1). It arrives at the same model whether using the default search method or using the long search method (with stepwise and approximation set to FALSE).