Packages:

library(fpp3)
library(RColorBrewer)
library(knitr)
library(pracma)
library(cowplot)

Exercise 9.1:

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

The differences in these figures can be explained by the increasing sample size (n). When n = 36, the sample size is relatively small, so the absolute value of autocorrelation is higher in the leftmost figure than in the other figures. As n increases, the absolute value of the autocorrelation gets smaller and ultimately converges toward 0. All of these figures do indicate that the data are white noise, as even in the leftmost plot, there are no spikes outside the blue bounds of the ACF plot.

The critical values are at different distances from the mean of zero because when n is relatively small, i.e. 36, the correlation between any two of these values at any lag varies more than when n is relatively large, i.e. 360 or 1,000. The autocorrelation is measured as an average, so when n is relatively small, the denominator is relatively small, and even uncorrelated data, such as white noise, have larger absolute values of autocorrelation at any lag. As n increases, the relatively larger denominator stabilizes the variation in correlation. As such, the blue bounds of the ACF plot will be wider for small n than large n because it takes a smaller difference in the absolute value of autocorrelation to generate the abnormal spikes we need to see to distinguish autocorrelated data from white noise.

Exercise 9.2:

A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

data(gafa_stock)
keep <- c("Symbol", "Date", "Close")
amazon <- gafa_stock |>
    select(all_of(keep)) |>
    filter(Symbol == "AMZN") |>
    mutate(day = row_number()) |>
    update_tsibble(index = day, regular = TRUE)
p1 <- amazon |>
    autoplot(Close) +
    labs(title = "Amazon daily closing stock price")
p2a <- amazon |>
    ACF(Close) |>
    autoplot() +
    labs(subtitle = "ACF: Amazon daily closing stock price")
p2b <- amazon |>
    PACF(Close) |>
    autoplot() +
    labs(subtitle = "PACF: Amazon daily closing stock price")
p1

The time plot of the Amazon daily closing stock price demonstrates the first piece of evidence that the data are non-stationary: there is a clear upward trend visible in the data.

p2 <- plot_grid(p2a, p2b)
p2

The ACF plot of the Amazon daily closing stock price shows two more pieces of evidence indicating the data are non-stationary: the ACF decreases very slowly for this time series, and the value of \(r_1\) is large and positive.

If the PACF provides additional evidence the data are non-stationary, other than confirming the value of \(r_1\) is large and positive, it’s unclear to us. We thought analyzing the PACF was only useful for selecting values of p or q for ARIMA models for stationary data (i.e. non-stationary data that has already been differenced).

Exercise 9.3:

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

keep <- c("Country", "Year", "GDP")
turkey_gdp <- global_economy |>
    select(all_of(keep)) |>
    filter(Country == "Turkey")
p3 <- turkey_gdp |>
    autoplot(GDP) +
    labs(title = "Annual Turkish GDP in USD")
p3

lam <- turkey_gdp |>
    features(GDP, features = guerrero) |>
    pull(lambda_guerrero)
lam <- round(lam, 2)

The ideal lambda proposed for a Box-Cox transformation is \(\approx 0.16\), so we will round down and use a log transformation here.

turkey_gdp <- turkey_gdp |>
    mutate(Log_GDP = log(GDP))
p4 <- turkey_gdp |>
    autoplot(Log_GDP) +
    labs(title = "Annual Turkish log(GDP) in USD")
p4

There is no evidence of seasonality in the plot of log(GDP), but we will confirm the appropriate number of seasonal differences is zero. Then we will determine the appropriate number of first differences for the data.

nsdiffs <- turkey_gdp |>
    features(Log_GDP, unitroot_nsdiffs) |>
    select(-Country)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- turkey_gdp |>
    features(Log_GDP, unitroot_ndiffs) |>
    select(-Country)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
Appropriate N: Seasonal Differences Appropriate N: First Differences
0 1

The appropriate number of seasonal differences is in fact zero, and the appropriate number of first differences is one. Now we can difference the data and confirm we’ve made it stationary:

p5 <- turkey_gdp |>
    ACF(difference(Log_GDP)) |>
    autoplot() +
    labs(subtitle = "ACF: Annual Changes in Turkish log(GDP) in USD")
p5

The ACF of the differenced log(GDP) data looks just like that of a white noise series, as there are no spikes outside the blue bounds.

turkey_gdp |>
    mutate(diff_Log_GDP = difference(Log_GDP)) |>
    features(diff_Log_GDP, ljung_box, lag = 10)
## # A tibble: 1 × 3
##   Country lb_stat lb_pvalue
##   <fct>     <dbl>     <dbl>
## 1 Turkey     4.12     0.942

And the large p-value from the Ljung-Box test confirms the differenced log(GDP) data are stationary.

data(aus_accommodation)
keep <- c("Date", "State", "Takings")
tasmania_takings <- aus_accommodation |>
    select(all_of(keep)) |>
    filter(State == "Tasmania")
p6 <- tasmania_takings |>
    autoplot(Takings) +
    labs(title = "Quarterly Takings in Tasmania")
p6

lam <- tasmania_takings |>
    features(Takings, features = guerrero) |>
    pull(lambda_guerrero)
lam <- round(lam, 3)

The ideal lambda proposed for a Box-Cox transformation is \(\approx 0.002\), so we will round down and use a log transformation again.

tasmania_takings <- tasmania_takings |>
    mutate(Log_Takings = log(Takings))
p7 <- tasmania_takings |>
    autoplot(Log_Takings) +
    labs(title = "Quarterly log(Takings) in Tasmania")
p7

There is strong evidence of seasonality in the plot of log(Takings), so we will determine the appropriate number of seasonal differences, then the appropriate number of first differences for the data.

nsdiffs <- tasmania_takings |>
    features(Log_Takings, unitroot_nsdiffs) |>
    select(-State)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- tasmania_takings |>
    features(Log_Takings, unitroot_ndiffs) |>
    select(-State)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
Appropriate N: Seasonal Differences Appropriate N: First Differences
1 1

The appropriate number of seasonal differences is one, and the appropriate number of first differences is one. Now we can doubly difference the data and confirm we’ve made it stationary:

p8 <- tasmania_takings |>
    transmute(
        `Annual Change log(Takings)` = difference(Log_Takings, 4),
        `Doubly Diff. log(Takings)` = difference(difference(
            Log_Takings, 4), 1)) |>
    pivot_longer(-Date, names_to="Type", values_to="Log_Takings") |>
    mutate(Type = factor(Type, levels = c("Annual Change log(Takings)",
                                          "Doubly Diff. log(Takings)"))) |>
    ggplot(aes(x = Date, y = Log_Takings)) +
    geom_line() +
    facet_grid(vars(Type), scales = "free_y") +
    labs(title = "Tasmanian Takings", y = NULL)
p8 

The data now appear stationary.

data(souvenirs)
p9 <- souvenirs |>
    autoplot(Sales) +
    labs(title = "Monthly Souvenir Sales")
p9

lam <- souvenirs |>
    features(Sales, features = guerrero) |>
    pull(lambda_guerrero)
lam <- round(lam, 3)

The ideal lambda proposed for a Box-Cox transformation is again \(\approx 0.002\), so we will round down and use a log transformation yet again.

souvenirs <- souvenirs |>
    mutate(Log_Sales = log(Sales))
p10 <- souvenirs |>
    autoplot(Log_Sales) +
    labs(title = "Monthly Souvenir log(Sales)")
p10

There is strong evidence of seasonality in the plot of log(Sales), so we will determine the appropriate number of seasonal differences, then the appropriate number of first differences for the data.

nsdiffs <- souvenirs |>
    features(Log_Sales, unitroot_nsdiffs)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- souvenirs |>
    features(Log_Sales, unitroot_ndiffs)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
Appropriate N: Seasonal Differences Appropriate N: First Differences
1 1

The appropriate number of seasonal differences is one, and the appropriate number of first differences is one. Now we can doubly difference the data and confirm we’ve made it stationary:

p11 <- souvenirs |>
    transmute(
        `Annual Change log(Sales)` = difference(Log_Sales, 12),
        `Doubly Diff. log(Sales)` = difference(difference(
            Log_Sales, 12), 1)) |>
    pivot_longer(-Month, names_to="Type", values_to="Log_Sales") |>
    mutate(Type = factor(Type, levels = c("Annual Change log(Sales)",
                                          "Doubly Diff. log(Sales)"))) |>
    ggplot(aes(x = Month, y = Log_Sales)) +
    geom_line() +
    facet_grid(vars(Type), scales = "free_y") +
    labs(title = "Souvenir Sales", y = NULL)
p11

The data now appear stationary.

Exercise 9.5:

For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

data(aus_retail)
set.seed(1221)
remove <- c("State", "Industry", "Series ID")
my_series <- aus_retail |>
    filter(`Series ID` == sample(aus_retail$`Series ID`, 1)) |>
    select(-all_of(remove))
p12 <- my_series |>
    autoplot(Turnover)
p12

lam <- my_series |>
    features(Turnover, features = guerrero) |>
    pull(lambda_guerrero)
lam <- round(lam, 2)

The ideal lambda proposed for a Box-Cox transformation is \(\approx 0.26\), so we will round down and use a fourth root transformation.

my_series <- my_series |>
    mutate(Turnover_4thRt = nthroot(Turnover, 4))
p13 <- my_series |>
    autoplot(Turnover_4thRt) +
    labs(title = "My Series: Monthly Turnover (Fourth Root)")
p13

We will determine the appropriate number of seasonal differences for the transformed data, then the appropriate number of first differences for the transformed data.

nsdiffs <- my_series |>
    features(Turnover_4thRt, unitroot_nsdiffs)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- my_series |>
    features(Turnover_4thRt, unitroot_ndiffs)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
Appropriate N: Seasonal Differences Appropriate N: First Differences
1 1

The appropriate number of seasonal differences is one, and the appropriate number of first differences is one. Now we can doubly difference the data and confirm we’ve made it stationary:

p14 <- my_series |>
    transmute(
        `Annual Change Turnover^1/4` = difference(Turnover_4thRt, 12),
        `Doubly Diff. Turnover^1/4` = difference(difference(
            Turnover_4thRt, 12), 1)) |>
    pivot_longer(-Month, names_to="Type", values_to="Turnover_4thRt") |>
    mutate(Type = factor(Type, levels = c("Annual Change Turnover^1/4",
                                          "Doubly Diff. Turnover^1/4"))) |>
    ggplot(aes(x = Month, y = Turnover_4thRt)) +
    geom_line() +
    facet_grid(vars(Type), scales = "free_y") +
    labs(title = "My Series: Turnover", y = NULL)
p14

The data now appear stationary.

Exercise 9.6:

Simulate and plot some data from simple ARIMA models.

y <- numeric(100)
e <- rnorm(100)
for (i in 2:100){
    y[i] <- (0.6 * y[i-1]) + e[i]
}
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
p15 <- sim |>
    autoplot(y) +
    labs(title = "Sim 1:")
p15

phis = seq(from = -0.9, to = 0.9, by = 0.1)
sim_df <- as.data.frame(matrix(ncol = 2, nrow = 0))
colnames(sim_df) <- c("phi_val", "y")
for (j in 1:length(phis)){
    y <- numeric(100)
    e <- rnorm(100)
    phi <- phis[j]
    phi_val <- (y + 1) * phi
    for (i in 2:100){
        y[i] <- (phi * y[i-1]) + e[i]
    }
    idx <- seq_len(100)
    new_rows <- cbind(idx, phi_val, y)
    sim_df <- rbind(sim_df, new_rows)
}
sim_ts <- sim_df |>
    as_tsibble(index = idx, key = phi_val)
p16 <- sim_ts |>
    autoplot(y) +
    facet_wrap(~ phi_val, ncol = 5) +
    theme_minimal() + 
    theme(legend.position = "none") +
    labs(title = expression(paste("Repeated Simulations with Various ", phi[1],
    " Values")))
p16

Changing \(\phi_1\) from \(-0.9\) to \(0.9\) in intervals of \(0.1\) results in different time series patterns. For all \(-0.9 \le \phi_1 < 0\), the time series patterns frequently oscillate around the mean. The absolute value of the back and forth swings is the largest, and the swings are most frequent, when \(\phi_1\) is smallest (i.e. \(-0.9\)). For all \(0.9 \ge \phi_1 \ge 0\), the time series patterns no longer oscillate around the mean, and changes in these different patterns become less frequent. Because \(\phi_1 < 1\) for all values of \(\phi_1\) here, the most recent observations always have higher weight than observations from the more distant past.

y <- numeric(100)
e <- rnorm(100)
for (i in 2:100){
    y[i] <- (0.6 * e[i-1]) + e[i]
}
sim2 <- tsibble(idx = seq_len(100), y = y, index = idx)
p17 <- sim2 |>
    autoplot(y) +
    labs(title = "Sim 2:")
p17

thetas = seq(from = -0.9, to = 0.9, by = 0.1)
sim_df2 <- as.data.frame(matrix(ncol = 2, nrow = 0))
colnames(sim_df2) <- c("theta_val", "y")
for (j in 1:length(thetas)){
    y <- numeric(100)
    e <- rnorm(100)
    theta <- thetas[j]
    theta_val <- (y + 1) * theta
    for (i in 2:100){
        y[i] <- (theta * e[i-1]) + e[i]
    }
    idx <- seq_len(100)
    new_rows <- cbind(idx, theta_val, y)
    sim_df2 <- rbind(sim_df2, new_rows)
}
sim_ts2 <- sim_df2 |>
    as_tsibble(index = idx, key = theta_val)
p18 <- sim_ts2 |>
    autoplot(y) +
    facet_wrap(~ theta_val, ncol = 5) +
    theme_minimal() + 
    theme(legend.position = "none") +
    labs(title = expression(paste("Repeated Simulations with Various ", theta[1],
    " Values")))
p18

Changing \(\theta_1\) from \(-0.9\) to \(0.9\) in intervals of \(0.1\) results in different time series patterns, just like changing \(\phi_1\) did. The smaller the value of \(\theta_1\), the more frequently the data oscillate back and forth around the mean. Because \(\theta_1 < 1\) for all values of \(\theta_1\) here, the most recent observations again always have higher weight than observations from the more distant past.

y <- numeric(100)
e <- rnorm(100)
for (i in 2:100){
    y[i] <- (0.6 * y[i-1]) + (0.6 * e[i-1]) + e[i]
}
sim3 <- tsibble(idx = seq_len(100), y = y, index = idx)
y <- numeric(100)
e <- rnorm(100)
for (i in 3:100){
    y[i] <- (-0.8 * y[i-1]) + (0.3 * y[i-2]) + e[i]
}
sim4 <- tsibble(idx = seq_len(100), y = y, index = idx)
p19a <- sim3 |>
    autoplot(y) +
    labs(title = "Sim 3:")
p19b <- sim4 |>
    autoplot(y) +
    labs(title = "Sim 4:")
p19 <- plot_grid(p19a, p19b)
p19

The ARMA(1,1) model (Sim 3) is stationary. Because \(\phi_1 = \theta_1 < 1\) here, the most recent observations always have higher weight than observations from the more distant past. The AR(2) model (Sim 4) is non-stationary. Because \(\phi_1 < 0\), we see oscillation around the mean. The swings back and forth get larger and larger.

Exercise 9.7:

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

data(aus_airpassengers)
p20 <- aus_airpassengers |>
    autoplot(Passengers)
p20

lam <- aus_airpassengers |>
    features(Passengers, features = guerrero) |>
    pull(lambda_guerrero)
lam <- round(lam, 2)

With a proposed lambda transformation of \(-0.18\), a log transformation is probably necessary, but we will skip it so the following questions are less complicated to answer as written.

cols <- c("stepwise", "search")
fit <- aus_airpassengers |>
    model(stepwise = ARIMA(Passengers),
          search = ARIMA(Passengers, stepwise=FALSE))
pivot_fit <- fit |>
    pivot_longer(cols = all_of(cols), names_to = "Model Name",
                 values_to = "Orders")
kable(pivot_fit, format = "simple")
Model Name Orders
stepwise <ARIMA(0,2,1)>
search <ARIMA(0,2,1)>
glance_fit <- glance(fit) |>
    arrange(AICc) |>
    select(.model:BIC)
kable(glance_fit, format = "simple")
.model sigma2 log_lik AIC AICc BIC
stepwise 4.307764 -97.01896 198.0379 198.3236 201.6512
search 4.307764 -97.01896 198.0379 198.3236 201.6512

Both the default stepwise ARIMA() procedure and the more extensive search procedure selected the same model: ARIMA(0,2,1).

We check that the residuals look like white noise.

fit |>
    select(stepwise) |>
    gg_tsresiduals()

The residuals do in fact look like white noise.

We plot forecasts for the next 10 periods.

p22a <- fit |>
    forecast(h=10) |>
    filter(.model == "stepwise") |>
    autoplot(aus_airpassengers) +
    labs(subtitle = "AUS Air Passengers: ARIMA(0,2,1) Forecasts")
p22a

\((1 - B)^2y_t = (1 + \theta_1B)\epsilon_t\)

fit <- aus_airpassengers |>
    model(arima010c = ARIMA(Passengers ~ 1 + pdq(0,1,0)))
p22b <- fit |>
    forecast(h=10) |>
    filter(.model == "arima010c") |>
    autoplot(aus_airpassengers) +
    labs(subtitle = "AUS Air Passengers: ARIMA(0,1,0) w/ Drift Forecasts")
p22b

The forecasts from the ARIMA(0,1,0) w/ drift model are very similar to the forecasts from the ARIMA(0,2,1) forecasts. Forecasts from the former are slightly less optimistic. The prediction intervals in the latter model are wider than the prediction intervals in the former.

fit <- aus_airpassengers |>
    model(arima212c = ARIMA(Passengers ~ 1 + pdq(2, 1, 2)),
          arima212 = ARIMA(Passengers ~ 0 + pdq(2, 1, 2)),
          arima021c = ARIMA(Passengers ~ 1 + pdq(0, 2, 1)))
p23 <- fit |>
    forecast(h=10) |>
    filter(.model == "arima212c") |>
    autoplot(aus_airpassengers) +
    labs(subtitle = "AUS Air Passengers: ARIMA(2,1,2) w/ Drift Forecasts")
p23

The forecasts aren’t very different, but they and the prediction intervals wobble.

p24 <- fit |>
    forecast(h=10) |>
    filter(.model == "arima212") |>
    autoplot(aus_airpassengers) +
    labs(subtitle = "AUS Air Passengers: ARIMA(2,1,2) Forecasts")
p24

Removing the constant has resulted in a NULL model for which forecasts can’t be produced/plotted.

p25 <- fit |>
    forecast(h=10) |>
    filter(.model == "arima021c") |>
    autoplot(aus_airpassengers) +
    labs(subtitle = "AUS Air Passengers: ARIMA(0,2,1) w/ Drift Forecasts")
p25

The forecasts become more optimistic. We’ve been warned that including a constant and an order of differencing \(\ge 2\) is dangerous when forecasting.

Exercise 9.8:

For the United States GDP series (from global_economy):

keep <- c("Country", "Year", "GDP")
us_gdp <- global_economy |>
    select(all_of(keep)) |>
    filter(Country == "United States")
p26 <- us_gdp |>
    autoplot(GDP) +
    labs(title = "Annual United States GDP in USD")
p26

lam <- us_gdp |>
    features(GDP, features = guerrero) |>
    pull(lambda_guerrero)
lam <- round(lam, 2)

With a proposed lambda of \(0.28\), we round down to \(0.25\), a fourth root transformation.

lam <- 0.25
fit1 <- us_gdp |>
    model(stepwise = ARIMA(box_cox(GDP, lam)),
          search = ARIMA(box_cox(GDP, lam), stepwise = FALSE))
pivot_fit <- fit1 |>
    pivot_longer(-Country, names_to = "Model Name",
                 values_to = "Orders")
kable(pivot_fit, format = "simple")
Country Model Name Orders
United States stepwise <ARIMA(1,1,0) w/ drift>
United States search <ARIMA(1,1,0) w/ drift>
glance_fit <- glance(fit1) |>
    arrange(AICc) |>
    select(.model:BIC)
kable(glance_fit, format = "simple")
.model sigma2 log_lik AIC AICc BIC
stepwise 839.3346 -271.8562 549.7123 550.1652 555.8415
search 839.3346 -271.8562 549.7123 550.1652 555.8415
fit2 <- us_gdp |>
    model(arima011 = ARIMA(box_cox(GDP, lam) ~ 1 + pdq(0,1,1)),
          arima211 = ARIMA(box_cox(GDP, lam) ~ 1 + pdq(2,1,1)))
pivot_fit <- fit2 |>
    pivot_longer(-Country, names_to = "Model Name",
                 values_to = "Orders")
kable(pivot_fit, format = "simple")
Country Model Name Orders
United States arima011 <ARIMA(0,1,1) w/ drift>
United States arima211 <ARIMA(2,1,1) w/ drift>
glance_fit <- glance(fit2) |>
    arrange(AICc) |>
    select(.model:BIC)
kable(glance_fit, format = "simple")
.model sigma2 log_lik AIC AICc BIC
arima011 876.2163 -273.0490 552.0979 552.5508 558.2271
arima211 861.4592 -271.5637 553.1274 554.3039 563.3427

The stepwise model produced by ARIMA() had the lowest AICc (identical to the model produced by the more exhaustive search procedure), so that model is the best. We check the residual diagnostics.

fit1 |>
    select(stepwise) |>
    gg_tsresiduals()

The residuals look like white noise.

p27 <- fit1 |>
    forecast(h=10) |>
    filter(.model == "stepwise") |>
    autoplot(us_gdp) +
    labs(subtitle = "US GDP: ARIMA(1,1,0) w/ Drift Forecasts")
p27

Yes, the forecasts look reasonable.

compare <- us_gdp |> 
    slice(-n()) |>
    stretch_tsibble(.init = 10) |>
    model(ETS(GDP),
          ARIMA(box_cox(GDP, lam))) |>
    forecast(h = 1) |>
    accuracy(us_gdp) |>
    select(.model, RMSE:MAPE)
kable(compare, format = "simple")
.model RMSE MAE MPE MAPE
ARIMA(box_cox(GDP, lam)) 180880178014 114151663572 0.0725540 1.679234
ETS(GDP) 190604101250 126812323851 0.5935103 1.731149
fit3 <- us_gdp |>
    model(ETS = ETS(GDP))
p28 <- fit3 |>
    forecast(h=10) |>
    filter(.model == "ETS") |>
    autoplot(us_gdp) +
    labs(subtitle = "US GDP: ETS Forecasts")
p28

ETS() is less accurate by comparison. Its forecasts are less optimistic, and its prediction intervals are much wider, although ARIMA prediction intervals will always tend to be too narrow.