Objective:

Do the exercises 9.1, 9.2, 9.3, 9.5, 9.6, 9.7, 9.8 in Hyndman. Please submit both the Rpubs link as well as your .rmd file.

9.1

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. a. Explain the differences among these figures. Do they all indicate that the data are white noise?

The notable differences among these figures are how the leftmost figure with 36 random numbers has many ACF bars poke past the dashed bounds just by chance. The middle figure with 360 random numbers has most bars shrink toward zero and only a few flirt with the bounds. Lastly, the figure with the 1,000 random numbers has the bars are tiny and stay well within the bounds. b. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

The critical values at different distances from the mean of zero are different because the bands depend on sample size: with fewer points the bands are wider, and with more points they get narrower. The autocorrelations are different because each figure is a new random sample of white noise which means that the bars wiggle by chance and more with small samples and less with large ones.

9.2

2. A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

library(fpp3)
library(tsibble)
library(tibble)
library(feasts)
library(ggplot2)

amzn <- gafa_stock |>
  filter(Symbol == "AMZN") |>
  select(Date, Close)

autoplot(amzn, Close) + labs(title = "AMZN daily closing prices", x = NULL, y = "USD")

ACF(amzn, Close)  |> autoplot() + labs(title = "ACF: AMZN Close")

PACF(amzn, Close) |> autoplot() + labs(title = "PACF: AMZN Close")

Each plot shows that the series is non-stationary and should be differenced in that: The series plot (AMZN daily closing price) it trends upward over time which therefore shows how the average level isn’t constant, the ACF correlations decay very slowly and stay large for many lags which is a trend, and the PACF has the first few lags are strong, which also points to a unit-root style trend. Taking a first difference (price changes) usually makes the ACF drop off quickly.

9.3

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data. a. Turkish GDP from global_economy.

library(urca)
tur <- global_economy |>
  filter(Country == "Turkey") |>
  select(Year, GDP)

# Box–Cox lambda
lam_tur <- tur |> features(GDP, guerrero) |> pull(lambda_guerrero)

# Before and After diff
tur_checks <- tibble(
  step = c("boxcox only", "nonseasonal diff (d=1)"),
  kpss = c(
    tur |> mutate(y = box_cox(GDP, lam_tur)) |> features(y, unitroot_kpss) |> pull(kpss_stat),
    tur |> mutate(y = difference(box_cox(GDP, lam_tur))) |> features(y, unitroot_kpss) |> pull(kpss_stat)
  )
)
lam_tur; tur_checks
## [1] 0.1572187

b. Accommodation takings in the state of Tasmania from aus_accommodation.

tas <- aus_accommodation |>
  as_tibble() |>
  filter(State == "Tasmania") |>
  summarise(Takings = sum(Takings, na.rm = TRUE), .by = Date) |>
  tsibble::as_tsibble(index = Date)
names(tas); is_tsibble(tas); tsibble::index_var(tas)
## [1] "Date"    "Takings"
## [1] TRUE
## [1] "Date"
tas <- aus_accommodation |>
  as_tibble() |>
  filter(State == "Tasmania") |>
  summarise(Takings = sum(Takings, na.rm = TRUE), .by = Date) |>
  tsibble::as_tsibble(index = Date)

# Box–Cox lambda
lam_tas <- tas |> features(Takings, guerrero) |> pull(lambda_guerrero)

m <- 4  # quarterly seasonality instead of using YearQuarter index as earlier pipeline briefly lost the special time index and difference(lag = "year") error

# Box–Cox
tas_bc <- tas %>% mutate(Takings = box_cox(Takings, lam_tas))
tas_bc %>% features(Takings, unitroot_kpss) %>% pull(kpss_stat)
## [1] 1.837734
# Seasonal diff (D = 1)
tas_seas <- tas_bc %>% mutate(Takings = difference(Takings, lag = m))
tas_seas %>% features(Takings, unitroot_kpss) %>% pull(kpss_stat)
## [1] 0.1984156
# Seasonal+nonseasonal (D = 1, d = 1)
tas_both <- tas_seas %>% mutate(Takings = difference(Takings))
tas_both %>% features(Takings, unitroot_kpss) %>% pull(kpss_stat)
## [1] 0.04736784
lam_tas
## [1] 0.001819643

c. Monthly sales from souvenirs.

lam_souv <- souvenirs |>
  features(Sales, guerrero) |>
  pull(lambda_guerrero)

# KPSS for checking Box–Cox only, then seasonal diff (D=1), then add nonseasonal diff (d=1)
m <- 12

kpss_tbl <- tibble::tibble(
  step  = c("BoxCox only", "Seasonal diff (D=1)", "(D=1) then (d=1)"),
  kpss  = c(
    souvenirs |> dplyr::mutate(y = box_cox(Sales, lam_souv)) |>
      features(y, unitroot_kpss) |> dplyr::pull(kpss_stat),
    souvenirs |> dplyr::mutate(y = difference(box_cox(Sales, lam_souv), lag = m)) |>
      features(y, unitroot_kpss) |> dplyr::pull(kpss_stat),
    souvenirs |> dplyr::mutate(y = difference(difference(box_cox(Sales, lam_souv), lag = m))) |>
      features(y, unitroot_kpss) |> dplyr::pull(kpss_stat)
  )
)
kpss_tbl

9.5

For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

id <- aus_retail |> distinct(`Series ID`) |> slice(1) |> pull()

myseries <- aus_retail |>
  filter(`Series ID` == id) |>
  select(Month, Turnover) |>
  as_tsibble(index = Month)
lam <- myseries |>
  features(Turnover, guerrero) |>
  pull(lambda_guerrero)

m <- 12

kpss_check <- dplyr::bind_rows(
  myseries |>
    mutate(y = box_cox(Turnover, lam)) |>
    features(y, unitroot_kpss) |>
    mutate(step = "BoxCox"),

  myseries |>
    mutate(y = difference(box_cox(Turnover, lam), lag = m)) |>
    features(y, unitroot_kpss) |>
    mutate(step = "Seasonal diff (D=1)"),

  myseries |>
    mutate(y = difference(difference(box_cox(Turnover, lam), lag = m))) |>
    features(y, unitroot_kpss) |>
    mutate(step = "D=1 then d=1")
) |>
  select(step, kpss_stat, kpss_pvalue)

kpss_check

Based on these results, the Box-Cox is still non-stationary with the kpss_pvalue=0.01 < 0.05. After one seasonal difference D = 1, m = 12 the series is stationary kpss_pvalue = 0.074 ≥ 0.05, which therefore tells us the extra nonseasonal difference isn’t needed (d, D) = (0, 1).

9.6

Simulate and plot some data from simple ARIMA models. a. Use the following R code to generate data from AR(1) model.

library(ggplot2)
library(feasts)

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

b. Produce a time plot for the series. How does the plot change as you change phi1

autoplot(sim, y) +
  labs(title = "AR(1) phi = 0.6, sigma^2 = 1", x = "t", y = "y_t")

The plot changes as phi1 gets larger because the series becomes more persistent and smooth with long runs in the same direction. Whereas phi1 near 0 looks like white noise while a negative phi1 creates an inconsistent up and down pattern.

c. Write your own code to generate data from an MA(1) model with 01=0.6 and sigma^2=1

n <- 200
theta <- 0.6
sigma2 <- 1
burn <- 30 # burn-in to soften start-up bias

e <- rnorm(n + burn, sd = sqrt(sigma2))
y <- numeric(n + burn)
# use a random "e0" instead of forcing it to 0
e0 <- rnorm(1, sd = sqrt(sigma2))
y[1] <- e[1] + theta * e0

for (t in 2:(n + burn)) {
  y[t] <- e[t] + theta * e[t - 1]
}

# keep post burn-in part
y <- y[(burn + 1):(n + burn)]
sim_ma1 <- tsibble::tsibble(idx = seq_len(n), y = y, index = idx)

d. Produce a time plot for the series. How does the plot change as you change theta1?

autoplot(sim_ma1, y) + labs(title = "MA(1) with theta1 = 0.6", x = "t", y = "y_t")

The closest points move together as the theta1 increases to 1, which is where the series shows short smooth runs and when theta1 decreases to 0 it looks like white noise. If theta1 is negative, the plot tends to go up and down rapidly.

e. Generate data from an ARMA(1,1) model with phi1 = 0.6, theta1 = 0.6 and sigma^2 = 1.

sim_arma11 <- function(n = 200, phi1 = 0.6, theta1 = 0.6, sigma2 = 1,
                       burn = 50, seed = 123){
  set.seed(seed)
  e <- rnorm(n + burn + 1, sd = sqrt(sigma2))
  y <- numeric(n + burn + 1)
  for (t in 2:(n + burn + 1)) {
    y[t] <- phi1 * y[t - 1] + e[t] + theta1 * e[t - 1]
  }
  tibble(idx = seq_len(n), y = y[(burn + 2):(burn + n + 1)]) |>
    as_tsibble(index = idx)
}
sim <- sim_arma11(n = 200, phi1 = 0.6, theta1 = 0.6, sigma2 = 1)

autoplot(sim, y) +
  labs(title = "ARMA(1,1): phi1 = 0.6, theta1 = 0.6, sigma^2 = 1",
       x = "t", y = "y_t")

f. Generate data from an AR(2) model with phi1 = -0.8, phi2 = 0.3, and sigma^2=1 (Note that these parameters will give a non-stationary series.)

n <- 200
phi1 <- -0.8
phi2 <- 0.3
sigma2 <- 1
burn <- 50

e <- rnorm(n + burn, sd = sqrt(sigma2))
y <- numeric(n + burn)

for (t in 3:(n + burn)) {
  y[t] <- phi1 * y[t-1] + phi2 * y[t-2] + e[t]
}

sim_ar2 <- tsibble::tsibble(
  t = seq_len(n),
  y = y[(burn + 1):(n + burn)],
  index = t
)

autoplot(sim_ar2, y) +
  labs(title = "AR(2): phi1 = -0.8, phi2 = 0.3, sigma^2 = 1",
       x = "t", y = "y_t")

g. Graph the latter two series and compare them.

sim_fix <- sim %>% rename(t = idx) %>% tibble::as_tibble() %>% select(t, y)
ar2_fix <- sim_ar2 %>%                       tibble::as_tibble() %>% select(t, y)
both_series <- bind_rows(
  sim_fix %>% mutate(series = "ARMA(1,1): phi1=0.6, theta1=0.6"),
  ar2_fix %>% mutate(series = "AR(2): phi1=-0.8, phi2=0.3")
) %>%
  tsibble::as_tsibble(index = t, key = series)

feasts::autoplot(both_series, y) +
  facet_wrap(~ series, ncol = 1, scales = "free_y") +
  labs(x = "t", y = "y_t", title = "6e. ARMA(1,1) vs 6f. AR(2)") +
  theme_bw()

6e’s ARMA(1,1) series stays roughly around zero with a steady amplitude which appears to be stationary. 6f’s AR(2) series (phi1 = −0.8, phi2 = 0.3) escalates towards larger and larger up and down patterns rapidly which means it is non-stationary.

9.7

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011. a. Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

aus_airpassengers %>% autoplot(Passengers) +
  labs(y = "Millions", title = "Australian air passengers")

# Let ARIMA() choose the model
fit <- aus_airpassengers %>%
  model(ARIMA(Passengers))

# What model was selected?
report(fit)   # prints something like: ARIMA(0,1,1) with drift (your output is the answer)
## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8963
## s.e.   0.0594
## 
## sigma^2 estimated as 4.308:  log likelihood=-97.02
## AIC=198.04   AICc=198.32   BIC=201.65
# Residual diagnostics: should look like white noise
gg_tsresiduals(fit)

# Ljung–Box test on the one-step-ahead innovations
augment(fit) %>%
  features(.innov, ljung_box, lag = 10)    # p-values > 0.05 ⇒ residuals ~ white noise
# 10-year ahead forecasts
fc <- fit %>% forecast(h = 10)

autoplot(fc, aus_airpassengers) +
  labs(y = "Millions", title = "ARIMA Australian Air passengers")

Here I playtested between a few ARIMA specs, starting with ARIMA(1,1,1) and some seasonal terms however the residual ACF continued showing structure, which is where I tried experimenting with extra differencing and simpler orders. Once I moved to second differencing and dropped the AR term, ARIMA(0,2,1) gave white-noise residuals and its forecasts showed promise. The trial and error throughout this process showed me to let the data itself give direct instruction for me, such as fixing the trend with differencing first, keeping the model as simple as possible, and use residual checks and not just AIC to decide.

b. Write the model in terms of the backshift operator. The model in terms of the backshift operator would be ARIMA(0,2,1) in backshift form: (1 - B)^2 y_t = (1 + theta1 * B) * e_t, with theta1 ≈ -0.896 Expanded: y_t - 2 y_{t-1} + y_{t-2} = e_t - 0.896 * e_{t-1} Here B is the backshift operator (B y_t = y_{t-1}), e_t is the error term, and theta1 is the MA(1) coefficient.

c. Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a. The ARIMA(0,1,0) with drift gives a straight and steady growth line and the prediction bands widen slowly. ARIMA(0,2,1) on the other hand, rises faster and has wider intervals, showing more uncertainty.

fit_auto <- aus_airpassengers |> model(auto = ARIMA(Passengers))
fit_rw   <- aus_airpassengers |> model(rw_drift = RW(Passengers ~ drift()))

# 1) ARIMA(2,1,2) with drift (constant)
fit_212_d <- aus_airpassengers |>
  model(arima212_drift = ARIMA(Passengers ~ pdq(2,1,2) + drift()))

# 2) ARIMA(2,1,2) without drift (no constant)
fit_212_n <- aus_airpassengers |>
  model(arima212_nodrift = ARIMA(Passengers ~ pdq(2,1,2)))

# Forecast 10 periods ahead
fc_all <- bind_rows(
  forecast(fit_auto, h = 10),
  forecast(fit_rw,   h = 10),
  forecast(fit_212_d, h = 10),
  forecast(fit_212_n, h = 10)
)

# Plot all forecasts side-by-side
autoplot(fc_all, aus_airpassengers) +
  facet_wrap(vars(.model), ncol = 2) +
  labs(title = "ARIMA forecasts for Australian air passengers", y = "Millions", x = NULL)

With a constant, ARIMA(2,1,2) rises much like 7a’s ARIMA(0,2,1) and looks close to 7c’s random walk with drift. Whereas without the constant, its forecasts flatten and sit below parts a and c, which keep a stronger upward trend.

e. Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens? Adding a constant to ARIMA(0,2,1) makes the second differences non-zero, so the forecasts curve which is parabolic and grow fast. That makes the intervals blow up and the fit unstable, so I usually use ARIMA(0,2,1) without a constant.

9.8

For the United States GDP series (from global_economy): a. if necessary, find a suitable Box-Cox transformation for the data;

us <- global_economy %>%
  filter(Country == "United States") %>%
  select(Year, GDP)

## 8a) Box–Cox lambda (Guerrero)
lambda <- us %>% features(GDP, guerrero) %>% pull(lambda_guerrero)
lambda
## [1] 0.2819443

b. fit a suitable ARIMA model to the transformed data using ARIMA();

fit <- us %>%
  model(auto = ARIMA(box_cox(GDP, lambda)))

report(fit)
## Series: GDP 
## Model: ARIMA(1,1,0) w/ drift 
## Transformation: box_cox(GDP, lambda) 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1822
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78

c. try some other plausible models by experimenting with the orders chosen;

# 1) More plausible: ARIMA(2,1,1) with drift (constant when d = 1)
fit_mpl <- us |>
  model(mpl = ARIMA(box_cox(GDP, lambda) ~ pdq(2,1,1) + 1))

# 2) Less plausible: ARIMA(0,2,2) with constant
fit_lpl <- us |>
  model(lpl = ARIMA(box_cox(GDP, lambda) ~ pdq(0,2,2) + 1))

# Compare fit quality
bind_rows(glance(fit_mpl), glance(fit_lpl)) |>
  dplyr::select(.model, AICc, BIC)

ARIMA(2,1,1)+drift uses one difference to remove growth, two AR lags for persistence, one MA lag for short-term shocks, and a drift term for average growth on the Box–Cox scale. ARIMA(0,2,2)+constant uses two differences (to handle changing trend) and two MA lags to soak up transitory shocks, with a constant that acts like a gentle quadratic trend after back-transforming.

d. choose what you think is the best model and check the residual diagnostics;

diag_tbl <- bind_rows(
  augment(fit_mpl) |>
    features(.innov, ljung_box, lag = 10, dof = 3) |>
    mutate(model = "ARIMA(2,1,1)+drift"),
  augment(fit_lpl) |>
    features(.innov, ljung_box, lag = 10, dof = 3) |>
    mutate(model = "ARIMA(0,2,2)+const")
)

diag_tbl   # look for p-values > 0.05
# Residual plots for the chosen model
gg_tsresiduals(fit_mpl)

# Forecasts from the chosen model
fc_best <- forecast(fit_mpl, h = 12)
autoplot(fc_best, us) + labs(title = "US GDP: ARIMA(2,1,1)+drift on Box–Cox")

Both candidates pass the residual checkk as the Ljung–Box p-values are 0.79 for each (so no remaining autocorrelation), and the residual plots for ARIMA(2,1,1)+drift look well-behaved (ACF inside bands, roughly centered, near-normal). Although ARIMA(0,2,2)+const has lower AICc/BIC (650.9 / 658.3) than ARIMA(2,1,1)+drift (661.5/670.5), I prefer ARIMA(2,1,1)+drift because it needs only one difference (less aggressive differencing), gives smoother and more interpretable forecasts on the Box–Cox scale, and avoids the risk of overdifferencing.

e. produce forecasts of your fitted model. Do the forecasts look reasonable? With ARIMA(2,1,1) + drift on a Box-Cox transform, the next 10 years show a steady rise and the forecast bands open up slowly. There are no odd jumps which therefore makes the forecasts “look reasonable” for GDP.

f. compare the results with what you would obtain using ETS() (with no transformation). ETS() on raw GDP would pick a trend model and gives a more curved and fast growing path with wider long-range intervals. The ARIMA with Box-Cox is flatter over the next decade and has more narrow near-term intervals. ETS() is a bit more optimistic farther out.