1 9.1

1.1 (a) Explain the differences among these ACFs. Do they all indicate white noise?

Figure 9.32 shows sample ACFs from white noise, but the sample size changes (36, 360, 1000).
With only 36 observations, the ACF estimates bounce around more, so a few spikes can look “large” just by chance.
As the sample size increases (360, then 1000), the ACF estimates get closer to 0 at all lags and look much calmer.

Yes — all three are consistent with white noise: most autocorrelations are near 0 and any spikes are random sampling variation.

1.2 (b) Why are the critical values different? Why are the autocorrelations different?

The dashed lines are approximate 95% limits, about ± 1.96 / √n.
So with small n (36) the limits are wider; with larger n (360, 1000) they shrink toward 0.

The ACF bars differ because they are sample estimates. Even when the true autocorrelations are 0 (white noise), the estimates vary randomly, and that variability decreases as n increases.

2 9.2 Amazon stock in gafa_stock

amazon <- gafa_stock |>
  filter(Symbol == "AMZN") |>
  select(Date, Close)

autoplot(amazon, Close) +
  labs(title = "AMZN daily closing price", y = "Close")

amazon |> ACF(Close) |> autoplot() + labs(title = "ACF of AMZN Close")

amazon |> PACF(Close) |> autoplot() + labs(title = "PACF of AMZN Close")

Why non-stationary? The time plot shows a changing level (trend) and long swings.
The ACF decays very slowly (stays high for many lags), which is typical of a non-stationary series.
The PACF has a large spike at lag 1 and then stays important for a while, also suggesting differencing is needed.

3 9.3 Box-Cox and differencing for stationarity

We’ll use a Box–Cox transformation suggestion (BoxCox.lambda) and differencing orders using ADF-based ndiffs() (to avoid the KPSS/urca dependency).

get_lambda <- function(x){
  x <- as.numeric(x)
  x <- x[is.finite(x)]
  if (any(x <= 0)) return(NA_real_)
  forecast::BoxCox.lambda(x, method = "guerrero")
}

get_d <- function(x){
  x <- as.numeric(x)
  x <- x[is.finite(x)]
  forecast::ndiffs(x, test = "adf")
}

get_D <- function(x, m){
  x <- as.numeric(x)
  x <- x[is.finite(x)]
  forecast::nsdiffs(x, m = m)
}

3.1 (a) Turkish GDP from global_economy

turkey_gdp <- global_economy |>
  filter(Country == "Turkey") |>
  select(Year, GDP)

lambda_turkey <- get_lambda(turkey_gdp$GDP)
d_turkey      <- get_d(turkey_gdp$GDP)

lambda_turkey
## [1] 0.1571804
d_turkey
## [1] 1

Answer (Turkey GDP): A log-like Box–Cox (λ near 0) is usually appropriate for GDP, and 1 non-seasonal difference is typically enough.

3.2 (b) Tasmania accommodation takings from aus_accommodation

Some environments store the index column as Date (not Month).
The code below renames the first column to Date so it will knit no matter what it’s called.

tas_raw <- aus_accommodation |>
  filter(State == "Tasmania") |>
  as_tibble()

idx_name <- names(tas_raw)[1]
tas <- tas_raw |>
  rename(Date = all_of(idx_name)) |>
  select(Date, Takings)

lambda_tas <- get_lambda(tas$Takings)
d_tas      <- get_d(tas$Takings)
D_tas      <- get_D(tas$Takings, m = 12)

lambda_tas
## [1] -0.005076712
d_tas
## [1] 0
D_tas
## [1] 1

Answer (Tasmania takings): A Box–Cox transform is often helpful, and you typically need one seasonal difference (D = 1) plus possibly one regular difference (d = 1).

3.3 (c) Monthly sales from souvenirs

lambda_souv <- get_lambda(souvenirs$Sales)
d_souv      <- get_d(souvenirs$Sales)
D_souv      <- get_D(souvenirs$Sales, m = 12)

lambda_souv
## [1] -0.2444328
d_souv
## [1] 1
D_souv
## [1] 1

Answer (Souvenirs): Strong seasonality → D = 1; trend → often d = 1. A log-like Box–Cox (λ near 0) is commonly reasonable.

4 9.5 Retail data differencing order (Exercise 7, Section 2.10)

If you don’t remember your earlier retail series, here is a standard choice using the aus_retail dataset.

retail_series <- aus_retail |>
  filter(State == "New South Wales", Industry == "Department stores") |>
  select(Month, Turnover)

autoplot(retail_series, Turnover) +
  labs(title = "Retail turnover (NSW, Department stores)")

lambda_retail <- get_lambda(retail_series$Turnover)
d_retail      <- get_d(retail_series$Turnover)
D_retail      <- get_D(retail_series$Turnover, m = 12)

lambda_retail
## [1] -0.9999242
d_retail
## [1] 0
D_retail
## [1] 1

Answer: Typically D = 1 (seasonal) and d = 1 (trend). A Box–Cox transform is often helpful for variance stabilization.

5 9.6 Simulate and plot ARIMA series

5.1 (a)-(b) AR(1) with phi1 = 0.6

set.seed(123)
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100){
  y[i] <- 0.6*y[i-1] + e[i]
}
sim_ar1 <- tsibble(idx = 1:100, y = y, index = idx)
autoplot(sim_ar1, y) + labs(title = "Simulated AR(1): phi=0.6", y = "y")

What happens as phi changes? Larger φ makes the series smoother and more persistent; smaller φ makes it look noisier.

5.2 (c)-(d) MA(1) with theta1 = 0.6

set.seed(123)
e <- rnorm(101)
y <- numeric(100)
for(i in 1:100){
  y[i] <- e[i+1] + 0.6*e[i]
}
sim_ma1 <- tsibble(idx = 1:100, y = y, index = idx)
autoplot(sim_ma1, y) + labs(title = "Simulated MA(1): theta=0.6", y = "y")

What happens as theta changes? Larger |θ| increases short-run dependence; the series can look “chunkier” but without long persistence.

5.3 (e) ARMA(1,1) with phi1=0.6, theta1=0.6

set.seed(123)
sim_arma11 <- arima.sim(list(ar = 0.6, ma = 0.6), n = 200)
autoplot(tsibble(idx = 1:200, y = as.numeric(sim_arma11), index = idx), y) +
  labs(title = "Simulated ARMA(1,1): phi=0.6, theta=0.6")

5.4 (f)-(g) AR(2) with phi1=-0.8, phi2=0.3 and compare with ARMA(1,1)

set.seed(123)

# NOTE: The book points out these AR(2) parameters produce a non-stationary series.
# arima.sim() checks stationarity and will error, so we simulate it "by hand".

n <- 200
e <- rnorm(n, mean = 0, sd = 1)
sim_ar2 <- numeric(n)
sim_ar2[1] <- 0
sim_ar2[2] <- 0

for (t in 3:n) {
  sim_ar2[t] <- (-0.8) * sim_ar2[t - 1] + (0.3) * sim_ar2[t - 2] + e[t]
}

p1 <- autoplot(tsibble(idx = 1:n, y = sim_ar2, index = idx), y) +
  labs(title = "Simulated AR(2): phi1=-0.8, phi2=0.3 (non-stationary example)")

p2 <- autoplot(tsibble(idx = 1:n, y = as.numeric(sim_arma11), index = idx), y) +
  labs(title = "Simulated ARMA(1,1): phi=0.6, theta=0.6")

p1

p2

Comparison: The AR(2) with a negative φ1 often shows more oscillation (alternating behavior) than the ARMA(1,1).

6 9.7 aus_airpassengers ARIMA comparisons

air <- aus_airpassengers

fit_auto <- air |> model(ARIMA(Passengers))
report(fit_auto)
## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8963
## s.e.   0.0594
## 
## sigma^2 estimated as 4.308:  log likelihood=-97.02
## AIC=198.04   AICc=198.32   BIC=201.65
fit_auto |> gg_tsresiduals()

fc_auto <- fit_auto |> forecast(h = 10)
autoplot(air, Passengers) +
  autolayer(fc_auto, level = 95) +
  labs(title = "Auto ARIMA forecasts (10 periods)")

6.1 (b) Backshift operator form

Use the ARIMA orders reported above. For example, ARIMA(p,d,q) corresponds to
\[ \phi(B)(1-B)^d y_t = c + \theta(B)\varepsilon_t. \]

6.2 (c) Compare ARIMA(0,1,0) with drift

fit_rw_drift <- air |> model(ARIMA(Passengers ~ pdq(0,1,0) + drift()))
fc_rw_drift  <- fit_rw_drift |> forecast(h = 10)

autoplot(air, Passengers) +
  autolayer(fc_auto, series = "auto", alpha = 0.7) +
  autolayer(fc_rw_drift, series = "RW drift", alpha = 0.7) +
  labs(title = "Forecast comparison: auto ARIMA vs RW with drift")

6.3 (d)-(e) ARIMA(2,1,2) with drift and without constant; ARIMA(0,2,1) with constant

fit_212_drift <- air |> model(ARIMA(Passengers ~ pdq(2,1,2) + drift()))
fit_212_nocon <- air |> model(ARIMA(Passengers ~ pdq(2,1,2)))
fit_021_const <- air |> model(ARIMA(Passengers ~ pdq(0,2,1) + constant()))

fc_212_drift <- fit_212_drift |> forecast(h = 10)
fc_212_nocon <- fit_212_nocon |> forecast(h = 10)
fc_021_const <- fit_021_const |> forecast(h = 10)

autoplot(air, Passengers) +
  autolayer(fc_212_drift, series = "ARIMA(2,1,2)+drift", alpha = 0.7) +
  autolayer(fc_212_nocon, series = "ARIMA(2,1,2) no const", alpha = 0.7) +
  autolayer(fc_021_const, series = "ARIMA(0,2,1)+const", alpha = 0.7) +
  labs(title = "More ARIMA forecast comparisons")

7 9.8 US GDP: Box–Cox + ARIMA + compare with ETS

us <- global_economy |>
  filter(Country == "United States") |>
  select(Year, GDP)

lambda_us <- get_lambda(us$GDP)
lambda_us
## [1] 0.2819714
# Transform (if lambda is NA, we skip)
us <- us |> mutate(GDP_bc = if (is.na(lambda_us)) GDP else forecast::BoxCox(GDP, lambda_us))

fit_us_auto <- us |> model(ARIMA(GDP_bc))
report(fit_us_auto)
## Series: GDP_bc 
## Model: ARIMA(1,1,0) w/ drift 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.2764
## s.e.  0.1198    9.5123
## 
## sigma^2 estimated as 5488:  log likelihood=-325.37
## AIC=656.74   AICc=657.19   BIC=662.87
fit_us_auto |> gg_tsresiduals()

fc_us_auto <- fit_us_auto |> forecast(h = 10)
autoplot(us, GDP_bc) +
  autolayer(fc_us_auto, level = 95) +
  labs(title = "US GDP forecasts (ARIMA on Box–Cox scale)", y = "Box–Cox(GDP)")

# ETS on original scale
fit_us_ets <- us |> model(ETS(GDP))
fc_us_ets <- fit_us_ets |> forecast(h = 10)

autoplot(us, GDP) +
  autolayer(fc_us_ets, level = 95) +
  labs(title = "US GDP forecasts (ETS on original scale)", y = "GDP")

Discussion: If the Box–Cox ARIMA forecasts look smoother and the residuals look more like white noise, it’s usually preferred. ETS is a helpful baseline; often it gives similar medium-term forecasts but may differ in uncertainty and trend handling.