library(fpp3)
Warning: package ‘fpp3’ was built under R version 4.4.3── Attaching packages ─────────────────────────── fpp3 1.0.1 ──
✔ tibble      3.2.1     ✔ tsibble     1.1.6
✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
✔ tidyr       1.3.1     ✔ feasts      0.4.1
✔ lubridate   1.9.4     ✔ fable       0.4.1
✔ ggplot2     3.5.1     
Warning: package ‘tsibble’ was built under R version 4.4.3Warning: package ‘tsibbledata’ was built under R version 4.4.3Warning: package ‘feasts’ was built under R version 4.4.3Warning: package ‘fabletools’ was built under R version 4.4.3Warning: package ‘fable’ was built under R version 4.4.3── Conflicts ──────────────────────────────── fpp3_conflicts ──
✖ lubridate::date()    masks base::date()
✖ dplyr::filter()      masks stats::filter()
✖ tsibble::intersect() masks base::intersect()
✖ tsibble::interval()  masks lubridate::interval()
✖ dplyr::lag()         masks stats::lag()
✖ tsibble::setdiff()   masks base::setdiff()
✖ tsibble::union()     masks base::union()

Section 9.11 Problem #1

The Figure below shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

a.Explain the differences among these figures. Do they all indicate that the data are white noise? Figure: Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers.

  • The figures show different critical values (blue dashed lines).
  • All figures indicate that the data is white noise.

b.Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

  • The critical values are at different distances from zero because the data sets have different number of observations. The more observations in a data set, the less noise appears in the correlation estimates (spikes). Therefore the critical values for bigger data sets can be smaller in order to check if the data is not white noise.

Section 9.11 Problem #3

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

a. Turkish GDP from global_economy.

turkey <- global_economy |> filter(Country == "Turkey")
turkey |> autoplot(GDP)


turkey |> autoplot(log(GDP))


turkey |> autoplot(log(GDP) |> difference())


turkey |> features(GDP, guerrero)
  • Logs and differences make the data appear stationary.
  • Using a Box-Cox transformation with \(\lambda\) between 0 and 0.2 would also have worked well.

b. Accommodation takings in the state of Tasmania from aus_accommodation.

tas <- aus_accommodation |> filter(State == "Tasmania")
tas |> autoplot(Takings)


tas |> autoplot(log(Takings))

tas |> autoplot(log(Takings) |> difference(lag = 4))

tas |> autoplot(log(Takings) |> difference(lag = 4) |> difference())

tas |> features(Takings, guerrero)
  • Logs followed by seasonal and first differences make the data appear stationary.
  • The automatically selected Box-Cox \(\lambda\) value is very close to zero, confirming the choice of using logs.

c.Monthly sales from souvenirs.

souvenirs |> autoplot(Sales)


souvenirs |> autoplot(log(Sales))


souvenirs |> autoplot(log(Sales) |> difference(lag=12))


souvenirs |> autoplot(log(Sales) |> difference(lag=12) |> difference())


souvenirs |> features(Sales, guerrero)
  • Logs followed by seasonal and first differences make the data appear stationary.
  • The automatically selected Box-Cox \(\lambda\) value is very close to zero, confirming the choice of using logs.

Section 9.11 Problem #7

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

a. Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

aus_airpassengers |> autoplot(Passengers)


fit <- aus_airpassengers |>
  model(arima = ARIMA(Passengers))
report(fit)
Series: Passengers 
Model: ARIMA(0,2,1) 

Coefficients:
          ma1
      -0.8963
s.e.   0.0594

sigma^2 estimated as 4.308:  log likelihood=-97.02
AIC=198.04   AICc=198.32   BIC=201.65
fit |> gg_tsresiduals()


fit |> forecast(h = 10) |> autoplot(aus_airpassengers)

b. Write the model in terms of the backshift operator.

#year = 4.307764

\((1-B)^2 y_t = (1+ \theta B) \epsilon_t\)

where \(\epsilon ~ N(0, \sigma^2), \theta = -0.90\) and \(\sigma^2 = 4.31\)

c. Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.

aus_airpassengers |>
  model(arima = ARIMA(Passengers ~ 1 + pdq(0,1,0))) |>
  forecast(h = 10) |>
  autoplot(aus_airpassengers)

  • Both containing increasing trends, but the ARIMA(0,2,1) model has an implicit trend due to the double-differencing, while the ARIMA(0,1,0) with drift models the trend directly via the trend coefficient.
  • The intervals are narrower when there are fewer differences.

d. Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.

aus_airpassengers |>
  model(arima = ARIMA(Passengers ~ 1 + pdq(2,1,2))) |>
  forecast(h = 10) |>
  autoplot(aus_airpassengers)


aus_airpassengers |>
  model(arima = ARIMA(Passengers ~ 0 + pdq(2,1,2)))
Warning: 1 error encountered for arima
[1] non-stationary AR part from CSS
  • There is little difference between ARIMA(2,1,2) with drift and ARIMA(0,1,0) with drift.
  • Removing the constant causes an error because the model cannot be estimated.

e. Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

aus_airpassengers |>
  model(arima = ARIMA(Passengers ~ 1 + pdq(0,2,1))) |>
  forecast(h = 10) |>
  autoplot(aus_airpassengers)
Warning: Model specification induces a quadratic or higher order polynomial trend. 
This is generally discouraged, consider removing the constant or reducing the number of differences.

The forecast trend is now quadratic, and there is a warning that this is generally a bad idea.

