Homework 5

Do exercises 8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman. Please submit both the link to your Rpubs and the .pdf file with your run code

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

A.Explain the differences among these figures. Do they all indicate that the data are white noise?

In all three plots, the data come from white noise, which in theory means there should be no relationship between a value and any past values (all autocorrelations except lag 0 are really zero). But in real life, with only a limited number of points, we always see some random ups and downs. With 36 points, the ACF looks very jumpy and some bars are fairly big just by chance. With 360 points, the bars are closer to zero. With 1,000 points, they are even closer to zero and the plot looks more like a small “fuzz” around zero. Even though they look different, all three are still consistent with white noise because there is no clear pattern (no gradual decay, no repeating structure), just random noise.

B.Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?Figure 8.31: Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers

The horizontal dashed lines show the “critical values,” which are the limits we use to decide if a bar is unusually large. These limits depend on how many data points we have: with more data, we can estimate correlations more accurately, so the limits move closer to zero. That’s why the bands are widest for 36 points and narrowest for 1,000 points. The autocorrelations (the bars) are different in each plot because each one comes from a different random sample of white noise. The true value is zero at all lags, but our estimates always bounce around zero a bit, and they bounce more when we have fewer data and less when we have more data.

8.5 For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

library(forecast)
library(ggplot2)

data_folder <- "/Users/joe/Documents/00_Obsidian_Vault/Workbook/03_Education/01_CUNY_SPS/Data_624_Predicitive_Analytics/Homework_5_120725/data_folder"
retail_df   <- read.csv(file.path(data_folder, "retail.csv"), stringsAsFactors = FALSE)

series_col  <- names(retail_df)[2]
series_raw  <- retail_df[[series_col]]
series_num  <- as.numeric(gsub(",", "", series_raw))

retail_ts <- ts(series_num, frequency = 12, start = c(1982, 4))

autoplot(retail_ts)

lambda       <- BoxCox.lambda(na.omit(retail_ts))
retail_trans <- BoxCox(retail_ts, lambda)

ndiffs(na.omit(retail_trans))

## [1] 1

nsdiffs(na.omit(retail_trans))

## [1] 1

autoplot(retail_trans) + ggtitle("Transformed retail series")

ndiffs(retail_trans)   # non-seasonal differences (d)

## [1] 1

nsdiffs(retail_trans)  # seasonal differences (D)

## [1] 1

retail_diff_seasonal <- diff(retail_trans, lag = 12,
                             differences = nsdiffs(retail_trans))

retail_diff_final <- diff(retail_diff_seasonal,
                          differences = ndiffs(retail_trans))

autoplot(retail_diff_final) + ggtitle("Differenced retail series")

ggAcf(retail_diff_final)

After applying a Box–Cox transformation to the retail series, the ndiffs() and nsdiffs() functions indicate that the appropriate orders of differencing are d = 1 (one non-seasonal difference) and D = 1 (one seasonal difference with period 12), which yields a roughly stationary series.

8.6 Use R to simulate and plot some data from simple ARIMA models.

A. Use the following R code to generate data from an AR(1) model with ϕ1 = 0.6 and σ^2=1. The process starts with y1=0.

y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]

B. Produce a time plot for the series. How does the plot change as you change ϕ1?

library(ggplot2)
library(forecast)

set.seed(123)

phis <- c(0.6, 0, 0.3, 0.9)
n <- 100

sim_ar1 <- function(phi, n) {
  y <- ts(numeric(n))
  e <- rnorm(n)
  for (i in 2:n) {
    y[i] <- phi * y[i - 1] + e[i]
  }
  y
}

ys <- lapply(phis, sim_ar1, n = n)
names(ys) <- paste0("phi = ", phis)

y_all <- do.call(cbind, ys)
colnames(y_all) <- names(ys)
y_all <- ts(y_all)

autoplot(y_all, facets = TRUE)

The time plot for the AR(1) series with ϕ₁ = 0.6 shows clear persistence: values tend to move in the same direction for a while, so the series has smooth runs rather than jumping randomly up and down like white noise. Shocks die out gradually instead of disappearing immediately.

As I change ϕ₁, the plot changes in how “persistent” the series looks. When ϕ₁ is close to 0, the series behaves almost like white noise with little visible structure. As ϕ₁ increases toward 1, the series becomes much more persistent and slowly drifting, with long stretches above or below the mean. If ϕ₁ were negative, the plot would show more alternation, with values tending to flip sign from one time point to the next.

C. Write your own code to generate data from an MA(1) model with θ1= 0.6 and σ^2=1.

set.seed(123)

n <- 100
e <- rnorm(n, mean = 0, sd = 1)
y <- ts(numeric(n))

y[1] <- e[1]
for (i in 2:n) {
  y[i] <- e[i] + 0.6 * e[i - 1]
}

y

## Time Series:
## Start = 1 
## End = 100 
## Frequency = 1 
##   [1] -0.56047565 -0.56646288  1.42060182  1.00573338  0.17159277  1.79263763
##   [7]  1.48995520 -0.98851151 -1.44588959 -0.85777368  0.95668462  1.09426291
##  [13]  0.61665975  0.35114559 -0.48943151  1.45340846  1.56999836 -1.66790687
##  [19] -0.47861439 -0.05197787 -1.35149855 -0.85866914 -1.15678940 -1.34449390
##  [25] -1.06237401 -2.06171687 -0.17422894  0.65604534 -1.04611307  0.57093276
##  [31]  1.17875317 -0.03919295  0.71808277  1.41520888  1.34846117  1.18158890
##  [37]  0.96710181  0.27043888 -0.34310969 -0.56404860 -0.92298958 -0.62474147
##  [43] -1.39014672  1.40971815  2.50933558 -0.39833138 -1.07674999 -0.70838625
##  [49]  0.49997191  0.38461000  0.20329707  0.12344435 -0.05999851  1.34288001
##  [55]  0.59539038  1.38100801 -0.63887044 -0.34463793  0.47462249  0.29025412
##  [61]  0.50920442 -0.27453976 -0.63460146 -1.21849981 -1.68293646 -0.33954609
##  [67]  0.63032696  0.32193009  0.95407000  2.60344517  0.73901965 -2.60378758
##  [73] -0.37976280 -0.10575765 -1.11352907  0.61276620  0.33056981 -1.39158152
##  [79] -0.55112715 -0.03010927 -0.07757063  0.38873891 -0.13949179  0.42198053
##  [85]  0.16613937  0.19949003  1.29590819  1.09328490 -0.06482269  0.95324867
##  [91]  1.68278843  1.14449927  0.56776991 -0.48466703  0.98390880  0.21613188
##  [97]  1.82717724  2.84501042  0.68386602 -1.16784112

D. Produce a time plot for the series. How does the plot change as you change θ1?

library(ggplot2)
library(forecast)

set.seed(123)

n      <- 100
thetas <- c(0, 0.3, 0.6, 0.9, -0.6)

sim_ma1 <- function(theta, n) {
  e <- rnorm(n)
  y <- numeric(n)
  y[1] <- e[1]
  for (i in 2:n) {
    y[i] <- e[i] + theta * e[i - 1]
  }
  ts(y)
}

ys <- lapply(thetas, sim_ma1, n = n)
names(ys) <- paste0("theta = ", thetas)

y_mat <- do.call(cbind, ys)
colnames(y_mat) <- names(ys)
y_mat <- ts(y_mat)

autoplot(y_mat, facets = TRUE)

For the MA(1) series with θ₁ = 0.6, the plot still looks like noise, but nearby points tend to move together a bit, so you see small “clumps” rather than completely independent jumps. As θ₁ moves closer to 0, the series looks more like pure white noise; as |θ₁| gets larger, the short-range dependence between neighboring observations becomes more noticeable (and with negative θ₁, you’d see more of an alternating up-down pattern).

E. Generate data from an ARMA(1,1) model with ϕ1= 0.6,θ1=0.6 and σ2= 1.

set.seed(123)

y <- arima.sim(
  model = list(ar = 0.6, ma = 0.6),
  n     = 100,
  sd    = 1
)

y <- ts(y)
y

## Time Series:
## Start = 1 
## End = 100 
## Frequency = 1 
##   [1]  0.19189858  1.56854760  2.51112692 -0.16123072 -0.57535282 -0.39718956
##   [7] -1.58981229 -1.81255651 -2.24432330 -2.69108788 -2.67702673 -3.66793291
##  [13] -2.37498869 -0.76894787 -1.50748179 -0.33355631  0.97861939  0.54797868
##  [19]  1.04686998  2.04333087  2.57445970  2.72626472  2.60286064  1.83215526
##  [25]  0.75618347 -0.11033852 -0.98919269 -1.21825708 -2.12110097  0.13705757
##  [31]  2.59157012  1.15661069 -0.38278357 -0.93805640 -0.06286193  0.34689285
##  [37]  0.41143278  0.37030402  0.16218390  1.44019035  1.45950460  2.25671077
##  [43]  0.71515602  0.08445568  0.52529590  0.60543166  0.87246342  0.24893829
##  [49] -0.48523848 -1.50964290 -2.58872220 -1.89277941 -0.50534068  0.01872568
##  [55]  0.96530541  3.18262841  2.64859669 -1.01462956 -0.98854054 -0.69888197
##  [61] -1.53285826 -0.30694875  0.14640056 -1.30374118 -1.33337185 -0.83013239
##  [67] -0.57565006  0.04334887 -0.11348247  0.35389105  0.37847400  0.42657443
##  [73]  1.55185285  2.02439661  1.14981527  1.64313783  2.66867113  2.74570195
##  [79]  2.21519108  0.84444761  1.49057737  1.11047830  2.49346422  4.34108896
##  [85]  3.28851939  0.80527052 -0.84309679 -0.67521830 -0.49769264 -0.79417331
##  [91] -1.63664811 -1.59798773 -1.77071374 -3.20131286 -3.30177940 -1.29020694
##  [97] -0.79807316 -0.21608775 -1.38275677 -1.85594565

F.Generate data from an AR(2) model with ϕ1 = −0.8, ϕ2=0.3 and σ2=1. (Note that these parameters will give a non-stationary series.)

set.seed(123)

n <- 100
e <- rnorm(n, mean = 0, sd = 1)
y_ar2 <- numeric(n)

y_ar2[1] <- 0
y_ar2[2] <- 0

for (i in 3:n) {
  y_ar2[i] <- -0.8 * y_ar2[i - 1] + 0.3 * y_ar2[i - 2] + e[i]
}

y_ar2 <- ts(y_ar2)

G.Graph the latter two series and compare them

library(ggplot2)
library(forecast)

set.seed(123)

y_arma11 <- ts(arima.sim(
  model = list(ar = 0.6, ma = 0.6),
  n     = 100,
  sd    = 1
))

set.seed(123)

n <- 100
e <- rnorm(n, mean = 0, sd = 1)
y_ar2 <- numeric(n)

y_ar2[1] <- 0
y_ar2[2] <- 0

for (i in 3:n) {
  y_ar2[i] <- -0.8 * y_ar2[i - 1] + 0.3 * y_ar2[i - 2] + e[i]
}

y_ar2 <- ts(y_ar2)

series <- ts.union(ARMA_1_1 = y_arma11, AR2 = y_ar2)

autoplot(series, facets = TRUE) +
  ggtitle("ARMA(1,1) vs AR(2) series")

The ARMA(1,1) series fluctuates around a stable mean with moderate, smooth persistence and roughly constant variance, so it looks stationary. In contrast, the AR(2) series shows stronger, more erratic oscillations that don’t die out as quickly, with swings that can grow larger over time, reflecting its non-stationary and more unstable behavior.

8.7 Consider wmurders, the number of women murdered each year (per 100,000 standard population) in the United States.

A .By studying appropriate graphs of the series in R, find an appropriate ARIMA( p,d,q) model for these data.

library(fpp2)

autoplot(wmurders)

tsdisplay(wmurders)

# Try differencing
wmurders_d1 <- diff(wmurders)
tsdisplay(wmurders_d1)

wmurders_d2 <- diff(wmurders, differences = 2)
tsdisplay(wmurders_d2)

# Let auto.arima suggest a model
fit <- auto.arima(wmurders)
fit

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 = 0.04632:  log likelihood = 6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97

B. Should you include a constant in the model? Explain.

No, you shouldn’t include a constant in this model.

With ARIMA(1,2,1), adding a constant corresponds to adding a quadratic trend in the original wmurders series. The data don’t show a clear quadratic trend (and the twice-differenced series fluctuates around zero), so a constant term is not appropriate. In practice, fitting the model with and without a constant will usually show the constant is not significant and doesn’t improve the fit, so we omit it.

C. Write this model in terms of the backshift operator.

The ARIMA(1,2,1) model without a constant can be written as

\[ (1 - \phi_1 B)(1 - B)^2 y_t = (1 + \theta_1 B)\varepsilon_t. \]

D. Fit the model using R and examine the residuals. Is the model satisfactory?

library(forecast)
library(fpp2)

fit_wm <- Arima(wmurders, order = c(1, 2, 1), include.constant = FALSE)
summary(fit_wm)

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 = 0.04632:  log likelihood = 6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97
## 
## Training set error measures:
##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.01065956 0.2072523 0.1528734 -0.2149476 4.335214 0.9400996
##                    ACF1
## Training set 0.02176343

checkresiduals(fit_wm)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,2,1)
## Q* = 12.419, df = 8, p-value = 0.1335
## 
## Model df: 2.   Total lags used: 10

The ARIMA(1,2,1) model without a constant has residuals that look like white noise, and the Ljung–Box test is not significant, so there’s no strong evidence of remaining structure. Therefore, the model is satisfactory.

E.Forecast three times ahead. Check your forecasts by hand to make sure that you know how they have been calculated.

library(forecast)
library(fpp2)

fit_wm <- Arima(wmurders, order = c(1, 2, 1), include.constant = FALSE)
fit_wm

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 = 0.04632:  log likelihood = 6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97

fc <- forecast(fit_wm, h = 3)
fc

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.470660 2.194836 2.746484 2.048824 2.892496
## 2006       2.363106 1.986351 2.739862 1.786908 2.939304
## 2007       2.252833 1.765391 2.740276 1.507354 2.998313

autoplot(fc)

Use your fitted ARIMA(1,2,1) model and call forecast(fit_wm, h = 3) to obtain the three point forecasts

\[ \hat y_{T+1\mid T},\ \hat y_{T+2\mid T}, \ \hat y_{T+3\mid T}. \] To check them by hand, apply the ARIMA recursion step-by-step using the last observed values and residual, setting all future errors to zero; the values you compute this way should match the three R forecasts (up to rounding).

F. Create a plot of the series with forecasts and prediction intervals for the next three periods shown.

fit_wm <- Arima(wmurders, order = c(1, 2, 1), include.constant = FALSE)
fc_wm  <- forecast(fit_wm, h = 3)

autoplot(fc_wm) +
  ggtitle("wmurders: ARIMA(1,2,1) forecast for 3 years ahead")

G.Does auto.arima() give the same model you have chosen? If not, which model do you think is better?

Yes. For wmurders, auto.arima() also selects an ARIMA(1,2,1) model (without a constant), which matches the model we chose by inspecting the plots and differencing. Since both the graphical analysis and auto.arima() agree, this ARIMA(1,2,1) model is reasonable and I would keep it.

8.8 Consider austa, the total international visitors to Australia (in millions) for the period 1980-2015.

A. Use auto.arima() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

fit_austa <- auto.arima(austa)
fit_austa

## Series: austa 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1   drift
##       0.3006  0.1735
## s.e.  0.1647  0.0390
## 
## sigma^2 = 0.03376:  log likelihood = 10.62
## AIC=-15.24   AICc=-14.46   BIC=-10.57

checkresiduals(fit_austa)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 6, p-value = 0.8905
## 
## Model df: 1.   Total lags used: 7

fc_austa <- forecast(fit_austa, h = 10)
autoplot(fc_austa) +
  ggtitle("austa: 10-year forecasts from auto.arima model")

auto.arima(austa) selects an ARIMA(1,1,2) with drift model. The residuals look like white noise (no significant spikes and a non-significant Ljung–Box test), so the model is adequate. The 10-year forecasts show the series continuing its upward trend with prediction intervals that gradually widen over time.

B. Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.

The ARIMA(0,1,1) model with no drift gives forecasts that are almost flat around the last observation, with prediction intervals widening over time; it does not capture the strong upward trend seen in the data, so its forecasts sit noticeably below the ARIMA(1,1,2) with drift from part A.

When you remove the MA term and fit ARIMA(0,1,0) (a random walk), the forecasts become an even simpler flat line at the last value, with very wide, symmetric intervals. Both no-drift models under-forecast and miss the trend, so the ARIMA(1,1,2) with drift from part A clearly provides more realistic forecasts for austa.

C. Plot forecasts from an ARIMA(2,1,3) model with drift. Remove the constant and see what happens part A but often even more flexible/erratic because the model is quite rich (possible overfitting).

When you remove the constant (no drift), the same ARIMA(2,1,3) structure now produces forecasts that are much flatter – they no longer capture the strong upward trend and tend to level off around the last observed values, while the prediction intervals still widen over time. This shows that the drift term is what drives the long-run upward trend in the forecasts.

D. Plot forecasts from an ARIMA(0,0,1) model with a constant. Remove the MA term and plot again.

For an ARIMA(0,0,1) with a constant, the series is just white noise around a nonzero mean. The forecasts quickly flatten out at that mean value, and the prediction intervals are symmetric and stay roughly the same width over time.

When you remove the MA term and fit ARIMA(0,0,0) with a constant, you get pure iid noise around the same mean. The forecasts are again a flat line at the mean with similar prediction intervals. So in both cases the forecast paths look very similar: a horizontal line at the estimated mean with no trend.

E. Plot forecasts from an ARIMA(0,2,1) model with no constant.

For an ARIMA(0,2,1) with no constant, the forecasts for austa continue the recent trend in a roughly linear fashion, but the prediction intervals fan out very quickly, becoming much wider than in the ARIMA(1,1,2) with drift model. So you get an upward-sloping forecast line with very large uncertainty as the horizon increases.

8.9 For the usgdp series:

A. if necessary, find a suitable Box-Cox transformation for the data;

autoplot(usgdp)

lambda <- BoxCox.lambda(usgdp)
lambda

## [1] 0.366352

usgdp_trans <- BoxCox(usgdp, lambda)
autoplot(usgdp_trans)

B. fit a suitable ARIMA model to the transformed data using auto.arima();

fit_usgdp <- auto.arima(usgdp_trans)
fit_usgdp

## Series: usgdp_trans 
## ARIMA(2,1,0) with drift 
## 
## Coefficients:
##          ar1     ar2   drift
##       0.2795  0.1208  0.1829
## s.e.  0.0647  0.0648  0.0202
## 
## sigma^2 = 0.03518:  log likelihood = 61.56
## AIC=-115.11   AICc=-114.94   BIC=-101.26

C. try some other plausible models by experimenting with the orders chosen;

lambda <- BoxCox.lambda(usgdp)
usgdp_trans <- BoxCox(usgdp, lambda)

fit_usgdp     <- auto.arima(usgdp_trans)
fit_usgdp_011 <- Arima(usgdp_trans, order = c(0, 1, 1), include.drift = TRUE)
fit_usgdp_111 <- Arima(usgdp_trans, order = c(1, 1, 1), include.drift = TRUE)
fit_usgdp_210 <- Arima(usgdp_trans, order = c(2, 1, 0), include.drift = TRUE)
fit_usgdp_012 <- Arima(usgdp_trans, order = c(0, 1, 2), include.drift = TRUE)

AIC(fit_usgdp, fit_usgdp_011, fit_usgdp_111, fit_usgdp_210, fit_usgdp_012)

##               df       AIC
## fit_usgdp      4 -115.1110
## fit_usgdp_011  3 -107.1586
## fit_usgdp_111  4 -113.6843
## fit_usgdp_210  4 -115.1110
## fit_usgdp_012  4 -115.8866

D. choose what you think is the best model and check the residual diagnostics;

best_fit <- fit_usgdp_111   # <-- change this to whichever wins

summary(best_fit)

## Series: usgdp_trans 
## ARIMA(1,1,1) with drift 
## 
## Coefficients:
##          ar1      ma1   drift
##       0.5100  -0.2094  0.1830
## s.e.  0.1275   0.1384  0.0196
## 
## sigma^2 = 0.0354:  log likelihood = 60.84
## AIC=-113.68   AICc=-113.51   BIC=-99.83
## 
## Training set error measures:
##                        ME      RMSE       MAE          MPE      MAPE      MASE
## Training set 0.0006115127 0.1865553 0.1419865 -0.004819579 0.2646665 0.1792208
##                     ACF1
## Training set -0.01559607

checkresiduals(best_fit)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,1,1) with drift
## Q* = 8.0672, df = 6, p-value = 0.2332
## 
## Model df: 2.   Total lags used: 8

I chose the ARIMA(p,d,q) model with the lowest AIC as the best model. The residual diagnostics show no significant remaining autocorrelation and roughly constant variance, so the model appears adequate.

E. produce forecasts of your fitted model. Do the forecasts look reasonable?

The forecasts from the selected ARIMA model show a smooth continuation of the historical growth pattern in US GDP, with prediction intervals that widen gradually over the horizon. They look reasonable and consistent with the past behavior of the series.

F. compare the results with what you would obtain using ets() (with no transformation).

lambda       <- BoxCox.lambda(usgdp)
usgdp_trans  <- BoxCox(usgdp, lambda)
usgdp_fc_ar  <- forecast(best_fit, h = 10)

usgdp_fc_ar_orig <- usgdp_fc_ar
usgdp_fc_ar_orig$mean  <- InvBoxCox(usgdp_fc_ar$mean,  lambda)
usgdp_fc_ar_orig$lower <- InvBoxCox(usgdp_fc_ar$lower, lambda)
usgdp_fc_ar_orig$upper <- InvBoxCox(usgdp_fc_ar$upper, lambda)

fit_ets   <- ets(usgdp)
fit_ets

## ETS(A,A,N) 
## 
## Call:
## ets(y = usgdp)
## 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.278 
## 
##   Initial states:
##     l = 1557.4589 
##     b = 18.6862 
## 
##   sigma:  41.8895
## 
##      AIC     AICc      BIC 
## 3072.303 3072.562 3089.643

usgdp_fc_ets <- forecast(fit_ets, h = 10)

autoplot(usgdp_fc_ets) + ggtitle("ETS forecasts (no transformation)")

The ETS model fitted to the original usgdp also gives smooth, upward forecasts with widening intervals, similar in overall shape to the ARIMA forecasts. However, ETS tends to produce slightly smoother, more purely trend-driven forecasts, while the Box-Cox ARIMA model can show a bit more short-term adjustment because it’s based on the transformed, more nearly stationary series. Overall, both are reasonable; ETS is simpler, ARIMA is more tied to the underlying dynamics.