library(fpp3)
library(RColorBrewer)
library(knitr)
library(pracma)
library(cowplot)
Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.
The differences in these figures can be explained by the increasing sample size (n). When n = 36, the sample size is relatively small, so the absolute value of autocorrelation is higher in the leftmost figure than in the other figures. As n increases, the absolute value of the autocorrelation gets smaller and ultimately converges toward 0. All of these figures do indicate that the data are white noise, as even in the leftmost plot, there are no spikes outside the blue bounds of the ACF plot.
The critical values are at different distances from the mean of zero because when n is relatively small, i.e. 36, the correlation between any two of these values at any lag varies more than when n is relatively large, i.e. 360 or 1,000. The autocorrelation is measured as an average, so when n is relatively small, the denominator is relatively small, and even uncorrelated data, such as white noise, have larger absolute values of autocorrelation at any lag. As n increases, the relatively larger denominator stabilizes the variation in correlation. As such, the blue bounds of the ACF plot will be wider for small n than large n because it takes a smaller difference in the absolute value of autocorrelation to generate the abnormal spikes we need to see to distinguish autocorrelated data from white noise.
A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.
data(gafa_stock)
keep <- c("Symbol", "Date", "Close")
amazon <- gafa_stock |>
select(all_of(keep)) |>
filter(Symbol == "AMZN") |>
mutate(day = row_number()) |>
update_tsibble(index = day, regular = TRUE)
p1 <- amazon |>
autoplot(Close) +
labs(title = "Amazon daily closing stock price")
p2a <- amazon |>
ACF(Close) |>
autoplot() +
labs(subtitle = "ACF: Amazon daily closing stock price")
p2b <- amazon |>
PACF(Close) |>
autoplot() +
labs(subtitle = "PACF: Amazon daily closing stock price")
p1
The time plot of the Amazon daily closing stock price demonstrates the first piece of evidence that the data are non-stationary: there is a clear upward trend visible in the data.
p2 <- plot_grid(p2a, p2b)
p2
The ACF plot of the Amazon daily closing stock price shows two more pieces of evidence indicating the data are non-stationary: the ACF decreases very slowly for this time series, and the value of \(r_1\) is large and positive.
If the PACF provides additional evidence the data are non-stationary, other than confirming the value of \(r_1\) is large and positive, it’s unclear to us. We thought analyzing the PACF was only useful for selecting values of p or q for ARIMA models for stationary data (i.e. non-stationary data that has already been differenced).
For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
keep <- c("Country", "Year", "GDP")
turkey_gdp <- global_economy |>
select(all_of(keep)) |>
filter(Country == "Turkey")
p3 <- turkey_gdp |>
autoplot(GDP) +
labs(title = "Annual Turkish GDP in USD")
p3
lam <- turkey_gdp |>
features(GDP, features = guerrero) |>
pull(lambda_guerrero)
lam <- round(lam, 2)
The ideal lambda proposed for a Box-Cox transformation is \(\approx 0.16\), so we will round down and use a log transformation here.
turkey_gdp <- turkey_gdp |>
mutate(Log_GDP = log(GDP))
p4 <- turkey_gdp |>
autoplot(Log_GDP) +
labs(title = "Annual Turkish log(GDP) in USD")
p4
There is no evidence of seasonality in the plot of log(GDP), but we will confirm the appropriate number of seasonal differences is zero. Then we will determine the appropriate number of first differences for the data.
nsdiffs <- turkey_gdp |>
features(Log_GDP, unitroot_nsdiffs) |>
select(-Country)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- turkey_gdp |>
features(Log_GDP, unitroot_ndiffs) |>
select(-Country)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
| Appropriate N: Seasonal Differences | Appropriate N: First Differences |
|---|---|
| 0 | 1 |
The appropriate number of seasonal differences is in fact zero, and the appropriate number of first differences is one. Now we can difference the data and confirm we’ve made it stationary:
p5 <- turkey_gdp |>
ACF(difference(Log_GDP)) |>
autoplot() +
labs(subtitle = "ACF: Annual Changes in Turkish log(GDP) in USD")
p5
The ACF of the differenced log(GDP) data looks just like that of a white noise series, as there are no spikes outside the blue bounds.
turkey_gdp |>
mutate(diff_Log_GDP = difference(Log_GDP)) |>
features(diff_Log_GDP, ljung_box, lag = 10)
## # A tibble: 1 × 3
## Country lb_stat lb_pvalue
## <fct> <dbl> <dbl>
## 1 Turkey 4.12 0.942
And the large p-value from the Ljung-Box test confirms the differenced log(GDP) data are stationary.
data(aus_accommodation)
keep <- c("Date", "State", "Takings")
tasmania_takings <- aus_accommodation |>
select(all_of(keep)) |>
filter(State == "Tasmania")
p6 <- tasmania_takings |>
autoplot(Takings) +
labs(title = "Quarterly Takings in Tasmania")
p6
lam <- tasmania_takings |>
features(Takings, features = guerrero) |>
pull(lambda_guerrero)
lam <- round(lam, 3)
The ideal lambda proposed for a Box-Cox transformation is \(\approx 0.002\), so we will round down and use a log transformation again.
tasmania_takings <- tasmania_takings |>
mutate(Log_Takings = log(Takings))
p7 <- tasmania_takings |>
autoplot(Log_Takings) +
labs(title = "Quarterly log(Takings) in Tasmania")
p7
There is strong evidence of seasonality in the plot of log(Takings), so we will determine the appropriate number of seasonal differences, then the appropriate number of first differences for the data.
nsdiffs <- tasmania_takings |>
features(Log_Takings, unitroot_nsdiffs) |>
select(-State)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- tasmania_takings |>
features(Log_Takings, unitroot_ndiffs) |>
select(-State)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
| Appropriate N: Seasonal Differences | Appropriate N: First Differences |
|---|---|
| 1 | 1 |
The appropriate number of seasonal differences is one, and the appropriate number of first differences is one. Now we can doubly difference the data and confirm we’ve made it stationary:
p8 <- tasmania_takings |>
transmute(
`Annual Change log(Takings)` = difference(Log_Takings, 4),
`Doubly Diff. log(Takings)` = difference(difference(
Log_Takings, 4), 1)) |>
pivot_longer(-Date, names_to="Type", values_to="Log_Takings") |>
mutate(Type = factor(Type, levels = c("Annual Change log(Takings)",
"Doubly Diff. log(Takings)"))) |>
ggplot(aes(x = Date, y = Log_Takings)) +
geom_line() +
facet_grid(vars(Type), scales = "free_y") +
labs(title = "Tasmanian Takings", y = NULL)
p8
The data now appear stationary.
data(souvenirs)
p9 <- souvenirs |>
autoplot(Sales) +
labs(title = "Monthly Souvenir Sales")
p9
lam <- souvenirs |>
features(Sales, features = guerrero) |>
pull(lambda_guerrero)
lam <- round(lam, 3)
The ideal lambda proposed for a Box-Cox transformation is again \(\approx 0.002\), so we will round down and use a log transformation yet again.
souvenirs <- souvenirs |>
mutate(Log_Sales = log(Sales))
p10 <- souvenirs |>
autoplot(Log_Sales) +
labs(title = "Monthly Souvenir log(Sales)")
p10
There is strong evidence of seasonality in the plot of log(Sales), so we will determine the appropriate number of seasonal differences, then the appropriate number of first differences for the data.
nsdiffs <- souvenirs |>
features(Log_Sales, unitroot_nsdiffs)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- souvenirs |>
features(Log_Sales, unitroot_ndiffs)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
| Appropriate N: Seasonal Differences | Appropriate N: First Differences |
|---|---|
| 1 | 1 |
The appropriate number of seasonal differences is one, and the appropriate number of first differences is one. Now we can doubly difference the data and confirm we’ve made it stationary:
p11 <- souvenirs |>
transmute(
`Annual Change log(Sales)` = difference(Log_Sales, 12),
`Doubly Diff. log(Sales)` = difference(difference(
Log_Sales, 12), 1)) |>
pivot_longer(-Month, names_to="Type", values_to="Log_Sales") |>
mutate(Type = factor(Type, levels = c("Annual Change log(Sales)",
"Doubly Diff. log(Sales)"))) |>
ggplot(aes(x = Month, y = Log_Sales)) +
geom_line() +
facet_grid(vars(Type), scales = "free_y") +
labs(title = "Souvenir Sales", y = NULL)
p11
The data now appear stationary.
For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
data(aus_retail)
set.seed(1221)
remove <- c("State", "Industry", "Series ID")
my_series <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`, 1)) |>
select(-all_of(remove))
p12 <- my_series |>
autoplot(Turnover)
p12
lam <- my_series |>
features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
lam <- round(lam, 2)
The ideal lambda proposed for a Box-Cox transformation is \(\approx 0.26\), so we will round down and use a fourth root transformation.
my_series <- my_series |>
mutate(Turnover_4thRt = nthroot(Turnover, 4))
p13 <- my_series |>
autoplot(Turnover_4thRt) +
labs(title = "My Series: Monthly Turnover (Fourth Root)")
p13
We will determine the appropriate number of seasonal differences for the transformed data, then the appropriate number of first differences for the transformed data.
nsdiffs <- my_series |>
features(Turnover_4thRt, unitroot_nsdiffs)
colnames(nsdiffs) <- "Appropriate N: Seasonal Differences"
ndiffs <- my_series |>
features(Turnover_4thRt, unitroot_ndiffs)
colnames(ndiffs) <- "Appropriate N: First Differences"
diffs <- cbind(nsdiffs, ndiffs)
kable(diffs, format = "simple")
| Appropriate N: Seasonal Differences | Appropriate N: First Differences |
|---|---|
| 1 | 1 |
The appropriate number of seasonal differences is one, and the appropriate number of first differences is one. Now we can doubly difference the data and confirm we’ve made it stationary:
p14 <- my_series |>
transmute(
`Annual Change Turnover^1/4` = difference(Turnover_4thRt, 12),
`Doubly Diff. Turnover^1/4` = difference(difference(
Turnover_4thRt, 12), 1)) |>
pivot_longer(-Month, names_to="Type", values_to="Turnover_4thRt") |>
mutate(Type = factor(Type, levels = c("Annual Change Turnover^1/4",
"Doubly Diff. Turnover^1/4"))) |>
ggplot(aes(x = Month, y = Turnover_4thRt)) +
geom_line() +
facet_grid(vars(Type), scales = "free_y") +
labs(title = "My Series: Turnover", y = NULL)
p14
The data now appear stationary.
Simulate and plot some data from simple ARIMA models.
y <- numeric(100)
e <- rnorm(100)
for (i in 2:100){
y[i] <- (0.6 * y[i-1]) + e[i]
}
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
p15 <- sim |>
autoplot(y) +
labs(title = "Sim 1:")
p15
phis = seq(from = -0.9, to = 0.9, by = 0.1)
sim_df <- as.data.frame(matrix(ncol = 2, nrow = 0))
colnames(sim_df) <- c("phi_val", "y")
for (j in 1:length(phis)){
y <- numeric(100)
e <- rnorm(100)
phi <- phis[j]
phi_val <- (y + 1) * phi
for (i in 2:100){
y[i] <- (phi * y[i-1]) + e[i]
}
idx <- seq_len(100)
new_rows <- cbind(idx, phi_val, y)
sim_df <- rbind(sim_df, new_rows)
}
sim_ts <- sim_df |>
as_tsibble(index = idx, key = phi_val)
p16 <- sim_ts |>
autoplot(y) +
facet_wrap(~ phi_val, ncol = 5) +
theme_minimal() +
theme(legend.position = "none") +
labs(title = expression(paste("Repeated Simulations with Various ", phi[1],
" Values")))
p16
Changing \(\phi_1\) from \(-0.9\) to \(0.9\) in intervals of \(0.1\) results in different time series patterns. For all \(-0.9 \le \phi_1 < 0\), the time series patterns frequently oscillate around the mean. The absolute value of the back and forth swings is the largest, and the swings are most frequent, when \(\phi_1\) is smallest (i.e. \(-0.9\)). For all \(0.9 \ge \phi_1 \ge 0\), the time series patterns no longer oscillate around the mean, and changes in these different patterns become less frequent. Because \(\phi_1 < 1\) for all values of \(\phi_1\) here, the most recent observations always have higher weight than observations from the more distant past.
y <- numeric(100)
e <- rnorm(100)
for (i in 2:100){
y[i] <- (0.6 * e[i-1]) + e[i]
}
sim2 <- tsibble(idx = seq_len(100), y = y, index = idx)
p17 <- sim2 |>
autoplot(y) +
labs(title = "Sim 2:")
p17
thetas = seq(from = -0.9, to = 0.9, by = 0.1)
sim_df2 <- as.data.frame(matrix(ncol = 2, nrow = 0))
colnames(sim_df2) <- c("theta_val", "y")
for (j in 1:length(thetas)){
y <- numeric(100)
e <- rnorm(100)
theta <- thetas[j]
theta_val <- (y + 1) * theta
for (i in 2:100){
y[i] <- (theta * e[i-1]) + e[i]
}
idx <- seq_len(100)
new_rows <- cbind(idx, theta_val, y)
sim_df2 <- rbind(sim_df2, new_rows)
}
sim_ts2 <- sim_df2 |>
as_tsibble(index = idx, key = theta_val)
p18 <- sim_ts2 |>
autoplot(y) +
facet_wrap(~ theta_val, ncol = 5) +
theme_minimal() +
theme(legend.position = "none") +
labs(title = expression(paste("Repeated Simulations with Various ", theta[1],
" Values")))
p18
Changing \(\theta_1\) from \(-0.9\) to \(0.9\) in intervals of \(0.1\) results in different time series patterns, just like changing \(\phi_1\) did. The smaller the value of \(\theta_1\), the more frequently the data oscillate back and forth around the mean. Because \(\theta_1 < 1\) for all values of \(\theta_1\) here, the most recent observations again always have higher weight than observations from the more distant past.
y <- numeric(100)
e <- rnorm(100)
for (i in 2:100){
y[i] <- (0.6 * y[i-1]) + (0.6 * e[i-1]) + e[i]
}
sim3 <- tsibble(idx = seq_len(100), y = y, index = idx)
y <- numeric(100)
e <- rnorm(100)
for (i in 3:100){
y[i] <- (-0.8 * y[i-1]) + (0.3 * y[i-2]) + e[i]
}
sim4 <- tsibble(idx = seq_len(100), y = y, index = idx)
p19a <- sim3 |>
autoplot(y) +
labs(title = "Sim 3:")
p19b <- sim4 |>
autoplot(y) +
labs(title = "Sim 4:")
p19 <- plot_grid(p19a, p19b)
p19
The ARMA(1,1) model (Sim 3) is stationary. Because \(\phi_1 = \theta_1 < 1\) here, the most recent observations always have higher weight than observations from the more distant past. The AR(2) model (Sim 4) is non-stationary. Because \(\phi_1 < 0\), we see oscillation around the mean. The swings back and forth get larger and larger.
Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.
data(aus_airpassengers)
p20 <- aus_airpassengers |>
autoplot(Passengers)
p20
lam <- aus_airpassengers |>
features(Passengers, features = guerrero) |>
pull(lambda_guerrero)
lam <- round(lam, 2)
With a proposed lambda transformation of \(-0.18\), a log transformation is probably necessary, but we will skip it so the following questions are less complicated to answer as written.
cols <- c("stepwise", "search")
fit <- aus_airpassengers |>
model(stepwise = ARIMA(Passengers),
search = ARIMA(Passengers, stepwise=FALSE))
pivot_fit <- fit |>
pivot_longer(cols = all_of(cols), names_to = "Model Name",
values_to = "Orders")
kable(pivot_fit, format = "simple")
| Model Name | Orders |
|---|---|
| stepwise | <ARIMA(0,2,1)> |
| search | <ARIMA(0,2,1)> |
glance_fit <- glance(fit) |>
arrange(AICc) |>
select(.model:BIC)
kable(glance_fit, format = "simple")
| .model | sigma2 | log_lik | AIC | AICc | BIC |
|---|---|---|---|---|---|
| stepwise | 4.307764 | -97.01896 | 198.0379 | 198.3236 | 201.6512 |
| search | 4.307764 | -97.01896 | 198.0379 | 198.3236 | 201.6512 |
Both the default stepwise ARIMA() procedure and the more extensive search procedure selected the same model: ARIMA(0,2,1).
We check that the residuals look like white noise.
fit |>
select(stepwise) |>
gg_tsresiduals()
The residuals do in fact look like white noise.
We plot forecasts for the next 10 periods.
p22a <- fit |>
forecast(h=10) |>
filter(.model == "stepwise") |>
autoplot(aus_airpassengers) +
labs(subtitle = "AUS Air Passengers: ARIMA(0,2,1) Forecasts")
p22a
\((1 - B)^2y_t = (1 + \theta_1B)\epsilon_t\)
fit <- aus_airpassengers |>
model(arima010c = ARIMA(Passengers ~ 1 + pdq(0,1,0)))
p22b <- fit |>
forecast(h=10) |>
filter(.model == "arima010c") |>
autoplot(aus_airpassengers) +
labs(subtitle = "AUS Air Passengers: ARIMA(0,1,0) w/ Drift Forecasts")
p22b
The forecasts from the ARIMA(0,1,0) w/ drift model are very similar to the forecasts from the ARIMA(0,2,1) forecasts. Forecasts from the former are slightly less optimistic. The prediction intervals in the latter model are wider than the prediction intervals in the former.
fit <- aus_airpassengers |>
model(arima212c = ARIMA(Passengers ~ 1 + pdq(2, 1, 2)),
arima212 = ARIMA(Passengers ~ 0 + pdq(2, 1, 2)),
arima021c = ARIMA(Passengers ~ 1 + pdq(0, 2, 1)))
p23 <- fit |>
forecast(h=10) |>
filter(.model == "arima212c") |>
autoplot(aus_airpassengers) +
labs(subtitle = "AUS Air Passengers: ARIMA(2,1,2) w/ Drift Forecasts")
p23
The forecasts aren’t very different, but they and the prediction intervals wobble.
p24 <- fit |>
forecast(h=10) |>
filter(.model == "arima212") |>
autoplot(aus_airpassengers) +
labs(subtitle = "AUS Air Passengers: ARIMA(2,1,2) Forecasts")
p24
Removing the constant has resulted in a NULL model for which forecasts can’t be produced/plotted.
p25 <- fit |>
forecast(h=10) |>
filter(.model == "arima021c") |>
autoplot(aus_airpassengers) +
labs(subtitle = "AUS Air Passengers: ARIMA(0,2,1) w/ Drift Forecasts")
p25
The forecasts become more optimistic. We’ve been warned that including a constant and an order of differencing \(\ge 2\) is dangerous when forecasting.
For the United States GDP series (from global_economy):
keep <- c("Country", "Year", "GDP")
us_gdp <- global_economy |>
select(all_of(keep)) |>
filter(Country == "United States")
p26 <- us_gdp |>
autoplot(GDP) +
labs(title = "Annual United States GDP in USD")
p26
lam <- us_gdp |>
features(GDP, features = guerrero) |>
pull(lambda_guerrero)
lam <- round(lam, 2)
With a proposed lambda of \(0.28\), we round down to \(0.25\), a fourth root transformation.
lam <- 0.25
fit1 <- us_gdp |>
model(stepwise = ARIMA(box_cox(GDP, lam)),
search = ARIMA(box_cox(GDP, lam), stepwise = FALSE))
pivot_fit <- fit1 |>
pivot_longer(-Country, names_to = "Model Name",
values_to = "Orders")
kable(pivot_fit, format = "simple")
| Country | Model Name | Orders |
|---|---|---|
| United States | stepwise | <ARIMA(1,1,0) w/ drift> |
| United States | search | <ARIMA(1,1,0) w/ drift> |
glance_fit <- glance(fit1) |>
arrange(AICc) |>
select(.model:BIC)
kable(glance_fit, format = "simple")
| .model | sigma2 | log_lik | AIC | AICc | BIC |
|---|---|---|---|---|---|
| stepwise | 839.3346 | -271.8562 | 549.7123 | 550.1652 | 555.8415 |
| search | 839.3346 | -271.8562 | 549.7123 | 550.1652 | 555.8415 |
fit2 <- us_gdp |>
model(arima011 = ARIMA(box_cox(GDP, lam) ~ 1 + pdq(0,1,1)),
arima211 = ARIMA(box_cox(GDP, lam) ~ 1 + pdq(2,1,1)))
pivot_fit <- fit2 |>
pivot_longer(-Country, names_to = "Model Name",
values_to = "Orders")
kable(pivot_fit, format = "simple")
| Country | Model Name | Orders |
|---|---|---|
| United States | arima011 | <ARIMA(0,1,1) w/ drift> |
| United States | arima211 | <ARIMA(2,1,1) w/ drift> |
glance_fit <- glance(fit2) |>
arrange(AICc) |>
select(.model:BIC)
kable(glance_fit, format = "simple")
| .model | sigma2 | log_lik | AIC | AICc | BIC |
|---|---|---|---|---|---|
| arima011 | 876.2163 | -273.0490 | 552.0979 | 552.5508 | 558.2271 |
| arima211 | 861.4592 | -271.5637 | 553.1274 | 554.3039 | 563.3427 |
The stepwise model produced by ARIMA() had the lowest AICc (identical to the model produced by the more exhaustive search procedure), so that model is the best. We check the residual diagnostics.
fit1 |>
select(stepwise) |>
gg_tsresiduals()
The residuals look like white noise.
p27 <- fit1 |>
forecast(h=10) |>
filter(.model == "stepwise") |>
autoplot(us_gdp) +
labs(subtitle = "US GDP: ARIMA(1,1,0) w/ Drift Forecasts")
p27
Yes, the forecasts look reasonable.
compare <- us_gdp |>
slice(-n()) |>
stretch_tsibble(.init = 10) |>
model(ETS(GDP),
ARIMA(box_cox(GDP, lam))) |>
forecast(h = 1) |>
accuracy(us_gdp) |>
select(.model, RMSE:MAPE)
kable(compare, format = "simple")
| .model | RMSE | MAE | MPE | MAPE |
|---|---|---|---|---|
| ARIMA(box_cox(GDP, lam)) | 180880178014 | 114151663572 | 0.0725540 | 1.679234 |
| ETS(GDP) | 190604101250 | 126812323851 | 0.5935103 | 1.731149 |
fit3 <- us_gdp |>
model(ETS = ETS(GDP))
p28 <- fit3 |>
forecast(h=10) |>
filter(.model == "ETS") |>
autoplot(us_gdp) +
labs(subtitle = "US GDP: ETS Forecasts")
p28
ETS() is less accurate by comparison. Its forecasts are less optimistic, and its prediction intervals are much wider, although ARIMA prediction intervals will always tend to be too narrow.