Explain the differences in Figure 9.32. Do they all indicate that the data are white noise?
Answer: The plots differ by the magnitude of the critical values (dotted blue lines) as well as the magnitude of the spikes. Specifically, these magnitudes decrease as the number of random numbers increases.
Yes, all the ACF plots suggest the data are white noise because the ACF values drop to near zero quickly and none of the spikes exceed the critical values. In addition, there is no obvious pattern in the spikes.
Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?
Answer: The critical values decrease as the number of random numbers increases because the critical values for a 95% confidence interval in an ACF plot correspond to \(\pm 1.96 / \sqrt{T}\), where \(T\) is the number of observations. This behavior underscores the importance of analyzing large(r) samples when making statistical inferences.
The autocorrelations are different in each plot because the numbers are random (technically, pseudo-random), so each plot corresponds to a different set of data, which have different autocorrelations.
Plot the daily closing prices for Amazon stock (contained in
gafa_stock), along with the ACF and PACF. Explain how each
plot shows that the series is non-stationary and should be
differenced.
Answer:
The time series plot (top) shows an upward trend, which by definition of stationarity means that the data are not stationary.
The ACF plot (lower left) shows that ACF values decrease slowly as lag increases, but remain highly positive, which indicates that the data are non-stationary and need differencing.
The partial ACF plot (lower right) has a large signiicant spike at lag=1 while spikes at greater lags are much smaller and mostly non-significant. This suggests that an AR(1) model of the data would be suitable; however, the data would first need to be made stationary (eg, by differencing).
gafa_stock %>%
filter(Symbol == 'AMZN') %>%
gg_tsdisplay(Close, plot_type = 'partial') +
ggtitle('Amazon stock closing price on irregular trading days, 2014-2019')
Consistent with these inferences of non-stationarity and need for differencing, the KPSS unit root test shows that first-order differencing is sufficient to achieve stationarity.
# KPSS test
gafa_stock %>%
filter(Symbol == 'AMZN') %>%
features(Close, unitroot_ndiffs) %>%
pull(ndiffs)
## [1] 1
For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
global_economyAnswer: A Box-Cox transformation with \(\lambda = 0.15\) and first-order differencing is needed to obtain stationary data.
Explanation:
The time series plot (top) shows that the GDP of Turkey has an upward trend, so the untransformed time series is not stationary.
The variance does not appear to be constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.
The ACF plot (lower left) shows that ACF values decrease slowly as lag increases, which confirms the non-stationarity and need for differencing.
The partial ACF plot (lower right) has a large significant spike at lag=1 while spikes at greater lags are much smaller and are non-significant. This suggests that an AR(1) model of the data would be suitable; however, the data would need to be differenced to make it stationary.
global_economy %>%
filter(Country == 'Turkey') %>%
gg_tsdisplay(GDP, plot_type = 'partial') +
ggtitle('GDP of Turkey, 1960-2017')
The optimal value of lambda for Box-Cox transformation is 0.15, which suggests that a power transformation could normalize the data and stabilize its variance.
# Calculate lambda
lambda <- global_economy %>%
filter(Country == 'Turkey') %>%
features(GDP, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.1572187
The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.
global_economy2 <- global_economy %>%
filter(Country == 'Turkey') %>%
mutate(
transformed_GDP = box_cox(GDP, lambda)
)
global_economy2 %>%
features(transformed_GDP, unitroot_ndiffs) %>%
pull(ndiffs)
## [1] 1
After transformation and differencing, the data are closer to stationarity. Specifically, the time series plot no longer has a trend, and the ACF plot does not have any pattern or significant spikes that exceed the critical values.
# Differencing creates NAs at beginning of series; I suppressed the warnings about
# missing values
global_economy2 %>%
mutate(
fd = difference(transformed_GDP)
) %>%
gg_tsdisplay(fd, plot_type='partial') +
labs(
y = 'FD(BoxCox(GDP))',
title = 'Differenced transformed GDP data',
subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
'$\\lambda$=', round(lambda, 2))),
caption = 'FD, first difference.') +
theme(
plot.caption = element_text(color = "darkgray", hjust = 0)
)
aus_accomodationAnswer: A Box-Cox transformation with \(\lambda = 0\) and first-order differencing is needed to make the data more stationary.
Explanation:
The time series plot (top) shows that the accommodations data have seasonality (quarterly period) and an upward trend, so the untransformed time series is not stationary.
It can also be seen that the variance is not constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.
The ACF plot (lower left) exhibits a sinusoidal pattern, which confirms the seasonality and non-stationarity.
The partial ACF plot (lower right) has large significant spikes up to lag=6, and then decrease and remain below the critical values thereafter. This supports using a higher order autoregressive model of the data; however, the data would need to be differenced to make it stationary.
aus_accommodation %>%
filter(State == 'Tasmania') %>%
gg_tsdisplay(Takings, plot_type = 'partial') +
ggtitle('Tourist Accommodations in Tasmania Australia, 1998-2016')
The optimal value of lambda for Box-Cox transformation is near zero, which suggests a log transformation could normalize the data and stabilize its variance.
# Calculate lambda
lambda <- aus_accommodation %>%
filter(State == 'Tasmania') %>%
features(Takings, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.001819643
The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.
aus_accommodation2 <- aus_accommodation %>%
filter(State == 'Tasmania') %>%
mutate(
transformed_takings = box_cox(Takings, lambda)
)
aus_accommodation2 %>%
features(transformed_takings, unitroot_nsdiffs) %>%
pull(nsdiffs)
## [1] 1
After transformation and differencing, the data are closer to stationarity. The time series plot no longer exhibits seasonality or a trend. The ACF values drop below the critical values more rapidly, which is characteristic of stationarity. However, there still appears to be a sinusoidal pattern, so I am not sure whether the differenced, transformed data are completely stationary.
aus_accommodation2 %>%
mutate(
# lag = 4 since data are seasonal with quarterly period
fd = difference(transformed_takings, lag = 4)
) %>%
gg_tsdisplay(fd, plot_type='partial') +
labs(
y = 'FD(BoxCox(Takings))',
title = 'Differenced transformed accommodations data',
subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
'$\\lambda$$\\approx$', round(lambda, 2))),
caption = 'FD, first difference.') +
theme(
plot.caption = element_text(color = "darkgray", hjust = 0)
)
souvenirsAnswer: A Box-Cox transformation with \(\lambda = 0\) and first-order differencing is needed to obtain stationary data.
Explanation:
The time series plot (top) shows that souvenir sales trended upward and that fluctuations in sales vary with the level of the series. The seasonality period was unclear to me in this plot, so I looked at the subseries plots, which more clearly show that the seasonality has a monthly period. Together, these observations indicate that the untransformed time series is not stationary.
It can also be seen that the variance is not constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.
The ACF plot (lower left) has a sinusoidal pattern, which confirms the seasonality and nonstationarity.
The partial ACF plot (lower right) has large significant spikes up to lag=13. This suggests using a higher order autoregressive model of the data; however, the data would need to be differenced to make it stationary.
souvenirs %>%
gg_tsdisplay(Sales, plot_type = 'partial') +
ggtitle('Souvenir Sales, 1987-1993')
ggtime::gg_subseries(souvenirs, Sales)
The optimal value of lambda for Box-Cox transformation is near zero, which suggests a log transformation could normalize the data and stabilize its variance.
# Calculate lambda
lambda <- souvenirs %>%
features(Sales, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.002118221
The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.
souvenirs2 <- souvenirs %>%
mutate(
transformed_sales = box_cox(Sales, lambda)
)
souvenirs2 %>%
features(transformed_sales, unitroot_nsdiffs) %>%
pull(nsdiffs)
## [1] 1
After transformation and differencing, the data are more stationary. The time series plot no longer exhibits seasonality or a trend. The absence of a sinusoidal pattern is consistent with the removal of seasonality, and ACF values approach zero more rapidly, which is characteristic of stationarity.
souvenirs2 %>%
mutate(
# lag = 12 since data are seasonal with monthly period
fd = difference(transformed_sales, lag = 12)
) %>%
gg_tsdisplay(fd, plot_type='partial') +
labs(
y = 'FD(BoxCox(Sales))',
title = 'Differenced transformed sales data',
subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
'$\\lambda$$\\approx$', round(lambda, 2))),
caption = 'FD, first difference.') +
theme(
plot.caption = element_text(color = "darkgray", hjust = 0)
)
For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
Answer: A Box-Cox transformation with \(\lambda = 0\) and first-order differencing is needed to obtain stationary data.
Explanation:
The time series plot (top) shows that retail turnover trended upward. In addition, the subseries plot shows the seasonality has a monthly period. Together, the trend and seasonality indicate that the untransformed time series is not stationary.
The variance does not appear to be constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.
The ACF plot (lower left) has a sinusoidal pattern, which confirms the seasonality and nonstationarity.
The partial ACF plot (lower right) has large significant spikes up to lag=13. This suggests using a higher order autoregressive model of the data; however, the data would need to be differenced to make it stationary.
# Generate the same series used previously
set.seed(144)
retail_ts <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID` , 1))
retail_ts %>%
# Due to overplotting, I only visualized the series after 2010
filter(year(Month) >= 2010) %>%
gg_tsdisplay(Turnover, plot_type = 'partial') +
labs(
y = 'Turnover (AUD, million)',
title = 'Retail turnover in Australia, 2010-2018'
)
ggtime::gg_subseries(retail_ts, Turnover)
The optimal value of lambda for Box-Cox transformation is 0.22, which suggests that a power transformation could normalize the data and stabilize its variance.
# Calculate lambda
lambda <- retail_ts %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.2229869
The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.
retail_ts <- retail_ts %>%
mutate(
transformed_turnover = box_cox(Turnover, lambda)
)
retail_ts %>%
features(transformed_turnover, unitroot_nsdiffs) %>%
pull(nsdiffs)
## [1] 1
After transformation and differencing, the data are closer to stationarity. The time series plot no longer exhibits seasonality or a trend. The absence of a sinusoidal pattern is consistent with the removal of seasonality, and ACF values approach zero more rapidly, which is characteristic of stationary data.
retail_ts %>%
mutate(
# lag = 12 since data are seasonal with monthly period
fd = difference(transformed_turnover, lag = 12)
) %>%
# Due to overplotting, I only visualized the series after 2010
filter(year(Month) >= 2010) %>%
gg_tsdisplay(fd, plot_type='partial') +
labs(
y = 'FD(BoxCox(Turnover))',
title = 'Differenced transformed retail turnover data',
subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
'$\\lambda$$\\approx$', round(lambda, 2))),
caption = 'FD, first difference.') +
theme(
plot.caption = element_text(color = "darkgray", hjust = 0)
)
Use the following R code to generate data from an AR(1) model with \(\phi_1 = 0.6\) and \(\sigma^2 = 1\). The process starts with \(y_1 = 0\).
# To streamline code for part (b), I created a function using the code provided
# in the textbook. The function generates data from an AR(1) model with the specified
# value for phi and returns the data as a tsibble.
# I also defined a seed value for reproducibility.
generate_AR1_model <- function(phi) {
set.seed(100)
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100) {
y[i] <- phi * y[i-1] + e[i]
}
data_ts <- tsibble(idx = seq_len(100), y = y, index = idx)
return(data_ts)
}
sim <- generate_AR1_model(phi = 0.6)
Produce a time plot for the series. How does the plot change as you change \(\phi_1\)?
Answer: The facet plot below shows the effect of five different values of \(\phi\) on the AR(1) model. When \(\phi = -1\), the time series exhibits an oscillating behavior (ie, values change from positive to negative and vice versa at every index), and the magnitude of these changes increase with the level of the series. As a result, the time series is not stationary.
As \(\phi\) increases to 0, this behavior becomes less extreme and the series looks more like stationary white noise (ie, zero mean, constant variance, no pattern). However, as \(\phi\) approaches 1, the time series loses stationarity, as shown by the shift of the mean away from zero and non-constant variance.
# I generated data for 4 additional values of phi and then
# combined all 5 datasets into a single tibble with the
# phi value as a label (factor)
sim_AR1_tbl <- sim %>%
mutate(
phi = 'phi = 0.6'
) %>%
as_tibble()
sim2_AR1_tbl <- generate_AR1_model(phi = -1) %>%
mutate(
phi = 'phi = -1'
) %>%
as_tibble()
sim3_AR1_tbl <- generate_AR1_model(phi = -0.6) %>%
mutate(
phi = 'phi = -0.6'
) %>%
as_tibble()
sim4_AR1_tbl <- generate_AR1_model(phi = 0) %>%
mutate(
phi = 'phi = 0'
) %>%
as_tibble()
sim5_AR1_tbl <- generate_AR1_model(phi = 1) %>%
mutate(
phi = 'phi = 1'
) %>%
as_tibble()
sim_AR1_combined <- bind_rows(sim_AR1_tbl, sim2_AR1_tbl, sim3_AR1_tbl,
sim4_AR1_tbl, sim5_AR1_tbl)
sim_AR1_combined$phi <- factor(sim_AR1_combined$phi,
levels = c('phi = -1', 'phi = -0.6', 'phi = 0',
'phi = 0.6', 'phi = 1'))
# Facet plot
ggplot(sim_AR1_combined, aes(x = idx, y = y)) +
geom_line() +
facet_wrap(~ phi, scales = 'free_y', ncol = 3, nrow = 2) +
ggtitle(latex2exp::TeX('Effect of different values of $\\phi_1$ on AR(1) model')) +
theme(
axis.title = element_text(face = "bold")
)
Write your own code to generate data from an MA(1) model with \(\theta_1 = 0.6\) and \(\sigma^2 = 1\).
# Analogous to part (a)
generate_MA1_model <- function(theta) {
set.seed(100)
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100) {
y[i] <- theta * e[i-1] + e[i]
}
data_ts <- tsibble(idx = seq_len(100), y = y, index = idx)
return(data_ts)
}
Produce a time plot for the series. How does the plot change as you change \(\theta_1\)?
Answer: The facet plot below shows the effect of five different values of \(\theta\) on the MA(1) model. In general, \(\theta\) appears to affect the intensity of oscillatory behavior, such that the series is more “compressed” (ie, more rapid changes) when \(\theta < 0\) than when \(\theta > 0\). When \(\theta = 0\), the MA(1) time series appears to be most like white noise.
sim1_MA1_tbl <- generate_MA1_model(theta = -1) %>%
mutate(
theta = 'theta = -1'
) %>%
as_tibble()
sim2_MA1_tbl <- generate_MA1_model(theta = -0.6) %>%
mutate(
theta = 'theta = -0.6'
) %>%
as_tibble()
sim3_MA1_tbl <- generate_MA1_model(theta = 0) %>%
mutate(
theta = 'theta = 0'
) %>%
as_tibble()
sim4_MA1_tbl <- generate_MA1_model(theta = 0.6) %>%
mutate(
theta = 'theta = 0.6'
) %>%
as_tibble()
sim5_MA1_tbl <- generate_MA1_model(theta = 1) %>%
mutate(
theta = 'theta = 1'
) %>%
as_tibble()
sim_MA1_combined <- bind_rows(sim1_MA1_tbl, sim2_MA1_tbl, sim3_MA1_tbl,
sim4_MA1_tbl, sim5_MA1_tbl)
sim_MA1_combined$theta <- factor(sim_MA1_combined$theta,
levels = c('theta = -1', 'theta = -0.6', 'theta = 0',
'theta = 0.6', 'theta = 1'))
# Facet plot
ggplot(sim_MA1_combined, aes(x = idx, y = y)) +
geom_line() +
facet_wrap(~ theta, scales = 'free_y', ncol = 3, nrow = 2) +
ggtitle(latex2exp::TeX('Effect of different values of $\\theta_1$ on MA(1) model')) +
theme(
axis.title = element_text(face = "bold")
)
Generate data from an ARMA(1, 1) model with \(\phi_1 = 0.6\), \(\theta_1 = 0.6\), and \(\sigma^2 = 1\).
set.seed(100)
y <- numeric(100)
e <- rnorm(100)
for (i in 2:100) {
y[i] <- 0.6 * e[i-1] + 0.6 * y[i-1] + e[i]
}
ARMA1_1 <- tsibble(idx = seq_len(100), y = y, index = idx)
Generate data from an AR(2) model with \(\phi_1 = -0.8\), \(\phi_2 = 0.3\), and \(\sigma^2 = 1\). (Note that these parameters will give a non-stationary series).
set.seed(100)
y <- numeric(100)
e <- rnorm(100)
for (i in 3:100) {
y[i] <- -0.8 * y[i-1] + 0.3 * y[i-2] + e[i]
}
AR2 <- tsibble(idx = seq_len(100), y = y, index = idx)
Graph the latter two series and compare them.
Answer: The behavior of the ARMA(1, 1) model is more stable over time than the AR(2) model, which exhibits increasingly extreme oscillations over time. This is not surprising because the ARMA(1, 1) model, which is a linear combination of the AR(1) model with \(\phi_1 = 0.6\) (see plot in part (b) of this exercise) and the MA(1) model with \(\theta_1 = 0.6\) (see plot in part (d) of this exercise), includes a moving average. In contrast, the AR(2) model only accounts for the past two observations (one of which is weighted more than twice as much as the other and with opposite sign, which exacerbates the oscillatory behavior).
autoplot(ARMA1_1, y) +
ggtitle(latex2exp::TeX('ARMA(1,1) model with $\\phi_1 = 0.6$ and $\\theta_1 = 0.6$'))
autoplot(AR2, y) +
ggtitle(latex2exp::TeX('AR(2) model with $\\phi_1 = -0.8$ and $\\phi_2 = 0.3$'))
Consider aus_airpassengers , the total number of
passengers (in millions) from Australian air carriers for the period
1970-2011.
Use ARIMA() to find an appropriate ARIMA model. What
model was selected. Check that the residuals look like white noise. Plot
forecasts for the next 10 periods.
Answer: The best fit was an ARIMA(0,2,1) model, which has no autoregressive parts (\(p=0\)), second-degree differencing (\(d=2\)), and a first-order moving average part (\(q=1\)).
# Fit the data using ARIMA
passengers_fit <- aus_airpassengers %>%
# aus_airpassengers includes data from 1970 to 2016,
# so the lower bound does not need to be specified
filter(Year < 2012) %>%
model(ARIMA(Passengers))
report(passengers_fit)
## Series: Passengers
## Model: ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8756
## s.e. 0.0722
##
## sigma^2 estimated as 4.671: log likelihood=-87.8
## AIC=179.61 AICc=179.93 BIC=182.99
The plots below suggest that the residuals are white noise.
The time series plot (top) shows that the residuals fluctuate around zero with mostly similar variance and no apparent pattern.
The autocorrelation plot (lower left) does not have any significant spikes.
The histogram (lower right) shows that the distribution of the residuals is close to normal, and the mean is near zero.
# Residual plot
passengers_fit %>%
gg_tsresiduals() +
ggtitle('Residuals from ARIMA(0,2,1) model of aus_airpassengers')
The residual plots also agreed with the Ljung-Box tests with \(l=10\) (for non-seasonal data). The P-value was not significant, so the null hypothesis that the data are white noise was not rejected.
passengers_fit %>%
augment() %>%
features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 ARIMA(Passengers) 6.00 0.815
Together, the above results suggest that the ARIMA(0,2,1) model is appropriate, so we can proceed with forecasting.
# Forecast next 10 periods
plot9_7a <- passengers_fit %>%
forecast(h = 10) %>%
autoplot(aus_airpassengers) +
scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) +
guides(
y = guide_axis(minor.ticks = TRUE)
) +
labs(
y = 'Passengers (millions)',
title = 'Air Passengers, 1970-2011 + Forecast',
subtitle = 'ARIMA(0,2,1) model'
) +
theme_classic() +
theme(
axis.title = element_text(face = 'bold', size = 8),
plot.title = element_text(face = 'bold', size = 10),
plot.subtitle = element_text(size = 8)
)
plot9_7a
Write the model in terms of the backshift operator.
Answer: Using the general equation 9.2 in the textbook with the values of \(p\), \(d\), \(q\), and \(\theta_1\) from part (a), the model can be written as \((1 - B)^2y_t = (1 - 0.876B)\epsilon_t\), where \(By_t = y_{t-1}\).
Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.
Answer: The forecasts from the ARIMA(0,1,0) model with drift are more conservative (ie, lower forecasted values and smaller slope/rate of increase) than those from the ARIMA(0,2,1) model.
# Fit the data using ARIMA(0,1,0) with drift
passengers_fit2 <- aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ pdq(0, 1, 0)))
report(passengers_fit2)
## Series: Passengers
## Model: ARIMA(0,1,0) w/ drift
##
## Coefficients:
## constant
## 1.3669
## s.e. 0.3319
##
## sigma^2 estimated as 4.629: log likelihood=-89.08
## AIC=182.17 AICc=182.48 BIC=185.59
The residual plot and Ljung-Box test indicate the residuals from the ARIMA(0,1,0) model with drift are white noise.
# Residual plot
passengers_fit2 %>%
gg_tsresiduals() +
ggtitle('Residuals from ARIMA(0,1,0) model of aus_airpassengers')
# Ljung-Box test of residuals
passengers_fit2 %>%
augment() %>%
features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 ARIMA(Passengers ~ pdq(0, 1, 0)) 4.92 0.897
The ARIMA(0,1,0) model with drift is appropriate, so we can proceed with forecasting and compare the result to the forecast from the ARIMA(0,2,1) model.
# Forecast next 10 periods
plot9_7c <- passengers_fit2 %>%
forecast(h = 10) %>%
autoplot(aus_airpassengers) +
scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) +
guides(
y = guide_axis(minor.ticks = TRUE)
) +
labs(
y = 'Passengers (millions)',
title = 'Air Passengers, 1970-2011 + Forecast',
subtitle = 'ARIMA(0,1,0) model with drift'
) +
theme_classic() +
theme(
axis.title = element_text(face = 'bold', size = 8),
plot.title = element_text(face = 'bold', size = 10),
plot.subtitle = element_text(size = 8)
)
# Compare forecasts
cowplot::plot_grid(plot9_7a, plot9_7c)
Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.
Answer: The forecasts from the ARIMA(2,1,2) model with drift are similar to the forecasts from the ARIMA(0,2,1) model (in terms of the magnitude and slope of forecasted values); however, the prediction intervals appear to be a little smaller. The forecasts from the ARIMA(0,1,0) model with drift are the most conservative.
It is not possible to create an ARIMA(2,1,2) model with drift without
the constant term, which effectively removes the differencing. This
prevents the data from being made stationary, so the autoregressive part
of the model cannot converge. As a result, the ARIMA()
function produces an error.
# Fit the data using ARIMA(2,1,2) with drift
passengers_fit3 <- aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ pdq(2, 1, 2)))
report(passengers_fit3)
## Series: Passengers
## Model: ARIMA(2,1,2) w/ drift
##
## Coefficients:
## ar1 ar2 ma1 ma2 constant
## 1.4700 -0.5106 -1.5740 0.6780 0.0646
## s.e. 0.3792 0.3565 0.3091 0.2684 0.0293
##
## sigma^2 estimated as 4.747: log likelihood=-87.74
## AIC=187.47 AICc=189.94 BIC=197.75
The residual plot and Ljung-Box test are consistent with the residuals from the ARIMA(2,1,2) model being white noise.
# Residual plot
passengers_fit3 %>%
gg_tsresiduals() +
ggtitle('ARIMA(2,1,2) model residuals for aus_airpassengers')
# Ljung-Box test of residuals
passengers_fit3 %>%
augment() %>%
features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 ARIMA(Passengers ~ pdq(2, 1, 2)) 2.75 0.987
The ARIMA(2,1,2) model with drift is appropriate, so we can proceed with forecasting and compare the result to the forecast from the ARIMA(0,2,1) and ARIMA(0,1,0) models.
# Forecast next 10 periods
plot9_7d <- passengers_fit3 %>%
forecast(h = 10) %>%
autoplot(aus_airpassengers) +
scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) +
guides(
y = guide_axis(minor.ticks = TRUE)
) +
labs(
y = 'Passengers (millions)',
title = '', # Using title from part (a) for cowplot
subtitle = 'ARIMA(2,1,2) model with drift'
) +
theme_classic() +
theme(
axis.title = element_text(face = 'bold', size = 8),
plot.title = element_text(face = 'bold', size = 10),
plot.subtitle = element_text(size = 8)
)
# Compare forecasts
# Remove plot title from part (c)
plot9_7c <- plot9_7c + ggtitle('')
# Arrange plot grid
cowplot::plot_grid(plot9_7a, NULL, plot9_7c, plot9_7d, ncol = 2)
Removing the constant term results in an error, which prevents the model from being fitted.
passengers_fit3b <- aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ 0 + pdq(2, 1, 2)))
## Warning: 1 error encountered for ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
## [1] non-stationary AR part from CSS
Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?
Answer: Including a constant in the ARIMA(0,2,1) model results in a warning that it introduces a higher-order polynomial trend, which is discouraged. The facet plot below shows that forecasts from this model are larger and have a larger slope (rate of increase) than forecasts from the ARIMA(0,2,1) model (without a constant). The forecasts are also more aggressive than the ARIMA(0,1,0) and ARIMA(2,1,2) models with drift. The rapid increase in forecast values from the ARIMA(0,2,1) model with drift is consistent with the underlying behavior of a higher-order polynomial, which deviates from the expected trend of the historical data.
# Fit the data using ARIMA(0,2,1) with drift
passengers_fit4 <- aus_airpassengers %>%
filter(Year < 2012) %>%
model(ARIMA(Passengers ~ 1 + pdq(0, 2, 1)))
## Warning: Model specification induces a quadratic or higher order polynomial trend. This
## is generally discouraged, consider removing the constant or reducing the number
## of differences.
report(passengers_fit4)
## Series: Passengers
## Model: ARIMA(0,2,1) w/ poly
##
## Coefficients:
## ma1 constant
## -1.0000 0.0721
## s.e. 0.0709 0.0260
##
## sigma^2 estimated as 4.086: log likelihood=-85.74
## AIC=177.48 AICc=178.15 BIC=182.55
The residual plot and Ljung-Box test are consistent with the residuals from the ARIMA(0, 2, 1) model being white noise, so we can proceed with forecasting and compare the result to the forecasts from the previous models.
# Residual plot
passengers_fit4 %>%
gg_tsresiduals() +
ggtitle('Residuals from ARIMA(0,2,1) model with drift for aus_airpassengers')
# Ljung-Box test of residuals
passengers_fit4 %>%
augment() %>%
features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 ARIMA(Passengers ~ 1 + pdq(0, 2, 1)) 6.19 0.799
# Forecast next 10 periods
plot9_7e <- passengers_fit4 %>%
forecast(h = 10) %>%
autoplot(aus_airpassengers) +
scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) +
guides(
y = guide_axis(minor.ticks = TRUE)
) +
labs(
y = 'Passengers (millions)',
title = 'Air Passengers, 1970-2011 + Forecast',
subtitle = 'ARIMA(2,1,2) model with drift'
) +
theme_classic() +
theme(
axis.title = element_text(face = 'bold', size = 8),
plot.title = element_text(face = 'bold', size = 10),
plot.subtitle = element_text(size = 8)
)
# Compare forecasts
# Remove plot title from part (a)
plot9_7a <- plot9_7a + ggtitle('')
# Arrange plots
cowplot::plot_grid(plot9_7e, plot9_7a, plot9_7c, plot9_7d, ncol = 2)
For the United States GDP series (from
global_economy ):
a. If necessary, find a suitable Box-Cox transformation for the data
b. Fit a suitable ARIMA model to the transformed data using
ARIMA()
c. Try some other plausible models by experimenting with the orders chosen
d. Choose what you think is the best model and check the residual diagnostics
e. Produce forecasts of your fitted model. Do the forecasts look reasonable?
f. Compare the results with what you would obtain using
ETS() (with no transformation)
# Select and transform data of interest
us_gdp <- global_economy %>%
filter(Country == 'United States') %>%
mutate(
# I analyzed GDP per capita to remove population increase as a potential confounder
GDP_per_capita = GDP / Population
) %>%
select(GDP_per_capita)
The time series plot (top) shows that US GDP per capita trended upward from 1960 to 2017, which means the time series is not stationary.
The variance does not appear to be constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.
The ACF plot (lower left) shows that ACF values decrease slowly as lag increases, which confirms the non-stationarity and suggests a need for differencing. The slow decay suggests that a moving average term is not needed to model the data.
The partial ACF plot (lower right) has a large significant spike at lag=1 followed by a sharp drop to near zero, which suggests an AR(1) model; however, the data would first need to be differenced to make it stationary.
us_gdp %>%
gg_tsdisplay(GDP_per_capita, plot_type = 'partial') +
ylab('GDP (USD)') +
ggtitle('United States GDP per capita, 1960-2017')
The Guerrero method indicated that a Box-Cox transformation with \(\lambda = 0.39\) would help stabilize the variance.
# Determine optimal value of lambda for Box-Cox transformation
# Note that all values are positive, which satisfies the requirement for Box-Cox
lambda <- us_gdp %>%
features(GDP_per_capita, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.3929072
# Apply Box-Cox transformation
us_gdp <- us_gdp %>%
mutate(
GDP_per_capita_transformed = box_cox(GDP_per_capita, lambda)
)
The KPSS unit root test indicated that first-order differencing of the transformed data is sufficient to achieve stationarity.
us_gdp %>%
features(GDP_per_capita_transformed, unitroot_ndiffs) %>%
pull(ndiffs)
## [1] 1
The plots below confirm that first-order differencing makes the data more stationary. Specifically, the time series plot no longer has a trend, and all but one spike in the ACF plot are non-significant.
us_gdp %>%
gg_tsdisplay(difference(GDP_per_capita_transformed), plot_type = 'partial') +
labs(
y = 'FD(BoxCox(GDP))',
title = 'Differenced transformed GDP data',
subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
'$\\lambda$=', round(lambda, 2))),
caption = 'FD, first difference; GDP, gross domestic product per capita.') +
theme(
plot.caption = element_text(color = "darkgray", hjust = 0)
)
The results above suggest an ARIMA(1,1,0) model. In addition, I evaluated two automated model selections using the default stepwise procedure as well as a broader search of the order space.
gdp_fit <- us_gdp %>%
model(arima110 = ARIMA(GDP_per_capita_transformed ~ pdq(1, 1, 0)),
stepwise = ARIMA(GDP_per_capita_transformed),
search = ARIMA(GDP_per_capita_transformed, stepwise = FALSE))
All three models had identical AICc values, indicating that the ARIMA(1,1,0) model is likely the best fit of the data.
glance(gdp_fit) %>%
arrange(AICc) %>%
select(.model:BIC)
## # A tibble: 3 × 6
## .model sigma2 log_lik AIC AICc BIC
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima110 0.927 -77.8 162. 162. 168.
## 2 stepwise 0.927 -77.8 162. 162. 168.
## 3 search 0.927 -77.8 162. 162. 168.
The plots below suggest that the residuals from the ARIMA(1,1,0) model are white noise.
The time series plot (top) shows that the residuals fluctuate around zero with mostly similar variance and no apparent pattern.
The autocorrelation plot (lower left) does not have any significant spikes.
The histogram (lower right) shows that the distribution of the residuals is close to normal, and the mean is near zero.
gdp_fit %>%
select(arima110) %>%
gg_tsresiduals() +
ggtitle('Residuals from ARIMA(1,1,0) model of US GDP')
In addition, the Ljung-Box portmanteau test did not have a significant P value (P > 0.05), so the null hypothesis that the data are white noise was not rejected.
# Ljung-Box test of residuals
gdp_fit %>%
select(arima110) %>%
augment() %>%
# degrees of freedom from ARIMA(p,d,q) = h-(p+q)
# where h is the number of lags
# lag=10 for non-seasonal data, so dof = 10-(1+0) = 9
features(.innov, ljung_box, lag = 10, dof = 9)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 arima110 3.58 0.0586
These results suggest that the ARIMA(1,1,0) model is appropriate, so we can proceed with forecasting.
The forecast for the next 10 years is shown below. Based on the historical trend, the forecast looks reasonable.
gdp_fit %>%
select(arima110) %>%
# Forecoast next 10 years
forecast(h = 10) %>%
autoplot(us_gdp) +
labs(
y = 'BoxCox(GDP)',
title = 'US GDP per capita, 1960-2017 + Forecast',
subtitle = 'ARIMA(1,1,0) model'
) +
scale_x_continuous(limits = c(1960, 2030), breaks = seq(1960, 2030, by = 10)) +
scale_y_continuous(limits = c(50, 250), breaks = seq(50, 250, by = 50)) +
theme_classic() +
theme(
axis.title = element_text(face = 'bold'),
plot.title = element_text(face = 'bold')
)
The ARIMA(1,1,0) and ETS models have similar performance for forecasting US GDP, as shown in the forecast plot below. Overall, the ARIMA(1,1,0) model has lower cross-validated performance metrics (eg, RMSE). However, the 95% prediction interval for the first forecast of the ETS model is slightly smaller than that for the ARIMA model.
# Fit the models
# I used transformed data for both models; otherwise, transformation would be a
# potential confounding variable
gdp_fit2 <- us_gdp %>%
model(
ets = ETS(GDP_per_capita_transformed),
arima110 = ARIMA(GDP_per_capita_transformed ~ pdq(1, 1, 0))
)
# Generate forecasts for 10 years in future
h <- 10
gdp_forecast2 <- gdp_fit2 %>%
forecast(h = h)
# Plot forecast with historical data
gdp_forecast2 %>%
autoplot(us_gdp, level = NULL) +
autolayer(
.vars = GDP_per_capita_transformed,
filter_index(us_gdp, '2018' ~ .)
) +
# Only plot subset of data so the difference in forecasts is visible
coord_cartesian(xlim = c(2000, 2030), ylim = c(150, 220)) +
labs(
y = 'BoxCox(GDP)',
title = 'US GDP per capita, 2000-2017 + 10-Year Forecast',
color = 'model'
) +
theme_classic() +
theme(
axis.title = element_text(face = 'bold'),
plot.title = element_text(face = 'bold')
)
# Compare ETS vs ARIMA models using cross-validated performance measures
accuracy(gdp_fit2) %>%
select(`.model`, RMSE:MAPE)
## # A tibble: 2 × 5
## .model RMSE MAE MPE MAPE
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 ets 0.995 0.743 0.0927 0.633
## 2 arima110 0.938 0.699 -0.0115 0.623
# Compare 95% prediction intervals for the first forecast for each model
gdp_forecast2 %>%
mutate(
hilo_interval95 = hilo(GDP_per_capita_transformed, 95),
hilo_lower95 = hilo_interval95$lower,
hilo_upper95 = hilo_interval95$upper
) %>%
as.data.frame() %>%
# h is length of the forecast, so h+1 is the row number for the
# first forecast of the second model
slice(1, h+1) %>%
select(model = `.model`, lower = hilo_lower95, upper = hilo_upper95)
## model lower upper
## 1 ets 187.8378 193.9749
## 2 arima110 189.2593 193.0332
pander::pander(sessionInfo())
R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
locale: en_US.UTF-8||en_US.UTF-8||en_US.UTF-8||C||en_US.UTF-8||en_US.UTF-8
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: pander(v.0.6.6), cowplot(v.1.2.0), distributional(v.0.6.0), ggtime(v.0.1.0), latex2exp(v.0.9.8), fable(v.0.5.0), feasts(v.0.4.2), fabletools(v.0.5.1), fpp3(v.1.0.2), tsibbledata(v.0.4.1), tsibble(v.1.1.6), lubridate(v.1.9.4), forcats(v.1.0.1), stringr(v.1.6.0), dplyr(v.1.1.4), purrr(v.1.2.1), readr(v.2.1.6), tidyr(v.1.3.2), tibble(v.3.3.1), ggplot2(v.4.0.1) and tidyverse(v.2.0.0)
loaded via a namespace (and not attached): ggdist(v.3.3.3), utf8(v.1.2.6), rappdirs(v.0.3.4), sass(v.0.4.10), generics(v.0.1.4), anytime(v.0.3.12), lattice(v.0.22-7), stringi(v.1.8.7), hms(v.1.1.4), digest(v.0.6.39), magrittr(v.2.0.4), evaluate(v.1.0.5), grid(v.4.5.2), timechange(v.0.3.0), RColorBrewer(v.1.1-3), fastmap(v.1.2.0), jsonlite(v.2.0.0), scales(v.1.4.0), jquerylib(v.0.1.4), cli(v.3.6.5), crayon(v.1.5.3), rlang(v.1.1.7), ellipsis(v.0.3.2), withr(v.3.0.2), cachem(v.1.1.0), yaml(v.2.3.12), otel(v.0.2.0), tools(v.4.5.2), tzdb(v.0.5.0), vctrs(v.0.7.1), R6(v.2.6.1), lifecycle(v.1.0.5), urca(v.1.3-4), pkgconfig(v.2.0.3), progressr(v.0.18.0), pillar(v.1.11.1), bslib(v.0.10.0), gtable(v.0.3.6), glue(v.1.8.0), Rcpp(v.1.1.1), xfun(v.0.56), tidyselect(v.1.2.1), rstudioapi(v.0.18.0), knitr(v.1.51), farver(v.2.1.2), nlme(v.3.1-168), htmltools(v.0.5.9), labeling(v.0.4.3), rmarkdown(v.2.30), compiler(v.4.5.2) and S7(v.0.2.1)