Exercise 9.1

Figure 9.32
Figure 9.32

a.

Explain the differences in Figure 9.32. Do they all indicate that the data are white noise?

Answer: The plots differ by the magnitude of the critical values (dotted blue lines) as well as the magnitude of the spikes. Specifically, these magnitudes decrease as the number of random numbers increases.

Yes, all the ACF plots suggest the data are white noise because the ACF values drop to near zero quickly and none of the spikes exceed the critical values. In addition, there is no obvious pattern in the spikes.

b.

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

Answer: The critical values decrease as the number of random numbers increases because the critical values for a 95% confidence interval in an ACF plot correspond to \(\pm 1.96 / \sqrt{T}\), where \(T\) is the number of observations. This behavior underscores the importance of analyzing large(r) samples when making statistical inferences.

The autocorrelations are different in each plot because the numbers are random (technically, pseudo-random), so each plot corresponds to a different set of data, which have different autocorrelations.

Exercise 9.2

Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

Answer:

  • The time series plot (top) shows an upward trend, which by definition of stationarity means that the data are not stationary.

  • The ACF plot (lower left) shows that ACF values decrease slowly as lag increases, but remain highly positive, which indicates that the data are non-stationary and need differencing.

  • The partial ACF plot (lower right) has a large signiicant spike at lag=1 while spikes at greater lags are much smaller and mostly non-significant. This suggests that an AR(1) model of the data would be suitable; however, the data would first need to be made stationary (eg, by differencing).

gafa_stock %>%
  filter(Symbol == 'AMZN') %>%
  gg_tsdisplay(Close, plot_type = 'partial') +
    ggtitle('Amazon stock closing price on irregular trading days, 2014-2019')

Consistent with these inferences of non-stationarity and need for differencing, the KPSS unit root test shows that first-order differencing is sufficient to achieve stationarity.

# KPSS test
gafa_stock %>%
  filter(Symbol == 'AMZN') %>%
  features(Close, unitroot_ndiffs) %>%
  pull(ndiffs)
## [1] 1

Exercise 9.3

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

a. Turkish GDP from global_economy

Answer: A Box-Cox transformation with \(\lambda = 0.15\) and first-order differencing is needed to obtain stationary data.

Explanation:

  • The time series plot (top) shows that the GDP of Turkey has an upward trend, so the untransformed time series is not stationary.

    The variance does not appear to be constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.

  • The ACF plot (lower left) shows that ACF values decrease slowly as lag increases, which confirms the non-stationarity and need for differencing.

  • The partial ACF plot (lower right) has a large significant spike at lag=1 while spikes at greater lags are much smaller and are non-significant. This suggests that an AR(1) model of the data would be suitable; however, the data would need to be differenced to make it stationary.

global_economy %>%
  filter(Country == 'Turkey') %>%
  gg_tsdisplay(GDP, plot_type = 'partial') +
    ggtitle('GDP of Turkey, 1960-2017')

The optimal value of lambda for Box-Cox transformation is 0.15, which suggests that a power transformation could normalize the data and stabilize its variance.

# Calculate lambda
lambda <- global_economy %>%
  filter(Country == 'Turkey') %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

lambda
## [1] 0.1572187

The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.

global_economy2 <- global_economy %>%
  filter(Country == 'Turkey') %>%
  mutate(
    transformed_GDP = box_cox(GDP, lambda)
  )

global_economy2 %>%
  features(transformed_GDP, unitroot_ndiffs) %>%
  pull(ndiffs)
## [1] 1

After transformation and differencing, the data are closer to stationarity. Specifically, the time series plot no longer has a trend, and the ACF plot does not have any pattern or significant spikes that exceed the critical values.

# Differencing creates NAs at beginning of series; I suppressed the warnings about
# missing values
global_economy2 %>%
  mutate(
    fd = difference(transformed_GDP)
  ) %>%
  gg_tsdisplay(fd, plot_type='partial') +  
    labs(
      y = 'FD(BoxCox(GDP))',
      title = 'Differenced transformed GDP data',
      subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
                                    '$\\lambda$=', round(lambda, 2))),
      caption = 'FD, first difference.') +
    theme(
      plot.caption = element_text(color = "darkgray", hjust = 0)
    )

b. Accommodations in Tasmania from aus_accomodation

Answer: A Box-Cox transformation with \(\lambda = 0\) and first-order differencing is needed to make the data more stationary.

Explanation:

  • The time series plot (top) shows that the accommodations data have seasonality (quarterly period) and an upward trend, so the untransformed time series is not stationary.

    It can also be seen that the variance is not constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.

  • The ACF plot (lower left) exhibits a sinusoidal pattern, which confirms the seasonality and non-stationarity.

  • The partial ACF plot (lower right) has large significant spikes up to lag=6, and then decrease and remain below the critical values thereafter. This supports using a higher order autoregressive model of the data; however, the data would need to be differenced to make it stationary.

aus_accommodation %>%
  filter(State == 'Tasmania') %>%
  gg_tsdisplay(Takings, plot_type = 'partial') +
    ggtitle('Tourist Accommodations in Tasmania Australia, 1998-2016')

The optimal value of lambda for Box-Cox transformation is near zero, which suggests a log transformation could normalize the data and stabilize its variance.

# Calculate lambda
lambda <- aus_accommodation %>%
  filter(State == 'Tasmania') %>%
  features(Takings, features = guerrero) %>%
  pull(lambda_guerrero)

lambda
## [1] 0.001819643

The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.

aus_accommodation2 <- aus_accommodation %>%
  filter(State == 'Tasmania') %>%
  mutate(
    transformed_takings = box_cox(Takings, lambda)
  )

aus_accommodation2 %>%
  features(transformed_takings, unitroot_nsdiffs) %>%
  pull(nsdiffs)
## [1] 1

After transformation and differencing, the data are closer to stationarity. The time series plot no longer exhibits seasonality or a trend. The ACF values drop below the critical values more rapidly, which is characteristic of stationarity. However, there still appears to be a sinusoidal pattern, so I am not sure whether the differenced, transformed data are completely stationary.

aus_accommodation2 %>%
  mutate(
    # lag = 4 since data are seasonal with quarterly period
    fd = difference(transformed_takings, lag = 4)
  ) %>%
  gg_tsdisplay(fd, plot_type='partial') +
    labs(
      y = 'FD(BoxCox(Takings))',
      title = 'Differenced transformed accommodations data',
      subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
                                    '$\\lambda$$\\approx$', round(lambda, 2))),
      caption = 'FD, first difference.') +
    theme(
      plot.caption = element_text(color = "darkgray", hjust = 0)
    )

c. Monthly sales from souvenirs

Answer: A Box-Cox transformation with \(\lambda = 0\) and first-order differencing is needed to obtain stationary data.

Explanation:

  • The time series plot (top) shows that souvenir sales trended upward and that fluctuations in sales vary with the level of the series. The seasonality period was unclear to me in this plot, so I looked at the subseries plots, which more clearly show that the seasonality has a monthly period. Together, these observations indicate that the untransformed time series is not stationary.

    It can also be seen that the variance is not constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.

  • The ACF plot (lower left) has a sinusoidal pattern, which confirms the seasonality and nonstationarity.

  • The partial ACF plot (lower right) has large significant spikes up to lag=13. This suggests using a higher order autoregressive model of the data; however, the data would need to be differenced to make it stationary.

souvenirs %>%
  gg_tsdisplay(Sales, plot_type = 'partial') +
    ggtitle('Souvenir Sales, 1987-1993')

ggtime::gg_subseries(souvenirs, Sales) 

The optimal value of lambda for Box-Cox transformation is near zero, which suggests a log transformation could normalize the data and stabilize its variance.

# Calculate lambda
lambda <- souvenirs %>%
  features(Sales, features = guerrero) %>%
  pull(lambda_guerrero)

lambda
## [1] 0.002118221

The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.

souvenirs2 <- souvenirs %>%
  mutate(
    transformed_sales = box_cox(Sales, lambda)
  )

souvenirs2 %>%
  features(transformed_sales, unitroot_nsdiffs) %>%
  pull(nsdiffs)
## [1] 1

After transformation and differencing, the data are more stationary. The time series plot no longer exhibits seasonality or a trend. The absence of a sinusoidal pattern is consistent with the removal of seasonality, and ACF values approach zero more rapidly, which is characteristic of stationarity.

souvenirs2 %>%
  mutate(
    # lag = 12 since data are seasonal with monthly period
    fd = difference(transformed_sales, lag = 12)
  ) %>%
  gg_tsdisplay(fd, plot_type='partial') +
    labs(
      y = 'FD(BoxCox(Sales))',
      title = 'Differenced transformed sales data',
      subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
                                    '$\\lambda$$\\approx$', round(lambda, 2))),
      caption = 'FD, first difference.') +
    theme(
      plot.caption = element_text(color = "darkgray", hjust = 0)
    )

Exercise 9.5

For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

Answer: A Box-Cox transformation with \(\lambda = 0\) and first-order differencing is needed to obtain stationary data.

Explanation:

  • The time series plot (top) shows that retail turnover trended upward. In addition, the subseries plot shows the seasonality has a monthly period. Together, the trend and seasonality indicate that the untransformed time series is not stationary.

    The variance does not appear to be constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.

  • The ACF plot (lower left) has a sinusoidal pattern, which confirms the seasonality and nonstationarity.

  • The partial ACF plot (lower right) has large significant spikes up to lag=13. This suggests using a higher order autoregressive model of the data; however, the data would need to be differenced to make it stationary.

# Generate the same series used previously
set.seed(144)

retail_ts <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID` , 1))
retail_ts %>%
  # Due to overplotting, I only visualized the series after 2010
  filter(year(Month) >= 2010) %>%
  gg_tsdisplay(Turnover, plot_type = 'partial') +
    labs(
      y = 'Turnover (AUD, million)',
      title = 'Retail turnover in Australia, 2010-2018'
    )

ggtime::gg_subseries(retail_ts, Turnover)

The optimal value of lambda for Box-Cox transformation is 0.22, which suggests that a power transformation could normalize the data and stabilize its variance.

# Calculate lambda
lambda <- retail_ts %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

lambda
## [1] 0.2229869

The KPSS unit root test indicates that first-order differencing of the transformed data is sufficient to achieve stationarity.

retail_ts <- retail_ts %>%
  mutate(
    transformed_turnover = box_cox(Turnover, lambda)
  )

retail_ts %>%
  features(transformed_turnover, unitroot_nsdiffs) %>%
  pull(nsdiffs)
## [1] 1

After transformation and differencing, the data are closer to stationarity. The time series plot no longer exhibits seasonality or a trend. The absence of a sinusoidal pattern is consistent with the removal of seasonality, and ACF values approach zero more rapidly, which is characteristic of stationary data.

retail_ts %>%
  mutate(
    # lag = 12 since data are seasonal with monthly period
    fd = difference(transformed_turnover, lag = 12)
  ) %>%
  # Due to overplotting, I only visualized the series after 2010  
  filter(year(Month) >= 2010) %>%
  gg_tsdisplay(fd, plot_type='partial') +
    labs(
      y = 'FD(BoxCox(Turnover))',
      title = 'Differenced transformed retail turnover data',
      subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
                                    '$\\lambda$$\\approx$', round(lambda, 2))),
      caption = 'FD, first difference.') +
    theme(
      plot.caption = element_text(color = "darkgray", hjust = 0)
    )

Exercise 9.6

a.

Use the following R code to generate data from an AR(1) model with \(\phi_1 = 0.6\) and \(\sigma^2 = 1\). The process starts with \(y_1 = 0\).

# To streamline code for part (b), I created a function using the code provided
# in the textbook. The function generates data from an AR(1) model with the specified 
# value for phi and returns the data as a tsibble.
# I also defined a seed value for reproducibility.

generate_AR1_model <- function(phi) {
  set.seed(100)
  y <- numeric(100)
  e <- rnorm(100)

  for(i in 2:100) {
    y[i] <- phi * y[i-1] + e[i]
  }
  
  data_ts <- tsibble(idx = seq_len(100), y = y, index = idx)
  return(data_ts)
}
sim <- generate_AR1_model(phi = 0.6)

b.

Produce a time plot for the series. How does the plot change as you change \(\phi_1\)?

Answer: The facet plot below shows the effect of five different values of \(\phi\) on the AR(1) model. When \(\phi = -1\), the time series exhibits an oscillating behavior (ie, values change from positive to negative and vice versa at every index), and the magnitude of these changes increase with the level of the series. As a result, the time series is not stationary.

As \(\phi\) increases to 0, this behavior becomes less extreme and the series looks more like stationary white noise (ie, zero mean, constant variance, no pattern). However, as \(\phi\) approaches 1, the time series loses stationarity, as shown by the shift of the mean away from zero and non-constant variance.

# I generated data for 4 additional values of phi and then
# combined all 5 datasets into a single tibble with the
# phi value as a label (factor)
sim_AR1_tbl <- sim %>%
  mutate(
    phi = 'phi = 0.6'
  ) %>%
  as_tibble()

sim2_AR1_tbl <- generate_AR1_model(phi = -1) %>%
  mutate(
    phi = 'phi = -1'
  ) %>%
  as_tibble()

sim3_AR1_tbl <- generate_AR1_model(phi = -0.6) %>%
  mutate(
    phi = 'phi = -0.6'
  ) %>%
  as_tibble()

sim4_AR1_tbl <- generate_AR1_model(phi = 0) %>%
  mutate(
    phi = 'phi = 0'
  ) %>%
  as_tibble()

sim5_AR1_tbl <- generate_AR1_model(phi = 1) %>%
  mutate(
    phi = 'phi = 1'
  ) %>%
  as_tibble()

sim_AR1_combined <- bind_rows(sim_AR1_tbl, sim2_AR1_tbl, sim3_AR1_tbl, 
                              sim4_AR1_tbl, sim5_AR1_tbl)

sim_AR1_combined$phi <- factor(sim_AR1_combined$phi,
                           levels = c('phi = -1', 'phi = -0.6', 'phi = 0',
                                      'phi = 0.6', 'phi = 1'))
# Facet plot
ggplot(sim_AR1_combined, aes(x = idx, y = y)) +
  geom_line() +
  facet_wrap(~ phi, scales = 'free_y', ncol = 3, nrow = 2) + 
  ggtitle(latex2exp::TeX('Effect of different values of $\\phi_1$ on AR(1) model')) +
  theme(
    axis.title = element_text(face = "bold")
  )

c. 

Write your own code to generate data from an MA(1) model with \(\theta_1 = 0.6\) and \(\sigma^2 = 1\).

# Analogous to part (a)
generate_MA1_model <- function(theta) {
  set.seed(100)
  y <- numeric(100)
  e <- rnorm(100)

  for(i in 2:100) {
    y[i] <- theta * e[i-1] + e[i]
  }
  
  data_ts <- tsibble(idx = seq_len(100), y = y, index = idx)
  return(data_ts)
}

d. 

Produce a time plot for the series. How does the plot change as you change \(\theta_1\)?

Answer: The facet plot below shows the effect of five different values of \(\theta\) on the MA(1) model. In general, \(\theta\) appears to affect the intensity of oscillatory behavior, such that the series is more “compressed” (ie, more rapid changes) when \(\theta < 0\) than when \(\theta > 0\). When \(\theta = 0\), the MA(1) time series appears to be most like white noise.

sim1_MA1_tbl <- generate_MA1_model(theta = -1) %>%
  mutate(
    theta = 'theta = -1'
  ) %>%
  as_tibble()

sim2_MA1_tbl <- generate_MA1_model(theta = -0.6) %>%
  mutate(
    theta = 'theta = -0.6'
  ) %>%
  as_tibble()

sim3_MA1_tbl <- generate_MA1_model(theta = 0) %>%
  mutate(
    theta = 'theta = 0'
  ) %>%
  as_tibble()

sim4_MA1_tbl <- generate_MA1_model(theta = 0.6) %>%
  mutate(
    theta = 'theta = 0.6'
  ) %>%
  as_tibble()

sim5_MA1_tbl <- generate_MA1_model(theta = 1) %>%
  mutate(
    theta = 'theta = 1'
  ) %>%
  as_tibble()

sim_MA1_combined <- bind_rows(sim1_MA1_tbl, sim2_MA1_tbl, sim3_MA1_tbl, 
                              sim4_MA1_tbl, sim5_MA1_tbl)

sim_MA1_combined$theta <- factor(sim_MA1_combined$theta,
                            levels = c('theta = -1', 'theta = -0.6', 'theta = 0',
                                       'theta = 0.6', 'theta = 1'))
# Facet plot
ggplot(sim_MA1_combined, aes(x = idx, y = y)) +
  geom_line() +
  facet_wrap(~ theta, scales = 'free_y', ncol = 3, nrow = 2) + 
  ggtitle(latex2exp::TeX('Effect of different values of $\\theta_1$ on MA(1) model')) +
  theme(
    axis.title = element_text(face = "bold")
  )

e.

Generate data from an ARMA(1, 1) model with \(\phi_1 = 0.6\), \(\theta_1 = 0.6\), and \(\sigma^2 = 1\).

set.seed(100)
y <- numeric(100)
e <- rnorm(100)

for (i in 2:100) {
  y[i] <- 0.6 * e[i-1] + 0.6 * y[i-1] + e[i]
}

ARMA1_1 <- tsibble(idx = seq_len(100), y = y, index = idx)

f. 

Generate data from an AR(2) model with \(\phi_1 = -0.8\), \(\phi_2 = 0.3\), and \(\sigma^2 = 1\). (Note that these parameters will give a non-stationary series).

set.seed(100)
y <- numeric(100)
e <- rnorm(100)

for (i in 3:100) {
  y[i] <- -0.8 * y[i-1] + 0.3 * y[i-2] + e[i]
}

AR2 <- tsibble(idx = seq_len(100), y = y, index = idx)

g.

Graph the latter two series and compare them.

Answer: The behavior of the ARMA(1, 1) model is more stable over time than the AR(2) model, which exhibits increasingly extreme oscillations over time. This is not surprising because the ARMA(1, 1) model, which is a linear combination of the AR(1) model with \(\phi_1 = 0.6\) (see plot in part (b) of this exercise) and the MA(1) model with \(\theta_1 = 0.6\) (see plot in part (d) of this exercise), includes a moving average. In contrast, the AR(2) model only accounts for the past two observations (one of which is weighted more than twice as much as the other and with opposite sign, which exacerbates the oscillatory behavior).

autoplot(ARMA1_1, y) +
  ggtitle(latex2exp::TeX('ARMA(1,1) model with $\\phi_1 = 0.6$ and $\\theta_1 = 0.6$'))

autoplot(AR2, y) +
  ggtitle(latex2exp::TeX('AR(2) model with $\\phi_1 = -0.8$ and $\\phi_2 = 0.3$'))

Exercise 9.7

Consider aus_airpassengers , the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

a.

Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

Answer: The best fit was an ARIMA(0,2,1) model, which has no autoregressive parts (\(p=0\)), second-degree differencing (\(d=2\)), and a first-order moving average part (\(q=1\)).

# Fit the data using ARIMA
passengers_fit <- aus_airpassengers %>%
  # aus_airpassengers includes data from 1970 to 2016,
  # so the lower bound does not need to be specified
  filter(Year < 2012) %>%
  model(ARIMA(Passengers))

report(passengers_fit)
## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8756
## s.e.   0.0722
## 
## sigma^2 estimated as 4.671:  log likelihood=-87.8
## AIC=179.61   AICc=179.93   BIC=182.99

The plots below suggest that the residuals are white noise.

  • The time series plot (top) shows that the residuals fluctuate around zero with mostly similar variance and no apparent pattern.

  • The autocorrelation plot (lower left) does not have any significant spikes.

  • The histogram (lower right) shows that the distribution of the residuals is close to normal, and the mean is near zero.

# Residual plot
passengers_fit %>%
  gg_tsresiduals() +
  ggtitle('Residuals from ARIMA(0,2,1) model of aus_airpassengers')

The residual plots also agreed with the Ljung-Box tests with \(l=10\) (for non-seasonal data). The P-value was not significant, so the null hypothesis that the data are white noise was not rejected.

passengers_fit %>%
  augment() %>%
  features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
##   .model            lb_stat lb_pvalue
##   <chr>               <dbl>     <dbl>
## 1 ARIMA(Passengers)    6.00     0.815

Together, the above results suggest that the ARIMA(0,2,1) model is appropriate, so we can proceed with forecasting.

# Forecast next 10 periods
plot9_7a <- passengers_fit %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
    scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) + 
    guides(
      y = guide_axis(minor.ticks = TRUE)
    ) +
    labs(
      y = 'Passengers (millions)',
      title = 'Air Passengers, 1970-2011 + Forecast',
      subtitle = 'ARIMA(0,2,1) model'
    ) +
    theme_classic() +
    theme(
      axis.title = element_text(face = 'bold', size = 8),
      plot.title = element_text(face = 'bold', size = 10),
      plot.subtitle = element_text(size = 8)
    )

plot9_7a

b.

Write the model in terms of the backshift operator.

Answer: Using the general equation 9.2 in the textbook with the values of \(p\), \(d\), \(q\), and \(\theta_1\) from part (a), the model can be written as \((1 - B)^2y_t = (1 - 0.876B)\epsilon_t\), where \(By_t = y_{t-1}\).

c. 

Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.

Answer: The forecasts from the ARIMA(0,1,0) model with drift are more conservative (ie, lower forecasted values and smaller slope/rate of increase) than those from the ARIMA(0,2,1) model.

# Fit the data using ARIMA(0,1,0) with drift
passengers_fit2 <- aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ pdq(0, 1, 0)))

report(passengers_fit2)
## Series: Passengers 
## Model: ARIMA(0,1,0) w/ drift 
## 
## Coefficients:
##       constant
##         1.3669
## s.e.    0.3319
## 
## sigma^2 estimated as 4.629:  log likelihood=-89.08
## AIC=182.17   AICc=182.48   BIC=185.59

The residual plot and Ljung-Box test indicate the residuals from the ARIMA(0,1,0) model with drift are white noise.

# Residual plot
passengers_fit2 %>%
  gg_tsresiduals() +
  ggtitle('Residuals from ARIMA(0,1,0) model of aus_airpassengers')

# Ljung-Box test of residuals
passengers_fit2 %>%
  augment() %>%
  features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
##   .model                           lb_stat lb_pvalue
##   <chr>                              <dbl>     <dbl>
## 1 ARIMA(Passengers ~ pdq(0, 1, 0))    4.92     0.897

The ARIMA(0,1,0) model with drift is appropriate, so we can proceed with forecasting and compare the result to the forecast from the ARIMA(0,2,1) model.

# Forecast next 10 periods
plot9_7c <- passengers_fit2 %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
    scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) + 
    guides(
      y = guide_axis(minor.ticks = TRUE)
    ) +
    labs(
      y = 'Passengers (millions)',
      title = 'Air Passengers, 1970-2011 + Forecast',
      subtitle = 'ARIMA(0,1,0) model with drift'
    ) +
    theme_classic() +
    theme(
      axis.title = element_text(face = 'bold', size = 8),
      plot.title = element_text(face = 'bold', size = 10),
      plot.subtitle = element_text(size = 8)
    )

# Compare forecasts
cowplot::plot_grid(plot9_7a, plot9_7c)

d. 

Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.

Answer: The forecasts from the ARIMA(2,1,2) model with drift are similar to the forecasts from the ARIMA(0,2,1) model (in terms of the magnitude and slope of forecasted values); however, the prediction intervals appear to be a little smaller. The forecasts from the ARIMA(0,1,0) model with drift are the most conservative.

It is not possible to create an ARIMA(2,1,2) model with drift without the constant term, which effectively removes the differencing. This prevents the data from being made stationary, so the autoregressive part of the model cannot converge. As a result, the ARIMA() function produces an error.

# Fit the data using ARIMA(2,1,2) with drift
passengers_fit3 <- aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ pdq(2, 1, 2)))

report(passengers_fit3)
## Series: Passengers 
## Model: ARIMA(2,1,2) w/ drift 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2  constant
##       1.4700  -0.5106  -1.5740  0.6780    0.0646
## s.e.  0.3792   0.3565   0.3091  0.2684    0.0293
## 
## sigma^2 estimated as 4.747:  log likelihood=-87.74
## AIC=187.47   AICc=189.94   BIC=197.75

The residual plot and Ljung-Box test are consistent with the residuals from the ARIMA(2,1,2) model being white noise.

# Residual plot
passengers_fit3 %>%
  gg_tsresiduals() +
  ggtitle('ARIMA(2,1,2) model residuals for aus_airpassengers')

# Ljung-Box test of residuals
passengers_fit3 %>%
  augment() %>%
  features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
##   .model                           lb_stat lb_pvalue
##   <chr>                              <dbl>     <dbl>
## 1 ARIMA(Passengers ~ pdq(2, 1, 2))    2.75     0.987

The ARIMA(2,1,2) model with drift is appropriate, so we can proceed with forecasting and compare the result to the forecast from the ARIMA(0,2,1) and ARIMA(0,1,0) models.

# Forecast next 10 periods
plot9_7d <- passengers_fit3 %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
    scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) + 
    guides(
      y = guide_axis(minor.ticks = TRUE)
    ) +
    labs(
      y = 'Passengers (millions)',
      title = '',  # Using title from part (a) for cowplot
      subtitle = 'ARIMA(2,1,2) model with drift'
    ) +
    theme_classic() +
    theme(
      axis.title = element_text(face = 'bold', size = 8),
      plot.title = element_text(face = 'bold', size = 10),
      plot.subtitle = element_text(size = 8)
    )

# Compare forecasts
# Remove plot title from part (c)
plot9_7c <- plot9_7c + ggtitle('')

# Arrange plot grid
cowplot::plot_grid(plot9_7a, NULL, plot9_7c, plot9_7d, ncol = 2)

Removing the constant term results in an error, which prevents the model from being fitted.

passengers_fit3b <- aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ 0 + pdq(2, 1, 2)))
## Warning: 1 error encountered for ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
## [1] non-stationary AR part from CSS

e.

Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

Answer: Including a constant in the ARIMA(0,2,1) model results in a warning that it introduces a higher-order polynomial trend, which is discouraged. The facet plot below shows that forecasts from this model are larger and have a larger slope (rate of increase) than forecasts from the ARIMA(0,2,1) model (without a constant). The forecasts are also more aggressive than the ARIMA(0,1,0) and ARIMA(2,1,2) models with drift. The rapid increase in forecast values from the ARIMA(0,2,1) model with drift is consistent with the underlying behavior of a higher-order polynomial, which deviates from the expected trend of the historical data.

# Fit the data using ARIMA(0,2,1) with drift
passengers_fit4 <- aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ 1 + pdq(0, 2, 1)))
## Warning: Model specification induces a quadratic or higher order polynomial trend.  This
## is generally discouraged, consider removing the constant or reducing the number
## of differences.
report(passengers_fit4)
## Series: Passengers 
## Model: ARIMA(0,2,1) w/ poly 
## 
## Coefficients:
##           ma1  constant
##       -1.0000    0.0721
## s.e.   0.0709    0.0260
## 
## sigma^2 estimated as 4.086:  log likelihood=-85.74
## AIC=177.48   AICc=178.15   BIC=182.55

The residual plot and Ljung-Box test are consistent with the residuals from the ARIMA(0, 2, 1) model being white noise, so we can proceed with forecasting and compare the result to the forecasts from the previous models.

# Residual plot
passengers_fit4 %>%
  gg_tsresiduals() +
    ggtitle('Residuals from ARIMA(0,2,1) model with drift for aus_airpassengers')

# Ljung-Box test of residuals
passengers_fit4 %>%
  augment() %>%
  features(.innov, ljung_box, lag = 10)
## # A tibble: 1 × 3
##   .model                               lb_stat lb_pvalue
##   <chr>                                  <dbl>     <dbl>
## 1 ARIMA(Passengers ~ 1 + pdq(0, 2, 1))    6.19     0.799
# Forecast next 10 periods
plot9_7e <- passengers_fit4 %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
    scale_y_continuous(limits = c(0, 120), breaks = seq(0, 120, by = 40)) + 
    guides(
      y = guide_axis(minor.ticks = TRUE)
    ) +
    labs(
      y = 'Passengers (millions)',
      title = 'Air Passengers, 1970-2011 + Forecast',
      subtitle = 'ARIMA(2,1,2) model with drift'
    ) +
    theme_classic() +
    theme(
      axis.title = element_text(face = 'bold', size = 8),
      plot.title = element_text(face = 'bold', size = 10),
      plot.subtitle = element_text(size = 8)
    )

# Compare forecasts
# Remove plot title from part (a)
plot9_7a <- plot9_7a + ggtitle('')

# Arrange plots
cowplot::plot_grid(plot9_7e, plot9_7a, plot9_7c, plot9_7d, ncol = 2)

Exercise 9.8

For the United States GDP series (from global_economy ):

a. If necessary, find a suitable Box-Cox transformation for the data

b. Fit a suitable ARIMA model to the transformed data using ARIMA()

c. Try some other plausible models by experimenting with the orders chosen

d. Choose what you think is the best model and check the residual diagnostics

e. Produce forecasts of your fitted model. Do the forecasts look reasonable?

f. Compare the results with what you would obtain using ETS() (with no transformation)

# Select and transform data of interest
us_gdp <- global_economy %>%
  filter(Country == 'United States') %>%
  mutate(
    # I analyzed GDP per capita to remove population increase as a potential confounder
    GDP_per_capita = GDP / Population
  ) %>%
  select(GDP_per_capita)

Time series and ACF plots

  • The time series plot (top) shows that US GDP per capita trended upward from 1960 to 2017, which means the time series is not stationary.

    The variance does not appear to be constant, which suggests the need for transformation. The data are all positive, so Box-Cox transformation can be applied.

  • The ACF plot (lower left) shows that ACF values decrease slowly as lag increases, which confirms the non-stationarity and suggests a need for differencing. The slow decay suggests that a moving average term is not needed to model the data.

  • The partial ACF plot (lower right) has a large significant spike at lag=1 followed by a sharp drop to near zero, which suggests an AR(1) model; however, the data would first need to be differenced to make it stationary.

us_gdp %>%
  gg_tsdisplay(GDP_per_capita, plot_type = 'partial') +
    ylab('GDP (USD)') +
    ggtitle('United States GDP per capita, 1960-2017')

Box-Cox transformation

The Guerrero method indicated that a Box-Cox transformation with \(\lambda = 0.39\) would help stabilize the variance.

# Determine optimal value of lambda for Box-Cox transformation
# Note that all values are positive, which satisfies the requirement for Box-Cox
lambda <- us_gdp %>%
  features(GDP_per_capita, features = guerrero) %>%
  pull(lambda_guerrero)

lambda
## [1] 0.3929072
# Apply Box-Cox transformation
us_gdp <- us_gdp %>%
  mutate(
    GDP_per_capita_transformed = box_cox(GDP_per_capita, lambda)
  )

Differencing

The KPSS unit root test indicated that first-order differencing of the transformed data is sufficient to achieve stationarity.

us_gdp %>%
  features(GDP_per_capita_transformed, unitroot_ndiffs) %>%
  pull(ndiffs)
## [1] 1

The plots below confirm that first-order differencing makes the data more stationary. Specifically, the time series plot no longer has a trend, and all but one spike in the ACF plot are non-significant.

us_gdp %>%
  gg_tsdisplay(difference(GDP_per_capita_transformed), plot_type = 'partial') +
    labs(
      y = 'FD(BoxCox(GDP))',
      title = 'Differenced transformed GDP data',
      subtitle = latex2exp::TeX(paste0('Box-Cox transformation with ',
                                    '$\\lambda$=', round(lambda, 2))),
      caption = 'FD, first difference; GDP, gross domestic product per capita.') +
    theme(
      plot.caption = element_text(color = "darkgray", hjust = 0)
    )

Model evaluation

The results above suggest an ARIMA(1,1,0) model. In addition, I evaluated two automated model selections using the default stepwise procedure as well as a broader search of the order space.

gdp_fit <- us_gdp %>%
  model(arima110 = ARIMA(GDP_per_capita_transformed ~ pdq(1, 1, 0)),
        stepwise = ARIMA(GDP_per_capita_transformed),
        search = ARIMA(GDP_per_capita_transformed, stepwise = FALSE))

All three models had identical AICc values, indicating that the ARIMA(1,1,0) model is likely the best fit of the data.

glance(gdp_fit) %>%
  arrange(AICc) %>%
  select(.model:BIC)
## # A tibble: 3 × 6
##   .model   sigma2 log_lik   AIC  AICc   BIC
##   <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl>
## 1 arima110  0.927   -77.8  162.  162.  168.
## 2 stepwise  0.927   -77.8  162.  162.  168.
## 3 search    0.927   -77.8  162.  162.  168.

Residual diagnostics

The plots below suggest that the residuals from the ARIMA(1,1,0) model are white noise.

  • The time series plot (top) shows that the residuals fluctuate around zero with mostly similar variance and no apparent pattern.

  • The autocorrelation plot (lower left) does not have any significant spikes.

  • The histogram (lower right) shows that the distribution of the residuals is close to normal, and the mean is near zero.

gdp_fit %>%
  select(arima110) %>%
  gg_tsresiduals() +
    ggtitle('Residuals from ARIMA(1,1,0) model of US GDP')

In addition, the Ljung-Box portmanteau test did not have a significant P value (P > 0.05), so the null hypothesis that the data are white noise was not rejected.

# Ljung-Box test of residuals
gdp_fit %>%
  select(arima110) %>%
  augment() %>%
  # degrees of freedom from ARIMA(p,d,q) = h-(p+q)
  # where h is the number of lags
  # lag=10 for non-seasonal data, so dof = 10-(1+0) = 9
  features(.innov, ljung_box, lag = 10, dof = 9)
## # A tibble: 1 × 3
##   .model   lb_stat lb_pvalue
##   <chr>      <dbl>     <dbl>
## 1 arima110    3.58    0.0586

These results suggest that the ARIMA(1,1,0) model is appropriate, so we can proceed with forecasting.

Forecast

The forecast for the next 10 years is shown below. Based on the historical trend, the forecast looks reasonable.

gdp_fit %>%
  select(arima110) %>%
  # Forecoast next 10 years
  forecast(h = 10) %>%
  autoplot(us_gdp) +
    labs(
      y = 'BoxCox(GDP)',
      title = 'US GDP per capita, 1960-2017 + Forecast',
      subtitle = 'ARIMA(1,1,0) model'
    ) +
    scale_x_continuous(limits = c(1960, 2030), breaks = seq(1960, 2030, by = 10)) + 
    scale_y_continuous(limits = c(50, 250), breaks = seq(50, 250, by = 50)) + 
    theme_classic() +
    theme(
      axis.title = element_text(face = 'bold'),
      plot.title = element_text(face = 'bold')
    )

Comparison with ETS model

The ARIMA(1,1,0) and ETS models have similar performance for forecasting US GDP, as shown in the forecast plot below. Overall, the ARIMA(1,1,0) model has lower cross-validated performance metrics (eg, RMSE). However, the 95% prediction interval for the first forecast of the ETS model is slightly smaller than that for the ARIMA model.

# Fit the models
# I used transformed data for both models; otherwise, transformation would be a
# potential confounding variable
gdp_fit2 <- us_gdp %>%
  model(
    ets = ETS(GDP_per_capita_transformed),
    arima110 = ARIMA(GDP_per_capita_transformed ~ pdq(1, 1, 0))
  )
# Generate forecasts for 10 years in future
h <- 10
gdp_forecast2 <- gdp_fit2 %>%
  forecast(h = h)

# Plot forecast with historical data
gdp_forecast2 %>%
  autoplot(us_gdp, level = NULL) +
  autolayer(
    .vars = GDP_per_capita_transformed,
    filter_index(us_gdp, '2018' ~ .)
  ) +
  # Only plot subset of data so the difference in forecasts is visible
  coord_cartesian(xlim = c(2000, 2030), ylim = c(150, 220)) +
  labs(
    y = 'BoxCox(GDP)',
    title = 'US GDP per capita, 2000-2017 + 10-Year Forecast',
    color = 'model'
  ) +
  theme_classic() +
  theme(
    axis.title = element_text(face = 'bold'),
    plot.title = element_text(face = 'bold')
  )

# Compare ETS vs ARIMA models using cross-validated performance measures
accuracy(gdp_fit2) %>%
  select(`.model`, RMSE:MAPE)
## # A tibble: 2 × 5
##   .model    RMSE   MAE     MPE  MAPE
##   <chr>    <dbl> <dbl>   <dbl> <dbl>
## 1 ets      0.995 0.743  0.0927 0.633
## 2 arima110 0.938 0.699 -0.0115 0.623
# Compare 95% prediction intervals for the first forecast for each model
gdp_forecast2 %>%
  mutate(
    hilo_interval95 = hilo(GDP_per_capita_transformed, 95),
    hilo_lower95 = hilo_interval95$lower,
    hilo_upper95 = hilo_interval95$upper
  ) %>%
  as.data.frame() %>%
  # h is length of the forecast, so h+1 is the row number for the 
  # first forecast of the second model
  slice(1, h+1) %>%
  select(model = `.model`, lower = hilo_lower95, upper = hilo_upper95)
##      model    lower    upper
## 1      ets 187.8378 193.9749
## 2 arima110 189.2593 193.0332

Session Details

pander::pander(sessionInfo())

R version 4.5.2 (2025-10-31)

Platform: aarch64-apple-darwin20

locale: en_US.UTF-8||en_US.UTF-8||en_US.UTF-8||C||en_US.UTF-8||en_US.UTF-8

attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: pander(v.0.6.6), cowplot(v.1.2.0), distributional(v.0.6.0), ggtime(v.0.1.0), latex2exp(v.0.9.8), fable(v.0.5.0), feasts(v.0.4.2), fabletools(v.0.5.1), fpp3(v.1.0.2), tsibbledata(v.0.4.1), tsibble(v.1.1.6), lubridate(v.1.9.4), forcats(v.1.0.1), stringr(v.1.6.0), dplyr(v.1.1.4), purrr(v.1.2.1), readr(v.2.1.6), tidyr(v.1.3.2), tibble(v.3.3.1), ggplot2(v.4.0.1) and tidyverse(v.2.0.0)

loaded via a namespace (and not attached): ggdist(v.3.3.3), utf8(v.1.2.6), rappdirs(v.0.3.4), sass(v.0.4.10), generics(v.0.1.4), anytime(v.0.3.12), lattice(v.0.22-7), stringi(v.1.8.7), hms(v.1.1.4), digest(v.0.6.39), magrittr(v.2.0.4), evaluate(v.1.0.5), grid(v.4.5.2), timechange(v.0.3.0), RColorBrewer(v.1.1-3), fastmap(v.1.2.0), jsonlite(v.2.0.0), scales(v.1.4.0), jquerylib(v.0.1.4), cli(v.3.6.5), crayon(v.1.5.3), rlang(v.1.1.7), ellipsis(v.0.3.2), withr(v.3.0.2), cachem(v.1.1.0), yaml(v.2.3.12), otel(v.0.2.0), tools(v.4.5.2), tzdb(v.0.5.0), vctrs(v.0.7.1), R6(v.2.6.1), lifecycle(v.1.0.5), urca(v.1.3-4), pkgconfig(v.2.0.3), progressr(v.0.18.0), pillar(v.1.11.1), bslib(v.0.10.0), gtable(v.0.3.6), glue(v.1.8.0), Rcpp(v.1.1.1), xfun(v.0.56), tidyselect(v.1.2.1), rstudioapi(v.0.18.0), knitr(v.1.51), farver(v.2.1.2), nlme(v.3.1-168), htmltools(v.0.5.9), labeling(v.0.4.3), rmarkdown(v.2.30), compiler(v.4.5.2) and S7(v.0.2.1)