Homework 6 - ARIMA

Imports

library(gridExtra)
library(ggplot2)
library(cowplot)
options(scipen=10000)
library(mlbench)
library(tidyverse)
library(corrplot)
library(AppliedPredictiveModeling)
library(caret)
library(DataExplorer)
library(kableExtra)
library(mice)
library(fpp3)

9.1

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

Explain the differences among these figures. Do they all indicate that the data are white noise?

Figure 9.32: Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers.

For the three ACF examples, the plots suggest that the structure of the time series closely resembles white noise, meaning that any autocorrelations observed at various lags are likely due to random chance rather than any significant relationship between the values. This indicates that the series lacks meaningful patterns, such as trends or seasonality. The most noticeable difference across the plots is the narrowing of the 95% confidence intervals, which occurs because larger sample sizes provide more precise estimates of autocorrelation. As the number of data points increases, the intervals shrink, further reinforcing that the observed correlations are likely insignificant.

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

The critical values (confidence intervals) in the ACF plots vary because they are determined by the sample size of each time series. Larger sample sizes result in narrower confidence intervals since more data provides more accurate and reliable estimates of the autocorrelations. In smaller samples, the intervals are wider due to greater uncertainty in the estimates.

Although all three time series represent white noise, the observed autocorrelations differ slightly due to the random nature of white noise. In theory, autocorrelations for white noise should be zero at all lags, but in practice, random fluctuations cause small variations. These variations are more noticeable in smaller samples, where fewer data points lead to less stable estimates. As the sample size increases, the autocorrelations tend to converge more closely around zero, providing a clearer representation of the underlying randomness.

9.2

A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

amazon_stock <-  gafa_stock |> filter(Symbol == "AMZN") |> select(Close)

plot1 <- amazon_stock |> autoplot(Close)

plot2 <- amazon_stock |> ACF(Close) |> autoplot()

plot3 <- amazon_stock |> PACF(Close) |> autoplot()



plot_grid(plot1,plot2,plot3)

The closing price time series for Amazon stock indicates non-stationarity due to the presence of an upward trend, along with changing mean and variance over time. A stationary series requires constant mean and variance, but in the case of stock prices, especially for growing companies like Amazon, an upward trend typically suggests that these properties evolve, confirming non-stationarity.

The ACF plot further supports this claim, as all lags fall outside the upper bound of the 95% confidence interval, with the autocorrelation decaying linearly from the first lag. This pattern, where autocorrelations remain high and slowly decrease, indicates a strong dependence on previous values, which is characteristic of a non-stationary series with a trend. The fact that the autocorrelation decays slowly confirms that past values influence the current values significantly, consistent with trending behavior.

The PACF plot, showing a correlation of 1 at lag 1, strongly suggests that the current value is almost entirely determined by the previous value, which is typical for non-stationary processes. This is indicative of an autoregressive process, likely an AR(1), where the series has a strong correlation at the first lag but weaker correlations at higher lags. The strong correlation at lag 1 reflects the impact of the immediate prior value, while the diminishing influence at further lags suggests that the effect of past values fades quickly. This pattern is commonly observed in time series that follow an AR(1) process and are dominated by trends.

9.3

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

Turkish GDP from global_economy.

turkey_eco <- global_economy |> filter(Country == "Turkey") |> select(GDP)

turkey_eco_plot1 <- turkey_eco |> autoplot(GDP) + ggtitle("Time series of Turkey GDP")

turkey_eco_plot2 <- turkey_eco |> ACF(GDP) |> autoplot() + ggtitle("ACF of Turkey GDP")

turkey_eco_plot3 <- turkey_eco |> PACF(GDP) |> autoplot()  + ggtitle("PACF of Turkey GDP")

plot_grid(turkey_eco_plot1,turkey_eco_plot2,turkey_eco_plot3,ncol = 1)

Transformed and Differenced Time Series

Apply boxcox transformation.

turkey_lambda <- turkey_eco |> features(features = guerrero) |> pull(lambda_guerrero)
turkey_lambda

## [1] 0.1572187

turkey_eco["GDP_TR"] <- box_cox(turkey_eco$GDP,turkey_lambda)
 turkey_eco |> autoplot(GDP_TR) + ggtitle("Time series of Turkey GDP (Transformed)")

The optimal lambda is 0.1572187.

Find the number of differences required to make the data stationary.

turkey_eco |>
  features(GDP_TR, unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

One difference is required to make the data stationary.

Plots

turkey_eco_plot1_dff <- turkey_eco |> autoplot(difference(GDP_TR)) + ggtitle("Stationary Time series of Turkey GDP (1 difference with lambda = 0.1572187)")

turkey_eco_acf_diff <- turkey_eco |> ACF(difference(GDP_TR)) |> autoplot() + ggtitle("ACF of Turkey GDP (1 difference with lambda = 0.1572187)")

turkey_eco_pacf_diff <- turkey_eco |> PACF(difference(GDP_TR)) |> autoplot() +ggtitle("PACF of Turkey GDP (1 difference with lambda = 0.1572187)")

plot_grid(turkey_eco_plot1_dff, turkey_eco_acf_diff,turkey_eco_pacf_diff,ncol = 1)

The GDP time series, which has been Box-Cox transformed and differenced once, shows clear signs of stationarity. The transformed series is centered around zero with relatively constant variability, as observed in the graphs. Both the ACF and PACF plots suggest that the data resembles white noise, indicating that there is no significant autocorrelation at different lags. This conclusion is further supported by the KPSS test, where a p-value of 0.1 suggests that the null hypothesis of stationarity cannot be rejected, confirming that the data is stationary after the applied transformations.

turkey_eco |>
  features(difference(GDP_TR), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1    0.0889         0.1

Accommodation takings in the state of Tasmania from aus_accommodation.

tasmania_takings <- aus_accommodation |> filter(State == "Tasmania") |> select(Takings)

tasmania_tk_plot1 <- tasmania_takings |> autoplot(Takings) + ggtitle("Time series of Tasmania Takings")

tasmania_tk_plot2 <- tasmania_takings |> ACF(Takings) |> autoplot() + ggtitle("ACF of Tasmania Takings")

tasmania_tk_plot3 <- tasmania_takings |> PACF(Takings) |> autoplot()  + ggtitle("PACF of Tasmania Takings")

plot_grid(tasmania_tk_plot1,tasmania_tk_plot2,tasmania_tk_plot3,ncol = 1)

Transformed and Differenced Time Series

Apply boxcox transformation.

tasmania_lambda <- tasmania_takings |> features(features = guerrero) |> pull(lambda_guerrero)
tasmania_lambda

## [1] 0.001819643

tasmania_takings["Takings_TR"] <- box_cox(tasmania_takings$Takings,tasmania_lambda)

 tasmania_takings |> autoplot(Takings_TR) + ggtitle("Time series of Tasmania Takings (Transformed)")

The optimal lambda is 0.001819643.

Find the number of differences required to make the data stationary.

tasmania_takings |>
  features(Takings_TR, unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

tasmania_takings |>
  features(Takings_TR, unitroot_nsdiffs)

## # A tibble: 1 × 1
##   nsdiffs
##     <int>
## 1       1

One seasonal difference of 4 for quarterly data and a first order difference are required to make the data stationary.

Plots

tasmania_takings["takings_adjusted"] <- difference(tasmania_takings$Takings_TR, 4)
tasmania_takings["takings_adjusted"] <- difference(tasmania_takings$takings_adjusted, 1)


tasmania_tk_plot1_dff <- tasmania_takings |> autoplot(takings_adjusted) + ggtitle("Stationary Time series of the Transformed Tasmania Takings (1 seasonal difference + 1 first order difference)")

tasmania_tk_acf_diff <- tasmania_takings |> ACF(takings_adjusted) |> autoplot() + ggtitle("ACF of Tasmania Takings (1 seasonal difference + 1  first order difference)")

tasmania_tk_pacf_diff <- tasmania_takings |> PACF(takings_adjusted) |> autoplot() +ggtitle("PACF of Tasmania Takings  (1 seasonal difference + 1  first order difference)")

plot_grid(tasmania_tk_plot1_dff, tasmania_tk_acf_diff,tasmania_tk_pacf_diff,ncol = 1)

The Tasmania Takings time series, which has undergone a Box-Cox transformation, first-order differencing, and seasonal differencing for quarterly data, shows clear signs of stationarity. The transformed series is centered around zero with relatively stable variability, as observed in the graphs. While the ACF and PACF plots display a few minor autocorrelations slightly exceeding the confidence intervals, these small deviations are likely due to random noise rather than significant autocorrelation. Overall, the series resembles white noise, indicating no meaningful patterns in the autocorrelations and supporting the stationarity of the data.

This conclusion is further supported by the KPSS test, where a p-value of 0.1 suggests that the null hypothesis of stationarity cannot be rejected, confirming that the data is stationary after the applied transformations.

tasmania_takings |>
  features(difference(takings_adjusted), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1    0.0381         0.1

Monthly sales from souvenirs.

souvenirs_ms_plot1 <- souvenirs |> autoplot(Sales) + ggtitle("Time series of Souvenirs Montly Sales")

souvenirs_ms_plot2 <- souvenirs |> ACF(Sales) |> autoplot() + ggtitle("ACF of Souvenirs Montly Sales")

souvenirs_ms_plot3 <- souvenirs |> PACF(Sales) |> autoplot()  + ggtitle("PACF of Souvenirs Montly Sales")

plot_grid(souvenirs_ms_plot1,souvenirs_ms_plot2,souvenirs_ms_plot3,ncol = 1)

Transformed and Differenced Time Series

Apply boxcox transformation.

souvenirs_ms_lambda <- souvenirs |> features(Sales,features = guerrero) |> pull(lambda_guerrero)
tasmania_lambda

## [1] 0.001819643

souvenirs_ms <- souvenirs

souvenirs_ms["Sales_TR"] <- box_cox(souvenirs$Sales,souvenirs_ms_lambda)

souvenirs_ms |> autoplot(Sales_TR) + ggtitle("Time series of Souvenirs Sales (Transformed with lambda = 0.001819643)")

The optimal lambda is 0.001819643.

Find the number of differences required to make the data stationary.

souvenirs_ms |>
  features(Sales_TR, unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

souvenirs_ms |>
  features(Sales_TR, unitroot_nsdiffs)

## # A tibble: 1 × 1
##   nsdiffs
##     <int>
## 1       1

One seasonal difference of 12 for monthly data and a first order difference are required to make the data stationary.

Plots

souvenirs_ms["Sales_Adjusted"] <- difference(souvenirs_ms$Sales_TR, 12)
souvenirs_ms["Sales_Adjusted"] <- difference(souvenirs_ms$Sales_Adjusted, 1)


souvenirs_ms

## # A tsibble: 84 x 4 [1M]
##       Month Sales Sales_TR Sales_Adjusted
##       <mth> <dbl>    <dbl>          <dbl>
##  1 1987 Jan 1665.     7.48             NA
##  2 1987 Feb 2398.     7.85             NA
##  3 1987 Mar 2841.     8.02             NA
##  4 1987 Apr 3547.     8.25             NA
##  5 1987 May 3753.     8.30             NA
##  6 1987 Jun 3715.     8.29             NA
##  7 1987 Jul 4350.     8.45             NA
##  8 1987 Aug 3566.     8.25             NA
##  9 1987 Sep 5022.     8.60             NA
## 10 1987 Oct 6423.     8.85             NA
## # ℹ 74 more rows

souvenirs_ms_plot1_dff <- souvenirs_ms |> autoplot(Sales_Adjusted) + ggtitle("Stationary Time series of the Transformed Souvenir Sales (1 Seasonal Difference + 1 First Order Difference)")

souvenirs_ms_acf_diff <- souvenirs_ms |> ACF(Sales_Adjusted) |> autoplot() + ggtitle("ACF of Souvenir Sales (1 Seasonal Difference + 1  First Order Difference)")

souvenirs_ms_pacf_diff <- souvenirs_ms |> PACF(Sales_Adjusted) |> autoplot() +ggtitle("PACF of Souvenir Sales (1 Seasonal Difference + 1  First Order Difference)")

plot_grid(souvenirs_ms_plot1_dff, souvenirs_ms_acf_diff,souvenirs_ms_pacf_diff,ncol = 1)

The Souvenir sales time series, which has undergone a Box-Cox transformation, first-order differencing, and seasonal differencing for monthly data, shows some signs of stationarity. The transformed series is centered around zero with relatively stable variability, as observed in the graphs. However, the ACF plot reveals a significant negative autocorrelation outside the lower bound of the confidence interval, suggesting that the data may be over-differenced or that a structural pattern, such as an MA(1) component, may be present. While the PACF shows no clear patterns, the negative autocorrelation likely indicates the need for an MA term to capture this behavior, rather than assuming the deviations are purely random noise. Overall, the series does not fully resemble white noise, and further model adjustments may be needed to account for the negative autocorrelation.

The KPSS test shows a p-value of 0.1, which suggests that the null hypothesis of stationarity cannot be rejected, confirming that the data is stationary after the applied transformations and differentiation.

souvenirs_ms |>
  features(difference(Sales_Adjusted), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1    0.0494         0.1

9.5

For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

set.seed(12345678)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

Select Turnover Data

turnover_data <- myseries |> select(Turnover)

turnover_data |> autoplot()

## Plot variable not specified, automatically selected `.vars = Turnover`

turnover_data_plot1 <- turnover_data |> autoplot(Turnover) + ggtitle("Time series of Austraila Retail Turnover")

turnover_data_plot2 <- turnover_data |> ACF(Turnover) |> autoplot() + ggtitle("ACF of Austraila Retail Turnover")

turnover_data_plot3 <- turnover_data |> PACF(Turnover) |> autoplot()  + ggtitle("PACF of Austraila Retail Turnover")

plot_grid(turnover_data_plot1,turnover_data_plot2,turnover_data_plot3,ncol = 1)

Transformed and Differenced Time Series

Apply boxcox transformation.

turnover_data_lambda <- turnover_data |> features(Turnover,features = guerrero) |> pull(lambda_guerrero)
turnover_data_lambda

## [1] 0.08303631

turnover_data["Turnover_TR"] <- box_cox(turnover_data$Turnover,turnover_data_lambda)

 turnover_data |> autoplot(Turnover_TR) + ggtitle("Time series of Austraila Retail Turnover (Transformed)")

The optimal lambda is 0.08303631.

Find the number of differences required to make the data stationary.

turnover_data |>
  features(Turnover, unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

turnover_data |>
  features(Turnover, unitroot_nsdiffs)

## # A tibble: 1 × 1
##   nsdiffs
##     <int>
## 1       1

One seasonal difference of 12 for monthly data and a first order difference are required to make the data stationary.

Plots

turnover_data["Turnover_Adjusted"] <- difference(turnover_data$Turnover, 12)
turnover_data["Turnover_Adjusted"] <- difference(turnover_data$Turnover_Adjusted, 1)


turnover_ad_plot1_dff <- turnover_data |> autoplot(Turnover_Adjusted) + ggtitle("Stationary Time series of the Transformed Austraila Retail Turnover (1 seasonal difference + 1 first order difference)")

turnover_ad_acf_diff <- turnover_data |> ACF(Turnover_Adjusted) |> autoplot() + ggtitle("ACF of Austraila Retail Turnover (1 seasonal difference + 1  first order difference)")

turnover_ad_pacf_diff <- turnover_data |> PACF(Turnover_Adjusted) |> autoplot() +ggtitle("PACF of Austraila Retail Turnover (1 seasonal difference + 1  first order difference)")

plot_grid(turnover_ad_plot1_dff, turnover_ad_acf_diff,turnover_ad_pacf_diff,ncol = 1)

The Australian Turnover time series, after a Box-Cox transformation, first-order differencing, and seasonal differencing for monthly data, shows some stationarity. The series is centered around zero with stable variability, but the ACF and PACF reveal negative autocorrelations at lag 1 and lag 12, with lag 1 showing a stronger autocorrelation of -0.04. This suggests the need for a non-seasonal MA(1) term to account for the short-term negative autocorrelation. The patterns at these lags indicate further model refinement is needed to better capture the series’ structure and achieve white noise behavior.

turnover_data |>
  features(difference(Turnover_Adjusted), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1    0.0260         0.1

9.6

Simulate and plot some data from simple ARIMA models.

1) Use the following R code to generate data from an AR(1) model with Phi = 0.6 and variance = 1. The process starts with y1= 0.

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

Produce a time plot for the series. How does the plot change as you modify the parameter phi1?

sim |> autoplot(y)

set.seed(123)
y2 <- numeric(100)
y3 <- numeric(100)
e2 <- rnorm(100)
e3 <- rnorm(100)

for(i in 2:100){
    y2[i] <- 0.1*y2[i-1] + e2[i]
    y3[i] <- 0.9*y3[i-1] + e3[i]
    
}

sim <- tsibble(idx = seq_len(100), y = y, y2 = y2, y3 = y3, index = idx)

plot1 <- sim |> autoplot(y2) + labs( title = "Phi = 0.1")
plot2 <- sim |> autoplot(y) + labs( title = "Phi = 0.6")
plot3 <- sim |> autoplot(y3) + labs( title = "Phi = 0.9")

grid.arrange(plot1, plot2, plot3, nrow = 2)

As you modify the parameter Phi1 in an AR(1) model, the time plot changes from erratic to smoother and more persistent. With low Phi1 (e.g., 0.1), the series shows weak dependence on past values, resulting in random fluctuations and no clear pattern. As Phi1 increases to moderate levels (e.g., 0.6), the series becomes more dependent on previous values, leading to smoother transitions and emerging trends. At higher Phi1 (e.g., 0.9), the series exhibits strong persistence, with long-term trends and gradual shifts, creating a stable, drift-like behavior in the plot.

3) Write your own code to generate data from a moving average model of order 1 (MA(1)) with the parameter theta1 equal to 0.6 and the variance sigma squared equal to 1.

set.seed(123)
y4 <- numeric(100)
e4 <- rnorm(100)

for(i in 2:100)
  y4[i] <- 0.6*e4[i-1] + e4[i]
  
sim2 <- tsibble(idx = seq_len(100), y4 = y4, index = idx)

4) Produce a time plot for the series. How does the plot change as you modify the parameter theta1?

set.seed(123)
y5 <- numeric(100)
y6 <- numeric(100)
e5 <- rnorm(100)
e6 <- rnorm(100)

for(i in 2:100){
  y5[i] <- 0.1*e5[i-1] + e5[i]
  y6[i] <- 0.9*e6[i-1] + e6[i]
}
sim2 <- tsibble(idx = seq_len(100), y4 = y4, y5 = y5, y6 = y6, index = idx)

plot4 <- sim2 |> autoplot(y5) + labs( title = "Theta = 0.1")
plot5 <- sim2 |> autoplot(y4) + labs( title = "Theta = 0.6")
plot6 <- sim2 |> autoplot(y6) + labs( title = "Theta = 0.9")

grid.arrange(plot4, plot5, plot6, nrow = 2)

As you modify the parameter Theta1 in a Moving Average (MA) model, the time plot changes significantly, reflecting the influence of past errors on the series. With a low Theta1 (e.g., 0.1), the series exhibits weak dependence on previous errors, resulting in rapid, random fluctuations and a plot that resembles white noise. As Theta1 increases to a moderate level (e.g., 0.6), the series becomes smoother, with more gradual transitions, as past errors have a greater influence on current values. When Theta1 is close to 1 (e.g., 0.9), the series displays strong autocorrelation, leading to longer-lasting trends and a plot that is more persistent and stable, with fewer sharp changes. Overall, as Theta1 increases, the time series transitions from random and erratic to smoother and more trend-driven.

5) Generate data from an autoregressive moving average model of order 1,1 (ARMA(1,1)) with the parameters phi1 equal to 0.6, theta1 equal to 0.6, and the variance sigma squared equal to 1.

set.seed(123)
y7 <- (numeric(100))
e7 <- rnorm(100)

for(i in 2:100)
  y7[i] <- 0.6*y7[i-1] + 0.6*e7[i-1] + e7[i] 

sim3 <- tsibble(idx = seq_len(100), y7 = y7, index = idx)

6) Generate data from an autoregressive model of order 2 (AR(2)) with the parameters phi1 equal to -0.8, phi2 equal to 0.3, and the variance sigma squared equal to 1. (Note that these parameters will give a non-stationary series.)

set.seed(123)
y8 <- (numeric(100))
e8 <- rnorm(100)

for(i in 3:100)
  y8[i] <- -0.8*y8[i-1] + 0.3*y8[i-2] + e8[i]

sim4 <- tsibble(idx = seq_len(100), y8 = y8, index = idx)

7) Graph the latter two series and compare them.

plot7 <- sim3 |> autoplot(y7) + labs( title = "ARMA(1,1) Phi = 0.6 and Theta = 0.6")
plot8 <- sim4 |> autoplot(y8) + labs( title = "AR(2) Phi1 = -0.8 and Phi2 = 0.3")

grid.arrange(plot7, plot8, nrow = 2)

In your ARIMA ARMA(1,1) model with Phi = 0.6 and Theta = 0.6, the absence of a clear trend or seasonality suggests that the series is driven more by short-term dynamics without long-term directional movement or repeating patterns. The influence of past values (Phi) and past errors (Theta) balances out, leading to a relatively stable and fluctuating series without strong directional movement over time.

On the other hand, the AR(2) model with Phi1 = -0.8 and Phi2 = 0.3 shows a different behavior. Initially, the series may appear flat or stable, but over time, variability increases in a multiplicative manner. This suggests that the model is likely capturing some instability or non-stationarity, where the effects of the autoregressive terms grow over time, amplifying the variability and leading to more pronounced fluctuations as the series evolves. The combination of negative and positive Phi values in this AR(2) model could be contributing to this complex behavior, with an initial damping effect followed by increasing volatility.

9.7

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

aus_airpassengers |> autoplot(Passengers)

a) Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

arima_fit <- aus_airpassengers |>
  model(ARIMA(Passengers))

report(arima_fit)

## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8963
## s.e.   0.0594
## 
## sigma^2 estimated as 4.308:  log likelihood=-97.02
## AIC=198.04   AICc=198.32   BIC=201.65

Residuals

arima_fit |> gg_tsresiduals()

Forecast

arima_fit |> forecast(h=10) |>
  autoplot(aus_airpassengers)

b) Write the model in terms of the backshift operator.

(1 - B)^2 * Y_t = (1 - 0.8963 * B) * e_t

Where:

(1 - B)^2 represents the second-order differencing (since the differencing parameter d = 2). Y_t is the original time series (Passengers). B is the backshift operator, meaning B * Y_t = Y_(t-1). 0.8963 is the moving average (MA) coefficient for the MA(1) term. e_t is the error term (white noise).

c) Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.

arima_fit2 <- aus_airpassengers |>
  model(ARIMA(Passengers ~ pdq(0,1,0)))

report(arima_fit2)

## Series: Passengers 
## Model: ARIMA(0,1,0) w/ drift 
## 
## Coefficients:
##       constant
##         1.4191
## s.e.    0.3014
## 
## sigma^2 estimated as 4.271:  log likelihood=-98.16
## AIC=200.31   AICc=200.59   BIC=203.97

arima_fit2 |> forecast(h=10) |>
  autoplot(aus_airpassengers)

When comparing ARIMA(0,1,0) and ARIMA(0,2,1), both models produce the same straight diagonal prediction line, but their prediction intervals differ. In ARIMA(0,1,0), the intervals widen slightly with a subtle curvature, and the lower bounds expand more than the upper bounds, indicating growing uncertainty over time. In contrast, ARIMA(0,2,1) shows a more linear and symmetric expansion of intervals, reflecting more consistent and balanced predictions. The key insight is that ARIMA(0,2,1) provides more stable and predictable uncertainty over time, while ARIMA(0,1,0) suggests increasing unpredictability, especially at longer forecast horizons.

d) Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.

arima_fit3 <- aus_airpassengers |>
  model(ARIMA(Passengers ~ 1 +pdq(2,1,2)))

report(arima_fit3)

## Series: Passengers 
## Model: ARIMA(2,1,2) w/ drift 
## 
## Coefficients:
##           ar1      ar2     ma1     ma2  constant
##       -0.5518  -0.7327  0.5895  1.0000    3.2216
## s.e.   0.1684   0.1224  0.0916  0.1008    0.7224
## 
## sigma^2 estimated as 4.031:  log likelihood=-96.23
## AIC=204.46   AICc=206.61   BIC=215.43

arima_fit3 |> forecast(h=10) |>
  autoplot(aus_airpassengers)

The ARIMA(Passengers ~ 1 + pdq(2,1,2)) model is more complex compared to ARIMA(0,1,0) and ARIMA(0,2,1) due to its AR(2) and MA(2) components, which account for both lagged values and past errors. This complexity results in a forecast line with slight undulations, reflecting how the model responds to more detailed patterns in the data. In contrast, the ARIMA(0,1,0) prediction line remains mostly straight but with slight curvature in its prediction intervals, which widen more gradually and curve as the forecast horizon increases. The lower bounds tend to widen a bit more than the upper bounds, indicating uneven uncertainty.

On the other hand, ARIMA(0,2,1) has prediction intervals that are linear and perfectly symmetrical, expanding in a consistent, straight manner. In comparison, the ARIMA(2,1,2) model produces fluctuating confidence intervals that show slight ups and downs, reflecting the more dynamic interactions between its AR and MA terms, capturing greater variability in the data.

Removing the constant

arima_fit3_nc <- aus_airpassengers |>
  model(ARIMA(Passengers ~ pdq(2,1,2)))

## Warning: It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
## You can include a constant using `ARIMA(y~1)` to the formula or exclude it by adding `ARIMA(y~0)`.

## Warning: 1 error encountered for ARIMA(Passengers ~ pdq(2, 1, 2))
## [1] Could not find an appropriate ARIMA model.
## This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
## For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots

report(arima_fit3_nc)

## Series: Passengers 
## Model: NULL model 
## NULL model

When we remove the constant we get an warning indicating that we haven’t explicitly indicated whether to include a constant (drift) in the model. This can lead to issues, especially when the model encounters numerically unstable characteristic roots in the autoregressive or moving average components. These roots must satisfy stationarity conditions to ensure the model can be fit properly. Without specifying a constant, the model might face instability, particularly with more complex AR and MA terms.

e) Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

arima_fit4 <- aus_airpassengers |>
  model(ARIMA(Passengers ~ 1 + pdq(0,2,1)))

## Warning: Model specification induces a quadratic or higher order polynomial trend. 
## This is generally discouraged, consider removing the constant or reducing the number of differences.

report(arima_fit4)

## Series: Passengers 
## Model: ARIMA(0,2,1) w/ poly 
## 
## Coefficients:
##           ma1  constant
##       -1.0000    0.0571
## s.e.   0.0585    0.0213
## 
## sigma^2 estimated as 3.855:  log likelihood=-95.1
## AIC=196.21   AICc=196.79   BIC=201.63

The warning for ARIMA(0,2,1) with a constant occurs because the model, with two levels of differencing and a constant, induces a quadratic trend in the data, meaning the series will grow or decline at an accelerating rate over time. This is generally discouraged as it can lead to unrealistic forecasts. To resolve this, we can either remove the constant to avoid the quadratic trend or reduce the number of differences to prevent this higher-order growth.

9.8

For the United States GDP series (from global_economy):

usa <- global_economy |>
  filter(Country == "United States")

usa |> autoplot(GDP)

a)if necessary, find a suitable Box-Cox transformation for the data.

Lambda Value

lambda_us <- usa |>
  features(GDP,features=guerrero) |>
  pull(lambda_guerrero)
lambda_us

## [1] 0.2819443

Transformed Data

usa |> autoplot(box_cox(GDP, lambda_us))

b)fit a suitable ARIMA model to the transformed data using ARIMA()

fit_us <- usa |>
  model(ARIMA(box_cox(GDP, lambda_us)))

report(fit_us)

## Series: GDP 
## Model: ARIMA(1,1,0) w/ drift 
## Transformation: box_cox(GDP, lambda_us) 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1822
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78

c)try some other plausible models by experimenting with the orders chosen

alternate_models <- usa |>
  model(
    "ARIMA(2,1,2)" = ARIMA(box_cox(GDP, lambda_us) ~ 1 + pdq(2,1,2)),
    "ARIMA(0,1,0)" = ARIMA(box_cox(GDP, lambda_us) ~ 1 + pdq(0,1,0)),
    "ARIMA(1,1,0)" = ARIMA(box_cox(GDP, lambda_us) ~ 1 + pdq(1,1,0))       
      )

d)choose what you think is the best model and check the residual diagnostics

Among the three models—ARIMA(1,1,0), ARIMA(2,1,2), and ARIMA(0,1,0)—the ARIMA(1,1,0) model stands out as the most optimal based on several key statistical criteria. It achieves the lowest AIC (657) and AICc (657), both of which suggest that it offers the best balance between model accuracy and complexity. These measures indicate that the ARIMA(1,1,0) model captures the underlying patterns in the data more effectively without introducing unnecessary parameters. Furthermore, its BIC (663) is also the lowest, reinforcing the idea that this model is the simplest and most parsimonious option. Although its sigma2 is slightly higher than that of the ARIMA(0,1,0) model, the overall fit, reflected in the AIC and BIC values, makes ARIMA(1,1,0) the superior model. It combines simplicity with performance, making it the most appropriate choice for this analysis.

fit_us %>% gg_tsresiduals()

For the ARIMA(1,1,0) model, the residuals show right skewness in the histogram, likely indicating the presence of outliers that are affecting the model’s accuracy. While the ACF shows white noise, meaning the model has captured the time dependencies well, the innovation residuals display significant troughs, suggesting occasional large deviations that the model isn’t fully accounting for. In summary, although the model effectively handles autocorrelation, the skewness and troughs in the residuals point to potential outliers that could be addressed to improve the model’s performance.

e)produce forecasts of your fitted model. Do the forecasts look reasonable?

fit_us %>% 
  forecast(h=10) %>%
  autoplot(usa)

f)compare the results with what you would obtain using ETS() (with no transformation).

fit_ets <- usa %>%
  model(ETS(GDP))

fit_ets %>%
  forecast(h=10) %>%
  autoplot(usa)

report(fit_ets)

## Series: GDP 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9990876 
##     beta  = 0.5011949 
## 
##   Initial states:
##          l[0]        b[0]
##  448093333334 64917355687
## 
##   sigma^2:  0.0007
## 
##      AIC     AICc      BIC 
## 3190.787 3191.941 3201.089

When comparing the ETS(M,A,N) model to the ARIMA(1,1,0) model, a clear distinction arises in terms of model fit and complexity. The ETS model, which uses multiplicative errors, an additive trend, and no seasonality, results in significantly higher AIC (3190.787), AICc (3191.941), and BIC (3201.089) values compared to the ARIMA model’s AIC (657) and BIC (663). These higher values indicate that while the ETS model may be capturing more intricate details, it is not as efficient or optimal as ARIMA(1,1,0) in terms of balancing model fit and complexity. The sigma^2 of 0.0007 in the ETS model shows that it handles error variance well, but the much higher information criteria suggest that it could be overfitting the data or introducing unnecessary complexity.

On the other hand, the ARIMA(1,1,0) model presents a more straightforward and efficient solution, as evidenced by its much lower AIC and BIC. The ARIMA model’s simplicity, without the need for intricate smoothing parameters or error structures, offers a stable fit with minimal complexity. While the ETS model’s smoothing parameters—alpha near 1 and beta around 0.5—suggest that it emphasizes recent observations and captures trends dynamically, the ARIMA model still outperforms it overall. The ARIMA(1,1,0) model strikes a better balance between fit and simplicity, making it the more appropriate choice for the dataset based on these selection criteria.

Homework 6 - ARIMA

Lewris Mota Sanchez

2024-10-19

Imports

9.1

9.2

9.3

9.5

9.6

9.7

9.8