ARIMA Modeling

1) Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

a) Explain the differences among these figures. Do they all indicate that the data are white noise?

Answer:

Each of the three figures suggests that the data exhibit characteristics of white noise. This is evident as the bars representing the autocorrelation function (ACF) fall within the dashed lines, denoting critical values for statistically significant ACF. Notably, with fewer samples, the ACF bars are taller compared to datasets with a larger sample size. Given that the data are randomly generated, they adhere to independent and identically distributed (i.i.d.) properties, consequently resulting in autocorrelation values approaching zero across all lags. This trend is illustrated in the figures, wherein an increase in sample size corresponds to a decrease in autocorrelation values, ultimately converging towards zero.

b) Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

Answer:

The dashed lines are determined using the formula ±1.96/√N with a zero center. Mathematically, as the sample size (N) increases, the absolute value of the critical value decreases. This statistical phenomenon implies that smaller datasets have a higher likelihood of exhibiting correlation by chance compared to larger datasets. Consequently, to adjust for this tendency, the absolute values of the critical values are higher for smaller datasets. In essence, a higher level of autocorrelation is required in smaller datasets to reject the null hypothesis that the observed autocorrelation is solely a result of chance.

2) A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

Answer:

Let’s filter out and plot the Amazon stock series:

gafa_stock %>%
  filter(Symbol == "AMZN") %>%
  gg_tsdisplay(Close, plot_type='partial') +
  labs(title = "Amazon Closing Stock Price")

gafa_stock %>%
  filter(Symbol == "AMZN") %>%
  features(Close, unitroot_ndiffs)

## # A tibble: 1 × 2
##   Symbol ndiffs
##   <chr>   <int>
## 1 AMZN        1

The time series exhibits a predominant increasing trend, indicating non-stationarity. This is evident from the ACF plot, which shows a gradual decrease and a large positive lag-1 autocorrelation (r1), contrasting with a stationary series where autocorrelation diminishes more swiftly. Additionally, the PACF plot displays a significant first lag, further confirming non-stationarity. To stabilize the mean, differencing is recommended. The KPSS (unit root) test suggests that the Amazon closing price requires a single differencing operation to achieve stationarity. let’s carry out differecing and see how it looks

gafa_stock %>%
  filter(Symbol == "AMZN") %>%
  autoplot(difference(Close)) +
  labs(title = "Amazon Closing Stock Price")

As we can see that after differencing we get a stationary time series. Since there is no seasonality in the series so one non-seasonal difference did the job for us.

3) For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

a) Turkish GDP from global_economy.

Answer:

# plot
global_economy %>%
  filter(Country == "Turkey") %>%
  gg_tsdisplay(GDP, plot_type='partial') +
  labs(title = "Non-transformed Turkish GDP")

# calculate lambda
lambda <- global_economy %>%
  filter(Country == "Turkey") %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

# unit root test
global_economy %>%
  filter(Country == "Turkey") %>%
  features(box_cox(GDP,lambda), unitroot_ndiffs)

## # A tibble: 1 × 2
##   Country ndiffs
##   <fct>    <int>
## 1 Turkey       1

global_economy %>%
  filter(Country == "Turkey") %>%
  gg_tsdisplay(difference(box_cox(GDP,lambda)), plot_type='partial') +
  labs(title = latex2exp::TeX(paste0("Differenced Turkish GDP with $\\lambda$ = ",
         round(lambda,2))))

Turkish GDP is non-seasonal and non-stationary. The KPSS test recommends only differencing once and it seems to be stationary according to the graphs after differencing once.

b) Accommodation takings in the state of Tasmania from aus_accommodation.

# plot
aus_accommodation %>%
  filter(State == "Tasmania") %>%
  gg_tsdisplay(Takings, plot_type='partial') +
  labs(title = "Non-transformed Tasmania Accomodation Takings")

# calculate lambda
lambda <-aus_accommodation %>%
  filter(State == "Tasmania") %>%
  features(Takings, features = guerrero) %>%
  pull(lambda_guerrero)

#unit root test
aus_accommodation %>%
  filter(State == "Tasmania") %>%
  features(box_cox(Takings,lambda), unitroot_nsdiffs)

## # A tibble: 1 × 2
##   State    nsdiffs
##   <chr>      <int>
## 1 Tasmania       1

# transformed plot
aus_accommodation %>%
  filter(State == "Tasmania") %>%
  gg_tsdisplay(difference(box_cox(Takings,lambda), 4), plot_type='partial') +
  labs(title = latex2exp::TeX(paste0("Differenced Tasmania Accomodation Takings with $\\lambda$ = ",
         round(lambda,2))))

Accommodation revenue in Tasmania exhibits seasonal patterns and non-stationarity. Based on the KPSS test, a single differencing operation is suggested, rendering the data stationary according to post-differencing graphs. Notably, the ACF decays rapidly, and the PACF plot truncates after the initial lag. Nonetheless, double differencing appears to center the data more effectively around zero.

c) Monthly sales from souvenirs.

# plot
souvenirs %>%
  gg_tsdisplay(Sales, plot_type='partial', lag = 36) +
  labs(title = "Non-transformed Monthly Souvenir Sales")

# calculate lambda
lambda <- souvenirs %>%
  features(Sales, features = guerrero) %>%
  pull(lambda_guerrero)

# unit root test
souvenirs %>%
  features(box_cox(Sales,lambda), unitroot_nsdiffs)

## # A tibble: 1 × 1
##   nsdiffs
##     <int>
## 1       1

souvenirs %>%
  gg_tsdisplay(difference(box_cox(Sales,lambda), 12), plot_type='partial', lag = 36) +
  labs(title = latex2exp::TeX(paste0("Differenced Monthly Souvenir Sales with $\\lambda$ = ",
         round(lambda,2))))

Monthly souvenir sales display both seasonality and non-stationarity. The KPSS test advises a single differencing, and a lambda value approaching 0 indicates a potential natural log transformation. Post-differencing, the data appears stationary according to graphical analysis. Notably, the ACF demonstrates rapid decay, while the PACF plot truncates after the initial lag.

5) For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

Answer:

Let’s plot the data first from exercise 7 in section 2.10:

set.seed(1234)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1)) 

# plot
myseries %>%
  gg_tsdisplay(Turnover, plot_type='partial', lag = 36) +
  labs(title = "Non-transformed Retail Turnover")

As we can see that the variance through out the time series is not constant so it definitely needs a transformation and we can also confirm that the series is non stationary. So let’s find the appropriate lambda value for transformation and carry out unit root test to confirm how many differentiation are required to converts the time series in stationary time series

# lambda calculation
lambda <- myseries %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)
print(lambda)

## [1] 0.3196221

The value of lambda for box cox transformation is .3196221. Now let carry out unit root test.

# unit root test
myseries %>%
  features(box_cox(Turnover, lambda), unitroot_nsdiffs)

## # A tibble: 1 × 3
##   State    Industry                                      nsdiffs
##   <chr>    <chr>                                           <int>
## 1 Tasmania Cafes, restaurants and takeaway food services       1

According to unit root only 1 difference will convert our non stationary time series to stationary.

let’s apply both and plot the series again

myseries %>%
  gg_tsdisplay(difference(box_cox(Turnover,lambda), 12), plot_type='partial', lag = 36) +
  labs(title = latex2exp::TeX(paste0("Differenced Tasmania Accomodation Takings with $\\lambda$ = ",
         round(lambda,2))))

After differencing once and transforming the data, the data seems to be stationary as it is centered around zero and the ACF and PACF plots show stationarity.

6) Simulate and plot some data from simple ARIMA models.

a) Use the following R code to generate data from an AR(1) model with \(ϕ_1\) = 0.6 and \(σ^2\) = 1. The process starts with \(y_1\) = 0.

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

b) Produce a time plot for the series. How does the plot change as you change \(ϕ_1\)?

sim %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" AR(1) model with $\\phi_i$ = 0.6"))

# phi_1 = 1
for(i in 2:100)
  y[i] <- 1*y[i-1] + e[i]

tsibble(idx = seq_len(100), y = y, index = idx) %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" AR(1) model with $\\phi_i$ = 1"))

# phi_1 = 0
for(i in 2:100)
  y[i] <- 0*y[i-1] + e[i]

tsibble(idx = seq_len(100), y = y, index = idx) %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" AR(1) model with $\\phi_i$ = 0"))

# phi_1 = -0.6
for(i in 2:100)
  y[i] <- -0.6*y[i-1] + e[i]

tsibble(idx = seq_len(100), y = y, index = idx) %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" AR(1) model with $\\phi_i$ = -0.6"))

Altering the parameter \(ϕ_1\) leads to diverse patterns in time series. Setting \(ϕ_1\)=0 yields a pattern akin to white noise, while \(ϕ_1\)=1 resembles a random walk. Negative values of \(ϕ_1\) result in oscillations around the mean. Decreasing \(ϕ_1\) amplifies variation, resulting in more pronounced spikes in the series.

c) Write your own code to generate data from an MA(1) model with \(θ_1\) = 0.6 and \(σ^2\) = 1.

for(i in 2:100)
  y[i] <- e[i] + 0.6 * e[i-1]

sim2 <- tsibble(idx = seq_len(100), y = y, index = idx)

d) Produce a time plot for the series. How does the plot change as you change \(θ_1\) ?

sim2 %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" MA(1) model with $\\theta_i$ = 0.6"))

# theta_1 = 1
for(i in 2:100)
  y[i] <- e[i] + 1 * e[i-1]

tsibble(idx = seq_len(100), y = y, index = idx) %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" MA(1) model with $\\theta_1$ = 1"))

# theta_1 = 0
for(i in 2:100)
  y[i] <- e[i] + 0 * e[i-1]

tsibble(idx = seq_len(100), y = y, index = idx) %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" MA(1) model with $\\theta_1$ = 0"))

# theta_1 = -0.6
for(i in 2:100)
  y[i] <- e[i] + -0.6 * e[i-1]

tsibble(idx = seq_len(100), y = y, index = idx) %>%
  autoplot(y) +
  labs(title = latex2exp::TeX(" MA(1) model with $\\theta_1$ = -0.6"))

As \(θ_1\) decreases, the frequency of spikes increases. Despite this change, the shapes of the graphs remain relatively consistent, with similar magnitudes. It appears that decreasing \(θ_1\) enhances the centering around the mean, indicating a potential improvement in stability and convergence towards the average value.

e) Generate data from an ARMA(1,1) model with \(ϕ_1\) = 0.6, \(θ_1\) = 0.6 and \(σ^2\) = 1.

set.seed(1234)
y <- numeric(100)

for(i in 2:100)
  y[i] <- e[i] + 0.6 * e[i-1] + 0.6 * y[i-1]

arma1_1 <- tsibble(idx = seq_len(100), y = y, index = idx)

f) Generate data from an AR(2) model with \(ϕ_1\) = −0.8, \(ϕ_2\) = 0.3 and \(σ^2\) = 1. (Note that these parameters will give a non-stationary series.)

set.seed(1234)
y <- numeric(100)

for(i in 3:100)
  y[i] <- -0.8 * y[i-1] + 0.3 * y[i-2] + e[i]

ar2 <- tsibble(idx = seq_len(100), y = y, index = idx)

g) Graph the latter two series and compare them.

arma1_1 %>%
  gg_tsdisplay(y, plot_type='partial') +
  labs(title = latex2exp::TeX("ARMA(1,1) model with $\\phi_1 = 0.6$, $\\theta_1 = 0.6$, and $\\sigma^2 = 1$"))

ar2 %>%
  gg_tsdisplay(y, plot_type='partial') +
  labs(title = latex2exp::TeX("AR(2) model with $\\phi_1 = -0.8$, $\\phi_2 = 0.3$, and $\\sigma^2 = 1$"))

The ARMA(1,1) model exhibits stationarity, characterized by randomness, a rapidly decreasing ACF plot, and a PACF plot truncated after the first lag. Conversely, the AR(2) model is non-stationary, evidenced by oscillations around the mean and an exponential variance increase with index rise. Its PACF plot displays a negative first lag with no subsequent values, while the ACF plot oscillates. These discrepancies arise from unmet parameter constraints, specifically, the condition ϕ2−ϕ1<1 is not fulfilled in this scenario.

7) Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

a) Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

Answer:

fit <- aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers)) 

report(fit)

## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8756
## s.e.   0.0722
## 
## sigma^2 estimated as 4.671:  log likelihood=-87.8
## AIC=179.61   AICc=179.93   BIC=182.99

ARIMA (0,2,1) was selected by the auto ARIMA by fabletools. Now let’s check out the forecast and residuals of the ARIMA model.

fit %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers) +
  labs(title = "Australian Aircraft Passengers with ARIMA(0,2,1)", y = "Passengers (in millions)")

and the residuals

fit %>% 
  gg_tsresiduals() +
  labs(title = "Residuals for ARIMA(0,2,1)")

Residuals do resemble white noise since there are no spikes outside the boundaries on the ACF plot. It also exhibits a closer to normal distribution.

b) Write the model in terms of the backshift operator.

\(y_t = -0.87 \varepsilon_{t-1} + \varepsilon_t\)

\((1-B)^2 y_t = (1-0.87B)\varepsilon_t\)

c) Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.

fit2 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~  pdq(0,1,0)))

fit2 %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers) +
  labs(title = "Australian Aircraft Passengers with ARIMA(0,1,0)", y = "Passengers (in millions)")

fit2 %>% 
  gg_tsresiduals() +
  labs(title = "Residuals for ARIMA(0,1,0)")

The ARIMA model from part a forecasted higher than the actual values while this ARIMA model forecasted lower than the actual values. The slope of the forecasts also seems to be more gradual.

d) Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens

fit3 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ pdq(2,1,2)))

fit3 %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers) +
  labs(title = "Australian Aircraft Passengers with ARIMA(2,1,2)", y = "Passengers (in millions)")

fit3 %>% 
  gg_tsresiduals() +
  labs(title = "Residuals for ARIMA(2,1,2)")

#removing constant
fit4 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ 0 + pdq(2,1,2)))

## Warning: 1 error encountered for ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
## [1] non-stationary AR part from CSS

The pattern closely resembles a segment, and the residuals exhibit characteristics akin to white noise. Upon removal of the constant term, a null model is generated. The omission of a constant introduces a polynomial trend of order d−1 into the forecast, in this instance 0, rendering it non-stationary.

e) Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

fit5 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ 1 + pdq(0,2,1)))

## Warning: Model specification induces a quadratic or higher order polynomial trend. 
## This is generally discouraged, consider removing the constant or reducing the number of differences.

The slope become steeper and a warning is given that the model specifies a higher order polynomial trend and to remove the constant.

fit5 %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers) +
  labs(title = "Australian Aircraft Passengers with ARIMA(0,2,1) with constant", y = "Passengers (in millions)")

fit5 %>% 
  gg_tsresiduals() +
  labs(title = "Residuals for ARIMA(0,2,1) with constant")

8) For the United States GDP series (from global_economy):

a) if necessary, find a suitable Box-Cox transformation for the data;

us_gdp <- global_economy %>%
  filter(Code == 'USA') %>%
  select(Country, GDP)
  
us_gdp|>
  autoplot(GDP)

lambda_gdp <- us_gdp %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)
lambda_gdp

## [1] 0.2819443

us_gdp <- us_gdp %>%
  mutate(GDP = box_cox(GDP, lambda_gdp))

b) fit a suitable ARIMA model to the transformed data using ARIMA();

fit6 <- us_gdp %>%
  model(
    arima = ARIMA(GDP, stepwise = FALSE, approx = FALSE)
  )
report(fit6)

## Series: GDP 
## Model: ARIMA(1,1,0) w/ drift 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1822
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78

c try some other plausible models by experimenting with the orders chosen;

We can look at the ACF and PACF plot to select the model

 us_gdp%>%
  gg_tsdisplay(difference(GDP), plot_type='partial') +
  labs(title = "Transformed United States GDP")

Let’s chose and MA model from the ACF plot where the last significance spike is on lag 1 so our model could be ARIMA(0,1,1) and let;s try other variations by manually changing the order of pdq.

usa_fit <- us_gdp %>%
  model(arima011=ARIMA(GDP ~ pdq(0,1,1)),
    arima110 = ARIMA(GDP ~ pdq(1,1,0)),
    arima120 = ARIMA(GDP ~ pdq(1,2,0)),
    arima210 = ARIMA(GDP ~ pdq(2,1,0)),
    arima212 = ARIMA(GDP ~ pdq(2,1,2)),
    arima111 = ARIMA(GDP ~ pdq(1,1,1)))

glance(usa_fit) %>% arrange(AICc) %>% select(.model:BIC)

## # A tibble: 6 × 6
##   .model   sigma2 log_lik   AIC  AICc   BIC
##   <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl>
## 1 arima120  6780.   -326.  656.  656.  660.
## 2 arima110  5479.   -325.  657.  657.  663.
## 3 arima011  5689.   -326.  659.  659.  665.
## 4 arima111  5580.   -325.  659.  659.  667.
## 5 arima210  5580.   -325.  659.  659.  667.
## 6 arima212  5734.   -325.  662.  664.  674.

The models have similar AICc values, with ARIMA(1,2,0) having the lowest, closely followed by ARIMA(1,1,0).

d) choose what you think is the best model and check the residual diagnostics;

usa_fit %>%
  select(arima120) %>%
  gg_tsresiduals() +
  ggtitle("Residuals for ARIMA(1,2,0)")

Since ARIMA(1,2,0) has the lowest AICc value, that model was chosen. The residuals seems to be white noise.

e) produce forecasts of your fitted model. Do the forecasts look reasonable?

usa_fit %>%
  forecast(h=5) %>%
  filter(.model=='arima120') %>%
  autoplot(us_gdp)

The forecasts seem reasonable as the trend is continuing at a similar slope.

f) compare the results with what you would obtain using ETS() (with no transformation).

fit_ets <- us_gdp%>%
  model(ETS(GDP))

report(fit_ets)

## Series: GDP 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9998985 
##     beta  = 0.3731217 
## 
##   Initial states:
##      l[0]     b[0]
##  7045.398 198.9275
## 
##   sigma^2:  0
## 
##      AIC     AICc      BIC 
## 744.1762 745.3300 754.4784

fit_ets %>%
  forecast(h=5) %>%
  autoplot(us_gdp)

ARIMA Modeling

Umer Farooq

2024-03-16