library(fpp3)
library(ggplot2)
library(ggfortify)

9.1

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

Explain the differences among these figures. Do they all indicate that the data are white noise?

All of the figures have spikes that seem to be touching the interval lines. Collectively however, they do not make up 5% of the values, thus all three figures contain white noise data.

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

The critical values are at different distances due to the random manner of choosing the numbers.

9.2

A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

amzn <- gafa_stock %>% filter(Symbol == 'AMZN')
amzn %>%
  gg_tsdisplay(Close, plot_type='partial')

The time series shows an increasing trend that makes the series non-stationary. The ACF plot has a large positive r1 and is slowly decreasing. The PACF plot shows a daily cyclic pattern with the highest spike being at 1.

9.3

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

Turkish GDP from global_economy.

# Transformation lambda
turk <- global_economy %>%
  filter(Country == "Turkey") %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

# Display
global_economy %>%
  filter(Country == "Turkey") %>%
  gg_tsdisplay(GDP, plot_type='partial')

# Num Differentiation
global_economy %>%
  filter(Country == "Turkey") %>%
  features(box_cox(GDP,turk), unitroot_ndiffs)
## # A tibble: 1 x 2
##   Country ndiffs
##   <fct>    <int>
## 1 Turkey       1
# Differentiate and display
global_economy %>%
  filter(Country == "Turkey") %>%
  gg_tsdisplay(difference(box_cox(GDP,turk)), plot_type='partial')

The Turkish GDP series is non-stationary and non-seasonal. A Box-Cox transformation with the guerrero feature was applied to the Turkish GDP series. The KPSS test indicated that one difference was required to obtain stationary data.

Accommodation takings in the state of Tasmania from aus_accommodation.

# Transformation lambda
tas <-aus_accommodation %>%
  filter(State == "Tasmania") %>%
  features(Takings, features = guerrero) %>%
  pull(lambda_guerrero)

# Display
aus_accommodation %>%
  filter(State == "Tasmania") %>%
  gg_tsdisplay(Takings, plot_type='partial')

# Num Differentiation
aus_accommodation %>%
  filter(State == "Tasmania") %>%
  features(box_cox(Takings,tas), unitroot_nsdiffs) 
## # A tibble: 1 x 2
##   State    nsdiffs
##   <chr>      <int>
## 1 Tasmania       1
# Differentiate and display
aus_accommodation %>%
  filter(State == "Tasmania") %>%
  gg_tsdisplay(difference(box_cox(Takings,tas), 4), plot_type='partial')

The Tasmania Accommodation series is non-stationary and seasonal. A Box-Cox transformation with the guerrero feature was applied to the Tasmania Accommodation series. The KPSS test indicated that one seasonal difference was required to obtain stationary data.

Monthly sales from souvenirs.

# Transformation lambda
souv <- souvenirs %>%
  features(Sales, features = guerrero) %>%
  pull(lambda_guerrero)

# Display
souvenirs %>%
  gg_tsdisplay(Sales, plot_type='partial') +
  labs(title = "Non-transformed Monthly Souvenir Sales")

# Num Differentiation
souvenirs %>%
  features(box_cox(Sales,souv), unitroot_nsdiffs) 
## # A tibble: 1 x 1
##   nsdiffs
##     <int>
## 1       1
# Differentiate and display
souvenirs %>%
  gg_tsdisplay(difference(box_cox(Sales,souv), 12), plot_type='partial')

The Monthly Sales series is non-stationary and seasonal. A Box-Cox transformation with the guerrero feature was applied to the Monthly Sales series. The KPSS test indicated that one seasonal difference was required to obtain stationary data.

9.5

For your retail data (from Exercise 8 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

set.seed(1234)
myseries <- aus_retail %>%
  filter(`Series ID` == 'A3349791W')

s_lam <- myseries %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

myseries %>%
  gg_tsdisplay(Turnover, plot_type='partial')

myseries %>%
  features(box_cox(Turnover, s_lam), unitroot_nsdiffs) 
## # A tibble: 1 x 3
##   State           Industry                           nsdiffs
##   <chr>           <chr>                                <int>
## 1 New South Wales Other recreational goods retailing       1
myseries %>%
  gg_tsdisplay(difference(box_cox(Turnover,s_lam),12), plot_type='partial')

The ‘A3349791W’ series from aus_retail is non-stationary and seasonal. The KPSS test performed on the Box-Cox transformed (guerrero feature) series indicated one seasonal difference was required to obtain stationary data.

9.6

Simulate and plot some data from simple ARIMA models.

Use the following R code to generate data from an AR(1) model with
ϕ1=0.6 and σ2=1. The process starts with y1=0.

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
ar1 <- function(phi)
{
  set.seed(123)
  y <- ts(numeric(100))
  e <- rnorm(100)
  
  for(i in 2:100)
    y[i] <- phi*y[i-1] + e[i]
  return (y)
}

Produce a time plot for the series. How does the plot change as you change ϕ1?

ar1(1) %>% autoplot()

ar1(0.6) %>% autoplot()

ar1(0) %>% autoplot()

ar1(-1) %>% autoplot()

As ϕ1 decreases we see more oscillation and more spikes in the data.

Write your own code to generate data from an MA(1) model with θ1=0.6 and σ2=1.

ma1 <- function(theta) 
{
  set.seed(456)
  y <- ts(numeric(100))
  e <- rnorm(100)
  
  for(i in 2:100)
    y[i] <- theta*y[i-1] + e[i]
  return (y)
}

Produce a time plot for the series. How does the plot change as you change θ1?

ma1(1) %>% autoplot()

ma1(0.6) %>% autoplot()

ma1(0) %>% autoplot()

ma1(-1) %>% autoplot()

As θ1 decreases we see more oscillation and the spkies increasing.

Generate data from an ARMA(1,1) model with ϕ1=0.6, θ1=0.6 and σ2=1.

arma <- function(phi, theta)
{
  set.seed(111)
  y <- ts(numeric(100))
  e <- rnorm(100)
  
  for(i in 2:100)
    y[i] <- phi*y[i-1] + theta*e[i-1] + e[i]
  return (y)
}

Generate data from an AR(2) model with ϕ1=−0.8, ϕ2=0.3 and σ2=1. (Note that these parameters will give a non-stationary series.)

ar2 <- function(phi1, phi2)
{
  set.seed(222)
  y <- ts(numeric(100))
  e <- rnorm(100)
  
  for(i in 3:100)
    y[i] <- phi1*y[i-1] + phi2*y[i-2] + e[i]
  return (y)
}

Graph the latter two series and compare them.

arma(0.6,0.6) %>% autoplot()

ar2(-0.8,0.2) %>% autoplot()

The ARMA(1,1) model looks to be stationary. The AR(2) model however is non-stationary with osillation and increasing variance.

9.7

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

aus_air <- aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers)) 

report(aus_air)
## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8756
## s.e.   0.0722
## 
## sigma^2 estimated as 4.671:  log likelihood=-87.8
## AIC=179.61   AICc=179.93   BIC=182.99
aus_air %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers)

aus_air %>% 
  gg_tsresiduals()

The ARMA(0,2,1) model was used. The acf plot shows the residuals look like white noise.

Write the model in terms of the backshift operator.

\(y_{t} = -0.88\varepsilon_{t-1}+\varepsilon{t}\)

\((1-B)^2y_{t} = (1 - 0.88B)\varepsilon_{t}\)

Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.

aus_air2 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ pdq(0,1,0)))

aus_air2 %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers)

aus_air2 %>% 
  gg_tsresiduals()

The model in part a predicted higher values, while this model predicted lower values.

Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.

aus_air3 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ pdq(2,1,2)))

aus_air3 %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers)

aus_air3 %>% 
  gg_tsresiduals()

The model in part a predicted higher values, and the model in part b predicted lower values. This model however predicted values closer to the actual values.

aus_air4 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ 0 + pdq(2,1,2)))

report(aus_air4)
## Series: Passengers 
## Model: NULL model 
## NULL model

Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

aus_air5 <-aus_airpassengers %>%
  filter(Year < 2012) %>%
  model(ARIMA(Passengers ~ pdq(0,2,1)))

aus_air5 %>% 
  forecast(h=10) %>%
  autoplot(aus_airpassengers)

aus_air5 %>% 
  gg_tsresiduals()

The slope of becomes steeper.

9.8

For the United States GDP series (from global_economy):

if necessary, find a suitable Box-Cox transformation for the data;

global_economy %>%
  filter(Code == "USA") %>%
  gg_tsdisplay(GDP, plot_type='partial')

us_lam <-global_economy %>%
  filter(Code == "USA") %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

alam <- myseries %>% features(Turnover,features=guerrero) %>% pull(lambda_guerrero)

The US GDP series is non-stationary, thus a Box-Cox transformation with the guerrero feature was applied to make the data more stationary.

fit a suitable ARIMA model to the transformed data using ARIMA();

us_gdp <- global_economy %>%
  filter(Code == "USA") %>%
  model(ARIMA(box_cox(GDP, us_lam))) 

report(us_gdp)
## Series: GDP 
## Model: ARIMA(1,1,0) w/ drift 
## Transformation: box_cox(GDP, us_lam) 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1822
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78

The ARIMA(1,1,0) model with drift was used to fit the data.

try some other plausible models by experimenting with the orders chosen;

global_economy %>%
  filter(Code == "USA") %>%
  gg_tsdisplay(box_cox(GDP,us_lam), plot_type='partial')

global_economy %>%
  filter(Code == "USA") %>%
  features(box_cox(GDP,us_lam), unitroot_ndiffs) 
## # A tibble: 1 x 2
##   Country       ndiffs
##   <fct>          <int>
## 1 United States      1

The KPSS test indicates that only one difference is needed to make the data stationary.

other <- global_economy %>%
  filter(Code == "USA") %>%
  model(arima_110 = ARIMA(box_cox(GDP,us_lam) ~ pdq(1,1,0)),
        arima_111 = ARIMA(box_cox(GDP,us_lam) ~ pdq(1,1,1)),
        arima_120 = ARIMA(box_cox(GDP,us_lam) ~ pdq(1,2,0)),
        arima_210 = ARIMA(box_cox(GDP,us_lam) ~ pdq(2,1,0)),
        arima_212 = ARIMA(box_cox(GDP,us_lam) ~ pdq(2,1,2)))

glance(other) %>% arrange(AICc) %>% select(.model:BIC)
## # A tibble: 5 x 6
##   .model    sigma2 log_lik   AIC  AICc   BIC
##   <chr>      <dbl>   <dbl> <dbl> <dbl> <dbl>
## 1 arima_120  6780.   -326.  656.  656.  660.
## 2 arima_110  5479.   -325.  657.  657.  663.
## 3 arima_111  5580.   -325.  659.  659.  667.
## 4 arima_210  5580.   -325.  659.  659.  667.
## 5 arima_212  5734.   -325.  662.  664.  674.

choose what you think is the best model and check the residual diagnostics;

The ARIMA(1,2,0) model is chosen because it has the best AICc value.

best <- other %>% select(arima_120) 
best %>%  gg_tsresiduals() 

produce forecasts of your fitted model. Do the forecasts look reasonable?

other %>% forecast(h=10) %>%
  filter(.model=='arima_120') %>%
  autoplot(global_economy)

The forecasts produced by the ARIMA(1,2,0) model seem to be reasonable and follow the series trend.

compare the results with what you would obtain using ETS() (with no transformation).

ets <- global_economy %>%
  filter(Code == "USA") %>%
  model(ETS(GDP))

report(ets)
## Series: GDP 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9990876 
##     beta  = 0.5011949 
## 
##   Initial states:
##          l[0]        b[0]
##  448093333334 64917355687
## 
##   sigma^2:  7e-04
## 
##      AIC     AICc      BIC 
## 3190.787 3191.941 3201.089

ETS has a significantly larger AICc value than the ARIMA(1,2,0) model, thus suggest it performs worse.