DATA624: Homework 6

library(fpp3)
library(ggpubr)

Task

Do the exercises 9.1, 9.2, 9.3, 9.5, 9.6, 9.7, 9.8 in Hyndman. Please submit both the Rpubs link as well as your .rmd file.

Exercises

9.1

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

Explain the differences among these figures. Do they all indicate that the data are white noise?
Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

Part A

The difference among this figures is with the sample size. The bounds appear to become closer and closer to zero as the sample size increases. Each figure shows that the data for each sample size is white noise due to each autocorrelation remaining within / close to the bounds of the ACF plot.

Part B

The critical values are \(\pm1.96 / \sqrt{T}\), where \(T\) is the length of the timeseries. The sample size affects the length of the timeseries and, in turn, affects the critical values distance from zero. The larger the sample size the smaller the critical values are from zero. The autocorrelations are different in each figure due to the random numbers.

9.2

A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

gafa_stock <- gafa_stock

q2 <- gafa_stock %>%
  filter(Symbol == "AMZN")

gg_tsdisplay(q2, y = Close, plot_type = "partial") +
  ggtitle("Amazon Closing Price, ACF, and PACF")

## Warning: Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.
## Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.

print("Unit Root Test Results:")

## [1] "Unit Root Test Results:"

q2 %>%
  features(Close, unitroot_kpss)

## # A tibble: 1 x 3
##   Symbol kpss_stat kpss_pvalue
##   <chr>      <dbl>       <dbl>
## 1 AMZN        14.0        0.01

Amazon Closing price data is non-stationary and should be differenced because:

In the daily stock prices plot, the data is clearly trended in a positive direction.
The ACF plot shows that there is an extremely small incremental decrease to the lags. The ACF should quickly converge to around zero if the timeseries was stationary.
The PACF plot has the first lag close to 1, indicating that the data is non-stationary
The Unit Root test results suggests that differencing is requires (pvalue of 0.01 is less than 0.05)

9.3

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

Turkish GDP from global_economy.
Accommodation takings in the state of Tasmania from aus_accommodation.
Monthly sales from souvenirs.

Part A

The appropriate Box-Cox transformation for the Turkish GDP is 0.1572. Only 1 order of differencing is needed to obtain stationary data.

global_economy <- global_economy

q3a <- global_economy %>%
  filter(Country == "Turkey")

lambda <- q3a %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

paste0("Lambda value: ", round(lambda, 4))

## [1] "Lambda value: 0.1572"

q3a %>%
  features(box_cox(GDP, lambda), unitroot_ndiffs)

## # A tibble: 1 x 2
##   Country ndiffs
##   <fct>    <int>
## 1 Turkey       1

Part B

The appropriate Box-Cox transformation for the accommodation takings in the state of Tasmania is 0.0018. Only 1 order of differencing is needed to obtain stationary data.

aus_accommodation <- aus_accommodation

q3b <- aus_accommodation %>%
  filter(State == "Tasmania")

lambda <- q3b %>%
  features(Takings, features = guerrero) %>%
  pull(lambda_guerrero)

paste0("Lambda value: ", round(lambda, 4))

## [1] "Lambda value: 0.0018"

q3b %>%
  features(box_cox(Takings, lambda), unitroot_ndiffs)

## # A tibble: 1 x 2
##   State    ndiffs
##   <chr>     <int>
## 1 Tasmania      1

Part C

The appropriate Box-Cox transformation for the monthly sales of souvenirs is 0.0021. Only 1 order of differencing is needed to obtain stationary data.

souvenirs <- souvenirs

q3c <- souvenirs

lambda <- q3c %>%
  features(Sales, features = guerrero) %>%
  pull(lambda_guerrero)

paste0("Lambda value: ", round(lambda, 4))

## [1] "Lambda value: 0.0021"

q3c %>%
  features(box_cox(Sales, lambda), unitroot_ndiffs)

## # A tibble: 1 x 1
##   ndiffs
##    <int>
## 1      1

9.5

For your retail data (from Exercise 8 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

The retail data needs to undergo multiple differences to achieve stationarity. I performed two differences to the data: one seasonal difference and one standard difference. The Unit Root test, KPSS, showed that after differencing twice the test statistic is small (0.02) and the pvalue was larger than 0.05 which allows the acceptance of the null hypothesis that the data is stationary. It sshould be noted that there is still a large amount of autocorrelation and partial autocorrelation.

set.seed(15)

myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

myseries %>%
  gg_tsdisplay(Turnover, plot_type = "partial") +
  ggtitle("Original Data")

myseries %>%
  features(Turnover, unitroot_ndiffs)

## # A tibble: 1 x 3
##   State           Industry                                                ndiffs
##   <chr>           <chr>                                                    <int>
## 1 New South Wales Furniture, floor coverings, houseware and textile good~      1

myseries %>%
  features(Turnover, unitroot_nsdiffs)

## # A tibble: 1 x 3
##   State           Industry                                               nsdiffs
##   <chr>           <chr>                                                    <int>
## 1 New South Wales Furniture, floor coverings, houseware and textile goo~       1

myseries %>%
  features(difference(difference(Turnover, lag = 12)), unitroot_kpss)

## # A tibble: 1 x 4
##   State           Industry                                 kpss_stat kpss_pvalue
##   <chr>           <chr>                                        <dbl>       <dbl>
## 1 New South Wales Furniture, floor coverings, houseware a~    0.0214         0.1

myseries %>%
  gg_tsdisplay(difference(difference(box_cox(Turnover, lambda), lag = 12)), plot_type='partial') +
  ggtitle("Differenced Data")

## Warning: Removed 13 row(s) containing missing values (geom_path).

## Warning: Removed 13 rows containing missing values (geom_point).

9.6

Simulate and plot some data from simple ARIMA models.

Use the following R code to generate data from an AR(1) model with \(\phi_1 = 0.6\) and \(\sigma^2 = 1\). The process starts with \(y_1=0\)

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

Produce a time plot for the series. How does the plot change as you change \(\phi_1\)
Write your own code to generate data from an MA(1) model with \(\theta_1 = 0.6\) and \(\sigma^2 = 1\)
Produce a time plot for the series. How does the plot change as you change \(theta_1\)
Generate data from an ARMA(1,1) model with \(\phi_1 = 0.6\), \(\theta_1 = 0.6\), and \(\sigma^2 = 1\)
Generate data from an AR(2) model with \(\phi_1 = -0.8\), \(\phi_2 = 0.3\), \(\sigma^2 = 1\) (Note that these parameters will give a non-stationary series.)
Graph the latter two series and compare them.

Part A

set.seed(15)

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

head(sim)

## # A tsibble: 6 x 2 [1]
##     idx      y
##   <int>  <dbl>
## 1     1  0    
## 2     2  1.83 
## 3     3  0.759
## 4     4  1.35 
## 5     5  1.30 
## 6     6 -0.476

Part B

The plot changes are significant as \(\phi_1\) is adjusted. Low values for \(\phi_1\) seem to have much larger oscillations compared to high values of \(\phi_1\). Moderate to low \(\phi_1\) values seem to be centered around 0, whereas extremely large \(\phi_1\) are not. It seems that the plot converges into an exponential function with very high values \(\phi_1\). The frequency for negative values of \(\phi_1\) is much greater than when \(\phi_1\) is positive.

for(i in 2:100){
  y[i] <- 0.01*y[i-1] + e[i]
}
small_phi <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- 0.99*y[i-1] + e[i]
}
large_phi <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- 1.2*y[i-1] + e[i]
}
positive_phi <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- -1.2*y[i-1] + e[i]
}
negative_phi <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- -0.6*y[i-1] + e[i]
}
negative_phi2 <- tsibble(idx = seq_len(100), y = y, index = idx)

p6 <- sim %>%
  autoplot(y) +
  ggtitle("Phi = 0.6")

p1 <- small_phi %>%
  autoplot(y) +
  ggtitle("Phi = 0.01")

p2 <- large_phi %>%
  autoplot(y) +
  ggtitle("Phi = 0.99")

p3 <- positive_phi %>%
  autoplot(y) +
  ggtitle("Phi = 1.2")

p4 <- negative_phi %>%
  autoplot(y) +
  ggtitle("Phi = -1.2")

p5 <- negative_phi2 %>%
  autoplot(y) +
  ggtitle("Phi = -0.6")

ggarrange(p1, p2,
          p3, p4,
          p6, p5,
          nrow = 3,
          ncol = 2)

Part C

theta <- 0.6

for(i in 2:100){
  y[i] <- e[i] + theta * e[i-1]
}

MA <- tsibble(idx = seq_len(100), y = y, index = idx)

head(MA)

## # A tsibble: 6 x 2 [1]
##     idx      y
##   <int>  <dbl>
## 1     1  0    
## 2     2  1.99 
## 3     3  0.759
## 4     4  0.693
## 5     5  1.03 
## 6     6 -0.963

Part D

The change in \(\theta\) does not drastically change the model shape. Low / negative values of \(\theta\) result in more spikes in the model, whereas large positive values of \(\theta\) provide a smoother outcome.

MA %>%
  autoplot(y) +
  ggtitle("MA(1) model: Theta = 0.6")

for(i in 2:100){
  y[i] <- e[i] + 0.01 * e[i-1]
}
small_theta <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- e[i] + 0.99 * e[i-1]
}
large_theta <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- e[i] + 1.5 * e[i-1]
}
positive_theta <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- e[i] + -1.5 * e[i-1]
}
negative_theta <- tsibble(idx = seq_len(100), y = y, index = idx)

for(i in 2:100){
  y[i] <- e[i] + -0.6 * e[i-1]
}
negative_theta2 <- tsibble(idx = seq_len(100), y = y, index = idx)

p6 <- MA %>%
  autoplot(y) +
  ggtitle("Theta = 0.6")

p1 <- small_theta %>%
  autoplot(y) +
  ggtitle("Theta = 0.01")

p2 <- large_theta %>%
  autoplot(y) +
  ggtitle("Theta = 0.99")

p3 <- positive_theta %>%
  autoplot(y) +
  ggtitle("Theta = 1.5")

p4 <- negative_theta %>%
  autoplot(y) +
  ggtitle("Theta = -1.5")

p5 <- negative_theta2 %>%
  autoplot(y) +
  ggtitle("Theta = -0.6")

ggarrange(p1, p2,
          p3, p4,
          p6, p5,
          nrow = 3,
          ncol = 2)

Part E

for(i in 2:100){
  y[i] <- 0.6*y[i-1] + 0.6*e[i-1] + e[i]
}

arma_E <- tsibble(idx = seq_len(100), y = y, index = idx)

head(arma_E)

## # A tsibble: 6 x 2 [1]
##     idx     y
##   <int> <dbl>
## 1     1 0    
## 2     2 1.99 
## 3     3 1.95 
## 4     4 1.86 
## 5     5 2.14 
## 6     6 0.324

Part F

for(i in 3:100){
  y[i] <- -0.8*y[i-1] + 0.3*y[i-2] + e[i]
}

ar_F <- tsibble(idx = seq_len(100), y = y, index = idx)

head(ar_F)

## # A tsibble: 6 x 2 [1]
##     idx     y
##   <int> <dbl>
## 1     1  0   
## 2     2  1.99
## 3     3 -1.93
## 4     4  3.04
## 5     5 -2.52
## 6     6  1.67

Part G

The ARMA(1, 1) model appears to be stationary, as it is constant around 0 with minimal fluctuations. The Both ACF and PACF plots show that the autocorrelation drops after each lag until it is with the bounds. The AR(2) model appears to oscillate around 0 but get progressively larger as the series progresses. The ACF plot shows that the autocorrelation switches between positive and negative strength and slowly decreases each lag. The PCA drops rappidly after teh first lag.

arma_E %>%
  gg_tsdisplay(y, plot_type = "partial") +
  ggtitle("ARMA(1, 1) Model")

ar_F %>%
  gg_tsdisplay(y, plot_type = "partial") +
  ggtitle("AR(2) Model")

9.7

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.
Write the model in terms of the backshift operator.
Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.
Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.
Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

Part A

The selected ARIMA model is ARIMA(0, 2, 1). The residuals do appear to be white noise.

fit <- aus_airpassengers %>%
  model(
    ARIMA(Passengers)
  )

report(fit)

## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8963
## s.e.   0.0594
## 
## sigma^2 estimated as 4.308:  log likelihood=-97.02
## AIC=198.04   AICc=198.32   BIC=201.65

fit %>%
  gg_tsresiduals() +
  ggtitle("ARIMA(0, 2, 1): Residuals")

fit %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
  ggtitle("Australian Airpassengers Forecast")

Part B

Backshift operator form:

\[ y_t = -0.8963 * \epsilon_{t - 1} + \epsilon_t \]

Part C

Part A’s forecast is consistently higher than Part C due to a steeper slope. Part C also appears to have residuals that exhibit white noise.

fit2 <- aus_airpassengers %>%
  model(
    ARIMA(Passengers ~ pdq(0, 1, 0))
  )

fit_vis <- aus_airpassengers %>%
  model(
    PartA = ARIMA(Passengers),
    PartC = ARIMA(Passengers ~ pdq(0, 1, 0))
  )

fit2 %>%
  gg_tsresiduals() +
  ggtitle("ARIMA(0, 1, 0): Residuals")

fit_vis %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
  ggtitle("Australian Airpassengers Forecast")

Part D

The ARIMA(2, 1, 2) model with a constant seems to be very similar to the ARIMA(0, 1, 0) model but is a slightly worse fit overall. The best fitting model appears to be Part A model, ARIMA(0, 2, 1). Removing the constant throws an error and does not produce an ARIMA model.

fit3 <- aus_airpassengers %>%
  model(
    ARIMA(Passengers ~ 1 + pdq(2, 1, 2))
  )

fit4 <- aus_airpassengers %>%
  model(
    ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
  )

## Warning: 1 error encountered for ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
## [1] non-stationary AR part from CSS

fit_vis <- aus_airpassengers %>%
  model(
    PartA = ARIMA(Passengers),
    PartC = ARIMA(Passengers ~ pdq(0, 1, 0)),
    PartD_Constant = ARIMA(Passengers ~ 1 + pdq(2, 1, 2)),
    PartD_NoConstant = ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
  )

## Warning: 1 error encountered for PartD_NoConstant
## [1] non-stationary AR part from CSS

fit3 %>%
  gg_tsresiduals() +
  ggtitle("ARIMA(2, 1, 2) With Constant: Residuals")

fit4 %>%
  gg_tsresiduals() +
  ggtitle("ARIMA(2, 1, 2) Without Constant: Residuals")

## Error in na.contiguous.default(as.ts(x)): all times contain an NA

fit_vis %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
  ggtitle("Australian Airpassengers Forecast")

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf

## Warning: Removed 10 row(s) containing missing values (geom_path).

report(fit_vis)

## Warning in report.mdl_df(fit_vis): Model reporting is only supported for
## individual models, so a glance will be shown. To see the report for a specific
## model, use `select()` and `filter()` to identify a single model.

## # A tibble: 3 x 8
##   .model         sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <chr>           <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 PartA            4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>
## 2 PartC            4.27   -98.2  200.  201.  204. <cpl [0]> <cpl [0]>
## 3 PartD_Constant   4.03   -96.2  204.  207.  215. <cpl [2]> <cpl [2]>

Part E

The ARIMA(0, 2, 1) model with a constant appears to produce nonlinear forecasts.

fit <- aus_airpassengers %>%
  model(
    ARIMA(Passengers ~ 1 + pdq(0, 2, 1))
  )

## Warning: Model specification induces a quadratic or higher order polynomial trend. 
## This is generally discouraged, consider removing the constant or reducing the number of differences.

fit %>%
  gg_tsresiduals() +
  ggtitle("ARIMA(0, 2, 1) with a Constant: Residuals")

fit %>%
  forecast(h = 10) %>%
  autoplot(aus_airpassengers) +
  ggtitle("Australian Airpassengers Forecast")

9.8

For the United States GDP series (from global_economy):

if necessary, find a suitable Box-Cox transformation for the data;
fit a suitable ARIMA model to the transformed data using ARIMA();
try some other plausible models by experimenting with the orders chosen;
choose what you think is the best model and check the residual diagnostics;
produce forecasts of your fitted model. Do the forecasts look reasonable?
compare the results with what you would obtain using ETS() (with no transformation).

Part A

There are no large deviations in the variance or this time series, therefore a boxcox transformation is not necessary.

us.data <- global_economy %>%
  filter(Code == "USA")

us.data %>%
  gg_tsdisplay(GDP, plot_type = 'partial')

Part B

A suitable ARIMA model for this series is ARIMA(0, 2, 2)

fit <- us.data %>%
  model(
    ARIMA(GDP)
  )

report(fit)

## Series: GDP 
## Model: ARIMA(0,2,2) 
## 
## Coefficients:
##           ma1      ma2
##       -0.4206  -0.3048
## s.e.   0.1197   0.1078
## 
## sigma^2 estimated as 2.615e+22:  log likelihood=-1524.08
## AIC=3054.15   AICc=3054.61   BIC=3060.23

fit %>%
  forecast(h = 10) %>%
  autoplot(us.data) +
  ggtitle("US GDP Forecast")

Part C

The best model is still the model from Part B in terms of accuracy and fit measurements.

fit <- us.data %>%
  model(
   PartB = ARIMA(GDP),
   PartC1 = ARIMA(GDP ~ pdq(0, 2, 1)),
   PartC2 = ARIMA(GDP ~ pdq(1, 2, 0))
  )

glance(fit)

## # A tibble: 3 x 9
##   Country       .model  sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <fct>         <chr>    <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 United States PartB  2.61e22  -1524. 3054. 3055. 3060. <cpl [0]> <cpl [2]>
## 2 United States PartC1 2.92e22  -1528. 3059. 3059. 3063. <cpl [0]> <cpl [1]>
## 3 United States PartC2 3.06e22  -1529. 3061. 3062. 3065. <cpl [1]> <cpl [0]>

fit %>%
  forecast(h = 10) %>%
  autoplot(us.data) +
  ggtitle("US GDP Forecast")

Part D

As mentioned above, the best model was from Part B, ARIMA(0, 2, 2). The residuals have an outlier or two but otherwise distributed around the mean. The ACF plot validates the model residuals are resembling white noise.

fit <- us.data %>%
  model(
    ARIMA(GDP)
  )

fit %>%
  gg_tsresiduals() +
  ggtitle("ARIMA(0, 2, 2) Residual Diagnostic Plots")

Part E

The forecasts look reasonable.

fit %>%
  forecast(h = 10) %>%
  autoplot(us.data) +
  ggtitle("US GDP Forecast")

Part F

The ARIMA model has better fit metrics than the ETS model, therefore the best model for this series is ARIMA(0, 2, 2). The end result is a slightly less steep forecast compared to the ETS model. The ETS model also has a much wider prediction interval compared to the ARIMA model.

fit_F <- us.data %>%
  model(
    ARIMA = ARIMA(GDP),
    ETS = ETS(GDP)
        )

glance(fit_F)

## # A tibble: 2 x 12
##   Country   .model   sigma2 log_lik   AIC  AICc   BIC ar_roots ma_roots      MSE
##   <fct>     <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl> <list>   <list>      <dbl>
## 1 United S~ ARIMA  2.61e+22  -1524. 3054. 3055. 3060. <cpl [0~ <cpl [2~ NA      
## 2 United S~ ETS    6.78e- 4  -1590. 3191. 3192. 3201. <NULL>   <NULL>    2.79e22
## # ... with 2 more variables: AMSE <dbl>, MAE <dbl>

fit_F %>%
  forecast(h = 10) %>%
  autoplot(us.data) +
  ggtitle("US GDP Forecast")