Homework 6

9.1) Explain the differences among these figures. Do they all indicate that the data are white noise?

The differences in these figures are the bounded area size, the positioning of a spike in respect to lag, and the intensity of each spike. The size of the bounded area decreases as the time series increases in the amount of random numbers, which minimizes the acf range. These figures all indicate that the data are white noise due to the figures remaining within the bounded area (blue lines).

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

The critical values are at different distances from the mean of zero due to the increase in time series. The critical value decreases as the time series increases. When the size of the series increases, the critical value gets closer to the mean of zero. The autocorrelations are different in each figure due to the increase in series size. As the amount of random numbers increases, the autocorrelation chance decreases.

9.2) A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

## Warning: Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.
## Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.

Due to the time series having a noticeable trend, we can conclude it is non-stationary. The ACF plot is steadily decreasing but not drastically dropping to 0 and the PACF plot has an early lag close to 1, thereby supporting this claim. According to the KPSS test below, the Amazon closing price needs to apply differencing one time to become stationary.

## # A tibble: 1 × 2
##   Symbol ndiffs
##   <chr>   <int>
## 1 AMZN        1

9.3) For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

## # A tibble: 1 × 2
##   Country ndiffs
##   <fct>    <int>
## 1 Turkey       1

## Warning: Removed 1 row containing missing values (`geom_line()`).

## Warning: Removed 1 rows containing missing values (`geom_point()`).

## # A tibble: 1 × 2
##   State    ndiffs
##   <chr>     <int>
## 1 Tasmania      1

## Warning: Removed 4 rows containing missing values (`geom_line()`).

## Warning: Removed 4 rows containing missing values (`geom_point()`).

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

## Warning: Removed 12 rows containing missing values (`geom_line()`).

## Warning: Removed 12 rows containing missing values (`geom_point()`).

9.5) For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

set.seed(12345678)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1)) 

myseries %>%
  gg_tsdisplay(Turnover, plot_type='partial') +
  labs(title = 'Retail Turnover')

lambda = myseries %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

myseries %>%
  features(box_cox(Turnover, lambda), unitroot_nsdiffs)

## # A tibble: 1 × 3
##   State              Industry                                            nsdiffs
##   <chr>              <chr>                                                 <int>
## 1 Northern Territory Clothing, footwear and personal accessory retailing       1

myseries %>%
  gg_tsdisplay(difference(box_cox(Turnover, lambda), 12), plot_type = 'partial', lag=30) +
  labs(title = 'Transformed Retail Turnover' )

## Warning: Removed 12 rows containing missing values (`geom_line()`).

## Warning: Removed 12 rows containing missing values (`geom_point()`).

9.6) Simulate and plot some data from simple ARIMA models.

y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

Produce a time plot for the series. How does the plot change?

Changing the variable provides different patterns within the time series. When 1, it looks like a random walk but white noise when 0. If we go with a negative value, it remains around the mean.

Write your own code to generate data from an MA(1) model.Produce a time plot for the series. How does the plot change?

As the value of theta decreases, the graph centers more around the mean value and increases in how often spikes occur.

Generate data from an ARMA(1,1) model. Generate data from an AR(2) model. Graph the latter two series and compare them.

The ARIMA model is stationary, while AR2 model is not. We can determine this by the ACF plot decreasing quickly without sudden spikes and the PACF truncating after initial lag. AR2 remains around the mean,shows PACF with no values and ACF alternating between positive and negative values.

9.7) Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.

Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

The suggested model was ARIMA(0,2,1). The residuals show signs of white noise.

## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8756
## s.e.   0.0722
## 
## sigma^2 estimated as 4.671:  log likelihood=-87.8
## AIC=179.61   AICc=179.93   BIC=182.99

Write the model in terms of the backshift operator.

(1-B)^2yt = (1-0.87B)Et

Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a.

The model has a slightly lower slope and does not begin at the same point of the line.

Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens.

The slope seems to correlate with part a but shifted below the actual information. When removing the constant, we were warned about the “non-stationary AR part from CSS”.

## Warning: 1 error encountered for ARIMA(Passengers ~ 0 + pdq(2, 1, 2))
## [1] non-stationary AR part from CSS

Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

Suggested to remove the constant or reduce the number of differencing. The plot looks similar to a and fits well.

## Warning: Model specification induces a quadratic or higher order polynomial trend. 
## This is generally discouraged, consider removing the constant or reducing the number of differences.

9.8) For the United States GDP series (from global_economy):

if necessary, find a suitable Box-Cox transformation for the data.

GDP not stationary. Has an increasing trend. Recommending box-cox to correct this.

fit a suitable ARIMA model to the transformed data using ARIMA().

## Series: GDP 
## Model: ARIMA(1,1,0) w/ drift 
## Transformation: box_cox(GDP, lambda) 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1822
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78

try some other plausible models by experimenting with the orders chosen. choose what you think is the best model and check the residual diagnostics.

ARIMA(1,2,1) has the lowest AIC value and resembles white noise.

## Warning in report.mdl_df(gdp_models): Model reporting is only supported for
## individual models, so a glance will be shown. To see the report for a specific
## model, use `select()` and `filter()` to identify a single model.

## # A tibble: 6 × 9
##   Country       .model   sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <fct>         <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 United States arima110  5479.   -325.  657.  657.  663. <cpl [1]> <cpl [0]>
## 2 United States arima111  5580.   -325.  659.  659.  667. <cpl [1]> <cpl [1]>
## 3 United States arima120  6780.   -326.  656.  656.  660. <cpl [1]> <cpl [0]>
## 4 United States arima121  5761.   -321.  648.  649.  655. <cpl [1]> <cpl [1]>
## 5 United States arima210  5580.   -325.  659.  659.  667. <cpl [2]> <cpl [0]>
## 6 United States arima212  5734.   -325.  662.  664.  674. <cpl [2]> <cpl [2]>

produce forecasts of your fitted model. Do the forecasts look reasonable?

The forecasts seem reasonable as it continues to trend at the same slope.

compare the results with what you would obtain using ETS() (with no transformation).

The ETS model has a greater AIC value, therefore the model would perform worse than the ARIMA model.

## Series: GDP 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9990876 
##     beta  = 0.5011949 
## 
##   Initial states:
##          l[0]        b[0]
##  448093333334 64917355687
## 
##   sigma^2:  7e-04
## 
##      AIC     AICc      BIC 
## 3190.787 3191.941 3201.089