library(forecast)
library(fpp2)
library(magrittr)
library(seasonal)
library(ggplot2)
library(gridExtra)

7.8 Exercises

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

Use the ses() function in R to find the optimal values of α and ℓ0 , and generate forecasts for the next four months.

plot(pigs)

fc<- ses(pigs, h=4)
round(accuracy(fc),2)

##                  ME    RMSE     MAE   MPE MAPE MASE ACF1
## Training set 385.87 10253.6 7961.38 -0.92 9.27  0.8 0.01

pigs.train <- window(pigs, end = c(1995,4))

pigs.test <- window(pigs, start = c(1995,5))


alpha <- seq(.01, .99, by = .01)
RMSE <- NA
for(i in seq_along(alpha)) {
  fit <- ses(pigs.train, alpha = alpha[i], h = 4)
  RMSE[i] <- accuracy(fit, pigs.test)[2,2]
}

fit_p<- ses(pigs.train, alpha=0.89, h=4)

The min value of alpha with min RMSE is 0.89.

Compute a 95% prediction interval for the first forecast using
^y ± 1.96 s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

Write your own function to implement simple exponential smoothing.

The function should take arguments y (the time series), alpha (the smoothing parameter α ) and level (the initial level ℓ0 ). It should return the forecast of the next observation in the series. Does it give the same forecast as ses()?

Modify your function from the previous exercise to return the sum of squared errors rather than the forecast of the next observation.

Then use the optim() function to find the optimal values of
α and
ℓ 0 . Do you get the same values as the ses() function?

Combine your previous two functions to produce a function which both finds the optimal values

of α and ℓ0 , and produces a forecast of the next observation in the series.

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

Plot the series and discuss the main features of the data.
use the ses() function to forecast each series, and plot the forecasts.
Compute the RMSE values for the training data in each case.

autoplot(books, facets = TRUE)+
  ggtitle("Daily Book Sales")

Total sales show an increasing trend. For majority of the time the sales of one seem to rise when the other falls.

#Paper

paper <- books[,1] #paperbacks only

fit1 <- ses(paper, alpha=0.89, h=4)

autoplot(fit1)

accuracy(fit1)

##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 1.702475 44.04445 36.76217 -2.733058 20.64711 0.9270461 -0.4565379

# Hardcover
 
hard<- books[,2]

fit2<- ses(hard, alpha=.8, h=4)

autoplot(fit2)

accuracy(fit2)

##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 4.778564 35.42193 28.27426 0.5621838 14.28275 0.8435737 -0.4800128

RMSE Paperback = 44.044 RMSE Hardcover = 35.422

We will continue with the daily sales of paperback and hardcover books in data set books.

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.
Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.
Compare the forecasts for the two series using both methods. Which do you think is best?
Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

paper <- books[,1] #paperbacks only

fit3 <- holt(paper, h=4)

autoplot(fit3)

accuracy(fit3) #RMSE = 31.13692

##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792

# Hardcover
 
hard<- books[,2]

fit4<- holt(hard, h=4)

autoplot(fit4)

accuracy(fit4) #RMSE = 27.194

##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

autoplot(eggs)

holt_eggs <- holt(eggs, h = 100)
e1<- autoplot(holt_eggs) +   autolayer(holt_eggs$fitted)

holt_damped_eggs <- holt(eggs, damped = TRUE, h = 100)
e2<- autoplot(holt_damped_eggs) +   autolayer(holt_damped_eggs$fitted)


holt_BoxCox_eggs <- holt(eggs, lambda = BoxCox.lambda(eggs), h = 100)
e3<- autoplot(holt_BoxCox_eggs) +   autolayer(holt_BoxCox_eggs$fitted)

holt_BoxCox_damped_eggs <- holt(eggs,damped = TRUE, lambda = BoxCox.lambda(eggs), h = 100)
e4<- autoplot(holt_BoxCox_damped_eggs) +   autolayer(holt_BoxCox_damped_eggs$fitted)

grid.arrange(e1, e2, e3,e4, nrow = 2)

sqrt(mean(holt_eggs$residuals^2))

## [1] 26.58219

#[1] 26.58219

sqrt(mean(holt_damped_eggs$residuals^2))

## [1] 26.54019

#[1] 26.54019

sqrt(mean(holt_BoxCox_eggs$residuals^2))

## [1] 1.032217

#[1] 1.032217

sqrt(mean(holt_BoxCox_damped_eggs$residuals^2))

## [1] 1.039187

#[1] 1.039187

Recall your retail time series data (from Exercise 3 in Section 2.10).

Why is multiplicative seasonality necessary for this series?

library(readxl)
retaildata <- readxl::read_excel("C:/Users/hunai/Documents/retail.xlsx", skip=1)

myts <- ts(retaildata[,"A3349399C"],  frequency=12, start=c(1982,4))

autoplot(myts)

ggseasonplot(myts)

ggsubseriesplot(myts)

gglagplot(myts)

ggAcf(myts)

Seasonality variations in the time series are changing with increase in time. Multiplicative seasonality is the best approach because additive method can handle constant seasonal variations only.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

ets_retail <- hw(myts, seasonal = "multiplicative")
ets_retaild <- hw(myts,seasonal = "multiplicative",damped = TRUE)

autoplot(myts) + autolayer(ets_retail, series='Retail Data Multiplicative', PI=FALSE)  +
    autolayer(ets_retaild, series='Retail Data Damped', PI=FALSE)

error_ets_retail <- tsCV(  myts, hw, h = 1, seasonal = "multiplicative")
error_ets_retaild <- tsCV( myts, hw, h = 1, seasonal = "multiplicative", damped = TRUE)
sqrt(mean(error_ets_retail^2, na.rm = TRUE)) #14.1892

## [1] 14.18924

sqrt(mean(error_ets_retaild^2, na.rm = TRUE)) #14.467

## [1] 14.4676

The RMSE of both methods are very close. We prefer the damping method as that will control the trend from rising too much.

Check that the residuals from the best method look like white noise.

checkresiduals(ets_retaild)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 91.149, df = 7, p-value < 2.2e-16
## 
## Model df: 17.   Total lags used: 24

checkresiduals(ets_retail)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 67.421, df = 8, p-value = 1.6e-11
## 
## Model df: 16.   Total lags used: 24

Neither method displays as “white noise” completely, but the Ljung-Box test shows significance for both models.

Now find the test set RMSE, while training the model to the end of 2010. ?

myts_train <- window(myts, end = c(2010, 12))
myts_test <- window(myts, start = 2011)

#Holt-Winters’ method with damped option.
retaild_train <- hw(myts_train, h = 36, seasonal = "multiplicative", damped = TRUE)
autoplot(retaild_train)

accuracy(retaild_train, myts_test)

##                      ME     RMSE       MAE       MPE     MAPE      MASE
## Training set  0.9068587 12.22244  9.146994 0.2077715 4.469245 0.5514028
## Test set     23.7963469 45.12278 31.653071 5.2110681 7.648041 1.9081232
##                    ACF1 Theil's U
## Training set 0.06226556        NA
## Test set     0.81794642 0.5652086

The test RMSE is 45.123, which is higher compared to the training set RMSE.

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

fc_stl_ets_retail_train <- myts_train %>% stlm( s.window = 13, robust = TRUE, method = "ets", lambda = BoxCox.lambda(myts_train) ) %>%
forecast( h = 36, lambda = BoxCox.lambda(myts_train) )

autoplot(fc_stl_ets_retail_train)

accuracy(fc_stl_ets_retail_train, myts_test)

##                       ME     RMSE       MAE        MPE     MAPE      MASE
## Training set -0.03080916 11.26884  7.988279 -0.1302718 3.874566 0.4815527
## Test set     -0.80317765 26.06297 19.951653 -1.0220778 5.199036 1.2027336
##                    ACF1 Theil's U
## Training set 0.07981647        NA
## Test set     0.79863957 0.3413503

#Forecast w/o transfomation
fc_stl_ets_retail_train_without_tr <- myts_train %>% stlm( s.window = 13, robust = TRUE,  method = "ets" ) %>% forecast(h = 36)

autoplot(fc_stl_ets_retail_train_without_tr)

accuracy(fc_stl_ets_retail_train_without_tr,  myts_test)

##                      ME     RMSE       MAE        MPE     MAPE      MASE
## Training set -0.1167899 13.25297  8.917642 -0.3358157 4.330561 0.5375769
## Test set     -0.1768534 38.43752 30.155581 -1.6648872 7.753192 1.8178509
##                   ACF1 Theil's U
## Training set 0.1194750        NA
## Test set     0.6013115 0.4986773

For this exercise use data set ukcars, the quarterly UK passenger vehicle production data from 1977Q1–2005Q1.

Plot the data and describe the main features of the series.

autoplot(ukcars)

Decompose the series using STL and obtain the seasonally adjusted data.

ukcars %>%
  stl(t.window=13, s.window="periodic", robust=TRUE) -> fit_cars

autoplot(ukcars, series="Data") +
  autolayer(trendcycle(fit_cars), series="Trend") +
  autolayer(seasadj(fit_cars), series="Seasonally Adjusted") +
  xlab("Year") + ylab("Vehicle Production") +
    scale_colour_manual(values=c("gray","blue","red"),
             breaks=c("Data","Seasonally Adjusted","Trend"))

Forecast the next two years of the series using an additive damped trend method applied to the seasonally adjusted data. (This can be done in one step using stlf() with arguments etsmodel=“AAN”, damped=TRUE.)

fit_cars1<- fit_cars %>% seasadj() %>%
  stlf(etsmodel="AAN", damped=TRUE)

  autoplot(fit_cars1)

Forecast the next two years of the series using Holt’s linear method applied to the seasonally adjusted data (as before but with damped=FALSE).

fit_cars2<- fit_cars %>% seasadj() %>%
  stlf(etsmodel="AAN", damped=FALSE)

  autoplot(fit_cars2)

Now use ets() to choose a seasonal model for the data.

fit_cars3<- ets(ukcars, model="ZZZ",damped=NULL)

summary(fit_cars3)

## ETS(A,N,A) 
## 
## Call:
##  ets(y = ukcars, model = "ZZZ", damped = NULL) 
## 
##   Smoothing parameters:
##     alpha = 0.6199 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 314.2568 
##     s = -1.7579 -44.9601 21.1956 25.5223
## 
##   sigma:  25.9302
## 
##      AIC     AICc      BIC 
## 1277.752 1278.819 1296.844 
## 
## Training set error measures:
##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 1.313884 25.23244 20.17907 -0.1570979 6.629003 0.6576259
##                    ACF1
## Training set 0.02573334

  autoplot(fit_cars3)

The ETS model chose an “ANA” model.

Compare the RMSE of the ETS model with the RMSE of the models you obtained using STL decompositions. Which gives the better in-sample fits?

accuracy(fit_cars1)

##                    ME     RMSE      MAE        MPE    MAPE     MASE       ACF1
## Training set 1.551267 23.32113 18.48987 0.07690594 5.95317 0.602576 0.02262669

summary(fit_cars1)

## 
## Forecast method: STL +  ETS(A,Ad,N)
## 
## Model Information:
## ETS(A,Ad,N) 
## 
## Call:
##  ets(y = na.interp(x), model = etsmodel, damped = TRUE, allow.multiplicative.trend = allow.multiplicative.trend) 
## 
##   Smoothing parameters:
##     alpha = 0.7037 
##     beta  = 1e-04 
##     phi   = 0.9243 
## 
##   Initial states:
##     l = 341.6239 
##     b = -5.0799 
## 
##   sigma:  23.8549
## 
##      AIC     AICc      BIC 
## 1257.950 1258.743 1274.314 
## 
## Error measures:
##                    ME     RMSE      MAE        MPE    MAPE     MASE       ACF1
## Training set 1.551267 23.32113 18.48987 0.07690594 5.95317 0.602576 0.02262669
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       392.8649 362.2937 423.4362 346.1103 439.6196
## 2005 Q3       408.1368 370.7524 445.5211 350.9624 465.3112
## 2005 Q4       402.7433 359.6076 445.8790 336.7730 468.7137
## 2006 Q1       408.9103 360.7036 457.1171 335.1845 482.6362
## 2006 Q2       392.8611 340.0672 445.6550 312.1198 473.6023
## 2006 Q3       408.1332 351.1193 465.1471 320.9380 495.3284
## 2006 Q4       402.7401 341.7970 463.6831 309.5357 495.9444
## 2007 Q1       408.9073 344.2729 473.5417 310.0576 507.7570

accuracy(fit_cars2)

##                      ME   RMSE     MAE        MPE     MAPE      MASE       ACF1
## Training set -0.3412727 23.295 18.1605 -0.5509137 5.882519 0.5918418 0.02103582

summary(fit_cars2)

## 
## Forecast method: STL +  ETS(A,A,N)
## 
## Model Information:
## ETS(A,A,N) 
## 
## Call:
##  ets(y = na.interp(x), model = etsmodel, damped = FALSE, allow.multiplicative.trend = allow.multiplicative.trend) 
## 
##   Smoothing parameters:
##     alpha = 0.7248 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 328.1086 
##     b = 0.9202 
## 
##   sigma:  23.7186
## 
##      AIC     AICc      BIC 
## 1255.697 1256.258 1269.334 
## 
## Error measures:
##                      ME   RMSE     MAE        MPE     MAPE      MASE       ACF1
## Training set -0.3412727 23.295 18.1605 -0.5509137 5.882519 0.5918418 0.02103582
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       394.0276 363.6310 424.4242 347.5400 440.5152
## 2005 Q3       410.2168 372.6732 447.7604 352.7988 467.6348
## 2005 Q4       405.7407 362.2064 449.2749 339.1608 472.3206
## 2006 Q1       412.8249 364.0287 461.6211 338.1976 487.4523
## 2006 Q2       397.6928 344.1480 451.2376 315.8031 479.5825
## 2006 Q3       413.8820 355.9756 471.7884 325.3218 502.4423
## 2006 Q4       409.4059 347.4431 471.3688 314.6419 504.1699
## 2007 Q1       416.4902 350.7196 482.2607 315.9027 517.0776

accuracy(fit_cars3)

##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 1.313884 25.23244 20.17907 -0.1570979 6.629003 0.6576259
##                    ACF1
## Training set 0.02573334

summary(fit_cars3)

## ETS(A,N,A) 
## 
## Call:
##  ets(y = ukcars, model = "ZZZ", damped = NULL) 
## 
##   Smoothing parameters:
##     alpha = 0.6199 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 314.2568 
##     s = -1.7579 -44.9601 21.1956 25.5223
## 
##   sigma:  25.9302
## 
##      AIC     AICc      BIC 
## 1277.752 1278.819 1296.844 
## 
## Training set error measures:
##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 1.313884 25.23244 20.17907 -0.1570979 6.629003 0.6576259
##                    ACF1
## Training set 0.02573334

The second model using a damping factor for the ETS “AAN” model gives the lowest RMSE, as well as the lowest AICc.

Compare the forecasts from the three approaches? Which seems most reasonable? g. Check the residuals of your preferred model.

checkresiduals(fit_cars2)

## 
##  Ljung-Box test
## 
## data:  Residuals from STL +  ETS(A,A,N)
## Q* = 22.061, df = 4, p-value = 0.0001949
## 
## Model df: 4.   Total lags used: 8

For this exercise use data set visitors, the monthly Australian short-term overseas visitors data, May 1985–April 2005.

Make a time plot of your data and describe the main features of the series.

autoplot(visitors)

b. Split your data into a training set and a test set comprising the last two years of available data. Forecast the test set using Holt-Winters’ multiplicative method.

vis_train<- window(visitors, end = c(2003,12))

fcast<- hw(vis_train, seasonal = "multiplicative")
autoplot(fcast)

c. Why is multiplicative seasonality necessary here?

The seasonality is variable over the time series, which an additive model would not be able to capture.

Forecast the two-year test set using each of the following methods: an ETS model;

vis_fit1<- ets(vis_train, model= "ZZZ")

fcast1<- forecast(vis_fit1, h=24)

autoplot(fcast1)

checkresiduals(vis_fit1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,Ad,M)
## Q* = 22.713, df = 7, p-value = 0.001912
## 
## Model df: 17.   Total lags used: 24

an additive ETS model applied to a Box-Cox transformed series;

vis_fit2<- ets(vis_train, model = "ZZZ", lambda = BoxCox.lambda(vis_train), additive.only = TRUE)

fcast2<- forecast(vis_fit2, h=24)
       
autoplot(fcast2)

checkresiduals(vis_fit2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,Ad,A)
## Q* = 15.475, df = 7, p-value = 0.03037
## 
## Model df: 17.   Total lags used: 24

a seasonal naïve method;

fcast3<- snaive(vis_train, h=24)

autoplot(fcast3)

checkresiduals(fcast3)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 293.54, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

an STL decomposition applied to the Box-Cox transformed data followed by an ETS model applied to the seasonally adjusted (transformed) data.

fcast4 <- stlf(vis_train, t.window = 13, s.window = "periodic", h = 24, robust = TRUE, method = "ets", lambda = BoxCox.lambda(vis_train))

autoplot(fcast4)

checkresiduals(fcast4)

## 
##  Ljung-Box test
## 
## data:  Residuals from STL +  ETS(A,A,N)
## Q* = 23.978, df = 20, p-value = 0.2434
## 
## Model df: 4.   Total lags used: 24

e Which method gives the best forecasts? Does it pass the residual tests?

accuracy(fcast1) #RMSE = 14.96

##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.232439 14.95912 10.80756 0.2153496 4.044462 0.4138209
##                      ACF1
## Training set -0.002560158

vis_fit1$aicc # AICC = 2414.43

## [1] 2414.43

accuracy(fcast2) #RMSE = 15.515

##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.168056 15.51547 11.22989 0.1124176 4.046645 0.4299918
##                     ACF1
## Training set -0.02809955

vis_fit2$aicc # AICC = 600.958

## [1] 600.9575

accuracy(fcast3) # RMSE = 31.247

##                    ME     RMSE      MAE      MPE     MAPE MASE      ACF1
## Training set 16.59292 31.24792 26.11651 6.836282 10.18912    1 0.6514168

accuracy(fcast4) # RMSE = 15.394

##                       ME     RMSE      MAE         MPE     MAPE      MASE
## Training set -0.02532068 15.39434 11.08552 -0.09928427 4.107846 0.4244642
##                     ACF1
## Training set -0.02566624

Compare the same four methods using time series cross-validation with the tsCV() function instead of using a training and test set. Do you come to the same conclusions?

The fets() function below returns ETS forecasts.

fets <- function(y, h) {
  forecast(ets(y), h = h)
}

Apply tsCV() for a forecast horizon of h= 4 , for both ETS and seasonal naïve methods to the qcement data, (Hint: use the newly created fets() and the existing snaive() functions as your forecast function arguments.) Compute the MSE of the resulting 4 -step-ahead errors. (Hint: make sure you remove missing values.) Why are there missing values? Comment on which forecasts are more accurate. Is this what you expected?

8.11 Exercises

Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. Explain the differences among these figures. Do they all indicate that the data are white noise?

All three figures indicate that the data are white noise, since the ACF bars are all within the dash lines which indicate critical values for the ACF to be considered statistically significance. For data with smaller number of samples, the ACF bars are taller than the data with larger number of samples. Since the data are randomly generated, they are independently and identically distributed, and therefore should have autocorrelation of zero for all of its lags. This is demonstrated by the figures - as more samples are generated, the autocorrelations tend toward zero.
Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

These dash lines are estimated using ±1.96/N−−√ with zero center. Mathematically, as N gets bigger, the absolute value of critical value become smaller. Statistically, this means that it is “easier” for smaller data set to exhibit correlation by chance than larger data set. So to compensate, the absolute value of critical values are larger for smaller data set - you need to have higher autocorrelation to reject the null hypothesis that the autocorrelation is zero (i.e. the autocorrelation observed are due to chance only).

A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose).

Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

ggtsdisplay(ibmclose)

The textbook says that time series with trends or seasonality are not stationary. In the price plot above, there are apparent trends throughout the time. Therefore it is not stationary.

For a stationary data, the ACF usually drops very quickly to zero. For non-stationary data, the ACF decrease slowly and smoothly. In the ACF plot above, it is apparent that the ACF drop offs very slowly. Therefore it is not stationary.

For the PACF, the 1st lag always equal to ACF’s 1st lag, as can be seen in the figure above. The 1st lag PACF is very high here, and all other PACF are close to zero. This suggests that the data is non-stationary, since the 1st PACF is extremely high (close to one).

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

usnetelec

autoplot(usnetelec)

bc.lambda <- BoxCox.lambda(usnetelec) #bc.lambda = 0.51677
ndiffs(usnetelec) #1

## [1] 1

d_usnet <- usnetelec %>% BoxCox(bc.lambda) %>% diff(1)
ggtsdisplay(d_usnet)

enplanements

autoplot(enplanements)

bc.lambda <- BoxCox.lambda(enplanements) #bc.lambda =  -0.2269461
ndiffs(enplanements) #1

## [1] 1

nsdiffs(enplanements) #1

## [1] 1

d_enplane <- enplanements %>% BoxCox(bc.lambda) %>% diff(1)
ggtsdisplay(d_enplane)

visitors

For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

ts_retail<- myts
autoplot(ts_retail)

bc.lambda <- BoxCox.lambda(ts_retail) #bc.lambda = 0.02074707
autoplot(BoxCox(ts_retail, bc.lambda))

ndiffs(ts_retail) #1

## [1] 1

nsdiffs(ts_retail) #1

## [1] 1

d_retail <- ts_retail %>% BoxCox(bc.lambda) %>% diff(1) %>% diff()

cbind("Retail Sales" = ts_retail,
      "Transformed" = BoxCox(ts_retail, bc.lambda),
      "Seasonally\n differenced transformed" =
        diff(BoxCox(ts_retail, bc.lambda)),
      "Doubly\n differenced logs" =
        diff(diff(BoxCox(ts_retail, bc.lambda)),1)) %>%
  autoplot(facets=TRUE) +
    xlab("Year") + ylab("") +
    ggtitle("Australian Retail Sales")

Consider austa, the total international visitors to Australia (in millions) for the period 1980-2015.

Use auto.arima() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

aust_ar<- auto.arima(austa)

summary(aust_ar)

## Series: austa 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1   drift
##       0.3006  0.1735
## s.e.  0.1647  0.0390
## 
## sigma^2 estimated as 0.03376:  log likelihood=10.62
## AIC=-15.24   AICc=-14.46   BIC=-10.57
## 
## Training set error measures:
##                        ME      RMSE       MAE       MPE     MAPE      MASE
## Training set 0.0008313383 0.1759116 0.1520309 -1.069983 5.513269 0.7461559
##                      ACF1
## Training set -0.000571993

fcast_ar<- forecast(aust_ar, h= 10)

autoplot(fcast_ar)

Auto.arima chose a (0,1,1) model: autoregressive of order 0, 1 degree of differencing, moving average of order 1.

 checkresiduals(aust_ar)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
## 
## Model df: 2.   Total lags used: 7

Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.

#Arima No Drift

aust_ar1<- Arima(austa, order=c(0,1,1), include.drift = FALSE)

summary(aust_ar1)

## Series: austa 
## ARIMA(0,1,1) 
## 
## Coefficients:
##          ma1
##       0.4936
## s.e.  0.1265
## 
## sigma^2 estimated as 0.04833:  log likelihood=3.73
## AIC=-3.45   AICc=-3.08   BIC=-0.34
## 
## Training set error measures:
##                     ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.1150344 0.2136411 0.1714887 3.817629 5.649995 0.8416532
##                    ACF1
## Training set -0.1892256

fcast_ar1<- forecast(aust_ar1, h= 10)

autoplot(fcast_ar1)

#Removing MA

aust_ar2<- Arima(austa, order=c(0,1,0), include.drift = FALSE)

summary(aust_ar2)

## Series: austa 
## ARIMA(0,1,0) 
## 
## sigma^2 estimated as 0.06423:  log likelihood=-1.62
## AIC=5.24   AICc=5.36   BIC=6.8
## 
## Training set error measures:
##                     ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.1674969 0.2498996 0.1981155 5.471861 6.479997 0.9723354
##                   ACF1
## Training set 0.2583422

fcast_ar2<- forecast(aust_ar2, h= 10)

autoplot(fcast_ar2)

Plot forecasts from an ARIMA(2,1,3) model with drift. Remove the constant and see what happens.

#Arima 2,1,3

aust_ar3<- Arima(austa, order=c(2,1,3), include.drift = TRUE)

summary(aust_ar3)

## Series: austa 
## ARIMA(2,1,3) with drift 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2     ma3   drift
##       1.7648  -0.8513  -1.7684  0.5571  0.2224  0.1725
## s.e.  0.1136   0.1182   0.2433  0.4351  0.2150  0.0051
## 
## sigma^2 estimated as 0.0276:  log likelihood=13.72
## AIC=-13.44   AICc=-9.29   BIC=-2.55
## 
## Training set error measures:
##                        ME      RMSE      MAE       MPE     MAPE      MASE
## Training set -0.004129818 0.1491068 0.114857 -1.203567 4.326425 0.5637093
##                     ACF1
## Training set -0.01736817

#Series: austa 
#ARIMA(2,1,3) with drift 

#Coefficients:
#        ar1      ar2      ma1     ma2     ma3   drift
#      1.7648  -0.8513  -1.7684  0.5571  0.2224  0.1725
#s.e.  0.1136   0.1182   0.2433  0.4351  0.2150  0.0051

#sigma^2 estimated as 0.0276:  log likelihood=13.72
#AIC=-13.44   AICc=-9.29   BIC=-2.55

#Training set error measures:
#                       ME      RMSE      MAE       MPE     MAPE      MASE        ACF1
#Training set -0.004129818 0.1491068 0.114857 -1.203567 4.326425 0.5637093 -0.01736817

fcast_ar3<- forecast(aust_ar3, h= 10)

autoplot(fcast_ar3)

#Removing constant

aust_ar4<- Arima(austa, order=c(2,0,3), include.constant = FALSE)

summary(aust_ar4)

## Series: austa 
## ARIMA(2,0,3) with zero mean 
## 
## Coefficients:
##           ar1     ar2     ma1     ma2     ma3
##       -0.0020  0.9969  1.6680  0.7627  0.0894
## s.e.   0.0045  0.0045  0.3682  0.2389  0.3726
## 
## sigma^2 estimated as 0.04839:  log likelihood=1.98
## AIC=8.04   AICc=10.94   BIC=17.54
## 
## Training set error measures:
##                    ME      RMSE       MAE      MPE     MAPE      MASE      ACF1
## Training set 0.105478 0.2041388 0.1649614 3.554934 5.477898 0.8096177 -0.237856

#Series: austa 
#ARIMA(2,0,3) with zero mean 

#Coefficients:
#          ar1     ar2     ma1     ma2     ma3
#      -0.0020  0.9969  1.6680  0.7627  0.0894
#s.e.   0.0045  0.0045  0.3682  0.2389  0.3726

#sigma^2 estimated as 0.04839:  log likelihood=1.98
#AIC=8.04   AICc=10.94   BIC=17.54

#Training set error measures:
#                  ME      RMSE       MAE      MPE     MAPE      MASE      ACF1
#Training set 0.105478 0.2041388 0.1649614 3.554934 5.477898 0.8096177 -0.237856

fcast_ar4<- forecast(aust_ar4, h= 10)
autoplot(fcast_ar4)

Plot forecasts from an ARIMA(0,2,1) model with no constant.

aust_ar5<- Arima(austa, order = c(0,2,1))

summary(aust_ar5)

## Series: austa 
## ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.7263
## s.e.   0.2277
## 
## sigma^2 estimated as 0.03969:  log likelihood=6.74
## AIC=-9.48   AICc=-9.09   BIC=-6.43
## 
## Training set error measures:
##                      ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.02907781 0.1907537 0.1567877 1.216128 4.984081 0.7695018
##                   ACF1
## Training set 0.1284994

#Series: austa 
#ARIMA(0,2,1) 

#Coefficients:
#          ma1
#      -0.7263
#s.e.   0.2277

#sigma^2 estimated as 0.03969:  log likelihood=6.74
#AIC=-9.48   AICc=-9.09   BIC=-6.43

#Training set error measures:
#                     ME      RMSE       MAE      MPE     MAPE      MASE      ACF1
#Training set 0.02907781 0.1907537 0.1567877 1.216128 4.984081 0.7695018 0.1284994
#
fcast_ar5<- forecast(aust_ar5, h= 10)
autoplot(fcast_ar4)

For the usgdp series:

if necessary, find a suitable Box-Cox transformation for the data; fit a suitable ARIMA model to the transformed data using auto.arima();

ggtsdisplay(usgdp)

lam<- BoxCox.lambda(usgdp) #lam = 0.366352

gdp_tr<- BoxCox(usgdp, lambda= lam)

ggtsdisplay(gdp_tr)

gdp_ar<- auto.arima(gdp_tr)

summary(gdp_ar)

## Series: gdp_tr 
## ARIMA(2,1,0) with drift 
## 
## Coefficients:
##          ar1     ar2   drift
##       0.2795  0.1208  0.1829
## s.e.  0.0647  0.0648  0.0202
## 
## sigma^2 estimated as 0.03518:  log likelihood=61.56
## AIC=-115.11   AICc=-114.94   BIC=-101.26
## 
## Training set error measures:
##                        ME      RMSE       MAE          MPE     MAPE      MASE
## Training set 0.0006791305 0.1859861 0.1417489 -0.004468837 0.264737 0.1789208
##                     ACF1
## Training set 0.009570738

fcast_g1<- forecast(gdp_ar, h=24)

autoplot(fcast_g1)

checkresiduals(fcast_g1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,0) with drift
## Q* = 6.5772, df = 5, p-value = 0.254
## 
## Model df: 3.   Total lags used: 8

The auto.arima function chose a model with: autoregressive of order 2, 1 differencing, and MA of order 0.

try some other plausible models by experimenting with the orders chosen;

gdp_ar2<- Arima(gdp_tr, order = c(2,1,1), include.drift = TRUE)

fcast_g2<- forecast(gdp_ar2, h= 24)

autoplot(fcast_g2)

checkresiduals(fcast_g2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,1) with drift
## Q* = 5.4997, df = 4, p-value = 0.2398
## 
## Model df: 4.   Total lags used: 8

gdp_ar3<- Arima(gdp_tr, order = c(3,1,1), method = "ML")

fcast_g3<- forecast(gdp_ar3, h= 24)

autoplot(fcast_g3)

checkresiduals(fcast_g3)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,1,1)
## Q* = 7.1674, df = 4, p-value = 0.1273
## 
## Model df: 4.   Total lags used: 8

choose what you think is the best model and check the residual diagnostics;

Auto Arima provided the best fit, and the results did not improve with changing the order.

produce forecasts of your fitted model. Do the forecasts look reasonable? compare the results with what you would obtain using ets() (with no transformation).

ZEngineer ADEC 7460 HW Wk 2

Zille Engineer

July 20, 2020

7.8 Exercises

8.11 Exercises