\label{fig:fig1}Forecasting: Principles and Practice.

Forecasting: Principles and Practice.

Instructions

From the book Forecasting Principles and Practice by Hyndman, R. & Athanasopoulus, G.

Please submit exercises 3.1, 3.2, 3.3 and 3.8 from the Hyndman online Forecasting book. Please submit both your Rpubs link as well as attach the .rmd file with your code.

Exercises

3.1

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.

  • usnetelec
  • usgdp
  • mcopper
  • enplanements

BoxCox.lambda(usnetelec)

BoxCox.lambda(usgdp)

BoxCox.lambda(mcopper)

BoxCox.lambda(enplanements)

3.2

Why is a Box-Cox transformation unhelpful for the cangas data?

BoxCox.lambda(cangas)

A good value of \(\lambda\) is one which makes the size of the seasonal variation about the same across the whole series, as that makes the forecasting model simpler. In this case, \(\lambda\) = 0.577 does not work quite well since the variations does not seems to change across the whole series; in this case, it is not making the explanations easier than the original series.

3.3

What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

read_excel(retail.xlsx)

My previously selected time series was A3349627V; and it represents the Turnover in New South Wales about Liquor retailing.

I would select \(\lambda\) = -0.058. This lambda makes the size of the seasonal variation about the same across the whole series, as that makes the forecasting model simpler.

3.8

For your retail time series (from Exercise 3 in Section 2.10):

In this case, I will proceed to perform two different procedures, one with the raw data, that is with no transformation, and then one with the Box-Cox transformation.

RAW DATA: myts

a)

Split the data into two parts using:

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

b)

Check that your data have been split appropriately by producing the following plot.

autoplot(myts) +
   autolayer(myts.train, series="Training") +
   autolayer(myts.test, series="Test")

c)

Calculate forecasts using snaive applied to myts.train.

fc <- snaive(myts.train)

d)

Compare the accuracy of your forecasts against the actual values stored in myts.test.

accuracy(fc,myts.test)

e)

Check the residuals.

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 591.71, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

Do the residuals appear to be uncorrelated and normally distributed?

In this case, the residuals appear to be auto-correlated; however, the residuals seem to follow a normal distribution but not centered to zero.

f)

How sensitive are the accuracy measures to the training/test split?

In this case, it seems not to be too sensitive, but I believe this is due to the mean not being equal to zero; and perhaps some correlations are still reflected on the residuals as seeing on the lag plot.

BOX-COX Transformation: myts

a)

Split the data into two parts using:

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

b)

Check that your data have been split appropriately by producing the following plot.

autoplot(myts) +
   autolayer(myts.train, series="Training") +
   autolayer(myts.test, series="Test")

c)

Calculate forecasts using snaive applied to myts.train.

fc <- snaive(myts.train)

d)

Compare the accuracy of your forecasts against the actual values stored in myts.test.

accuracy(fc,myts.test)

e)

Check the residuals.

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 669.61, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

Do the residuals appear to be uncorrelated and normally distributed?

In this case, the residuals appear to be auto correlated; however, the residuals seem to follow a normal distribution but not centered to zero.

f)

How sensitive are the accuracy measures to the training/test split?

In this case, it seems not to be too sensitive, but I believe this is due to the mean not being equal to zero; and perhaps some correlations are still reflected on the residuals as seeing on the lag plot.

References

Hyndman, R. & Athanasopoulos, G. 2019. Forecasting: Principles and Practice. Australia: Monash University. https://otexts.com/fpp2/.

R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.