require(fpp2)
## Loading required package: fpp2
## Loading required package: ggplot2
## Loading required package: forecast
## Loading required package: fma
## Loading required package: expsmooth

3.1 For the following series, find an appropriate Box-Cox transformation in order to stabilize the variance

Since BoxCox has a built in function to find a good lambda value to stabilize the variance, we will go with that.

usnetelec

l1 <- BoxCox.lambda(usnetelec)
autoplot(BoxCox(usnetelec, l1)) + 
  ggtitle("usnetelec Box-Cox")

autoplot(usnetelec) +
  ggtitle("usnetelec raw")

usgdp

l2 <- BoxCox.lambda(usgdp)
autoplot(BoxCox(usgdp, l2)) +
  ggtitle("usgdp Box-Cox")

autoplot(usgdp) +
  ggtitle("usgdp raw")

mcopper

l3 <- BoxCox.lambda(mcopper)
autoplot(BoxCox(mcopper, l3)) +
  ggtitle("mcopper Box-Cox")

autoplot(mcopper) +
  ggtitle("mcopper raw")

enplanements

l4 <- BoxCox.lambda(enplanements)
autoplot(BoxCox(enplanements, l4)) +
  ggtitle("enplanement Box-Cox")

autoplot(enplanements) +
  ggtitle("enplanements raw")

For usnetelec and usgdp there does not seem to be much seasonal variation in the data to be smoothed over, although the values have been normalized by the Box-Cox transformation so that may help in prediction if other values are as small. The biggest shift seems to be in the mcopper dataset, as the Box-Cox transformation due to the huge spike in the more modern entries.

3.2 Why is a Box-Cox transformation unhelpful for the cangas data?

l5 <- BoxCox.lambda(cangas)
autoplot(BoxCox(cangas, l5)) +
  ggtitle("cangas Box-Cox")

autoplot(cangas) +
  ggtitle("cangas raw")

I suppose that the variation in the cangas data set is not extreme enough for a Box-Cox transformation to have any real effect. We saw this with usnetelec and usgdp in the previous exercise, and even enplanements to a certain extent. Even Hyndman and Athanasopoulos note that a transformation isn’t always needed.

3.3 What Box-Cox tranformation would you select for your retail data (from Exercise 3 in Section 2.10)?

From Exercise 2.3, we know we have to read in the file, but more importantly the file is just a data frame of many time series. We need to pick one in particular to use. For simplicity’s sake, let us choose the first one.

retaildata <- readxl::read_excel("retail.xlsx", skip = 1)

myts <- ts(retaildata[, "A3349335T"],
           frequency = 12, start = c(1982, 4))
l6 <- BoxCox.lambda(myts)
autoplot(BoxCox(myts, l6)) +
  ggtitle("retail Box-Cox")

autoplot(myts) +
  ggtitle("retail raw")

We see a very drastic transformation compared to our other usages of BoxCox. The upward trend is much more linear, rather than being a gradual rise, but also all the seasonal peaks have been smoothed out.

3.8 For your retail time series (from Exercise 3 in Section 2.10):

a. Split the data into two parts using

myts.train <- window(myts, end = c(2010, 12))
myts.test <- window(myts, start = 2011)

b. Check that your data have been split appropriately by producing the following plot:

autoplot(myts) +
  autolayer(myts.train, series = "Training") +
  autolayer(myts.test, series = "Test")

c. Calculate forecasts using snaive applied to myts.train.

fc <- snaive(myts.train)

d. Compare the accuracy of your forecasts against the actual values stored in myts.test.

accuracy(fc, myts.test)
##                    ME      RMSE       MAE      MPE     MAPE     MASE
## Training set 61.56787  72.20702  61.68438 6.388722 6.404105 1.000000
## Test set     97.44583 109.62545 100.02917 4.629852 4.751209 1.621629
##                   ACF1 Theil's U
## Training set 0.6018274        NA
## Test set     0.2686595 0.9036205

e. Check the residuals.

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 812.76, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

Do the residuals appear to be uncorrelated and normally distributed?

No, the residuals don’t appear to be normally distributed. The data has a rather large right skew. They are correlated though. More than 5% of the spikes are beyond 0.2 in the ACF graph and the p-value is close to 0.

f. How sensitive are the accuracy measures to the training/test split?

To me, the only way to see the sensitivity of the accuracy measure is to attempt multiple train/test splits. Let’s do one with a larger test split, and one with a smaller test split.

myts.train2 <- window(myts, end = c(2005, 12))
myts.test2 <- window(myts, start = 2006)

myts.train3 <- window(myts, end = c(2012, 12))
myts.test3 <- window(myts, start = 2013)
fc2 <- snaive(myts.train2)
fc3 <- snaive(myts.train3)

accuracy(fc, myts.test)
##                    ME      RMSE       MAE      MPE     MAPE     MASE
## Training set 61.56787  72.20702  61.68438 6.388722 6.404105 1.000000
## Test set     97.44583 109.62545 100.02917 4.629852 4.751209 1.621629
##                   ACF1 Theil's U
## Training set 0.6018274        NA
## Test set     0.2686595 0.9036205
accuracy(fc2, myts.test2)
##                     ME      RMSE       MAE      MPE     MAPE     MASE
## Training set  53.64432  61.52865  53.70586 6.624671 6.638775 1.000000
## Test set     144.62083 162.88771 144.62083 8.390952 8.390952 2.692831
##                   ACF1 Theil's U
## Training set 0.4783456        NA
## Test set     0.7699170  1.629926
accuracy(fc3, myts.test3)
##                    ME     RMSE      MAE      MPE     MAPE     MASE
## Training set 61.43585 72.16410 61.82577 6.150356 6.177893 1.000000
## Test set     74.47500 86.11787 74.47500 3.250322 3.250322 1.204595
##                    ACF1 Theil's U
## Training set  0.5721717        NA
## Test set     -0.1109222 0.5204259
autoplot(myts) +
  autolayer(fc, series = "Default SN", PI = FALSE) +
  autolayer(fc2, series = "Large Split SN", PI = FALSE) +
  autolayer(fc3, series = "Small Split SN", PI = FALSE) +
  ggtitle("Retail Split Difference") + 
  guides(colour = guide_legend(title = "Forecasts"))

So it looks as though the small split is the best predictor, followed by the original, and then the large test split, which shows that the accuracy measures are incredibly sensitive to the train/test split. This makes some intuitive sense as the more data available over a shorter prediction time, the more accurate the model should be in theory.