Assignment: Exercises 3.1, 3.2, 3.3 and 3.8 from the HA textbook
library(fpp2)
## Loading required package: ggplot2
## Loading required package: forecast
## Loading required package: fma
## Loading required package: expsmooth
3.1 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.
The optimal Box-Cox transformation for usnetelec uses a lambda value of 0.52. According to HA, this is is the lambda that makes the size of the seasonal variation about the same across the whole series. You can see the original series along with the transformed series in the below plot. The transformed series appears to be smoother.
help(usnetelec)
lambda1 <- BoxCox.lambda(usnetelec)
usnetelecBoxcox <- BoxCox(usnetelec, lambda1)
usnetelecComb <- cbind(usnetelec, usnetelecBoxcox)
autoplot(usnetelecComb, facet = TRUE) + ggtitle("US Net Electricity Generation 1949-2003")
The lambda value for usgdp’s Box-Cox transformation is 0.37, and again, this transformed series appears more condensed and smoother compared to the original series.
help(usgdp)
lambda2 <- BoxCox.lambda(usgdp)
usgdpBoxcox <- BoxCox(usgdp, lambda2)
usgdpComb <- cbind(usgdp, usgdpBoxcox)
autoplot(usgdpComb, facet = TRUE) + ggtitle("Quarterly US GDP 1947Q1-2006Q1")
The lambda for mcopper’s Box-Cox transformation is 0.19, and this transformation namely makes the spike at its end seem less drastic than in the original series.
help(mcopper)
lambda3 <- BoxCox.lambda(mcopper)
mcopperBoxcox <- BoxCox(mcopper, lambda3)
mcopperComb <- cbind(mcopper, mcopperBoxcox)
autoplot(mcopperComb, facet = TRUE) + ggtitle("Monthly Copper Prices")
The lambda for enplanements’s Box-Cox transformation is -0.23, and this transformation makes the beginning of the series more volatile while making the end of the series less volatile compared to the original series.
help(enplanements)
lambda4 <- BoxCox.lambda(enplanements)
enplanementsBoxcox <- BoxCox(enplanements, lambda4)
enplanementsComb <- cbind(enplanements, enplanementsBoxcox)
autoplot(enplanementsComb, facet = TRUE) + ggtitle("Monthly US Domestic Enplanements")
3.2 Why is a Box-Cox transformation unhelpful for the cangas data?
I would say that the Box-Cox transformation is unhelpful for this data as the series does not appear to change much post transformation. This is especially true when we look at the scales of the original and transformed series side by side–they are not different by much compared to what we see with the series that we previously transformed.
help(cangas)
lambda5 <- BoxCox.lambda(cangas)
cangasBoxcox <- BoxCox(cangas, lambda5)
cangasComb <- cbind(cangas, cangasBoxcox)
autoplot(cangasComb, facet = TRUE) + ggtitle("Monthly Canadian Gas Production January 1960-February 2005")
3.3 What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?
First, we retrive the series that we explored in the previous assignment.
retaildata <- readxl::read_excel("retail.xlsx", skip=1) #The second argument (skip=1) is required because the Excel sheet has two header rows.
myts <- ts(retaildata[,"A3349415T"],
frequency=12, start=c(1982,4))
We can then see how this series appears after a Box-Cox transformation that uses a lambda of -0.24. The heightened peaks due to seasonal effects in the original series are more stabilized in the transformed series.
lambda6 <- BoxCox.lambda(myts)
mytsBoxcox <- BoxCox(myts, lambda6)
mytsComb <- cbind(myts, mytsBoxcox)
autoplot(mytsComb, facet = TRUE) + ggtitle("Australian Retail")
3.8 For your retail time series (from Exercise 3 in Section 2.10):
myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)
autoplot(myts) +
autolayer(myts.train, series="Training") +
autolayer(myts.test, series="Test")
fc <- snaive(myts.train)
d.Compare the accuracy of your forecasts against the actual values stored in myts.test.
accuracy(fc,myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set 2.746847 7.205710 5.513814 5.520797 9.100298 1.000000
## Test set -4.233333 7.933999 6.316667 -4.839500 7.470770 1.145608
## ACF1 Theil's U
## Training set 0.6556452 NA
## Test set 0.2333239 0.3695156
The mean of the residuals is close to zero and there appears to be little to no no correlation in the residual series. They also appear to be nearly normally distributed when we look at the histogram.
checkresiduals(fc)
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 545.84, df = 24, p-value < 2.2e-16
##
## Model df: 0. Total lags used: 24
We can see this sensitivity by trying out different splits and seeing how the accuracy changes. The splits we used train the model on data through 2000, 2005, or 2010 and the differing accuracy results ultimately show that the measures are very sensitive to the training/test split. I think that we would see larger discrepancies for some other series that have worse outcomes in their residuals.
myts.train2 <- window(myts, end=c(2000,12))
myts.test2 <- window(myts, start=2001)
myts.train3 <- window(myts, end=c(2005,12))
myts.test3 <- window(myts, start=2006)
fc2 <- snaive(myts.train2)
fc3 <- snaive(myts.train3)
autoplot(myts) +
autolayer(fc2, series="Trained through 2000", PI = FALSE) +
autolayer(fc3, series="Trained through 2005", PI = FALSE) +
autolayer(fc, series="Trained through 2010", PI = FALSE)
accuracy(fc,myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set 2.746847 7.205710 5.513814 5.520797 9.100298 1.000000
## Test set -4.233333 7.933999 6.316667 -4.839500 7.470770 1.145608
## ACF1 Theil's U
## Training set 0.6556452 NA
## Test set 0.2333239 0.3695156
accuracy(fc2,myts.test2)
## ME RMSE MAE MPE MAPE MASE
## Training set 3.641784 5.939179 4.452113 8.193854 9.852485 1.000000
## Test set 5.700000 10.905389 7.533333 5.351448 7.958069 1.692081
## ACF1 Theil's U
## Training set 0.6399205 NA
## Test set 0.3470475 0.5373456
accuracy(fc3,myts.test3)
## ME RMSE MAE MPE MAPE MASE
## Training set 3.814286 6.795320 5.100000 7.359787 9.354952 1.0000000
## Test set 1.891667 5.621239 4.658333 1.274826 4.306579 0.9133987
## ACF1 Theil's U
## Training set 0.5669672 NA
## Test set 0.5768809 0.2278897