Data624 The forecaster’s toolbox Assignment2

Chapter 3:

suppressMessages(suppressWarnings(library(fpp2)))
suppressMessages(suppressWarnings(library(readxl)))

3.1 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.

usnetelec:

lmd = BoxCox.lambda(usnetelec)
lmd

## [1] 0.5167714

usnetelec.trans = BoxCox(usnetelec,lmd)
combined = cbind(usnetelec,usnetelec.trans)
autoplot(combined,facet=T) + xlab("Year") + ggtitle("usnetelec")

usgdp:

lmd = BoxCox.lambda(usgdp)
lmd

## [1] 0.366352

usgdp.trans = BoxCox(usgdp,lmd)
combined = cbind(usgdp,usgdp.trans)
autoplot(combined,facet=T) + xlab("Year") + ggtitle("usgdp")

mcopper:

lmd = BoxCox.lambda(mcopper)
lmd

## [1] 0.1919047

mcopper.trans = BoxCox(mcopper,lmd)
combined = cbind(mcopper,mcopper.trans)
autoplot(combined,facet=T) + xlab("Year") + ggtitle("mcopper")

enplanements:

lmd = BoxCox.lambda(enplanements)
lmd

## [1] -0.2269461

enplanements.trans = BoxCox(enplanements,lmd)
combined = cbind(enplanements,enplanements.trans)
autoplot(combined,facet=T) + xlab("Year") + ggtitle("enplanements")

3.2 Why is a Box-Cox transformation unhelpful for the cangas data?

To answer this lets first autoplot the cangas data.

cangas:

lmd = BoxCox.lambda(cangas)
lmd

## [1] 0.5767759

cangas.trans = BoxCox(cangas,lmd)
combined = cbind(cangas,cangas.trans)
autoplot(combined,facet=T) + xlab("Year") + ggtitle("cangas")

This time series is monthly Canadian gas production, in billions of cubic metres, January 1960 - February 2005. There is not much variations in the plots. So transformation is not always needed.

3.3 What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

Retails data:

retaildata <- readxl::read_excel("C:/Users/rites/Documents/GitHub/Data624_Assignment1/retail.xlsx", skip=1)

## readxl works best with a newer version of the tibble package.
## You currently have tibble v1.4.2.
## Falling back to column name repair from tibble <= v1.4.2.
## Message displays once per session.

myts <- ts(retaildata[,"A3349873A"],frequency=12, start=c(1982,4))

lmd = BoxCox.lambda(myts)
lmd

## [1] 0.1276369

myts.trans = BoxCox(myts,lmd)
combined = cbind(myts,myts.trans)
autoplot(combined,facet=T) + xlab("Year") + ggtitle("myts")

It would be good to choose Box-Cox Transformation with lambda = 0.1276369

3.8 For your retail time series (from Exercise 3 in Section 2.10):

a. Split the data into two parts using

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

b. Check that your data have been split appropriately by producing the following plot.

autoplot(myts) +
autolayer(myts.train, series="Training") +
autolayer(myts.test, series="Test")

c. Calculate forecasts using snaive applied to myts.train.

fc <- snaive(myts.train)

d. Compare the accuracy of your forecasts against the actual values stored in myts.test.

accuracy(fc,myts.test)

##                     ME     RMSE      MAE       MPE      MAPE     MASE
## Training set  7.772973 20.24576 15.95676  4.702754  8.109777 1.000000
## Test set     55.300000 71.44309 55.78333 14.900996 15.082019 3.495907
##                   ACF1 Theil's U
## Training set 0.7385090        NA
## Test set     0.5315239  1.297866

e. Check the residuals.

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 624.45, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

Do the residuals appear to be uncorrelated and normally distributed?

Residuals are correlated with each other and not normally distributed

f. How sensitive are the accuracy measures to the training/test split?

Sensitivity is the ratio of the test set error to the train set error. Looking at the accuracy results, i feel accuracy measures are very sensitive to training/test split.

Data624_Assignment2

Ritesh Lohiya

February 14, 2019

Data624 The forecaster’s toolbox Assignment2

Chapter 3:

3.1 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.

usnetelec:

usgdp:

mcopper:

enplanements:

3.2 Why is a Box-Cox transformation unhelpful for the cangas data?

To answer this lets first autoplot the cangas data.

cangas:

This time series is monthly Canadian gas production, in billions of cubic metres, January 1960 - February 2005. There is not much variations in the plots. So transformation is not always needed.

3.3 What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

It would be good to choose Box-Cox Transformation with lambda = 0.1276369

3.8 For your retail time series (from Exercise 3 in Section 2.10):

a. Split the data into two parts using

b. Check that your data have been split appropriately by producing the following plot.

c. Calculate forecasts using snaive applied to myts.train.

d. Compare the accuracy of your forecasts against the actual values stored in myts.test.

e. Check the residuals.

Do the residuals appear to be uncorrelated and normally distributed?

Residuals are correlated with each other and not normally distributed

f. How sensitive are the accuracy measures to the training/test split?

Sensitivity is the ratio of the test set error to the train set error. Looking at the accuracy results, i feel accuracy measures are very sensitive to training/test split.