#install.packages("GGally")
library(GGally)

## Loading required package: ggplot2

library(fpp2)

## Loading required package: forecast

## Loading required package: fma

## 
## Attaching package: 'fma'

## The following object is masked from 'package:GGally':
## 
##     pigs

## Loading required package: expsmooth

library(readxl)

Questions HA 3.1-3.3, 3.8

Question 1: 3.1

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. usnetelec usgdp mcopper enplanements

Solution We want a good value of ?? is one which makes the size of the seasonal variation about the same across the whole series, as that makes the forecasting model simpler.

usnetelec

(lambda <- BoxCox.lambda(usnetelec))

## [1] 0.5167714

autoplot(BoxCox(usnetelec,lambda))

autoplot(usnetelec)

usgdp

(lambda <- BoxCox.lambda(usgdp))

## [1] 0.366352

autoplot(usgdp)

autoplot(BoxCox(usgdp,lambda))

mcopper

(lambda <- BoxCox.lambda(mcopper))

## [1] 0.1919047

autoplot(mcopper)

autoplot(BoxCox(mcopper,lambda))

enplanements

(lambda <- BoxCox.lambda(enplanements))

## [1] -0.2269461

autoplot(BoxCox(enplanements,lambda))

autoplot(enplanements)

Question 2

Why is a Box-Cox transformation unhelpful for the cangas data?

(lambda <- BoxCox.lambda(cangas))

## [1] 0.5767759

autoplot(BoxCox(cangas,lambda))

autoplot(cangas)

The box cox is unhelpful because the timeseries has too much non constant variance especially at the end.

Question 3

What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

retaildata <- readxl::read_excel("C:/Users/Mezu/Documents/retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"],
  frequency=12, start=c(1982,4))
autoplot(myts)

(lambda <- BoxCox.lambda(myts))

## [1] 0.1276369

autoplot(BoxCox(myts,lambda))

I will use lambda =0.127 This helpsto even out the variance

Question 3.8

For your retail time series (from Exercise 3 in Section 2.10): Split the data into two parts using

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

Check that your data have been split appropriately by producing the following plot.

autoplot(myts) +
  autolayer(myts.train, series="Training") +
  autolayer(myts.test, series="Test")

Calculate forecasts using snaive applied to myts.train.

fc <- snaive(myts.train)

Compare the accuracy of your forecasts against the actual values stored in myts.test

accuracy(fc,myts.test)

##                     ME     RMSE      MAE       MPE      MAPE     MASE
## Training set  7.772973 20.24576 15.95676  4.702754  8.109777 1.000000
## Test set     55.300000 71.44309 55.78333 14.900996 15.082019 3.495907
##                   ACF1 Theil's U
## Training set 0.7385090        NA
## Test set     0.5315239  1.297866

Check the residuals.

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 624.45, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

Do the residuals appear to be uncorrelated and normally distributed?

From the plots the residuals appears to be normally distributed

How sensitive are the accuracy measures to the training/test split?

The time plot of the residuals shows that the variation of the residuals has a pattern and it is not constant. We can also say the same for the ACFresidual lag plot. However, the distribution of the residuals is normall. This means that there is likely some bias since residual plot is not close to zero. The predictions might not be good but the prediction intervals assuming normally distribution should be good.

Homework2

Nnaemezue Obi-Eyisi

February 12, 2019

Questions HA 3.1-3.3, 3.8

Question 1: 3.1

Question 2

Question 3

Question 3.8