Question 3.1
- For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance:
  - usnetelec
  - usgdp
  - mcopper
  - enplanements
Question 3.2
- Why is a Box-Cox transformation unhelpful for the cangas data?
- The Cangas data set has seasonal variation that curves when the box cox transformation is applied. A good value of λ is one which makes the size of the seasonal variation about the same across the whole series, we do not see this in the Cangas dataset.Applying the box cox transformation does not improve the timeseries.
Question 3.3
- What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?
Question 3.8

Question 3.1

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance:

usnetelec
usgdp
mcopper
enplanements

## ── Attaching packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── fpp3 0.3 ──

## ✓ tibble      3.0.3     ✓ tsibble     0.9.2
## ✓ dplyr       1.0.2     ✓ tsibbledata 0.2.0
## ✓ tidyr       1.1.2     ✓ feasts      0.1.5
## ✓ lubridate   1.7.9     ✓ fable       0.2.1
## ✓ ggplot2     3.3.2

## ── Conflicts ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────── fpp3_conflicts ──
## x lubridate::date()   masks base::date()
## x dplyr::filter()     masks stats::filter()
## x tsibble::interval() masks lubridate::interval()
## x dplyr::lag()        masks stats::lag()

## Loading required package: forecast

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## Loading required package: fma

## Loading required package: expsmooth

usnetelec

autoplot(usnetelec)

lambda <- BoxCox.lambda(usnetelec)
lambda  #0.5167714

## [1] 0.5167714

autoplot(BoxCox(usnetelec,lambda))

usgdp

autoplot(usgdp)

lambda <- BoxCox.lambda(usgdp)
lambda  #0.366352

## [1] 0.366352

autoplot(BoxCox(usgdp,lambda))

mcopper

autoplot(mcopper)

lambda <- BoxCox.lambda(mcopper)
lambda  #0.1919047

## [1] 0.1919047

autoplot(BoxCox(mcopper,lambda))

enplanements

autoplot(enplanements)

lambda <- BoxCox.lambda(enplanements)
lambda  # -0.2269461

## [1] -0.2269461

autoplot(BoxCox(enplanements,lambda))

Question 3.2

Why is a Box-Cox transformation unhelpful for the cangas data?

autoplot(cangas)

lambda <- BoxCox.lambda(cangas)
lambda  # 0.5767759

## [1] 0.5767759

autoplot(BoxCox(cangas,lambda))

The Cangas data set has seasonal variation that curves when the box cox transformation is applied. A good value of λ is one which makes the size of the seasonal variation about the same across the whole series, we do not see this in the Cangas dataset.Applying the box cox transformation does not improve the timeseries.

Question 3.3

What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

myts <- ts(retaildata[,"A3349413L"],
  frequency=12, start=c(1982,4))

autoplot(myts)

lambda <- BoxCox.lambda(myts)
lambda  # 0.1606171

## [1] 0.1606171

autoplot(BoxCox(myts,lambda))

Question 3.8

For your retail time series (from Exercise 3 in Section 2.10):

Split the data into two parts using

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

Check that your data have been split appropriately by producing the following plot.

autoplot(myts) + autolayer(myts.train, series=“Training”) + #autolayer(myts.test, series=“Test”)

Calculate forecasts using snaive applied to myts.train.

fc <- snaive(myts.train)

Compare the accuracy of your forecasts against the actual values stored in myts.test.

accuracy(fc,myts.test)

##                     ME     RMSE      MAE       MPE      MAPE     MASE      ACF1
## Training set  6.702703 18.68344 14.29249  4.111808 10.115102 1.000000 0.7640962
## Test set     -6.304167 24.55469 18.66250 -2.918671  7.691228 1.305755 0.6891488
##              Theil's U
## Training set        NA
## Test set      1.019696

Check the residuals. Do the residuals appear to be uncorrelated and normally distributed?

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 749.15, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

The residuals appear to be uncorrelated when looking at the lad char and not normally distributed per the histogram chart.

A good forecasting method will yield residuals with the following properties:

The residuals are uncorrelated. If there are correlations between residuals, then there is information left in the residuals which should be used in computing forecasts.

The residuals have zero mean. If the residuals have a mean other than zero, then the forecasts are biased.

How sensitive are the accuracy measures to the training/test split?

myts2<- window(myts, start= 2000, end=c(2010, 12)) mytsfc1 <- meanf(myts2, h = 40) mytsfc2 <- rwf(myts2, h=40) mytsfc3 <- rwf(myts2, drift =TRUE, h=40)

autoplot(subset(myts, end=c(2010, 12))) + autolayer(mytsfc1, PI=FALSE, series = “Mean”)+ autolayer(mytsfc2, PI=FALSE, series = “Naïve”)+ autolayer(mytsfc3, PI=FALSE, series = “Drift”)+ guides(colors=guide_legend(title = “Forecasts”))

DATA624 HW2

Christina Kasman

9/9/2020