Assignment 2

Do exercises 3.1, 3.2, 3.3 and 3.8 from the online Hyndman book. Please include your Rpubs link along with your .rmd file.

3.1

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.

exercise1<-
function(data, name){
  return1<-
    paste0("The boxcox transformation lambda value for the ", name, " dataset is       ",round(BoxCox.lambda(data),2))
 
   
  p1<-
    autoplot(data)+
      theme_bw()+
      labs(title = name,
           y= name)
 
    
  p2<-
    autoplot(BoxCox(data, lambda = BoxCox.lambda(data)))+
      theme_bw()+
      labs(title = paste0("BoxCox transformed ", name),
           y = name)

  return2<-grid.arrange(p1,p2, nrow=2)
  
  return(list(return1,return2))

}

exercise1(usnetelec,"usnetelec")

## [[1]]
## [1] "The boxcox transformation lambda value for the usnetelec dataset is       0.52"
## 
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]

exercise1(usgdp,"usgdp")

## [[1]]
## [1] "The boxcox transformation lambda value for the usgdp dataset is       0.37"
## 
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]

exercise1(mcopper,"mcopper")

## [[1]]
## [1] "The boxcox transformation lambda value for the mcopper dataset is       0.19"
## 
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]

exercise1(enplanements,"enplanements")

## [[1]]
## [1] "The boxcox transformation lambda value for the enplanements dataset is       -0.23"
## 
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]

3.2

Why is a Box-Cox transformation unhelpful for the cangas data?

as shown below, the boxcox transformation for the cangas dataset is unhelpful because the change in magnitude of the seasonality is not constant. from 1980 to 1990 the magnitude increases, but then subsequently dampens thereafter. The boxcox transformation cannot capture both of these fluctuation deltas.

exercise1(cangas,"cangas")

## [[1]]
## [1] "The boxcox transformation lambda value for the cangas dataset is       0.58"
## 
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]

3.3

What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

as shown below, A lambda of 0.13 would could be used for the transformation of the austrailian sales retail data.

library(httr)
url1<-"https://otexts.com/fpp2/extrafiles/retail.xlsx"
GET(url1, write_disk(tf <- tempfile(fileext = ".xlsx")))

## Response [https://otexts.com/fpp2/extrafiles/retail.xlsx]
##   Date: 2021-02-20 02:56
##   Status: 200
##   Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
##   Size: 639 kB
## <ON DISK>  C:\Users\REGIST~1\AppData\Local\Temp\RtmpSQXLVH\fileb5041ca262ef.xlsx

retaildata  <- readxl::read_excel(tf, skip = 1)


AustrailianSales <- ts(retaildata[,"A3349873A"],
  frequency=12, start=c(1982,4))


exercise1(AustrailianSales,"AustrailianSales")

## [[1]]
## [1] "The boxcox transformation lambda value for the AustrailianSales dataset is       0.13"
## 
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]

3.8

###a. Split the data into two parts using

myts= AustrailianSales
myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

b. Check that your data have been split appropriately by producing the following plot.

as shown below, we’ve properly captured the testing and training portions of our Austrailian retail sales data for our forecasting model

autoplot(myts) +
  autolayer(myts.train, series="Training") +
  autolayer(myts.test, series="Test")

c. Calculate forecasts using snaive applied to myts.train

using the function snaive, we can use the naive forecasting method which uses the last observed value for predicting future events

fc <- snaive(myts.train)

d. Compare the accuracy of your forecasts against the actual values stored in myts.test.

the accuracy of this shows an RMSE of 20 on the training set and 71 on on the test set.

accuracy(fc,myts.test)

##                     ME     RMSE      MAE       MPE      MAPE     MASE      ACF1
## Training set  7.772973 20.24576 15.95676  4.702754  8.109777 1.000000 0.7385090
## Test set     55.300000 71.44309 55.78333 14.900996 15.082019 3.495907 0.5315239
##              Theil's U
## Training set        NA
## Test set      1.297866

e. check the residuals

the residuals do satisfy the requirements of statistical forecasting and are scattered randomly about the mean with normality in the errors.

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 624.45, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

f. How sensitive are the accuracy measures to the training/test split?

the accuracy measures are very sensitive to the training and testing set as this method does not capture the true seasonality and simply uses the last predicted value