Do exercises 3.1, 3.2, 3.3 and 3.8 from the online Hyndman book. Please include your Rpubs link along with your .rmd file.
For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.
exercise1<-
function(data, name){
return1<-
paste0("The boxcox transformation lambda value for the ", name, " dataset is ",round(BoxCox.lambda(data),2))
p1<-
autoplot(data)+
theme_bw()+
labs(title = name,
y= name)
p2<-
autoplot(BoxCox(data, lambda = BoxCox.lambda(data)))+
theme_bw()+
labs(title = paste0("BoxCox transformed ", name),
y = name)
return2<-grid.arrange(p1,p2, nrow=2)
return(list(return1,return2))
}
exercise1(usnetelec,"usnetelec")
## [[1]]
## [1] "The boxcox transformation lambda value for the usnetelec dataset is 0.52"
##
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]
exercise1(usgdp,"usgdp")
## [[1]]
## [1] "The boxcox transformation lambda value for the usgdp dataset is 0.37"
##
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]
exercise1(mcopper,"mcopper")
## [[1]]
## [1] "The boxcox transformation lambda value for the mcopper dataset is 0.19"
##
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]
exercise1(enplanements,"enplanements")
## [[1]]
## [1] "The boxcox transformation lambda value for the enplanements dataset is -0.23"
##
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]
Why is a Box-Cox transformation unhelpful for the cangas data?
as shown below, the boxcox transformation for the cangas dataset is unhelpful because the change in magnitude of the seasonality is not constant. from 1980 to 1990 the magnitude increases, but then subsequently dampens thereafter. The boxcox transformation cannot capture both of these fluctuation deltas.
exercise1(cangas,"cangas")
## [[1]]
## [1] "The boxcox transformation lambda value for the cangas dataset is 0.58"
##
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]
What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?
as shown below, A lambda of 0.13 would could be used for the transformation of the austrailian sales retail data.
library(httr)
url1<-"https://otexts.com/fpp2/extrafiles/retail.xlsx"
GET(url1, write_disk(tf <- tempfile(fileext = ".xlsx")))
## Response [https://otexts.com/fpp2/extrafiles/retail.xlsx]
## Date: 2021-02-20 02:56
## Status: 200
## Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
## Size: 639 kB
## <ON DISK> C:\Users\REGIST~1\AppData\Local\Temp\RtmpSQXLVH\fileb5041ca262ef.xlsx
retaildata <- readxl::read_excel(tf, skip = 1)
AustrailianSales <- ts(retaildata[,"A3349873A"],
frequency=12, start=c(1982,4))
exercise1(AustrailianSales,"AustrailianSales")
## [[1]]
## [1] "The boxcox transformation lambda value for the AustrailianSales dataset is 0.13"
##
## [[2]]
## TableGrob (2 x 1) "arrange": 2 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (2-2,1-1) arrange gtable[layout]
###a. Split the data into two parts using
myts= AustrailianSales
myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)
as shown below, we’ve properly captured the testing and training portions of our Austrailian retail sales data for our forecasting model
autoplot(myts) +
autolayer(myts.train, series="Training") +
autolayer(myts.test, series="Test")
using the function snaive, we can use the naive forecasting method which uses the last observed value for predicting future events
fc <- snaive(myts.train)
the accuracy of this shows an RMSE of 20 on the training set and 71 on on the test set.
accuracy(fc,myts.test)
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 7.772973 20.24576 15.95676 4.702754 8.109777 1.000000 0.7385090
## Test set 55.300000 71.44309 55.78333 14.900996 15.082019 3.495907 0.5315239
## Theil's U
## Training set NA
## Test set 1.297866
the residuals do satisfy the requirements of statistical forecasting and are scattered randomly about the mean with normality in the errors.
checkresiduals(fc)
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 624.45, df = 24, p-value < 2.2e-16
##
## Model df: 0. Total lags used: 24
the accuracy measures are very sensitive to the training and testing set as this method does not capture the true seasonality and simply uses the last predicted value