Loading Library
load("workspace.RData")
## Registered S3 methods overwritten by 'ggplot2':
## method from
## [.quosures rlang
## c.quosures rlang
## print.quosures rlang
## Registered S3 method overwritten by 'xts':
## method from
## as.zoo.xts zoo
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Registered S3 methods overwritten by 'forecast':
## method from
## fitted.fracdiff fracdiff
## residuals.fracdiff fracdiff
library(fpp2)
## Warning: package 'fpp2' was built under R version 3.6.1
## Loading required package: ggplot2
## Loading required package: forecast
## Warning: package 'forecast' was built under R version 3.6.1
## Loading required package: fma
## Warning: package 'fma' was built under R version 3.6.1
## Loading required package: expsmooth
## Warning: package 'expsmooth' was built under R version 3.6.1
Converting into time series
df <- read.csv("Breakfast.csv",header=TRUE)
cbf.ts <- ts(df[,"Continental.B.F"],frequency=7)
str(cbf.ts)
## Time-Series [1:115] from 1 to 17.3: 25 25 25 35 41 30 40 40 40 40 ...
Visualizing
autoplot(cbf.ts)+
ggtitle("Number of People Ordering Continental BF") +
xlab("Week") +
ylab("Number of People")
Calculating the ACF and PCF
acf(cbf.ts)
pacf(cbf.ts)
it seems to be an MA process of the order 1.
The following Models are tried:
train <- window(cbf.ts, end=c(12,5))
h <- length(cbf.ts) - length(train)
ETS <- forecast(ets(train), h=h)
ARIMA <- forecast(auto.arima(train, lambda=0, biasadj=TRUE),
h=h)
STL <- stlf(train, lambda=0, h=h, biasadj=TRUE)
NNAR <- forecast(nnetar(train), h=h)
TBATS <- forecast(tbats(train, biasadj=TRUE), h=h)
Combination <- (ETS[["mean"]] + ARIMA[["mean"]] +
STL[["mean"]] + NNAR[["mean"]] + TBATS[["mean"]])/5
autoplot(cbf.ts) +
autolayer(ETS, series="ETS", PI=FALSE) +
autolayer(ARIMA, series="ARIMA", PI=FALSE) +
autolayer(STL, series="STL", PI=FALSE) +
autolayer(NNAR, series="NNAR", PI=FALSE) +
autolayer(TBATS, series="TBATS", PI=FALSE) +
autolayer(Combination, series="Combination") +
xlab("week") + ylab("Number of People Ordering Continental Breakfast") +
ggtitle("Number of People Ordering Continental Breakfast")
c(ETS = accuracy(ETS, cbf.ts)["Test set","MAPE"],
ARIMA = accuracy(ARIMA, cbf.ts)["Test set","MAPE"],
`STL-ETS` = accuracy(STL, cbf.ts)["Test set","MAPE"],
NNAR = accuracy(NNAR, cbf.ts)["Test set","MAPE"],
TBATS = accuracy(TBATS, cbf.ts)["Test set","MAPE"],
Combination =
accuracy(Combination, cbf.ts)["Test set","MAPE"])
## ETS ARIMA STL-ETS NNAR TBATS Combination
## 10.853219 9.739436 12.364599 10.220758 12.131538 10.714648
As can be seen ARIMA works the best with MAPE of 9.73%
Checking of demand for stationarity.
library(urca)
## Warning: package 'urca' was built under R version 3.6.1
Lets Apply the KPSS root test to see if the differencing is required
cbf.ts %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.1962
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
So the value of test statistics is 0.192 which is less than the critical values, so differencing is not required and series is stationary.
save.image("workspace.Rdata")