For this assignment, I chose to use the Australian Electric Production dataset. I will build 3 models: ARIMA, ETS, and Neural Net and compare the results.
library(fpp3)
## ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──
## ✓ tibble 3.1.4 ✓ tsibble 1.1.1
## ✓ dplyr 1.0.7 ✓ tsibbledata 0.4.0
## ✓ tidyr 1.1.3 ✓ feasts 0.2.2
## ✓ lubridate 1.8.0 ✓ fable 0.3.1
## ✓ ggplot2 3.3.5
## Warning: package 'tsibbledata' was built under R version 4.1.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## x lubridate::date() masks base::date()
## x dplyr::filter() masks stats::filter()
## x tsibble::intersect() masks base::intersect()
## x tsibble::interval() masks lubridate::interval()
## x dplyr::lag() masks stats::lag()
## x tsibble::setdiff() masks base::setdiff()
## x tsibble::union() masks base::union()
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(fpp2)
## ── Attaching packages ────────────────────────────────────────────── fpp2 2.4 ──
## ✓ fma 2.4 ✓ expsmooth 2.3
##
##
## Attaching package: 'fpp2'
## The following object is masked from 'package:fpp3':
##
## insurance
options(scipen=999)
The data were split into a training and test set, reserving the final 20% of observations for testing.
library(dplyr)
electric<-aus_production %>%
select(c('Quarter', 'Electricity'))
training<-electric[1:174,]
test<-electric[175:218, ]
Below is the plot of the data as well as the ACF and PACF curves. Electric usage follows an upward trend with a clear seasonal element. From the ACF plot, We can see a clear decrease at lag=1.
autoplot(electric)
## Plot variable not specified, automatically selected `.vars = Electricity`
autoplot(acf(electric))
autoplot(pacf(electric))
## Building Models For each of the models, I used the automatic
specification selector to choose the appropriate model. Specifications
can be seen below.
e_arima<- training %>%
model(ARIMA(Electricity))
report(e_arima)
## Series: Electricity
## Model: ARIMA(1,1,1)(1,1,2)[4]
##
## Coefficients:
## ar1 ma1 sar1 sma1 sma2
## 0.1409 -0.4765 0.9118 -1.7516 0.8269
## s.e. 0.2065 0.1860 0.0470 0.0690 0.0646
##
## sigma^2 estimated as 194578: log likelihood=-1270.05
## AIC=2552.1 AICc=2552.62 BIC=2570.88
e_ets<- training %>%
model(ETS(Electricity))
report(e_ets)
## Series: Electricity
## Model: ETS(M,A,M)
## Smoothing parameters:
## alpha = 0.5529462
## beta = 0.04635571
## gamma = 0.3404324
##
## Initial states:
## l[0] b[0] s[0] s[-1] s[-2] s[-3]
## 4159.474 106.133 0.9625613 1.081404 1.026568 0.929467
##
## sigma^2: 0.0003
##
## AIC AICc BIC
## 2923.277 2924.374 2951.708
e_nn <- training %>%
model(NNETAR(sqrt(Electricity)))
report(e_nn)
## Series: Electricity
## Model: NNAR(1,1,2)[4]
## Transformation: sqrt(Electricity)
##
## Average of 20 networks, each of which is
## a 2-2-1 network with 9 weights
## options were - linear output units
##
## sigma^2 estimated as 2.631
I then built forecasts for 43 observations, equal to holdout from the test set.
e_predict_arima <- forecast(e_arima,h=43)
e_predict_ets<- forecast(e_ets,h= 43)
e_predict_nn <- forecast(e_nn, h=43)
Below are plots including predicted values for each model compared with the full dataset, as well as just predicted compared to test for a more granular view. We can see that all the models do a pretty good job picking up seasoinality, but that the neural net model seems to miss a bit low.
autoplot(e_predict_arima, electric)+
labs(title="ARIMA Prediction")
autoplot(e_predict_arima, test)+
labs(title="ARIMA Prediction")
autoplot(e_predict_ets, electric)+
labs(title="ETS Prediction")
autoplot(e_predict_ets, test)+
labs(title="ETS Prediction")
autoplot(e_predict_nn, electric)+
labs(title="Neural Net Prediction")
autoplot(e_predict_nn, test)+
labs(title="Neural Net Prediction")
When we compare key statistics, we have an obvious winner. The ARIMA
model has the lowest RMSE and MAE. The mean percent error is also quit
low at less than 1, as is the Mean average percent error at 1.3. This
suggests good model fit, much better than the other two models. When we
look at accuracy, we see that the neural net model actually outperforms
the ARIMA model by a small margin. Even though this is the case, the
ARIMA model is the clear choice because of its performance on model fit
and only a very small difference in ACF
acc1=accuracy(e_predict_arima, test)
acc2=accuracy(e_predict_ets, test)
acc3=accuracy(e_predict_nn, test)
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
acc1%>%kbl(caption="ARIMA")%>%kable_classic(html_font="Cambria")
| .model | .type | ME | RMSE | MAE | MPE | MAPE | MASE | RMSSE | ACF1 |
|---|---|---|---|---|---|---|---|---|---|
| ARIMA(Electricity) | Test | 293.7814 | 1229.619 | 934.5054 | 0.5120887 | 1.673898 | NaN | NaN | 0.1821686 |
acc2%>%kbl(caption="ETS")%>%kable_classic(html_font="Cambria")
| .model | .type | ME | RMSE | MAE | MPE | MAPE | MASE | RMSSE | ACF1 |
|---|---|---|---|---|---|---|---|---|---|
| ETS(Electricity) | Test | -1116.364 | 1876.313 | 1392.391 | -1.991293 | 2.504426 | NaN | NaN | 0.4733031 |
acc3%>%kbl(caption="Neural Net")%>%kable_classic(html_font="Cambria")
| .model | .type | ME | RMSE | MAE | MPE | MAPE | MASE | RMSSE | ACF1 |
|---|---|---|---|---|---|---|---|---|---|
| NNETAR(sqrt(Electricity)) | Test | 3396.958 | 4099.54 | 3396.958 | 5.995831 | 5.995831 | NaN | NaN | 0.5921725 |
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.