Time Series Assignment

For this assignment, I chose to use the Australian Electric Production dataset. I will build 3 models: ARIMA, ETS, and Neural Net and compare the results.

library(fpp3)
## ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──
## ✓ tibble      3.1.4     ✓ tsibble     1.1.1
## ✓ dplyr       1.0.7     ✓ tsibbledata 0.4.0
## ✓ tidyr       1.1.3     ✓ feasts      0.2.2
## ✓ lubridate   1.8.0     ✓ fable       0.3.1
## ✓ ggplot2     3.3.5
## Warning: package 'tsibbledata' was built under R version 4.1.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## x lubridate::date()    masks base::date()
## x dplyr::filter()      masks stats::filter()
## x tsibble::intersect() masks base::intersect()
## x tsibble::interval()  masks lubridate::interval()
## x dplyr::lag()         masks stats::lag()
## x tsibble::setdiff()   masks base::setdiff()
## x tsibble::union()     masks base::union()
library(forecast)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(fpp2)
## ── Attaching packages ────────────────────────────────────────────── fpp2 2.4 ──
## ✓ fma       2.4     ✓ expsmooth 2.3
## 
## 
## Attaching package: 'fpp2'
## The following object is masked from 'package:fpp3':
## 
##     insurance
options(scipen=999)

Prepare Data

The data were split into a training and test set, reserving the final 20% of observations for testing.

library(dplyr)
electric<-aus_production %>%
  select(c('Quarter', 'Electricity'))

training<-electric[1:174,]
test<-electric[175:218, ]

Below is the plot of the data as well as the ACF and PACF curves. Electric usage follows an upward trend with a clear seasonal element. From the ACF plot, We can see a clear decrease at lag=1.

autoplot(electric)
## Plot variable not specified, automatically selected `.vars = Electricity`

autoplot(acf(electric))

autoplot(pacf(electric))

## Building Models For each of the models, I used the automatic specification selector to choose the appropriate model. Specifications can be seen below.

e_arima<-  training %>% 
  model(ARIMA(Electricity))
report(e_arima)
## Series: Electricity 
## Model: ARIMA(1,1,1)(1,1,2)[4] 
## 
## Coefficients:
##          ar1      ma1    sar1     sma1    sma2
##       0.1409  -0.4765  0.9118  -1.7516  0.8269
## s.e.  0.2065   0.1860  0.0470   0.0690  0.0646
## 
## sigma^2 estimated as 194578:  log likelihood=-1270.05
## AIC=2552.1   AICc=2552.62   BIC=2570.88
e_ets<-  training %>% 
  model(ETS(Electricity))
report(e_ets)
## Series: Electricity 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.5529462 
##     beta  = 0.04635571 
##     gamma = 0.3404324 
## 
##   Initial states:
##      l[0]    b[0]      s[0]    s[-1]    s[-2]    s[-3]
##  4159.474 106.133 0.9625613 1.081404 1.026568 0.929467
## 
##   sigma^2:  0.0003
## 
##      AIC     AICc      BIC 
## 2923.277 2924.374 2951.708
e_nn <-  training %>% 
  model(NNETAR(sqrt(Electricity)))
report(e_nn)
## Series: Electricity 
## Model: NNAR(1,1,2)[4] 
## Transformation: sqrt(Electricity) 
## 
## Average of 20 networks, each of which is
## a 2-2-1 network with 9 weights
## options were - linear output units 
## 
## sigma^2 estimated as 2.631

I then built forecasts for 43 observations, equal to holdout from the test set.

e_predict_arima <- forecast(e_arima,h=43)
e_predict_ets<- forecast(e_ets,h= 43)
e_predict_nn <- forecast(e_nn, h=43)

Below are plots including predicted values for each model compared with the full dataset, as well as just predicted compared to test for a more granular view. We can see that all the models do a pretty good job picking up seasoinality, but that the neural net model seems to miss a bit low.

autoplot(e_predict_arima, electric)+
  labs(title="ARIMA Prediction")

autoplot(e_predict_arima, test)+
  labs(title="ARIMA Prediction")

autoplot(e_predict_ets, electric)+
  labs(title="ETS Prediction")

autoplot(e_predict_ets, test)+
  labs(title="ETS Prediction")

autoplot(e_predict_nn, electric)+
  labs(title="Neural Net Prediction")

autoplot(e_predict_nn, test)+
  labs(title="Neural Net Prediction")

When we compare key statistics, we have an obvious winner. The ARIMA model has the lowest RMSE and MAE. The mean percent error is also quit low at less than 1, as is the Mean average percent error at 1.3. This suggests good model fit, much better than the other two models. When we look at accuracy, we see that the neural net model actually outperforms the ARIMA model by a small margin. Even though this is the case, the ARIMA model is the clear choice because of its performance on model fit and only a very small difference in ACF

acc1=accuracy(e_predict_arima, test)
acc2=accuracy(e_predict_ets, test)
acc3=accuracy(e_predict_nn, test)
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
acc1%>%kbl(caption="ARIMA")%>%kable_classic(html_font="Cambria")
ARIMA
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
ARIMA(Electricity) Test 293.7814 1229.619 934.5054 0.5120887 1.673898 NaN NaN 0.1821686
acc2%>%kbl(caption="ETS")%>%kable_classic(html_font="Cambria")
ETS
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
ETS(Electricity) Test -1116.364 1876.313 1392.391 -1.991293 2.504426 NaN NaN 0.4733031
acc3%>%kbl(caption="Neural Net")%>%kable_classic(html_font="Cambria")
Neural Net
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
NNETAR(sqrt(Electricity)) Test 3396.958 4099.54 3396.958 5.995831 5.995831 NaN NaN 0.5921725

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.