Time Series Assignment

For this assignment, I chose to use the Australian Electric Production dataset. I will build 3 models: ARIMA, ETS, and Neural Net and compare the results.

library(fpp3)

## ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──

## ✓ tibble      3.1.4     ✓ tsibble     1.1.1
## ✓ dplyr       1.0.7     ✓ tsibbledata 0.4.0
## ✓ tidyr       1.1.3     ✓ feasts      0.2.2
## ✓ lubridate   1.8.0     ✓ fable       0.3.1
## ✓ ggplot2     3.3.5

## Warning: package 'tsibbledata' was built under R version 4.1.2

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## x lubridate::date()    masks base::date()
## x dplyr::filter()      masks stats::filter()
## x tsibble::intersect() masks base::intersect()
## x tsibble::interval()  masks lubridate::interval()
## x dplyr::lag()         masks stats::lag()
## x tsibble::setdiff()   masks base::setdiff()
## x tsibble::union()     masks base::union()

library(forecast)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

library(fpp2)

## ── Attaching packages ────────────────────────────────────────────── fpp2 2.4 ──

## ✓ fma       2.4     ✓ expsmooth 2.3

##

## 
## Attaching package: 'fpp2'

## The following object is masked from 'package:fpp3':
## 
##     insurance

options(scipen=999)

Prepare Data

The data were split into a training and test set, reserving the final 20% of observations for testing.

library(dplyr)
electric<-aus_production %>%
  select(c('Quarter', 'Electricity'))

training<-electric[1:174,]
test<-electric[175:218, ]

Below is the plot of the data as well as the ACF and PACF curves. Electric usage follows an upward trend with a clear seasonal element. From the ACF plot, We can see a clear decrease at lag=1.

autoplot(electric)

## Plot variable not specified, automatically selected `.vars = Electricity`

autoplot(acf(electric))

autoplot(pacf(electric))

## Building Models For each of the models, I used the automatic specification selector to choose the appropriate model. Specifications can be seen below.

e_arima<-  training %>% 
  model(ARIMA(Electricity))
report(e_arima)

## Series: Electricity 
## Model: ARIMA(1,1,1)(1,1,2)[4] 
## 
## Coefficients:
##          ar1      ma1    sar1     sma1    sma2
##       0.1409  -0.4765  0.9118  -1.7516  0.8269
## s.e.  0.2065   0.1860  0.0470   0.0690  0.0646
## 
## sigma^2 estimated as 194578:  log likelihood=-1270.05
## AIC=2552.1   AICc=2552.62   BIC=2570.88

e_ets<-  training %>% 
  model(ETS(Electricity))
report(e_ets)

## Series: Electricity 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.5529462 
##     beta  = 0.04635571 
##     gamma = 0.3404324 
## 
##   Initial states:
##      l[0]    b[0]      s[0]    s[-1]    s[-2]    s[-3]
##  4159.474 106.133 0.9625613 1.081404 1.026568 0.929467
## 
##   sigma^2:  0.0003
## 
##      AIC     AICc      BIC 
## 2923.277 2924.374 2951.708

e_nn <-  training %>% 
  model(NNETAR(sqrt(Electricity)))
report(e_nn)

## Series: Electricity 
## Model: NNAR(1,1,2)[4] 
## Transformation: sqrt(Electricity) 
## 
## Average of 20 networks, each of which is
## a 2-2-1 network with 9 weights
## options were - linear output units 
## 
## sigma^2 estimated as 2.631

I then built forecasts for 43 observations, equal to holdout from the test set.

e_predict_arima <- forecast(e_arima,h=43)
e_predict_ets<- forecast(e_ets,h= 43)
e_predict_nn <- forecast(e_nn, h=43)

Below are plots including predicted values for each model compared with the full dataset, as well as just predicted compared to test for a more granular view. We can see that all the models do a pretty good job picking up seasoinality, but that the neural net model seems to miss a bit low.

autoplot(e_predict_arima, electric)+
  labs(title="ARIMA Prediction")

autoplot(e_predict_arima, test)+
  labs(title="ARIMA Prediction")

autoplot(e_predict_ets, electric)+
  labs(title="ETS Prediction")

autoplot(e_predict_ets, test)+
  labs(title="ETS Prediction")

autoplot(e_predict_nn, electric)+
  labs(title="Neural Net Prediction")

autoplot(e_predict_nn, test)+
  labs(title="Neural Net Prediction")

When we compare key statistics, we have an obvious winner. The ARIMA model has the lowest RMSE and MAE. The mean percent error is also quit low at less than 1, as is the Mean average percent error at 1.3. This suggests good model fit, much better than the other two models. When we look at accuracy, we see that the neural net model actually outperforms the ARIMA model by a small margin. Even though this is the case, the ARIMA model is the clear choice because of its performance on model fit and only a very small difference in ACF

acc1=accuracy(e_predict_arima, test)
acc2=accuracy(e_predict_ets, test)
acc3=accuracy(e_predict_nn, test)
library(kableExtra)

## 
## Attaching package: 'kableExtra'

## The following object is masked from 'package:dplyr':
## 
##     group_rows

acc1%>%kbl(caption="ARIMA")%>%kable_classic(html_font="Cambria")

ARIMA
.model	.type	ME	RMSE	MAE	MPE	MAPE	MASE	RMSSE	ACF1
ARIMA(Electricity)	Test	293.7814	1229.619	934.5054	0.5120887	1.673898	NaN	NaN	0.1821686

acc2%>%kbl(caption="ETS")%>%kable_classic(html_font="Cambria")

ETS
.model	.type	ME	RMSE	MAE	MPE	MAPE	MASE	RMSSE	ACF1
ETS(Electricity)	Test	-1116.364	1876.313	1392.391	-1.991293	2.504426	NaN	NaN	0.4733031

acc3%>%kbl(caption="Neural Net")%>%kable_classic(html_font="Cambria")

Neural Net
.model	.type	ME	RMSE	MAE	MPE	MAPE	MASE	RMSSE	ACF1
NNETAR(sqrt(Electricity))	Test	3396.958	4099.54	3396.958	5.995831	5.995831	NaN	NaN	0.5921725

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Assignment 3

Lucas Kaplan

2022-04-28

Time Series Assignment

Prepare Data