Installing Packages and Data

##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.4
## ✔ ggplot2   3.4.2     ✔ stringr   1.5.0
## ✔ lubridate 1.9.2     ✔ tibble    3.2.1
## ✔ purrr     1.0.1     ✔ tidyr     1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## 
## Attaching package: 'tsibble'
## 
## 
## The following object is masked from 'package:lubridate':
## 
##     interval
## 
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, union
## 
## 
## ── Attaching packages ────────────────────────────────────────────── fpp3 0.5 ──
## 
## ✔ tsibbledata 0.4.1     ✔ fable       0.3.3
## ✔ feasts      0.3.1     ✔ fabletools  0.3.3
## 
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
## 
## 
## Attaching package: 'magrittr'
## 
## 
## The following object is masked from 'package:purrr':
## 
##     set_names
## 
## 
## The following object is masked from 'package:tidyr':
## 
##     extract
## 
## 
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo 
## 
## 
## Attaching package: 'forecast'
## 
## 
## The following object is masked from 'package:fabletools':
## 
##     accuracy

Data Overview

An overview of the data can be observed below. There is a strong seasonal trend in the data, exhibiting extreme spikes in temperature during the summer months with troughs in the winter months. The highest peak can be observed in the month of July. Additionally, the lag plots suggest that this dataset is non-stationary. This is not surprising given that the data exhibits such extreme seasonality.

Splitting the Data into Training and Testing Datasets

The data is then split into training and testing data sets.

Fitting Models

The Neural Net model is then created using the NNETAR function from the “forecasting” package. A report of the model, fit, shows that the neural net model of best fit is an NNAR(28,1,14) with an average of 20 networks. Each of which is a 28-14-1 network with 421 weights. The sigma squared value is estimated as 1.054. As we look into the forecasting capability of this Neural Net model, we can see that it is forecasting with very fair accuracy.

## Series: AverageTemperature 
## Model: NNAR(28,1,14)[12] 
## 
## Average of 20 networks, each of which is
## a 28-14-1 network with 421 weights
## options were - linear output units 
## 
## sigma^2 estimated as 1.04

ETS and ARIMA

For reference on the performance of our Neural Net model, we can compare it to ETS and ARIMA models. As we can see, the Neural Net performs very well when compared to the ARIMA and ETS models in forecasting on the ~4 years of testing data.

## Series: AverageTemperature 
## Model: ETS(A,N,A) 
##   Smoothing parameters:
##     alpha = 0.0106137 
##     gamma = 0.0001000128 
## 
##   Initial states:
##      l[0]      s[0]     s[-1]    s[-2]   s[-3]    s[-4]    s[-5]    s[-6]
##  7.120089 -11.14303 -4.318293 1.797407 8.03873 12.09695 13.22261 10.47946
##     s[-7]     s[-8]     s[-9]    s[-10]    s[-11]
##  5.441562 -1.103636 -7.710703 -13.12944 -13.67161
## 
##   sigma^2:  3.2522
## 
##      AIC     AICc      BIC 
## 9495.252 9495.674 9570.991

## Series: AverageTemperature 
## Model: ARIMA(3,0,0)(2,1,0)[12] 
## 
## Coefficients:
##          ar1     ar2     ar3     sar1     sar2
##       0.1631  0.0158  0.0355  -0.6571  -0.3473
## s.e.  0.0296  0.0300  0.0298   0.0280   0.0281
## 
## sigma^2 estimated as 4.298:  log likelihood=-2449.41
## AIC=4910.82   AICc=4910.9   BIC=4941.06

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.