Today we talked about ARIMA models. THis stands for autoregressive, integrated, moving average models.

The autoregressive part (p), is the amount of previous \(y_t\), or past observations, included in the model.

The integraded part (d), is the degree of differencing. We wan our time series to have constant variance and mean of zero. If we do not, then we will take differences between consecutive observations.

The moving average part (q), is the amount of previous epsilon terms included in the model.

Im am going to show an example using the gas dataset which has tgasoline weekly spot price in cents per gallong from 2000 to mid-2010.

Libraries we may need.

library(quantmod)
## Warning: package 'quantmod' was built under R version 3.4.4
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.3
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.4.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(tseries)
## Warning: package 'tseries' was built under R version 3.4.4
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.4.4
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.4.3
## 
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
## 
##     time<-
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
library(xts)

First I will plot my data. There might be seasonal variation but we do not know how to deal with that just yet so we won’t worry about it right now.

library(astsa)
## Warning: package 'astsa' was built under R version 3.4.3
## 
## Attaching package: 'astsa'
## The following object is masked from 'package:forecast':
## 
##     gas
data(gas)
plot(gas)

Next I will use ’auto.arima` to make an ARIMA model.

modgas <- auto.arima(gas)
modgas
## Series: gas 
## ARIMA(0,1,3) 
## 
## Coefficients:
##          ma1     ma2     ma3
##       0.0510  0.0647  0.1447
## s.e.  0.0426  0.0401  0.0426
## 
## sigma^2 estimated as 73.32:  log likelihood=-1938.65
## AIC=3885.29   AICc=3885.37   BIC=3902.49

Here we can see the ARIMA model that was created has p=0, d=1, and q=3. The others number we do not know about because they deal with seasonal variation.

ma1, ma2, ma3 gives us our coeficients for our epsilon/moving average terms. sma deals with seasonal variation.

Next I will try to forecast future gas prices using the forecast command. The arguments are our model, and h. h is the number of periods we are predicting into the future. Then we will plot our forecast.

mycast <- forecast(modgas,h=20)
mycast
##          Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## 2010.481       189.3976 178.4237 200.3715 172.61446 206.1807
## 2010.500       187.5015 171.5810 203.4219 163.15329 211.8496
## 2010.519       184.9345 164.8505 205.0184 154.21872 215.6502
## 2010.538       184.9345 160.5483 209.3207 147.63901 222.2299
## 2010.558       184.9345 156.8986 212.9703 142.05738 227.8116
## 2010.577       184.9345 153.6722 216.1967 137.12297 232.7460
## 2010.596       184.9345 150.7489 219.1200 132.65222 237.2167
## 2010.615       184.9345 148.0567 221.8123 128.53476 241.3342
## 2010.635       184.9345 145.5480 224.3209 124.69809 245.1709
## 2010.654       184.9345 143.1898 226.6791 121.09157 248.7774
## 2010.673       184.9345 140.9579 228.9110 117.67817 252.1908
## 2010.692       184.9345 138.8340 231.0350 114.42983 255.4391
## 2010.712       184.9345 136.8036 233.0653 111.32470 258.5443
## 2010.731       184.9345 134.8555 235.0134 108.34536 261.5236
## 2010.750       184.9345 132.9804 236.8885 105.47765 264.3913
## 2010.769       184.9345 131.1707 238.6982 102.70989 267.1591
## 2010.788       184.9345 129.4199 240.4490 100.03232 269.8366
## 2010.808       184.9345 127.7227 242.1462  97.43665 272.4323
## 2010.827       184.9345 126.0744 243.7945  94.91579 274.9532
## 2010.846       184.9345 124.4710 245.3979  92.46362 277.4053
plot(mycast)

If we run our forecast, it will give us the point estimates as well as 80% and 95% prediction intervals.

When we plot our data, the blue is our forecasted points! The medium blue shading represents the 95% prediction interval and the light blue represents the 80% prediction interval.

Overview

ARIMA models are an important type of model so it is good to have gained some background today. It will be useful to learn more about how these models work and go a little more in depth with what we can do with them. I think they were especially interesting today because we were able to forecast into the future. I am interested to see how we can account for seasonal variation in these models.