Concepts The concept we discussed today was ARIMA models to make predictions for a(n) time series. ARIMA models take in 3 components of time series modeling to make the most accurate prediction(s). We start out using autoregressive aspects, which means we use past data points to predict future/ current data points. This is kind of like when we used a linear regression model to make predictions using past data points as predictors. Then we incorporate differencing to get constant variance and a mean of 0. Differencing for a model uses the differences between consecutive measurements instead of the measurements by themselves. Lastly ARIMA uses moving average aspects. Like the autoregressive aspect the moving average aspect is pretty much making a linear regression model using the error terms for the previous data points as the predictors.
ARIMA models are considered one of the most accurate prediction techniques available when talking about a time series.
Model Creating The data set we will use for an example is the data set globtemp from R package astsa. The data set globtemp displays global mean land-ocean temperature deviations in centigrade for the years 1880-2015. After loading the first package we will also need to load more packages to run use our ARIMA commands. This are called quantmod, tseries, timeSeries, forecast and xts. The first command we will use is auto.arima(dataset). This automatically gives us orders of autoregressive, differencing and moving average portions in our model and coefficients we can use in the model to build our new equation.
library(astsa)
## Warning: package 'astsa' was built under R version 3.3.3
library(quantmod)
## Warning: package 'quantmod' was built under R version 3.3.3
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.3.3
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.3.3
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.3.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.3.3
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.3.3
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
library(tseries)
## Warning: package 'tseries' was built under R version 3.3.3
library(forecast)
## Warning: package 'forecast' was built under R version 3.3.3
##
## Attaching package: 'forecast'
## The following object is masked from 'package:astsa':
##
## gas
library(xts)
data(globtemp)
mod <- auto.arima(globtemp)
mod
## Series: globtemp
## ARIMA(1,1,1) with drift
##
## Coefficients:
## ar1 ma1 drift
## 0.3549 -0.7663 0.0072
## s.e. 0.1314 0.0874 0.0032
##
## sigma^2 estimated as 0.01011: log likelihood=119.88
## AIC=-231.76 AICc=-231.46 BIC=-220.14
The model output says we should use a model with one degree of autoregression, one degree of moving average and one degree of differencing. So our model would be: yt=.3549yt???1+??t???.7663??t???1. Drift tells us overall trend of the model and our drift of .0072 tells us that we have an upward trend
Forecasting Using our model we can start forecasting/ making predictions for our time series. We use the command forecast(model, h) where h is the number of time units in the future we are going to forecast. I am going to forcast 6 years into the future, so we would be predicting the mean land-ocean temperature deviations for 2016-2021.
forecasting <- forecast(mod, h=6)
forecasting
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2016 0.8031838 0.6743294 0.9320383 0.6061180 1.000250
## 2017 0.7841177 0.6346035 0.9336320 0.5554555 1.012780
## 2018 0.7819961 0.6219774 0.9420147 0.5372687 1.026723
## 2019 0.7858872 0.6181356 0.9536388 0.5293333 1.042441
## 2020 0.7919120 0.6174348 0.9663893 0.5250721 1.058752
## 2021 0.7986940 0.6179620 0.9794260 0.5222882 1.075100
The forecast output gives us the point prediction with the 80% prediction interval and the 95% prediction interval for each of our point prediction(s). We see point predictions as well as their corresponding prediction intervals on a plot of the time series using command plot(forecast.output)
plot(forecasting)
The black line represents the data points we had already observed, the blue line represents our point prediction(s), the dark blue shaded area shows the 80% prediction interval and the light blue shaded area shows 95% prediction interval for our point predictions.
Continuing ARIMA models are important because it allows us to make predictions for a time series model(s). Their needed to any analysis of a time series. This is furthering our understanding of our moving average techniques.