ARIMA

Knudson

Stat 333

Today in class we learned about ARIMA AR stands for autoregressive MA stands for moving average and I stands for integrated

This model is used to forecast for the future. Throughout this Learning Log, I will adress more statistical terms and functions. #

##install.packages(c("quantmod", "tseries", "timeSeries", "forecast", "xts"))
library(quantmod)
## Warning: package 'quantmod' was built under R version 3.4.4
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.4
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.4.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.4.4
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.4.4
## 
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
## 
##     time<-
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
library(xts)

Autoregressive Models

Let’s look at 2 ar(1) models

par(mfrow=c(2,1))

phi = .9

plot(arima.sim(list(order=c(1,0,0),ar=.9) , n=200), ylab = "y", main = (expression(AR(1) ~~~ phi == +.9)))
abline(0,0)

phi = -.9

plot(arima.sim(list(order=c(1,0,0),ar= -.9) , n=200), ylab = "y", main = (expression(AR(1) ~~~ phi == -.9)))
abline(0,0)

Right now, we are simulating random points.

Let’s fit a model using simulated ar(1) data

set.seed(2115)
fakedata <- arima.sim(list(order=c(1,0,0),ar=.9) , n=200)
plot(fakedata)

mod1 <- auto.arima(fakedata)
mod1 
## Series: fakedata 
## ARIMA(1,0,0) with zero mean 
## 
## Coefficients:
##          ar1
##       0.9266
## s.e.  0.0285
## 
## sigma^2 estimated as 1.004:  log likelihood=-284.66
## AIC=573.32   AICc=573.38   BIC=579.92

We see the output from the data It tells us that p is equal to .1 the mean is 0

Then we see coefficients - ar1 parameter is estimated to be .9266 the standard error is .0285

The model it is estimating would be

\(y_{t} = .966*(\)y_{t-1}$) + \(y_{b}\) end \(e_{t}\) ~ N(0,1.004)

P is the number of previous y observations that we depend on

Now let’s forecast using the model we just created

1 day into the future

mycast1 <- forecast(mod1, h=1) # includes prediction intervals
plot(mycast1) 

plot(forecast(mod1, h=10)) # 10 days into the future

Lower and upper bounds for the prediction intervals

Moving Average Models

par(mfrow = c(2,1))
plot(arima.sim(list(order=c(0,0,1), ma=.9), n = 200) , ylab = "y", main = (expression(MA(1) ~~~ theta ==  +.9)))
plot(arima.sim(list(order=c(0,0,1), ma=-.9), n = 200) , ylab = "y", main = (expression(MA(1) ~~~ theta ==  -.9)))

Let’s fit a model using simulated MA(1) data

set.seed(2.718)
ma.data <- arima.sim(list(order=c(0,0,1), ma=.9), n = 200)
mod2 <- auto.arima(ma.data)
mod2
## Series: ma.data 
## ARIMA(1,0,2) with zero mean 
## 
## Coefficients:
##           ar1    ma1     ma2
##       -0.8354  1.723  0.7496
## s.e.   0.2317  0.237  0.2105
## 
## sigma^2 estimated as 1.143:  log likelihood=-296.35
## AIC=600.71   AICc=600.91   BIC=613.9

The seed number is just random (makes sure everyone in the class gets the same random number) We are simulating data here. q is our moving average number, so here we are setting q = 1.

Simulated data does not necessarily mean that the data will look the same as the data it was simulated from.

We can see from the output, we have an ARIMA model with p = 1 and q = 2… This is something that can happen.

Now let’s forecast using the model we just created

plot(forecast(mod2, h=1)) # 1 day into the future

forecast(mod2, h=1)
##     Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## 201      0.3159447 -1.054301 1.686191 -1.779666 2.411555
plot(forecast(mod2, h=10)) # 10 days into the future

The blue dot is for the point prediction. The blue around it is the prediction interval. The first graph is 1 day into the future and the second one is ten days into the future.

Autoregressive Moving Average Models (ARMA)

Let’s fit a model using simulated MA(1) data

set.seed(1234)
arma.data <- arima.sim(list(order=c(1,0,1), ma=.9, ar = .9), n = 200)
mod3 <- auto.arima(arma.data)
mod3
## Series: arma.data 
## ARIMA(4,0,4) with non-zero mean 
## 
## Coefficients:
##          ar1      ar2     ar3      ar4     ma1     ma2      ma3      ma4
##       1.1663  -0.5789  0.8841  -0.5949  0.7172  0.0696  -0.4784  -0.6420
## s.e.  0.3502   0.5705  0.3427   0.1475  0.3347  0.2136   0.2604   0.1482
##        mean
##       1.989
## s.e.  0.372
## 
## sigma^2 estimated as 0.9662:  log likelihood=-278.04
## AIC=576.09   AICc=577.25   BIC=609.07

Simulate some data p = 1 q= 1

Simulated 200 observations from this

Very complicated output

Sample size of 200 may not be big enough

Now let’s forecast using the model we just created

plot(forecast(mod3, h=1)) # 1 day into the future

forecast(mod3, h=1)
##     Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## 201       2.299482 1.039752 3.559213 0.3728907 4.226074
plot(forecast(mod3, h=10)) # 10 days into the future

Use the forecase command and say how far out we want to forecast.

Autoregressive Integrated Moving Average Models (ARIMA)

Start with a basic (kind of) example

plot(WWWusage)

mod4 <- auto.arima(WWWusage)
mod4
## Series: WWWusage 
## ARIMA(1,1,1) 
## 
## Coefficients:
##          ar1     ma1
##       0.6504  0.5256
## s.e.  0.0842  0.0896
## 
## sigma^2 estimated as 9.995:  log likelihood=-254.15
## AIC=514.3   AICc=514.55   BIC=522.08
mycast <- forecast(mod4,h=20)
plot(mycast)

Time series of the number of users connected to the internet every minute Basically number of users over time

auto.arima(time series) is how you fit the model For this data, we get ARIMA(1,1,1)

In this example, we are going 20 time periods out.

Money time

Pull data from Yahoo finance

getSymbols('TECHM.NS', from='2012-01-01', to='2015-01-01')
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
## 
## This message is shown once per session and may be disabled by setting 
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
## 
## WARNING: There have been significant changes to Yahoo Finance data.
## Please see the Warning section of '?getSymbols.yahoo' for details.
## 
## This message is shown once per session and may be disabled by setting
## options("getSymbols.yahoo.warning"=FALSE).
## Warning: TECHM.NS contains missing values. Some functions will not work if
## objects contain missing values in the middle of the series. Consider using
## na.omit(), na.approx(), na.fill(), etc to remove or replace them.
## [1] "TECHM.NS"

Select the relevant close price series

stock_prices <- TECHM.NS[,4]

mod5 <- auto.arima(stock_prices)
mycast5 <- forecast(mod5, h=10)
plot(mycast5)

mod6 <- auto.arima(log(stock_prices))
mycast6 <- forecast(mod6, h=10)
plot(mycast6)

par(mfrow=c(2,1))
plot(mycast5)
plot(mycast6)

We see an upward trend which does make sense.

Here in this example, we are forecasting 10 days out.

The blue is the forecast.

YOUR TURN

Fit ARIMA models with various data sets and do prediction

data("AirPassengers")
 data(globtemp)
## Warning in data(globtemp): data set 'globtemp' not found
data(jj)
## Warning in data(jj): data set 'jj' not found

Auto Regressive Model

I am going to work with these libraries above. I am going to use this data to create a model…

##myMod <- arima(globtemp, c = (1,0,0))
##myMod

Now, I am going to use my model to forecast into the future.

##mycast1 <- forecast(myMod, h=1) # includes prediction intervals
##plot(mycast1) 

##plot(forecast(myMod, h=10)) # 10 days into the future

We can see the blue on the plot are the forecasted data points. The first one is for the next day and the next plot is the forecasted data for the next 10 days.

Moving Average Models

##mod2 <- arima(globtemp , order = c(0,0,1))
##mod2

Here is my moving average model.

##plot(forecast(mod2, h=1)) # 1 day into the future
##forecast(mod2, h=1)
##plot(forecast(mod2, h=10)) # 10 days into the future

ARIMA Model

##mod2 <- arima(globtemp , order = c(1,1,1))
##mod2

Here is my moving average model.

##plot(forecast(mod2, h=1)) # 1 day into the future
##forecast(mod2, h=1)
##plot(forecast(mod2, h=10)) # 10 days into the future