Example

Here is an example of how the likelihood ratio test works. I am going to use the wineind data set to show we can determine which model would be better to use. First, we need to call the following packages:

library(forecast)

## Warning in as.POSIXlt.POSIXct(Sys.time()): unknown timezone 'zone/tz/2018c.
## 1.0/zoneinfo/America/Chicago'

library(quantmod)

## Loading required package: xts

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: TTR

## Version 0.4-0 included new data defaults. See ?getSymbols.

library(tseries)
library(timeSeries)

## Loading required package: timeDate

## 
## Attaching package: 'timeSeries'

## The following object is masked from 'package:zoo':
## 
##     time<-

library(forecast)
library(xts)

data(wineind)
plot(wineind)

There appears to be a seasonal trend, but there isn’t an apparent overall trend. Now we want to determine what the time span of the seasonal repeating pattern is. We can look at the head of the data set to find this.

head(wineind)

##        Jan   Feb   Mar   Apr   May   Jun
## 1980 15136 16733 20016 17708 18019 19227

The data is collected by month, so S=12.

We can create a an ARIMA model using the auto.arima function on the dataset.

mod1<-auto.arima(wineind)
mod1

## Series: wineind 
## ARIMA(1,1,2)(0,1,1)[12] 
## 
## Coefficients:
##          ar1      ma1     ma2     sma1
##       0.4299  -1.4673  0.5339  -0.6600
## s.e.  0.2984   0.2658  0.2340   0.0799
## 
## sigma^2 estimated as 5399312:  log likelihood=-1497.05
## AIC=3004.1   AICc=3004.48   BIC=3019.57

The resulting model is ARIMA (1,1,2) x (0,1,1) 12. Let’s make another model so that we can compare the two and see which one is better overall. For example, let’s see what happens when we increase the order of the autoregressive component from 1 to 2.

mod2<-arima(wineind, order = c(2,1,2), season = list( order = c( 0,1,1), period=12))
mod2

## 
## Call:
## arima(x = wineind, order = c(2, 1, 2), seasonal = list(order = c(0, 1, 1), period = 12))
## 
## Coefficients:
##           ar1      ar2     ma1      ma2     sma1
##       -1.0640  -0.1981  0.1179  -0.8688  -0.6608
## s.e.   0.0849   0.0876  0.0716   0.0656   0.0827
## 
## sigma^2 estimated as 4995989:  log likelihood = -1494.52,  aic = 3001.05

To get the degrees of freedom for this test, we want to take the difference between the number of parameters for each model. We can do this by finding the length of the coefficient vectors for each model.

(mydf <- length(coef(mod2)) - length(coef(mod1)) )

## [1] 1

Now we want to find the test statistic by using the logLik command

(teststat<- -2*as.numeric(logLik(mod1) - logLik(mod2)))

## [1] 5.053129

This test statistic is going to follow a chi squared distribution, so we want to use the pchisq function in order to give us a p-value.

pchisq(teststat,df=mydf,lower.tail=FALSE)

## [1] 0.0245815

The p-value is pretty small, with a value of 0.0245815. This means that the more complicated model is better, and therefore we want to use the second model created, mod2 instead of the model created with the auto.arima function.

Learning Log 24

Kristen Rutschke

5/1/2018

Likelihood Ratio Test

Example