In addition to methods of model comparison learned previously in this course, such as the F-test, AIC, and the Wald test, one can use the Likeihood Ratio Test to compare models. For this test, the two models being compared need to be nested. The LRT copares the fit of two models using likelihoods.
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
library(quantmod)
## Warning: package 'quantmod' was built under R version 3.4.4
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.4
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.4.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(tseries)
## Warning: package 'tseries' was built under R version 3.4.4
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.4.4
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.4.4
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
library(forecast)
library(xts)
data("wineind")
To use the likelihood ratio test, we need 2 nested models. For the simpler one, we will use the one given by the auto.arima command:
mod1 <- auto.arima(wineind)
mod1
## Series: wineind
## ARIMA(1,1,2)(0,1,1)[12]
##
## Coefficients:
## ar1 ma1 ma2 sma1
## 0.4299 -1.4673 0.5339 -0.6600
## s.e. 0.2984 0.2658 0.2340 0.0799
##
## sigma^2 estimated as 5399312: log likelihood=-1497.05
## AIC=3004.1 AICc=3004.48 BIC=3019.57
For the second, more complicated model, we will increase the moving average for the seasonal component by one. Hopefully, we will find that this extra term is insignificant.
mod2 <- arima(wineind, order = c(1,1,2), season = list( order = c( 0,1,2), period=12))
mod2
##
## Call:
## arima(x = wineind, order = c(1, 1, 2), seasonal = list(order = c(0, 1, 2), period = 12))
##
## Coefficients:
## ar1 ma1 ma2 sma1 sma2
## -0.8967 0.0884 -0.8975 -0.5840 -0.0935
## s.e. 0.0458 0.0634 0.0586 0.0847 0.0791
##
## sigma^2 estimated as 5132423: log likelihood = -1496.28, aic = 3004.56
To find the degrees of freedom, we take the difference in the number of parameters for each model.
mydf <- length(coef(mod2)) - length(coef(mod1))
Next, we can calculate the test stat using the loglik command to find the loglikelihood.
(teststat<- -2*as.numeric(logLik(mod1) - logLik(mod2)))
## [1] 1.545086
The test stat follows a chi squared distribution, so we can find the p-value in the following wasy
pchisq(teststat, df = mydf, lower.tail = FALSE)
## [1] 0.2138624
The p-value is large, so the additional moving average for the seasonal component is insignificant. Thus, it is not needed and we should stick with the simpler model.