library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
library(quantmod)
## Warning: package 'quantmod' was built under R version 3.4.4
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.4
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.4.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(tseries)
## Warning: package 'tseries' was built under R version 3.4.4
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.4.4
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.4.4
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
library(forecast)
library(xts)
Today we discussed Wilks test that is used to compare two nested models. To do this we need to follow three main steps: 1. fit the two models 2. find the logLiklihood of each model and then find the teststat 3. Use pchisq(teststat) to determine which model to use
In this example I will be using the souvenir data from my previous learning log. We can plot this data as a reminder of what we were using.
souvenir <- scan("http://robjhyndman.com/tsdldata/data/fancy.dat")
souvenir <- ts(souvenir, frequency=12, start=c(1987,1))
plot.ts(souvenir)
Now, we can set our mod1 to the determined model from the previous log entry.
mod1 <- arima(log(souvenir), order = c(1,1,0), season = list( order = c( 0,1,1), period=12))
Next, we can take the logLiklihood which we will use later.
logLik(mod1)
## 'log Lik.' 20.49459 (df=3)
Now, for our second mod we will use 2s instead of 1s for the p,q,P, and Q spots.
mod2 <- arima(log(souvenir), order = c(2,1,0), season = list( order = c( 0,1,2), period=12))
Again, we can take the logLiklihood of our mod2 to compare to our mod1.
logLik(mod2)
## 'log Lik.' 20.79346 (df=5)
For our teststat, we will take the difference of the logliklihoods of the two models and multiply it by two.
teststat <- 2*as.numeric(logLik(mod2)-logLik(mod1))
teststat
## [1] 0.5977317
Now that we have our teststat, we can use pchisq to determine our pvalue. For this test our null hypothesis is that we don’t need p=2 and Q=2.
pchisq(teststat, df=2, lower.tail = FALSE)
## [1] 0.7416589
As you can see, our pvalue is high, thus we can reject the null hypothesis and say that we do not need the more complicated model and instead we should stick with the simpler model. If our pvalue was small we would accept the null hypothesis and change to the more complicated model.
In this example we decided to stick with our simpler model because it was not significantly better to use the more complicated model.