Over the course of the semester, we have learned various methods for model comparison. We have leanred about the F-test, the Wald Test, AIC comparison, and the adjusted \(R^2\) method. Today, we learned about another method for model comparison: the Likelihood Ratio Test. This is also known as the Wilks Test.
The Likelihood Ratio Test (LRT) compares the fit of two models using likelihoods, which involves comparing nested models. Recall that a nested model is a special case of another model. For the LRT we compare a simple model with a more complicated model, and we need evidence to justify using the more complicated model.
The test stat for the LRT is equal to -2[logliklihood(simpler model) - loglikelihood(more complicated model)]. This test stat follows a chi-square distribution, so it must a positive number. The degrees of freedom for the chi-square distribution is equal to the difference in the number of parameters for each model.
We can think of the LRT like a hypothesis. The null hypothesis states that we will use the simple model. The alternative hypothesis states that there is sufficient evidence that the more complicated model is better than the simple model. This makes sense ebcasue if the more complicated model is better, the test stat will be large, which means the p-value will be small. Therefore, if the p-value is small we will reject the null hypothesis in favor of the alternative and conclude that the more complicated model is better than the simpler model. Conversely, if the bigger model is not significantly better than the simple one, we will have a small test stat and large p-value. Thus, we will not reject the null hypothesis, and we will just use the simple model.
We can use R to compare two models using LRT. See the example below.
For this document we will be working with the wineind data, which includes informaton about the total wine sales by wine makers in bottles in Australia.
First let’s attach the following libraries that we will need for functions later in this document.
library(quantmod)
## Warning: package 'quantmod' was built under R version 3.4.4
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.3
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.4.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(tseries)
## Warning: package 'tseries' was built under R version 3.4.4
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.4.2
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.4.3
## Warning in as.POSIXlt.POSIXct(Sys.time()): unknown timezone 'zone/tz/2018c.
## 1.0/zoneinfo/America/Chicago'
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
library(xts)
Now let’s become more familiar with our data. First call our data. Then use the headcommand to see if our data is collected annually, quarterly, or monthly. Finally, plot the data to see if our data looks stationary.
data(wineind)
head(wineind)
## Jan Feb Mar Apr May Jun
## 1980 15136 16733 20016 17708 18019 19227
plot(wineind)
The output from the head command tells us that our dat is colected monthly. Therefore, we know that S=12 for the ARIMA models that we will create later. ALos, from our plot we see that our data is stationary, so we can continue using our data as it is.
As mentioned, for the LRT, we will compare two nested models. Therefore, we will need to create two models to compare. We will create ARIMA models to fit our data. As mentioned, our data is collected monthly, so S=12 for each model we create.
For this document we will create two models to compare. Before we create our models though, let’s use the auto.arima command to view the best time series model for our data. We will not use the model that auto.arima gives us in our LRT because we already know that this is the best model for our data. However, this model will give us a baseline for creating our two models for comparison. More specifically, we will use the same value of \(d\) and \(D\) in our models, but we will chnage the values for p, q, P, and Q.
auto.arima(wineind)
## Series: wineind
## ARIMA(1,1,2)(0,1,1)[12]
##
## Coefficients:
## ar1 ma1 ma2 sma1
## 0.4299 -1.4673 0.5339 -0.6600
## s.e. 0.2984 0.2658 0.2340 0.0799
##
## sigma^2 estimated as 5399312: log likelihood=-1497.05
## AIC=3004.1 AICc=3004.48 BIC=3019.57
From our output we see that our model has the folowing values: p=1, d=1, q=2, P=0, D=1, Q=1, and S=12.
Now, let’s use this information to create two models. We will name our models mod1 and mod2.
mod1 <- arima(wineind, order = c(1,1,1), season = list( order = c( 0,1,1), period=12))
mod1
##
## Call:
## arima(x = wineind, order = c(1, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))
##
## Coefficients:
## ar1 ma1 sma1
## -0.1078 -0.8850 -0.6435
## s.e. 0.0853 0.0346 0.0831
##
## sigma^2 estimated as 5363686: log likelihood = -1498.37, aic = 3004.75
mod2 <- arima(wineind, order = c(2,1,2), season = list( order = c( 0,1,1), period=12))
mod2
##
## Call:
## arima(x = wineind, order = c(2, 1, 2), seasonal = list(order = c(0, 1, 1), period = 12))
##
## Coefficients:
## ar1 ar2 ma1 ma2 sma1
## -1.0640 -0.1981 0.1179 -0.8688 -0.6608
## s.e. 0.0849 0.0876 0.0716 0.0656 0.0827
##
## sigma^2 estimated as 4995989: log likelihood = -1494.52, aic = 3001.05
From the outputs for each of our models we can confirm that mod1 and mod2 are nested. We confirm this by checking to see if each of the coefficients for the simpler model are included in the more complicated model. We see that mod1 is the simpler model and mod2 is more complicated.
Now that we’ve confirmed that our models are nested, we can use the LRT to see which model is a better fit for our data.
We will use a significance level of \({\alpha} = 0.05\)
First, let’s determine the degrees of freedom for the chi-square distribution of our test stat. Since our more complicated model has 5 parameters and the simpler model has has three parameters, our degrees of freedom is 2 (5-3=2). We can check this using R. the by inputting the coef command into the length command R calculates how many parameters are in each of our models. Our degrees of freedom is the difference of these two values.
mydf <- length(coef(mod2)) - length(coef(mod1))
mydf
## [1] 2
R confirms that our chi-square distribution has 2 degrees of freedom.
Now, let’s create our test statistic. Remember, the test stat equals -2[logliklihood(simpler model) - loglikelihood(more complicated model)]. Let’s plug the necessary numbers into our R code. The logLik command gives us the log likelihood and degrees of freedom for each of our models. By inputting this command into the as.numeric command, the as.numeric command extracts the numeric likelihood values from the logLik outputs.
teststat<- -2*as.numeric(logLik(mod1) - logLik(mod2))
teststat
## [1] 7.695973
From our output we see that our test stat is 7.695973. This is a positive number, which is what we need for the chi-square distribution, as previously mentioned.
Next, let’s calculate our p-value by using the pchisq function. Enter the test stat, the degrees of freedom for our distribution, and the argument lower.tail=FALSE. We use this last argument becasue our p-value is the area of the upper tail of the distribution.
pchisq(teststat, df = mydf, lower.tail = FALSE)
## [1] 0.02132263
The p-value for our test statistic is 0.02132263. This is less than our significance level of 0.05, so we will reject our null hypothesis in favor of the alternative. However, notice that if we had chosen a smaller significance level, such as \({\alpha} = 0.01\), we would not reject the null hypothesis. In our case though we can conclude that there is sufficient evidence that the more complicated model is better than the simple model.
It’s important to note that this test does not prove that our better model is the best model for the data. This test only tells us which model is better out of the two that we are comparing.