Today we covered the Likelihood Ratio Test, another method we can use to compare models. This test uses the log likelihood as a measure of model fit. Previously we covered the F-test, T-test, and AIC. The likelihood ratio test is sometimes also called the Wilks test.
The likelihood ratio test is similar to the F-test in that we can test groups of predictors. We need to have nested models to use the likelihood ratio test, just like we needed with the F-test. The test statistic is calculated by -2(Log_likelihood(model1) - Log_likelihood(model2)) and follows a chi-square distribution with degrees of freedom equal to the difference in the number of parameters between the two models. The test statistic must be greater than 0; if we switch the order of the log likelihoods when calculating our test statistic, the absolute value of the test statistic does not change.
If the p-value of this test is small, then we can conclude the more complicated model has a better fit. However, if we get a large p-value, then the test tells us the simpler model fits better. It’s important to note that the likelihood ratio test tells us which of the two models we are comparing is the better model; there can still be better models for our data. Model comparison is not limited to linear regression. We can use the likelihood ratio test to compare two time series models.
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
library(quantmod)
## Warning: package 'quantmod' was built under R version 3.4.4
## Loading required package: xts
## Warning: package 'xts' was built under R version 3.4.4
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.4.4
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Warning: package 'TTR' was built under R version 3.4.3
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(tseries)
## Warning: package 'tseries' was built under R version 3.4.4
library(timeSeries)
## Warning: package 'timeSeries' was built under R version 3.4.4
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 3.4.4
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
library(forecast)
library(xts)
births <- scan("http://robjhyndman.com/tsdldata/data/nybirths.dat")
births <- births[-c(1:24)]
births <- ts(births, frequency=12, start=c(1946,1))
SimplerMod <- arima(births, order = c(1,0,1), season = list( order = c( 1,1,1), period=12))
Complicated <- arima(births, order = c(2,0,2), season = list( order = c( 1,1,1), period=12))
We have two models we want to compare, but first let’s make sure they are nested models. We should verify that one of the models has every variable of the other one.
SimplerMod
##
## Call:
## arima(x = births, order = c(1, 0, 1), seasonal = list(order = c(1, 1, 1), period = 12))
##
## Coefficients:
## ar1 ma1 sar1 sma1
## 0.9884 -0.1564 -0.2199 -0.8820
## s.e. 0.0287 0.1747 0.1004 0.1498
##
## sigma^2 estimated as 0.3924: log likelihood = -136.81, aic = 283.61
Complicated
##
## Call:
## arima(x = births, order = c(2, 0, 2), seasonal = list(order = c(1, 1, 1), period = 12))
##
## Coefficients:
## ar1 ar2 ma1 ma2 sar1 sma1
## 1.2898 -0.2902 -0.5257 -0.2915 -0.2450 -0.8684
## s.e. 0.1738 0.1735 0.1660 0.1090 0.1029 0.1309
##
## sigma^2 estimated as 0.3295: log likelihood = -125.31, aic = 264.62
Here we can see the simpler model is a subset of the complex model; the complex model has a second component for both moving average and autoregression, while all other components are shared with the simpler model. Now we can use the likelihood ratio test. We can use the logLik function to compute the log likelihoods for both models.
testStat <- -2*(logLik(SimplerMod) - logLik(Complicated))
testStat
## 'log Lik.' 22.98939 (df=5)
Our test statistic is 22.98939.
Warning: the output shows degrees of freedom as 5, but this is NOT the degrees of freedom for our test statistic’s distribution. To get the distribution’s degrees of freedom, we look at how many parameters are in each model and take the difference.
Our complex model has 6 parameters, while our simpler model has 4. Thus, the degrees of freedom is 2.
We have our test statistic and we know it follows a chi-squared distribution with 2 degrees of freedom. All that remains is to calculate the p-value using the chisq function we used a long time ago. Recall that the arguments are the test stat, degrees of freedom as df=, and lower.tail = FALSE to compute the upper tail probability.
pchisq(22.98939, df = 2, lower.tail = FALSE)
## [1] 1.018398e-05
Our p-value is very small, so we can conclude the complex time series model for births in New York has better fit. Now let’s try comparing two models for the airmiles time series, which measures the number of revenue passenger miles flown each year from 1937 to 1960. There is no seasonality to this time series.
SimpleMiles <- arima(airmiles, order = c(0,1,1))
ComplexMiles <- arima(airmiles, order = c(0, 2, 2))
MileTestStat <- -2*(logLik(SimpleMiles) - logLik(ComplexMiles))
MileTestStat
## 'log Lik.' 32.54132 (df=2)
Our test stat is 32.54132. Since we have 4 parameters in the complex model and 2 parameters in the simpler model, our test stat follows a chi square distribution with 2 degrees of freedom.
pchisq(32.54132, df = 2, lower.tail = FALSE)
## [1] 8.585036e-08
The resulting p-value is small, so our complicated model in this case, with p, d, and q as 0, 2, and 2 respectively, is the better model. As noted before, our test result is only telling us this is the better model from the two models we were comparing. Let’s compare this to the model created by auto.arima.
autoMiles <- auto.arima(airmiles)
autoMiles
## Series: airmiles
## ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.7031
## s.e. 0.1273
##
## sigma^2 estimated as 1234546: log likelihood=-185.33
## AIC=374.67 AICc=375.3 BIC=376.85
The automated model is nested in our complex model, so we can perform the likelihood ratio test.
Miles2Stat <- -2*(logLik(autoMiles) - logLik(ComplexMiles))
Miles2Stat
## 'log Lik.' 0.8224925 (df=2)
Our test stat is 0.8224925 and follows the chi square distribution with 1 degree of freedom. Calculating the p-value:
pchisq(0.8224925, df = 1, lower.tail = FALSE)
## [1] 0.3644524
We see a large p-value, so we conclude the simpler model (i.e. the auto.arima model) in this comparison is the better of the two. We could not know this from the first test we did for the airmiles dataset. The process of finding the best model for the data we have still requires some trial and error.