Likelihood Ratio Test (Wilks Test)

In Tuesday’s class we started talking about Likelihood Ratio Tests. In this test, we compare two nested models in terms of model fit. A nested model means that one model is a special case of another; for example, \({y} = {\beta_0}\) is a nested model of \({y} = {\beta_0} + {\beta_1}x\).

To get our test statistic for thsi test, we take the likelihood of one model and divide it by the likelihood of the other, and then take the log of it. In this test statistic, the likelihood of the complicated model should be larger than the likelihood of the simpler model.

The distribution of this test stat is Chi Squared. The degrees of freedom we use in this test is the difference in the number of parameters of the two models.

To obtain our p-value, we take area to the right of our test statistic because the p-value in this distribution is the probability of getting test statistic as extreme or more extreme than ours.

The steps for this test are:

Fit model
Loglikelihood to get test stat
Obtain p-value with pchisq(teststat)

If we get a test stat close to 0, this means that our complicated and simpler model are pretty similar, so we should stick with the simpler model.

Example

As an example, we will use the iris dataset. We create two models with sepal length as the response, but one model will just have petal length as the predictor and the other will have petal length and species.

data(iris)
attach(iris)
mymod <- lm(Sepal.Length ~ Petal.Length)
mymod2 <- lm(Sepal.Length ~ Petal.Length + Species)
summary(mymod2)

## 
## Call:
## lm(formula = Sepal.Length ~ Petal.Length + Species)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.75310 -0.23142 -0.00081  0.23085  1.03100 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        3.68353    0.10610  34.719  < 2e-16 ***
## Petal.Length       0.90456    0.06479  13.962  < 2e-16 ***
## Speciesversicolor -1.60097    0.19347  -8.275 7.37e-14 ***
## Speciesvirginica  -2.11767    0.27346  -7.744 1.48e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.338 on 146 degrees of freedom
## Multiple R-squared:  0.8367, Adjusted R-squared:  0.8334 
## F-statistic: 249.4 on 3 and 146 DF,  p-value: < 2.2e-16

anova(mymod, mymod2)

## Analysis of Variance Table
## 
## Model 1: Sepal.Length ~ Petal.Length
## Model 2: Sepal.Length ~ Petal.Length + Species
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    148 24.525                                  
## 2    146 16.682  2    7.8434 34.323 6.053e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now that we’ve looked a bit at the comparison of the two models with the anova function, we obtain the log likelihood of each model using the logLik() function:

logLik(mymod)

## 'log Lik.' -77.02021 (df=3)

logLik(mymod2)

## 'log Lik.' -48.11637 (df=5)

We can see that the log likelihood for our simpler model is about -77.02 and the log likelihood of the more complex model is about -48.17.

We can now obtain our test stat to run our test. For this test, our null hypothesis is that we don’t need species in our model. Our alternative hypothesis is that we do need species in our model.

In this case, there are 2 parameters in our simpler model and 4 in our complex, so we have 2 degrees of freedom.

We obtain our test stat with the following equation:

teststat <- -2*(as.numeric(logLik(mymod))-as.numeric(logLik(mymod2)))
teststat

## [1] 57.80768

We now obtain our p-value:

pchisq(teststat, df = 2, lower.tail = FALSE)

## [1] 2.800409e-13

Because we get such a small p-value, we reject the null in favor of the alternative. We have really strong evidence that even after having petal length in the model, we should still have species in the model.

Time Series Example

We will now run the same type of test on the time series models we created in Learning Log 23. We follow all the same steps as above but with the time series models:

library(forecast)

## Warning: package 'forecast' was built under R version 3.4.4

data(wineind)
mod1 <- arima(wineind, order = c(0,0,0), season = list( order = c( 1,1,1), period=12))
mod2 <- arima(wineind, order = c(0,0,0), season = list( order = c( 0,1,1), period=12))
logLik(mod1)

## 'log Lik.' -1520.259 (df=3)

logLik(mod2)

## 'log Lik.' -1520.341 (df=2)

teststat <- 2*(as.numeric(logLik(mod1))-as.numeric(logLik(mod2)))
teststat

## [1] 0.1644043

pchisq(teststat,  df = 1, lower.tail = FALSE)

## [1] 0.6851336

Because we have a very high p-value, our test is telling us to stick with a simpler model. That is, we should stick with our model where P = 0, D = 1, Q = 1, versus the model where P = 1, D = 1, Q =1.

If I were to interpret these results to the wine salesman, I would tell them that this model definitely has a seasonal component. We can use our model to predict into the future:

par(mfrow=c(2,1))
forecast(mod2, h = 12)

##          Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## Sep 1994       23614.94 20335.48 26894.39 18599.448 28630.42
## Oct 1994       27679.00 24399.54 30958.45 22663.509 32694.48
## Nov 1994       32225.64 28946.19 35505.09 27210.152 37241.13
## Dec 1994       37271.66 33992.21 40551.11 32256.171 42287.15
## Jan 1995       14803.95 11524.49 18083.40  9788.458 19819.43
## Feb 1995       21919.62 18640.17 25199.07 16904.134 26935.11
## Mar 1995       23741.35 20461.90 27020.81 18725.868 28756.84
## Apr 1995       26136.40 22856.95 29415.85 21120.910 31151.88
## May 1995       24132.74 20853.28 27412.19 19117.248 29148.22
## Jun 1995       26534.20 23254.75 29813.65 21518.712 31549.69
## Jul 1995       29636.48 26357.02 32915.93 24620.989 34651.96
## Aug 1995       25267.08 21987.63 28546.53 20251.594 30282.57

Looking at this prediction table, we could give a wine salesman some advice about the future. For example, for August of 1995 (a year beyond this dataset), our suggestion would be to stock enough wine for about 25267 sales. To be safe, we would say that we are 80% confident that the total wine sales would be between 21987.63 and 28546.53, so the winesalesman would potentially need to stock even more than our point prediction.

Learning Log Day 24

Cara Christianson

May 1, 2018

Likelihood Ratio Test (Wilks Test)

Example

Time Series Example