In class on Tuesday, we went over how to do some tests that we didn’t cover throughout the rest of the class, especially using the Likelihood Ratio test.
First we worked in the context of linear regression, using the petal length data.
simpler <- lm(Petal.Length ~ Sepal.Length, data =iris)
biggermod <- lm(Petal.Length ~ Sepal.Length + Species, data =iris)
summary(biggermod) #why can't we use this to decide?
##
## Call:
## lm(formula = Petal.Length ~ Sepal.Length + Species, data = iris)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.76390 -0.17875 0.00716 0.17461 0.79954
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.70234 0.23013 -7.397 1.01e-11 ***
## Sepal.Length 0.63211 0.04527 13.962 < 2e-16 ***
## Speciesversicolor 2.21014 0.07047 31.362 < 2e-16 ***
## Speciesvirginica 3.09000 0.09123 33.870 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2826 on 146 degrees of freedom
## Multiple R-squared: 0.9749, Adjusted R-squared: 0.9744
## F-statistic: 1890 on 3 and 146 DF, p-value: < 2.2e-16
We can’t use the individual t-tests to decide whether or not to include species in our data because species is in some way included in the intercept.
# Could use the F-test to decide
anova(biggermod, simpler)
## Analysis of Variance Table
##
## Model 1: Petal.Length ~ Sepal.Length + Species
## Model 2: Petal.Length ~ Sepal.Length
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 146 11.657
## 2 148 111.459 -2 -99.802 624.99 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The partial f-test returns another super low p-value. I consider this method trustworthy, but let’s see what the Wilk’s test does.
# Or could use the LRT (Wilks test)
logLik(simpler)
## 'log Lik.' -190.5675 (df=3)
logLik(biggermod)
## 'log Lik.' -21.23709 (df=5)
mydf <- length(coef(biggermod)) - length(coef(simpler))
The logLik teststatistic is larger for the more complicated model. Should we use the more complicated model?
(teststat<- -2*as.numeric(logLik(simpler) - logLik(biggermod)))
## [1] 338.6608
pchisq(teststat, df = mydf, lower.tail = FALSE)
## [1] 2.888934e-74
#reject the null: the more complicated model (biggmod) fits better
The low p-val says that we should reject the null hypothesis and thus the simpler model. The bigger model fits better.
This Wilk’s test can be used in time series context too.
#install.packages("forecast")
#install.packages("uantmod")
#install.packages("tseries")
#install.packages("timeSeries")
#install.packages("xts")
library(forecast)
library(quantmod)
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(tseries)
library(timeSeries)
## Loading required package: timeDate
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
library(xts)
#### NY births data
births <- scan("http://robjhyndman.com/tsdldata/data/nybirths.dat")
births <- births[-c(1:24)]
births <- ts(births, frequency=12, start=c(1946,1))
# In TSDecomp.R we decided to use an ARIMA model with
# p>=1, d=0, q>=1 (nonseasonal)
# P>=1, D>=1, Q>=1 (seasonal)
# Fit models
mod1 <- arima(births, order = c(1,0,1), season = list( order = c( 1,1,1), period=12))
mod2 <- arima(births, order = c(2,0,2), season = list( order = c( 1,1,1), period=12))
Using the results from the last Learning Log, we created a bigger model to compare the built model too. We should expect to see that our model is better because we did a bunch of fun stuff to get to that model.
mod1
##
## Call:
## arima(x = births, order = c(1, 0, 1), seasonal = list(order = c(1, 1, 1), period = 12))
##
## Coefficients:
## ar1 ma1 sar1 sma1
## 0.9884 -0.1564 -0.2199 -0.8820
## s.e. 0.0287 0.1747 0.1004 0.1498
##
## sigma^2 estimated as 0.3924: log likelihood = -136.81, aic = 283.61
mod2
##
## Call:
## arima(x = births, order = c(2, 0, 2), seasonal = list(order = c(1, 1, 1), period = 12))
##
## Coefficients:
## ar1 ar2 ma1 ma2 sar1 sma1
## 1.2898 -0.2902 -0.5257 -0.2915 -0.2450 -0.8684
## s.e. 0.1738 0.1735 0.1660 0.1090 0.1029 0.1309
##
## sigma^2 estimated as 0.3295: log likelihood = -125.31, aic = 264.62
The AIC is minimized in the more complicated model (by more than 10!).
mydf <- length(coef(mod2)) - length(coef(mod1))
(teststat<- -2*as.numeric(logLik(mod1) - logLik(mod2)))
## [1] 22.98939
pchisq(teststat, df = mydf, lower.tail = FALSE)
## [1] 1.018396e-05
#reject the null: the more complicated model (mod2) fits better
Well, the results of our last Learning Log were beaten by the (2, 0, 2) ARIMA model that we just created today.
Let’s try this with new data!
data("AirPassengers")
AirPassengers <- AirPassengers[-c(1:24)]
AirPassengers <- ts(AirPassengers, frequency=12, start=c(1949,1))
Mod1 <- arima(AirPassengers, order = c(1,0,1), season = list( order = c( 1,1,1), period=12))
Mod2 <- arima(AirPassengers, order = c(2,0,2), season = list( order = c( 1,1,1), period=12))
## Warning in arima(AirPassengers, order = c(2, 0, 2), season = list(order =
## c(1, : possible convergence problem: optim gave code = 1
Mod1
##
## Call:
## arima(x = AirPassengers, order = c(1, 0, 1), seasonal = list(order = c(1, 1,
## 1), period = 12))
##
## Coefficients:
## ar1 ma1 sar1 sma1
## 0.9743 -0.3106 -0.9824 0.9394
## s.e. 0.0196 0.0987 0.0905 0.1742
##
## sigma^2 estimated as 145.8: log likelihood = -425.03, aic = 860.06
Mod2
##
## Call:
## arima(x = AirPassengers, order = c(2, 0, 2), seasonal = list(order = c(1, 1,
## 1), period = 12))
##
## Coefficients:
## ar1 ar2 ma1 ma2 sar1 sma1
## 0.5986 0.3592 0.0613 -0.0331 -0.9899 0.9556
## s.e. 0.5354 0.5226 0.5336 0.2159 0.0790 0.1859
##
## sigma^2 estimated as 144.3: log likelihood = -424.78, aic = 863.56
The AIC is minimized in the simpler model, which is perfect because we prefer the simpler model.
Mydf <- length(coef(Mod2)) - length(coef(Mod1))
(teststat2<- -2*as.numeric(logLik(Mod1) - logLik(Mod2)))
## [1] 0.4919102
pchisq(teststat2, df = Mydf, lower.tail = FALSE)
## [1] 0.7819573
We do not reject the null hypothesis because the test stat is small and the p-val is large.