Today in class we covered Likelihood Ratio Tests. This is a method of comparison for two models where one is nested in the other. A small pvalue in the LRT means that we have enough evidence to use the more complicated model. We can look at an example to demonstrate the LRT.
library(forecast)
library(quantmod)
library(tseries)
library(timeSeries)
library(forecast)
library(xts)
data("wineind")
plot.ts(wineind)
First, we need to load in our necessary packages to use the functions we want. Then we plotted the time series data to get a general look at the trends in our data. We can see that there is a general trend and a seasonal trend.
Next, we need to make our first model, which I did in my last learning log. We will compare the model i created by hand and the auto.arima function model. Let’s look at the model I made by hand first.
winemod<- arima(wineind, order = c(0,0,1), season = list( order = c( 1,1,1), period=12))
winemod
##
## Call:
## arima(x = wineind, order = c(0, 0, 1), seasonal = list(order = c(1, 1, 1), period = 12))
##
## Coefficients:
## ma1 sar1 sma1
## 0.1837 0.4262 -0.7814
## s.e. 0.0802 0.1814 0.1625
##
## sigma^2 estimated as 6261431: log likelihood = -1518.02, aic = 3044.03
summary(winemod)
##
## Call:
## arima(x = wineind, order = c(0, 0, 1), seasonal = list(order = c(1, 1, 1), period = 12))
##
## Coefficients:
## ma1 sar1 sma1
## 0.1837 0.4262 -0.7814
## s.e. 0.0802 0.1814 0.1625
##
## sigma^2 estimated as 6261431: log likelihood = -1518.02, aic = 3044.03
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 604.2352 2415.481 1782.329 1.761336 7.03485 0.3687278
## ACF1
## Training set -0.06496288
Now let’s look at the model we get using the auto.arima function.
winemod2<- auto.arima(wineind)
winemod2
## Series: wineind
## ARIMA(1,1,2)(0,1,1)[12]
##
## Coefficients:
## ar1 ma1 ma2 sma1
## 0.4299 -1.4673 0.5339 -0.6600
## s.e. 0.2984 0.2658 0.2340 0.0799
##
## sigma^2 estimated as 5399312: log likelihood=-1497.05
## AIC=3004.1 AICc=3004.48 BIC=3019.57
summary(winemod2)
## Series: wineind
## ARIMA(1,1,2)(0,1,1)[12]
##
## Coefficients:
## ar1 ma1 ma2 sma1
## 0.4299 -1.4673 0.5339 -0.6600
## s.e. 0.2984 0.2658 0.2340 0.0799
##
## sigma^2 estimated as 5399312: log likelihood=-1497.05
## AIC=3004.1 AICc=3004.48 BIC=3019.57
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -170.7415 2208.571 1619.979 -1.403046 6.55824 0.8234203
## ACF1
## Training set 0.01025396
Once we have the two models we want to compare, we need to get our test stat and our degrees of freedom.
mydf <- length(coef(winemod2)) - length(coef(winemod))
mydf
## [1] 1
(teststat<- -2*as.numeric(logLik(winemod) - logLik(winemod2)))
## [1] 41.93028
Once we have both of these we can use the pchisq function to get our pvalue.
pchisq(teststat, df=1, lower.tail=FALSE)
## [1] 9.45866e-11
This small pvalue means that we reject the null and winemod2 is a better fit than winemod. So, the auto.arima function created a better model than the one I created by hand. Now that we have a better model, we can use it to forecast.
If we want to get a prediction for September of 1994
forecast(winemod2, h=1)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1994 25248.61 22270.74 28226.49 20694.35 29802.88
So it we can estimate that in September 1994, Australia will sell 25,48.6 bottles of wine. If we wanted to look at the general trend for the next 10 months. To be safe, a wine sellers in Australia should stock between 20,694 and 29,803 bottles of wine.
plot(forecast(winemod2, h=10))
We can see that in January, wine sellers could expect to see a drastic drop in wine sales.