Learning Log 23

Today in class we talked about seasonal ARIMA. Seasonal ARIMA is ARIMA with a seasonal component and a non-seasonal component. It is (p,d,q)x(P,D,Q)*S. Where s is the timespan of seasonal repeating and the uppercase PDQ are the seasonal component and the lowercase pdq are the nonseasonal component. When doing seasonal ARIMA there are 5 steps.

Plot data to see seasonal trends
Do diff function to identify D,d
Examine ACF and PACF to choose preliminary values for P,p,Q,q
Estimate the model
diagnostics, hyptohesis tests, use AIC to compare different models.

library(forecast)
library(quantmod)
library(tseries)
library(timeSeries)
library(forecast)
library(xts)
data("wineind")
plot.ts(log(wineind))

First, I logged the data because it made the graphs have more constant variance.

Looking at the first graph, we can see the seasonal trends. If we look at the second graph we can see the non-seasonal trend of the data.

diffseasonal<-diff(log(wineind),lag=12)
plot.ts(diffseasonal)

adf.test(diffseasonal)

## Warning in adf.test(diffseasonal): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  diffseasonal
## Dickey-Fuller = -4.5177, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

The first graph shows the differences plotted on a graph. Here we want to see constant variance. The second output shows that with a pvalue of .01, we fail to reject the null and our data is now stationary. This means that when we account for the seasonal variation, we don’t have any non-seasonal variation. Because we used the diff function once, our D=1 and d=0.

acf(diffseasonal)

The acf gives us our MA numbers. There is one spike at a low lag, so our q=1. There is one spike at lag 1 and one at almost 2. so Q>= 1.

pacf(diffseasonal)

pacf gives us our AR numbers. There are no spikes at low lag, so our p=0. There is one spike at 1 and one at almost 2, so P>=1.

winemod<- arima(log(wineind), order = c(0,0,1), season = list( order = c( 1,1,1), period=12))
winemod

## 
## Call:
## arima(x = log(wineind), order = c(0, 0, 1), seasonal = list(order = c(1, 1, 
##     1), period = 12))
## 
## Coefficients:
##          ma1    sar1     sma1
##       0.2117  0.4461  -0.7912
## s.e.  0.0788  0.1743   0.1553
## 
## sigma^2 estimated as 0.009787:  log likelihood = 144.66,  aic = -281.32

summary(winemod)

## 
## Call:
## arima(x = log(wineind), order = c(0, 0, 1), seasonal = list(order = c(1, 1, 
##     1), period = 12))
## 
## Coefficients:
##          ma1    sar1     sma1
##       0.2117  0.4461  -0.7912
## s.e.  0.0788  0.1743   0.1553
## 
## sigma^2 estimated as 0.009787:  log likelihood = 144.66,  aic = -281.32
## 
## Training set error measures:
##                      ME       RMSE        MAE       MPE      MAPE
## Training set 0.02488139 0.09553026 0.07183153 0.2399747 0.7098799
##                   MASE        ACF1
## Training set 0.3793825 -0.06824652

Box.test(winemod$residuals, type = "Ljung")

## 
##  Box-Ljung test
## 
## data:  winemod$residuals
## X-squared = 0.83379, df = 1, p-value = 0.3612

First, we look at the coefficients/Standard error. Since all of these are > |2|, we can safely assume that they are useful coefficients. Since our Box-Ljung test yielded a high pvalue, we fail to reject the null and say that our model is adequate.

Learning Log 23

Kristan Miarka

April 26, 2018