INTRODUCTION

Construct the time series with log changes in Real Gross Domestic Product. As in the example discussed in class and the lecture slides, split the sample into two parts: first one up to 2008Q4, second one from 2009Q1 onward. Plot the ACF and the PACF for yt in the first subsample. Use auto.arima with ic=aicc and stepwise=FALSE to find the best model. Check the estimated model for adequacy - plot inverted AR na MA roots to check stationarity and invertibility, diagnose residuals. Use the estimated model with forecast to generate 1 to 32 step ahead forecast for the second subsample, 2009Q1-2016Q4. Afterwards use the rolling scheme to generate a sequence of 1 period ahead forecasts. Create a plot showing the multistep forecasts and the 1 step ahead rolling forecasts, similar to the one in lec05slides.pdf on the last page. Use accuracy to evaluate the out of sample accuracy of the two sets of forecasts and compare them with the accuracy of the AR(1) model discussed in class and in the lecture slides.

IMPORTING DATA

library(Quandl)

Quandl.api_key("KmxULt3z1Vz1neVxGioB")
rGDP <- Quandl("FRED/GDPC1", type="zoo")
dlrGDP <- diff(log(rGDP))

DATA ANALYSIS

  1. Consider the quarterly Real Gross Domestic Product.

A. I graph the time series data without transformation

plot(rGDP, ylab = 'Real GDP', xlab = 'Year')

The graph has a steady increasing trend, thus we are going to take the difference in logs of the data

B. I construct the time series using:

\[ y_t = \Delta log{y_t} = log{GDP_t} - log{GDP_{t-1}} \]
where \(GDP_{t}\) is the original quarterly Real Gross Domestic Product.

Plot the Data with transformation

dlrGDP <- diff(log(rGDP))
plot(dlrGDP,xlab="Time", ylab="yt")

Now, the time series is stationary,

Split Data

dlrGDPp1 <- window(dlrGDP, end="2008 Q4")
dlrGDPp2 <- window(dlrGDP, start="2009 Q1")

Now, we then split the data into two sub-samples.First one is for estimation sample, second is for prediction sample.

Use ACF and PACF to identify suitable AR and/or MA model(s)

library(forecast)
par(mfrow=c(2,1), cex=1, mar=c(3,4,3,3))
Acf(dlrGDP, type='correlation', lag=36)
Acf(dlrGDP, type='partial', lag=36)



In the results, the first and second lag in ACF and the second and first and twelfth lag in PACF are significant.

## Auto.Arima

Now we are going to use the auto.arima function from the forecast package to suggest the appropriate (best) model.

m1 <- auto.arima(dlrGDPp1, ic = 'aicc', stepwise = FALSE)
m1
## Series: dlrGDPp1 
## ARIMA(2,0,2) with non-zero mean 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2  intercept
##       1.2808  -0.6706  -0.9728  0.4962     0.0081
## s.e.  0.1719   0.1646   0.2083  0.1792     0.0008
## 
## sigma^2 estimated as 8.294e-05:  log likelihood=812.48
## AIC=-1612.96   AICc=-1612.61   BIC=-1591.91

ARMA(2,2) is suggested.

Ljung-Box Statistic and Maximum Likelihood

tsdiag(m1, gof.lag = 20)

We check the residuals of the model. As a result, we conclude that the residuals are statistically uncorrelated and also statistically non-different from zero. The result of ACF has no statistically significant coefficients. The p-values are high.

Check staionarity and invertibility

We check Stationarity and Invertibility. When the inverse of the characteristic roots of the model are in the unit circle, it show that the model is stationarity and invertibility.

plot.Arima(m1) 

All of the characteristic roots are in the unit circle. Thus, we can presume that the model is stationary and invertible.

Forecasting

Now we will construct forecasts using the ARMA(2,2) model. We will use the following two separate forecasting methods. (1) Multi-step forecast, and (2) Rolling Scheme forecast.

  1. Multi-step forecast
m1.multistep.forecast <- forecast(m1, length(dlrGDPp2))
  1. Rolling Scheme forecast
m1.rolling.forecast <- zoo()
firstQ <- 1947.25
lastQ <- 2008.75
for(i in 1:length(dlrGDPp2)) {
    temp <- window(dlrGDP, start = firstQ + (i-1)/4, end = lastQ + (i-1) / 4)
    m1.update <- arima(temp, order = c(2,0,2))
    m1.rolling.forecast <- c(m1.rolling.forecast, forecast(m1.update, 1)$mean)
}

m1.rolling.forecast <- as.ts(m1.rolling.forecast)
  1. Show the both results
plot(m1.multistep.forecast, type="o", pch=16, xlim=c(2005,2016), ylim=c(-0.03,0.03),
     main="ARMA(2,2) Model Forecasting: Multistep vs Rolling Scheme")
lines(m1.multistep.forecast$mean, type="p", pch=16, lty="dashed", col="blue")
lines(dlrGDP, type="o", pch=16, lty="dashed")
lines(m1.rolling.forecast, type="o", pch=16, lty="dashed",col="red")
legend("topleft", c("multistep", "roll","Real GDP"), pch = c(16,16),col=c("blue","red","black"),lty="dashed")

We can see the multi-step forecast is more stable than the rolling scheme forecast.

Checking Accuracy

Now we are going to check the accuracies of the model by using the accuracy function from the forecast package.

  1. Accuracy of the ARMA(2,2) forecasting model
accuracy(f = m1.multistep.forecast$mean, x =  dlrGDPp2)
##                    ME        RMSE         MAE       MPE     MAPE
## Test set -0.003506586 0.005294129 0.004239407 -159.3768 234.5007
accuracy(m1.rolling.forecast, dlrGDPp2)
##                    ME        RMSE         MAE       MPE    MAPE
## Test set -0.002202307 0.004997696 0.003901092 -96.81412 164.633

Rolling scheme forecasting method has less error than the multi-step forecating method. As a result, we can conclude the rolling scheme forecasting method has more accuracy than the multi-step forecasting method for the forecasting.

  1. Comparing ARMA(2,2) model with the accuracy of the AR(1) model

We now are going to compare ARMA(2,2) model with the accuracy of the AR(1) model

AR1<-arima(dlrGDP, order=c(1,0,0))
AR1.multistep.forecast <- forecast(AR1, length(dlrGDPp2))
AR1.rolling.forecast <- zoo()
firstQ <- 1947.25
lastQ <- 2008.75
for(i in 1:length(dlrGDPp2)) {
    temp <- window(dlrGDP, start = firstQ + (i-1)/4, end = lastQ + (i-1) / 4)
    AR1.update <- arima(temp, order = c(2,0,2))
    AR1.rolling.forecast <- c(AR1.rolling.forecast, forecast(AR1.update, 1)$mean)
}

AR1.rolling.forecast <- as.ts(m1.rolling.forecast)


accuracy(f = m1.multistep.forecast$mean, x = dlrGDPp2)
##                    ME        RMSE         MAE       MPE     MAPE
## Test set -0.003506586 0.005294129 0.004239407 -159.3768 234.5007
accuracy(m1.rolling.forecast, dlrGDPp2)
##                    ME        RMSE         MAE       MPE    MAPE
## Test set -0.002202307 0.004997696 0.003901092 -96.81412 164.633
accuracy(f = AR1.multistep.forecast$mean, x =  dlrGDPp2)
##                    ME        RMSE         MAE       MPE     MAPE
## Test set -0.003221717 0.005894379 0.004261119 -125.9801 226.3405
accuracy(AR1.rolling.forecast, dlrGDPp2)
##                    ME        RMSE         MAE       MPE    MAPE
## Test set -0.002202307 0.004997696 0.003901092 -96.81412 164.633

As a result, we can conclude that the both accuracy(Multi-Step Forecast Accuracy, Rolling Scheme Forecast Accuracy) of AR(1) model are better than those of ARMA(2,2).

Conclusion

In short, we can conclude AR(1) model is the best model when we try to get the forecasting. This result is surprised because when we use auto.arima fuction, ARMA(2,2) model is the best model. We think this result is from the step, “Split Data”.