A Model for Forecasting: Real GDP

Econ 5316 time Series Econometrics(Spring 2017) / Homework3 Problem3 (Feb.14)
Keunyoung(Kay) Kim

Introduction

First, consider the quarterly Real GDP(Gross Domestic Product) and construct the time series with log changes in Real GDP.
Second, find the best model and check adequacy: check stationarity and invertibility,and diagnose residuals.
Third, forecast the model : multistep forecasts vs. One step ahead rolling forecast
Finally, evaluate and compare them with accuracy of the AR(1) model.

Data

Data for the quarterly Real GDP is imported from Quandl website.(Quandle code : FRED/GDPC1)

library(Quandl)
Quandl.api_key("XzdSwkDsE98Mxj3ixQzG")
rGDP <- Quandl("FRED/GDPC1", type="zoo" )
str(rGDP)

## 'zooreg' series from 1947 Q1 to 2016 Q4
##   Data: num [1:280] 1934 1932 1930 1961 1990 ...
##   Index: Class 'yearqtr'  num [1:280] 1947 1947 1948 1948 1948 ...
##   Frequency: 4

The plot for Real GDP shows the upward sloping trend.

plot(rGDP, xlab="Year", ylab="Real GDP",  main="The Graph of Real GDP")

Construct the time series with log changes in Real GDP (\(y_t\)) and denote the variable name for \(y_t\) by dlrGDP.

\[ y_t = \Delta log GDP_{t} = log GDP_{t} -log GDP_{t-1} \]

dlrGDP <- diff(log(rGDP))
plot(dlrGDP,  xlab="Year", ylab="log difference in Real GDP",  main="Log Differences in Real GDP")

The plot shows that the volatiliy of real GDP is decreasing over time and we need to check stationariy later.

Methodology

Split the sample into two parts: up to 2008.Q4 vs from 2009.Q1

first.q <- 1947.25
last.q <-2008.75
dlrGDP.p1 <- window(dlrGDP, end=last.q)
dlrGDP.p2 <- window(dlrGDP, start=last.q+0.25)

# We can also split the sample period using the following method.
# dlrGDP.p1 <- window(dlrGDP, end="2008 Q4")
# dlrGDP.p2 <- window(dlrGDP, start="2009 Q1")

Plot the ACF and PACF for \(y_t\) in the first subsample and find the best model

We use ACF and PACF to identify suitable time series model.If ACF dies out slowly and PACF drops to zero suddenly after lag p, then it is AR(p) model. If ACF drops to zero immediately after lag q and PACF dies out slowly, then is is MA(q) model.

library(forecast)
Acf(dlrGDP.p1, type="correlation", lag=48, main="ACF for  Real GDP")

Acf(dlrGDP.p1, type="partial", lag=48, main="PACF for  Real GDP")

In the figure above, however neither ACF nor PACF drop to zero abruptly. ACF and PACF show the oscillating pattern and ACF shows a little bit sine wave pattern. From PACF AR(1) can be candidates for the model and from ACF MA(2) also can be a candidate.We can also need to consider ARMA model. In here, we use auto.arima to find best model:

auto.arima returns best ARIMA model according to either AIC, AICc or BIC value. Arguments for auto.arima is as follows:
– stationary : If TRUE, restricts search to stationary models.
– seasonal : If FALSE, restricts search to non-seasonal models.
– ic : Information criterion to be used in model selection.
– stepwise : If TRUE, will do stepwise selection (faster). Otherwise, it searches over all models.

We go over two cases i) seasonal ii) non-seasonal.

a1 <- auto.arima(dlrGDP.p1, seasonal = FALSE, stationary = TRUE, stepwise = FALSE, ic ="aicc")
a1

## Series: dlrGDP.p1 
## ARIMA(2,0,2) with non-zero mean 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2  intercept
##       1.2808  -0.6706  -0.9728  0.4962     0.0081
## s.e.  0.1719   0.1646   0.2083  0.1792     0.0008
## 
## sigma^2 estimated as 8.294e-05:  log likelihood=812.48
## AIC=-1612.96   AICc=-1612.61   BIC=-1591.91

a2 <- auto.arima(dlrGDP.p1, seasonal = TRUE, stationary = TRUE, stepwise = FALSE, ic ="aicc")
a2

## Series: dlrGDP.p1 
## ARIMA(2,0,2) with non-zero mean 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2  intercept
##       1.2808  -0.6706  -0.9728  0.4962     0.0081
## s.e.  0.1719   0.1646   0.2083  0.1792     0.0008
## 
## sigma^2 estimated as 8.294e-05:  log likelihood=812.48
## AIC=-1612.96   AICc=-1612.61   BIC=-1591.91

From the results above, two cases suggest exactly the same result that ARMA(2,2) is the best model.

\[ y_t = \phi_0 + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} \]

Chek the estimated model for adequacty

Plot inverted AR and MA roots to check stationarity and invertibility. If time series can be represented as a finite order moving average process, it is stationary. If time series can be represented as a finite order autoregressive process, it is invertible. A causal invertible model should have all the roots outside the unit circle. Equivalently, the inverse roots should lie inside the unit circle.

library(forecast)
arma22 <- Arima(dlrGDP.p1, order=c(2, 0, 2))
plot.Arima(arma22)

From the figures above, both inverse AR roots and inverse MA roots lie inside the unit circle. Thus this time series are staionary and invertible.

Diagnose residuals

The results below plot the standardized residuals, the autocorrelation function of the residuals, and the p-values for Ljung-Box statistic for lags 36.

The residuals look like stationary except early 50’s, late 50’ and late 70’s and early 80’s. The p-values for Ljung-Box Q statistics show that we can not reject the null hypothesis(no autocorrelation in residuals).

tsdiag(arma22, gof.lag=36)

Evaluate the out of sample accuracy and Compare them with accuracy of the AR(1) model

Use the estimated mofdel with forecast to generate 1 to 32 step ahead forecast for the second subsample.**

1. The multistep forecasts using ARMA(2,2)

library(forecast)
arma22 <- Arima(dlrGDP.p1, order=c(2, 0, 2))
arma22.f <- forecast(arma22, length(dlrGDP.p2))

plot(arma22.f, type="o", pch=16, xlim=c(2005, 2016), ylim=c(-0.03, 0.04), main="ARMA(2,2) Model Multistep Forecasts")
lines(arma22.f$mean, type="p", pch=16, lty="dashed", col="blue")
lines(dlrGDP.p2, type="o", pch=16, lty="dotted")

2. The one-step rolling forecats

rol.f <-zoo()
for(i in 1: length(dlrGDP.p2))
{
  y <- window(dlrGDP, end = last.q+(i-1)/4)
  rol.updt <- arima(y, order=c(2,0,2))
  rol.f <- c(rol.f, forecast(rol.updt, 1)$mean)
}  
rol.f <-as.ts(rol.f)

plot(rol.f, type="o", pch=15, xlim=c(2005, 2016), ylim=c(-0.03, 0.04), main="One Step Rolling Scheme Forecasts")
lines(rol.f, type="p", pch=15, lty="solid", col="red")
lines(dlrGDP, type="o", pch=16, lty="dotted")
lines(dlrGDP.p1, type="o", pch=16, lty="solid")

3. Comparison: Multistep forecast using ARMA(2,2) vs. One step rolling scheme forecast

plot(arma22.f, type="o", pch=16, xlim=c(2005, 2016), ylim=c(-0.03, 0.03), main="Multistep Forecasts,ARMA(2,2) vs. One step Rolling Scheme Forecast")
lines(arma22.f$mean, type="p", pch=16, lty="dashed", col="blue")
lines(rol.f, type="b", pch=15, lty="solid", lwd=2, col="red")
lines(dlrGDP, type="o", pch=16, lty="dotted")

4. Comparison: Multistep forecasts using AR(1) vs. Multistep forecasts using ARMA(2,2)

ar1 <- Arima(dlrGDP.p1, order=c(1, 0, 0))
ar1.f <- forecast(ar1, length(dlrGDP.p2))

plot(ar1.f, type="o", pch=16, xlim=c(2005, 2016), ylim=c(-0.03, 0.04), main="Multistep Forecasts,ARMA(2,2) vs. Multistep Forecasts,AR(1)")
lines(ar1.f$mean, type="p", pch=15, lty="dashed", col="green")
lines(arma22.f$mean, type="p", pch=16, lty="dashed", col="blue")
lines(dlrGDP, type="o", pch=16, lty="dotted")

Compare ARMA(2,2), rolling scheme forecast with the accuracy of the AR(1)

accuracy(arma22.f$mean, dlrGDP.p2)

##                    ME        RMSE         MAE       MPE     MAPE
## Test set -0.003506586 0.005294129 0.004239407 -159.3768 234.5007

accuracy(rol.f, dlrGDP.p2)

##                    ME        RMSE         MAE       MPE    MAPE
## Test set -0.002300581 0.005038677 0.003921268 -99.50147 166.961

accuracy(ar1.f$mean, dlrGDP.p2)

##                    ME        RMSE         MAE       MPE     MAPE
## Test set -0.003124662 0.005080373 0.004035371 -145.1532 227.4407

Conclusions

\(RMSE_{ARMA(2,2)} = 0.005294129\) > \(RMSE_{Rolling, ARMA(2,2))} = 0.005038677\)

Rolling scheme model using ARMA is better than multi-step model using ARMA.

\(RMSE_{ARMA(2,2)} = 0.005294129\) > \(RMSE_{AR(1)} = 0.005080373\)

Multi-step model using AR is better than multi-step model using ARMA.

\(RMSE_{Rolling, ARMA(2,2))} = 0.005038677\) > \(RMSE_{Rolling, AR(1))} = 0.004835928\) (from lecture slide)

Rolling scheme model using AR is better than rollling scheme model using ARMA.

In conclusion, one-step rolling scheme forecast based on AR(1) has the smallest RMSE and shows the best performance for forecasgting.

When the models are tesed using auto.arima based on AICc information criteria, ARMA(2,2) is chosen as the best model.
However, the forecasting performance of rolling schem model using ARMA(2,2) is not so good as that of rolling scheme model using AR(1).
ARMA model has more parameters and is likely to have overfitting problems.
For in-sample period, ARMA model is choosed as the best model but, out-of-sample period, AR shows better formance to forecast.
Finally, one-step rolling scheme forecast based on AR(1) shows better performance than forecast based on ARMA(2,2)