The goal of this assignment is to forecast log changes in real GDP. Section I discusses the data and how we transform the data to be forecastable. Section II then discusses the process of finding the best model. Section III then forecasts the model that we find in Section II using two forecasting methods. We compare both methods with one another and we also compare these methods with the AR(1) forecasting models that we derived in class. Section IV concludes.
We start by importing the raw data with the code used below. We also plot a time series of the raw data.
# Packages needed
library(Quandl) # For Quandl
library(forecast) # For Acf
# Import Data
Quandl.api_key("75HFthvbcxsrteNWupus")
raw.data <- Quandl("FRED/GDPC1", type = 'zoo')
plot(raw.data, main = "Time Series of Real GDP", ylab = 'Real GDP', xlab = 'Year')
From the graph we see that there is a clear linear trend upwards which implies that the time series is not stationary. Hence we fix this problem by taking the difference in logs of the data so that we have a time series of the log changes in Real Gross Domestic Product. Mathematically this looks like: \[y_t = \Delta \log(y_t) = \log (GDP_t)-\log(GDP_{t-1})\] We derive the above model with the following code. We include a plot of the new time series to visually see how we transformed the data.
# Construct Time Series Log Changes
log.diff.rGDP <- diff(log(raw.data)) # Full sample
plot(log.diff.rGDP)
To prevent data snooping we then split the data into two sub-samples. The first sub-sample is the data that we will use to estimate our model parameters. The second sub-sample will be used to test the forecasting ability and accuracy of our model. The code to split the data into these two resepctive samples is given below.
# Split Sample into Two Parts
log.diff.rGDP.part1 <- window(log.diff.rGDP, end = "2008 Q4") # Estimation sample
log.diff.rGDP.part2 <- window(log.diff.rGDP, start = "2009 Q1") # Prediction sample
Now that we have our time series we want to determine which type of model we should use. To do so we need to examine the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the transformed data. Theoretically there are three possible scenarios we can conclude from looking at the ACF and the PACF:
If the true model that produced the data is an auto-regressive model (AR) then the PACF should equal zero after lag p.
If the true model that produced the data is a moving average model (MA) then the ACF should equal zero after lag q.
If the true model that produced the data is an auto-regressive moving average model (ARMA) then the ACF and PACF will oscillate around zero, and not be particularly useful.
We now estimate the ACF and the PACF of the first sub-sample of the data. We calculate these functions up until the 20th lag (5 years). The code below shows how we produced these functions and the graphs follow:
# ACF and PACF (for the estimation sample)
ACF <- Acf(log.diff.rGDP.part1, type = 'correlation', lag = 20)
PACF <- Acf(log.diff.rGDP.part1, type = 'partial', lag = 20)
From the above graphs we can see that the ACF has statistically significant coefficients for the first two lags and possibly for the fifth, twelfth, and thiteenth lag as well. Similarly the PACF has statistically significant coefficients for the first lag and the twelfth lag. It is also good to note that it is difficult to determine if either the ACF or PACF converges to zero. Hence, we might expect that the best model will be a type of ARMA model.
We now employ the auto.arima
function from the forecast
package to suggest the appropriate (best) model.
The R code used to employ the auto.arima
function is given below:
# Auto.arima
Model <- auto.arima(log.diff.rGDP.part1, ic = 'aicc', stepwise = FALSE)
The results of this model are given below:
## Series: log.diff.rGDP.part1
## ARIMA(2,0,2) with non-zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2 intercept
## 1.2808 -0.6706 -0.9728 0.4962 0.0081
## s.e. 0.1719 0.1646 0.2083 0.1792 0.0008
##
## sigma^2 estimated as 8.294e-05: log likelihood=812.48
## AIC=-1612.96 AICc=-1612.61 BIC=-1591.91
The main takeaway is to see that the data is best described as having been produced by an ARMA(2,2) model. It is also important to see that the coefficeints are statistically different from zero.
We now want to examine if the model derived in part B is an adequate model; that is, we want to know if the model is stationary and if the model is invertible, and we want to diagnose the residuals of the model.
Recall that stationarity implies that the mean of a variable and the covariance between the variable and its lag are time invariant, that is, constant over time. If stationarity holds then many statistical properties that are employed under the assumption of independence can now be employed on dependent time series data. Invertibility implies that more recent observations have more weight in determining today’s observation than observations that are more distant in the past. Thus, for a model to be invertible then we need more recent information to be more relevant than past information.
We check the residuals of each model to determine if our models will be good forecasting models. We note that a good model will have residuals that are uncorrelated and an even better model will have uncorrelated residuals with a zero mean. If residuals are correlated then we say that there is still information in the residuals that could be used in our forecasting. If residuals do not have a zero mean then we say that our forecasts are biased.
To check these conditions we determine if the inverse of the characteristic roots of the model are unitary; that is, lie within the unit circle. If the inverse roots are unitary for both inverted AR and MA plots then we know that the model is both stationary and invertible, respectively.
The code used to plot the invertible characteristic roots is given below followed by the figures it produces:
plot.Arima(Model)
As we can see all of the characteristic roots lie within the unit circles and therefore we can assume that the model is both stationary and invertible.
To further check the model’s adequacy we will diagnose the residuals of the model, where higher p values indicate a better model.
tsdiag(Model, gof.lag = 20)
From the above graph we can conclude that the residuals are statistically uncorrelated and also statistically non-different from zero. The ACF has no statistically significant coefficients and the p-values are extermely high and therefore are not-statistically different from zero. Furthermore, the plot of the residuals fluctuate around zero with a mean of 2.410^{-5}, which is approximately zero.
Therefore, the model is both stationary and invertible, and has shown that it will be good for forecasting by it’s residual diagnostics. The section will thus use our model to forecast log changes in real GDP.
Since we have shown that the model is adequate then we will now go ahead and use the model derived in Section II to forecast “future” values of the data and compare them to the second sub-sample over the same window of time. In other words, we will check the out of sample accuracy of our model. We will do this by using two separate forecasting methods; namely, (1) Multi-step forecast, and (2) Rolling Scheme forecast. Once we have both forecasts we will compare them with one another as well as a simple AR(1) model that was previously derived in class.
- Multi-Step Forecast
The first forecasting method is a multi-step forecasting method. The code below is how we derived the multi-step forecast.
## Multistep Forecast
Model.multistep.forecast <- forecast(Model, length(log.diff.rGDP.part2))
- Rolling Scheme Forecast
The second forecasting method is a rolling scheme forecasting method. The code below is how we derived the rolling scheme forecast.
## Rolling Scheme Forecast
Model.rolling.forecast <- zoo()
first.quarter <- 1947.25
last.quarter <- 2008.75
for(i in 1:length(log.diff.rGDP.part2)) {
temp <- window(log.diff.rGDP, start = first.quarter + (i-1)/4, end = last.quarter + (i-1) / 4)
model.update <- arima(temp, order = c(2,0,2))
Model.rolling.forecast <- c(Model.rolling.forecast, forecast(model.update, 1)$mean)
}
Model.rolling.forecast <- as.ts(Model.rolling.forecast)
The first way we can compare the two forecasts is simply by plotting the forecasts against the real log change in real GDP data. Below is the graph of the two forecasts and the actual observed data.
As we can see the multi-step forecast appears to be slightly more smooth than the rolling scheme forecast; therefore, it appears from the graph that the rolling scheme model better forecasts actual log changes in real GDP since the actual values are more volatile than they are smooth. The best way to determine the accuracies of the model is to use the accuracy
command from the forecast
package.
We will now use the accuracy
function to compare the multi-step forecast and rolling scheme forecast of the ARMA(2,2) model. Furthermore, we will then compare the accuracy of these two forecating techniques with those of the AR(1) model that we derived in class.
To test the accuracies of the two models we use the following R code. The results are given below.
## Accuracy of ARMA(2,2) Models
accuracy(f = Model.multistep.forecast$mean, x = log.diff.rGDP.part2)
accuracy(Model.rolling.forecast, log.diff.rGDP.part2)
## ME RMSE MAE MPE MAPE
## Test set -0.0035 0.0053 0.0042 -159.3768 234.5007
## ME RMSE MAE MPE MAPE
## Test set -0.0022 0.005 0.0039 -96.8141 164.633
As predicted earlier from part B we see that the rolling scheme forecasting method has less error than the multi-step forecating method. Therefore, between these two methods, the rolling scheme forecasting method appears to have more accuracy than the multi-step forecasting method in predicting future log changes in real GDP.
We will now compare the ARMA(2,2) forecasts with the AR(1) forecasts by using the accuracy
function. The R code used to produce these accuracies is given below. The results follow.
## Accuracy of ARMA(2,2) Models
accuracy(f = Model.multistep.forecast$mean, x = log.diff.rGDP.part2)
accuracy(Model.rolling.forecast, log.diff.rGDP.part2)
## Accuracy of AR(1) Models
accuracy(f = AR1.multistep.forecast$mean, x = log.diff.rGDP.part2)
accuracy(AR1.rolling.forecast, log.diff.rGDP.part2)
The ARMA (2,2) multi-step forecast accuracy is given first followed by the AR(1) multi-step forecast accuracy.
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.0000 0.0090 0.0067 -55.8014 174.3994 0.6265 0.0106
## Test set -0.0035 0.0053 0.0042 -159.3768 234.5007 0.3983 NA
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.0000 0.0092 0.0068 -46.9308 168.4821 0.6359 -0.0305
## Test set -0.0031 0.0051 0.0040 -145.1532 227.4407 0.3791 NA
Surprisingly the multi-step AR(1) forecast is more accuracte than the ARMA(2,2) multi-step forecast. We can see this because the errors from the AR(1) multi-step forecast are smaller than the ARMA(2,2) multi-step forecast.
The ARMA (2,2) rolling scheme forecast accuracy is given first followed by the AR(1) rolling scheme forecast accuracy.
## ME RMSE MAE MPE MAPE
## Test set -0.0022 0.005 0.0039 -96.8141 164.633
## ME RMSE MAE MPE MAPE
## Test set -0.0018 0.0048 0.0039 -94.1665 169.6169
Again we see that the error terms of the AR(1) rolling scheme forecast are smaller than that of the ARMA(2,2) rolling scheme forecast. Hence, the AR(1) rolling scheme forecast is more accurate than the ARMA(2,2) rolling scheme forecast.
In conclusion, the ARMA(2,2) model seems to be the best fit for the in-sample data. In the out of sample data, the ARMA(2,2) rolling scheme forecasting model performed the best. However, the AR(1) model performed better than the ARMA(2,2) model in the out of sample test. In other words, the ARMA(2,2) model best modeled the in-sample data but the AR(1) model outperformed in its out of sample forecasting ability. This shows us the importance of having two sub samples over which to test our model. It also suggests that the ARMA(2,2) model overfit the in-sample data which is why the AR(1) model performed better in the out of sample forecasting tests.