Construct the time series with log changes in Real Gross Domestic Product. As in the example discussed in class and the lecture slides, split the sample into two parts: first one up to 2008Q4, second one from 2009Q1 onward. Plot the ACF and the PACF for yt in the first subsample. Use auto.arima with ic=aicc and stepwise=FALSE to find the best model. Check the estimated model for adequacy - plot inverted AR na MA roots to check stationarity and invertibility, diagnose residuals. Use the estimated model with forecast to generate 1 to 32 step ahead forecast for the second subsample, 2009Q1-2016Q4. Afterwards use the rolling scheme to generate a sequence of 1 period ahead forecasts. Create a plot showing the multistep forecasts and the 1 step ahead rolling forecasts, similar to the one in lec05slides.pdf on the last page. Use accuracy to evaluate the out of sample accuracy of the two sets of forecasts and compare them with the accuracy of the AR(1) model discussed in class and in the lecture slides.
library(Quandl)
Quandl.api_key("KmxULt3z1Vz1neVxGioB")
rGDP <- Quandl("FRED/GDPC1", type="zoo")
dlrGDP <- diff(log(rGDP))
A. I graph the time series data without transformation
plot(rGDP, ylab = 'Real GDP', xlab = 'Year')
The graph has a steady increasing trend, thus we are going to take the difference in logs of the data
B. I construct the time series using:
\[
y_t = \Delta log{y_t} = log{GDP_t} - log{GDP_{t-1}}
\]
where \(GDP_{t}\) is the original quarterly Real Gross Domestic Product.
dlrGDP <- diff(log(rGDP))
plot(dlrGDP,xlab="Time", ylab="yt")
Now, the time series is stationary,
dlrGDPp1 <- window(dlrGDP, end="2008 Q4")
dlrGDPp2 <- window(dlrGDP, start="2009 Q1")
Now, we then split the data into two sub-samples.First one is for estimation sample, second is for prediction sample.
library(forecast)
par(mfrow=c(2,1), cex=1, mar=c(3,4,3,3))
Acf(dlrGDP, type='correlation', lag=36)
Acf(dlrGDP, type='partial', lag=36)
In the results, the first and second lag in ACF and the second and first and twelfth lag in PACF are significant.
## Auto.Arima
Now we are going to use the auto.arima function from the forecast package to suggest the appropriate (best) model.
m1 <- auto.arima(dlrGDPp1, ic = 'aicc', stepwise = FALSE)
m1
## Series: dlrGDPp1
## ARIMA(2,0,2) with non-zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2 intercept
## 1.2808 -0.6706 -0.9728 0.4962 0.0081
## s.e. 0.1719 0.1646 0.2083 0.1792 0.0008
##
## sigma^2 estimated as 8.294e-05: log likelihood=812.48
## AIC=-1612.96 AICc=-1612.61 BIC=-1591.91
ARMA(2,2) is suggested.
tsdiag(m1, gof.lag = 20)
We check the residuals of the model. As a result, we conclude that the residuals are statistically uncorrelated and also statistically non-different from zero. The result of ACF has no statistically significant coefficients. The p-values are high.
We check Stationarity and Invertibility. When the inverse of the characteristic roots of the model are in the unit circle, it show that the model is stationarity and invertibility.
plot.Arima(m1)
All of the characteristic roots are in the unit circle. Thus, we can presume that the model is stationary and invertible.
Now we will construct forecasts using the ARMA(2,2) model. We will use the following two separate forecasting methods. (1) Multi-step forecast, and (2) Rolling Scheme forecast.
m1.multistep.forecast <- forecast(m1, length(dlrGDPp2))
m1.rolling.forecast <- zoo()
firstQ <- 1947.25
lastQ <- 2008.75
for(i in 1:length(dlrGDPp2)) {
temp <- window(dlrGDP, start = firstQ + (i-1)/4, end = lastQ + (i-1) / 4)
m1.update <- arima(temp, order = c(2,0,2))
m1.rolling.forecast <- c(m1.rolling.forecast, forecast(m1.update, 1)$mean)
}
m1.rolling.forecast <- as.ts(m1.rolling.forecast)
plot(m1.multistep.forecast, type="o", pch=16, xlim=c(2005,2016), ylim=c(-0.03,0.03),
main="ARMA(2,2) Model Forecasting: Multistep vs Rolling Scheme")
lines(m1.multistep.forecast$mean, type="p", pch=16, lty="dashed", col="blue")
lines(dlrGDP, type="o", pch=16, lty="dashed")
lines(m1.rolling.forecast, type="o", pch=16, lty="dashed",col="red")
legend("topleft", c("multistep", "roll","Real GDP"), pch = c(16,16),col=c("blue","red","black"),lty="dashed")
We can see the multi-step forecast is more stable than the rolling scheme forecast.
Now we are going to check the accuracies of the model by using the accuracy function from the forecast package.
accuracy(f = m1.multistep.forecast$mean, x = dlrGDPp2)
## ME RMSE MAE MPE MAPE
## Test set -0.003506586 0.005294129 0.004239407 -159.3768 234.5007
accuracy(m1.rolling.forecast, dlrGDPp2)
## ME RMSE MAE MPE MAPE
## Test set -0.002202307 0.004997696 0.003901092 -96.81412 164.633
Rolling scheme forecasting method has less error than the multi-step forecating method. As a result, we can conclude the rolling scheme forecasting method has more accuracy than the multi-step forecasting method for the forecasting.
We now are going to compare ARMA(2,2) model with the accuracy of the AR(1) model
AR1<-arima(dlrGDP, order=c(1,0,0))
AR1.multistep.forecast <- forecast(AR1, length(dlrGDPp2))
AR1.rolling.forecast <- zoo()
firstQ <- 1947.25
lastQ <- 2008.75
for(i in 1:length(dlrGDPp2)) {
temp <- window(dlrGDP, start = firstQ + (i-1)/4, end = lastQ + (i-1) / 4)
AR1.update <- arima(temp, order = c(2,0,2))
AR1.rolling.forecast <- c(AR1.rolling.forecast, forecast(AR1.update, 1)$mean)
}
AR1.rolling.forecast <- as.ts(m1.rolling.forecast)
accuracy(f = m1.multistep.forecast$mean, x = dlrGDPp2)
## ME RMSE MAE MPE MAPE
## Test set -0.003506586 0.005294129 0.004239407 -159.3768 234.5007
accuracy(m1.rolling.forecast, dlrGDPp2)
## ME RMSE MAE MPE MAPE
## Test set -0.002202307 0.004997696 0.003901092 -96.81412 164.633
accuracy(f = AR1.multistep.forecast$mean, x = dlrGDPp2)
## ME RMSE MAE MPE MAPE
## Test set -0.003221717 0.005894379 0.004261119 -125.9801 226.3405
accuracy(AR1.rolling.forecast, dlrGDPp2)
## ME RMSE MAE MPE MAPE
## Test set -0.002202307 0.004997696 0.003901092 -96.81412 164.633
As a result, we can conclude that the both accuracy(Multi-Step Forecast Accuracy, Rolling Scheme Forecast Accuracy) of AR(1) model are better than those of ARMA(2,2).
In short, we can conclude AR(1) model is the best model when we try to get the forecasting. This result is surprised because when we use auto.arima fuction, ARMA(2,2) model is the best model. We think this result is from the step, “Split Data”.