Research question: Can I use time series ARIMA models to predict the future two quarters of gdp. The data used is a quarterly time series from 1950 quarter 1 to 2000 quarter 4.
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
gdp_ts<- ts(data=USMacroG$gdp, start=1, end=204 ,frequency=1,deltat=1/4)
-In the code above I have turned the gdp variable I will be using in the prediction into a time time series starting at row 1/ 1950 quarter 1. By time series I am arranging the data in each variable to be connected to a single point in time being its respective quarter
kpss.test(gdp_ts)
## Warning in kpss.test(gdp_ts): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: gdp_ts
## KPSS Level = 4.063, Truncation lag parameter = 4, p-value = 0.01
The KPSS has a \(H_0\)<1 which says that the data is non stationary and a \(H_a\)=1 which states that the data is stationary. The KPSS test checks stationarity the p value of .01 proves the null hypothesis that the gdp variable is non stationary.
acf(gdp_ts)
difgdp<-auto.arima(gdp_ts, ic="aic", trace = TRUE)
##
## Fitting models using approximations to speed things up...
##
## ARIMA(2,2,2) : 2039.769
## ARIMA(0,2,0) : 2114.799
## ARIMA(1,2,0) : 2073.859
## ARIMA(0,2,1) : 2049.559
## ARIMA(1,2,2) : 2047.647
## ARIMA(2,2,1) : 2037.924
## ARIMA(1,2,1) : 2046.277
## ARIMA(2,2,0) : 2067.973
## ARIMA(3,2,1) : 2041.463
## ARIMA(3,2,0) : 2064.69
## ARIMA(3,2,2) : 2043.092
##
## Now re-fitting the best model(s) without approximations...
##
## ARIMA(2,2,1) : 2054.273
##
## Best model: ARIMA(2,2,1)
difgdp
## Series: gdp_ts
## ARIMA(2,2,1)
##
## Coefficients:
## ar1 ar2 ma1
## 0.2633 0.1442 -0.9656
## s.e. 0.0725 0.0726 0.0195
##
## sigma^2 = 1477: log likelihood = -1023.14
## AIC=2054.27 AICc=2054.48 BIC=2067.51
acf(ts(difgdp$residuals))
The first ACF graph shows correlation of 23 lags of gdp compared to the
lag of 0. We want all all the lags starting at 1 to be below the blue
guide line.
Next we use the auto.arima function to make the gdp data stationary.
Using the new stationary variable we create a new acf graph to check the correlation of the lags. Looking at the new graph we can see that all the lags except the 0 lag now are at or below the blue guide line.
Using the given ARIMA (2,2,1) model we can see that the theoretical model is \(\Delta^2\)\(GDP_t\)=\(\phi_1\)\(\Delta^2\)\(GDP_{t-1}\)+\(\phi_2\)\(\Delta^2\)\(GDP_{t-2}\)+ \(\theta_1\)\(\epsilon_{t-1}\)+\(\epsilon_t\)
The fitted model with our variables and coefficients is \(\Delta^2\)\(\hat{GDP_t}\)= .2632\(\Delta^2\)\(GDP_{t-1}\)+ .144\(\Delta^2\)\(GDP_{t-2}\)-.965\(\epsilon_{t-1}\)+\(\epsilon_t\)
fore<- forecast(difgdp, h=2)
plot(fore)
fore
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 205 9357.475 9308.217 9406.733 9282.142 9432.809
## 206 9415.513 9334.817 9496.209 9292.099 9538.927
Using the stationary gdp we can create a forecast of the future 2 quarters. The graph shows the predicted values of gdp in the two blue dots. The projected values are given in a 95% confidence interval. For the first quarter the projected low is 9282.142 and the high is 9432.809 For the second quarter the project low is 9292.099 and the high is 9538.927
When comparing the forecast to real values we can see that the model underestimates by roughly 1 billions in both forecast. The real values for GDP in those quarters are 10470.231 and 10,599.00