The EU 15 GDP dataset was imported from the OECD website and covers EU countries’ quartlery GDP growth from 1990-2017. It was then sorted to include only the study variables. The code in some of these steps is rendered ineffective for ease of visual presentation. A summary of the GDP growth was undertaken along with a visual presentation of it.
EUGDP <- read.csv("/Users/oscarortmans/Desktop/Econometrics HW 2/GDP 1990-2017 Q.csv")
#sapply(EUGDP, class)#
#head(EUGDP)#
#str(EUGDP)#
ncol(EUGDP)
## [1] 19
#SORT#
EUGDP <- EUGDP[c(9,10,17)]
names(EUGDP)
## [1] "TIME" "Period" "Value"
The Box test indicates that the timeseries is stationary.
##
## Box-Ljung test
##
## data: gdp
## X-squared = 181.22, df = 5, p-value < 2.2e-16
The time series plot confirms the Karl Jung Box test findings. The abline shows that the mean and variance are constant over time. There is also no drift over time.
The auto correlation plot shows that GDP growth in the current period is a significant function of previous GDP growth in the past 5 quarters. The pattern exhibits memory for at least five quarters; beyond that no meaningful patterns can be extrapolated.
The data was separated into a training and validation dataset whereby the last 20% of observations were cut-off. A simple autoregression model of 1st order was performed. The regression output confirms the ACF plot results.
#Total Dataset#
nrow(EUGDP)
## [1] 112
#Select 80 % of Dataset#
(0.2*112)
## [1] 22.4
112-22
## [1] 90
#Training Dataset#
EUGDP1 <- EUGDP[c(1:90),]
T=length(gdp)
ar1model <- lm(gdp[2:T]~gdp[1:T-1], data=EUGDP1)
summary(ar1model)
##
## Call:
## lm(formula = gdp[2:T] ~ gdp[1:T - 1], data = EUGDP1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3881 -0.3056 0.0529 0.3209 2.7930
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.14703 0.09393 1.565 0.12
## gdp[1:T - 1] 0.90794 0.03938 23.058 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7025 on 109 degrees of freedom
## Multiple R-squared: 0.8299, Adjusted R-squared: 0.8283
## F-statistic: 531.7 on 1 and 109 DF, p-value: < 2.2e-16
The code provided by Prof. Jeroan Roumbouts was used to create a new AR1 forecast function and databases which loop over in a recursive forecasting manner. As in the above quastion, an 80% training dataset was used. Model parameters were stored for the forecasting estimates to produce a forecast evaluation. No significant breaks in the parameter estimates are observed. The mean forecast error and root mean squared error outputs can be found below.
## [1] "Forecast horizons: "
## [1] 6
## [1] "Mean Error AR1: "
## [,1]
## [1,] -0.1149146
## [2,] -0.2369396
## [3,] -0.3627540
## [4,] -0.4938781
## [5,] -0.5941244
## [6,] -0.6785679
## [1] "Root Mean Squared Error AR1: "
## [,1]
## [1,] 0.2810474
## [2,] 0.4750649
## [3,] 0.6246785
## [4,] 0.7276392
## [5,] 0.8081066
## [6,] 0.8746451
The same forecasting exersize was undertaken using the mean of the last twelve observations of the 80 % training dataset. The mean was caluclated using the rows 78 to 90 of the EUGDP database (last 12 observations, as specified in the question).
The following lines of code were added to the loop produced by Prof. Jeroan Roumbouts:
#Mean Calculation:
#### mean_naive = mean(vy[c(78:90)]);
#Added Code to Loop:
#### vrmse_naive=array(0,dim=c(hor,1));
#### vmae_naive=array(0,dim=c(hor,1));
#### verror_naive=array(0,dim=c(Teval,1));
#### verror_naive[t,1]=(mean_naive-vyout[t+h-1]);
#### verror_sq_naive[t,1]=(mean_naive-vyout[t+h-1])^2;
#Forecast Evaluation:
#### print("Mean Error Naive: ");
#### print(vmae_naive);
#### print("Root Mean Squared Error Naive: ");
#### print(vrmse_naive);
mforecasts = Mean_forecast2(EUGDP$Value, 0.2, 6)
forecast_performance2(EUGDP$Value, 0.2, mforecasts)
## [1] "Mean Error Naive: "
## [,1]
## [1,] -1.292744
## [2,] -1.388053
## [3,] -1.497038
## [4,] -1.622966
## [5,] -1.721870
## [6,] -1.808788
## [1] "Root Mean Squared Error Naive: "
## [,1]
## [1,] 1.648059
## [2,] 1.679737
## [3,] 1.712091
## [4,] 1.744511
## [5,] 1.791931
## [6,] 1.842930
It is evident from the comparison of both model error tables that the recurisve forcasting model provides a far more accurate forcast of GDP than the “naive” model. The latter is true for all time horizons.
| Horizon | Mean.Error.AR1 | Mean.Error.Naive | Root.Mean.Squared.Error.AR1 | Root.Mean.Squared.Error.Naive |
|---|---|---|---|---|
| 1 | -0.1149146 | -1.292744 | 0.2810474 | 1.648059 |
| 2 | -0.2369396 | -1.388053 | 0.4750649 | 1.679737 |
| 3 | -0.3627540 | -1.497038 | 0.6246785 | 1.712091 |
| 4 | -0.4938781 | -1.622966 | 0.7276392 | 1.744511 |
| 5 | -0.5941244 | -1.721870 | 0.8081066 | 1.791931 |
| 6 | -0.6785679 | -1.808788 | 0.8746451 | 1.842930 |
As per standard in every forecasting exersize, the performance of forecasting drastically deminishes as the window of future time horizon is increased. This is because the ‘memory’ in the timeseries decreases and the standard error drastically increases with longer time intervals. This can be observed in our data outputs with an augmented error as time horizons are increased. Moreover, as time increases the red lines in our graphs become longer which visually indicates that the forecasting error increases with time.
## [1] "h=1"
## [1] "Mean Forecast Errors: "
## [1] -0.1149146
## [1] "Std Forecast Errors: "
## [1] 0.2625162
## [1] "h=2"
## [1] "Mean Forecast Errors: "
## [1] -0.2369396
## [1] "Std Forecast Errors: "
## [1] 0.4219284
## [1] "h=3"
## [1] "Mean Forecast Errors: "
## [1] -0.362754
## [1] "Std Forecast Errors: "
## [1] 0.5217711
## [1] "h=4"
## [1] "Mean Forecast Errors: "
## [1] -0.4938781
## [1] "Std Forecast Errors: "
## [1] 0.5490052
## [1] "h=5"
## [1] "Mean Forecast Errors: "
## [1] -0.5941244
## [1] "Std Forecast Errors: "
## [1] 0.5636512
## [1] "h=6"
## [1] "Mean Forecast Errors: "
## [1] -0.6785679
## [1] "Std Forecast Errors: "
## [1] 0.5688446
A loop was created using Prof. Jeroan Roumbouts’ original code in order to modify the time scale of the selected variables. The code was used to estimate a new AR1 model from 2018. The parameters of the new AR1 model were graphed in order to check that they were different to those in question 3 and confirm that our code loop really works.
The forecasts for the new AR1 model were taken over 6 horizons and combined onto a single timeseries plot. The forecasted plot can be seen below, with the area beyond the dotted red line indicating forecasted values.
AR1_forecast_final<-function(vy,hor)
{
T=length(vy);
Tout=24;
Tin=T;
mforecast=array(0,dim=c(Tout,hor));
mparam=array(0,dim=c(Tout,2)); #2 parameters in AR1 model
for(t in 1:Tout)
{
vyestim=vy[1:Tin+t-1];
Testim=length(vyestim);
AR1_estim_final=lm(vyestim[2:Testim]~vyestim[1:Testim-1]);
mparam[t,]=AR1_estim_final$coefficients;
vyfor=c(1,vyestim[Testim]);
for(h in 1:hor)
{
mforecast[t,h]=vyfor %*% mparam[t,];
vyfor=c(1,mforecast[t,h]);
}
new_value = mean(mforecast[t,]);
vy = append(vy, new_value);
}
# parameters
plot(mparam[,1], main="Intercept Parameter Estimates 2024");
plot(mparam[,2], main="Coefficient Parameter Estimates 2024");
# forecasts
return(mforecast);
}