Econometrics Homework 2

Oscar Ortmans & Sasha Bouloudnine

Question 1.

The EU 15 GDP dataset was imported from the OECD website and covers EU countries’ quartlery GDP growth from 1990-2017. It was then sorted to include only the study variables. The code in some of these steps is rendered ineffective for ease of visual presentation. A summary of the GDP growth was undertaken along with a visual presentation of it.

EUGDP <- read.csv("/Users/oscarortmans/Desktop/Econometrics HW 2/GDP 1990-2017 Q.csv")

#sapply(EUGDP, class)#
#head(EUGDP)#
#str(EUGDP)#
ncol(EUGDP)

## [1] 19

#SORT#
EUGDP <- EUGDP[c(9,10,17)]

Study variables:

names(EUGDP)

## [1] "TIME"   "Period" "Value"

Descriptive statistics:

The Box test indicates that the timeseries is stationary.

## 
##  Box-Ljung test
## 
## data:  gdp
## X-squared = 181.22, df = 5, p-value < 2.2e-16

Time Series:

The time series plot confirms the Karl Jung Box test findings. The abline shows that the mean and variance are constant over time. There is also no drift over time.

Auto Correlation:

The auto correlation plot shows that GDP growth in the current period is a significant function of previous GDP growth in the past 5 quarters. The pattern exhibits memory for at least five quarters; beyond that no meaningful patterns can be extrapolated.

Question 2.

The data was separated into a training and validation dataset whereby the last 20% of observations were cut-off. A simple autoregression model of 1st order was performed. The regression output confirms the ACF plot results.

Split Datasets:

#Total Dataset#
nrow(EUGDP)

## [1] 112

#Select 80 % of Dataset#
(0.2*112)

## [1] 22.4

112-22

## [1] 90

#Training Dataset#
EUGDP1 <- EUGDP[c(1:90),]

AR1 Model for GDP:

T=length(gdp)
ar1model <- lm(gdp[2:T]~gdp[1:T-1], data=EUGDP1)
summary(ar1model)

## 
## Call:
## lm(formula = gdp[2:T] ~ gdp[1:T - 1], data = EUGDP1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.3881 -0.3056  0.0529  0.3209  2.7930 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.14703    0.09393   1.565     0.12    
## gdp[1:T - 1]  0.90794    0.03938  23.058   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7025 on 109 degrees of freedom
## Multiple R-squared:  0.8299, Adjusted R-squared:  0.8283 
## F-statistic: 531.7 on 1 and 109 DF,  p-value: < 2.2e-16

Questions 3 & 4.

The code provided by Prof. Jeroan Roumbouts was used to create a new AR1 forecast function and databases which loop over in a recursive forecasting manner. As in the above quastion, an 80% training dataset was used. Model parameters were stored for the forecasting estimates to produce a forecast evaluation. No significant breaks in the parameter estimates are observed. The mean forecast error and root mean squared error outputs can be found below.

## [1] "Forecast horizons: "
## [1] 6
## [1] "Mean Error AR1: "
##            [,1]
## [1,] -0.1149146
## [2,] -0.2369396
## [3,] -0.3627540
## [4,] -0.4938781
## [5,] -0.5941244
## [6,] -0.6785679
## [1] "Root Mean Squared Error AR1: "
##           [,1]
## [1,] 0.2810474
## [2,] 0.4750649
## [3,] 0.6246785
## [4,] 0.7276392
## [5,] 0.8081066
## [6,] 0.8746451

Question 5.

The same forecasting exersize was undertaken using the mean of the last twelve observations of the 80 % training dataset. The mean was caluclated using the rows 78 to 90 of the EUGDP database (last 12 observations, as specified in the question).

Added Code for Naive Model and Forecasting Evaluation of Naive Model:

The following lines of code were added to the loop produced by Prof. Jeroan Roumbouts:

#Mean Calculation:

####  mean_naive = mean(vy[c(78:90)]);

#Added Code to Loop:
  
####  vrmse_naive=array(0,dim=c(hor,1));
####  vmae_naive=array(0,dim=c(hor,1));
####  verror_naive=array(0,dim=c(Teval,1)); 

####  verror_naive[t,1]=(mean_naive-vyout[t+h-1]);
####  verror_sq_naive[t,1]=(mean_naive-vyout[t+h-1])^2;


#Forecast Evaluation:

####  print("Mean Error Naive: ");
####  print(vmae_naive);
####  print("Root Mean Squared Error Naive: ");
####  print(vrmse_naive);

mforecasts = Mean_forecast2(EUGDP$Value, 0.2, 6)
forecast_performance2(EUGDP$Value, 0.2, mforecasts)

## [1] "Mean Error Naive: "
##           [,1]
## [1,] -1.292744
## [2,] -1.388053
## [3,] -1.497038
## [4,] -1.622966
## [5,] -1.721870
## [6,] -1.808788
## [1] "Root Mean Squared Error Naive: "
##          [,1]
## [1,] 1.648059
## [2,] 1.679737
## [3,] 1.712091
## [4,] 1.744511
## [5,] 1.791931
## [6,] 1.842930

Question 6.

It is evident from the comparison of both model error tables that the recurisve forcasting model provides a far more accurate forcast of GDP than the “naive” model. The latter is true for all time horizons.

Horizon	Mean.Error.AR1	Mean.Error.Naive	Root.Mean.Squared.Error.AR1	Root.Mean.Squared.Error.Naive
1	-0.1149146	-1.292744	0.2810474	1.648059
2	-0.2369396	-1.388053	0.4750649	1.679737
3	-0.3627540	-1.497038	0.6246785	1.712091
4	-0.4938781	-1.622966	0.7276392	1.744511
5	-0.5941244	-1.721870	0.8081066	1.791931
6	-0.6785679	-1.808788	0.8746451	1.842930

Question 7.

As per standard in every forecasting exersize, the performance of forecasting drastically deminishes as the window of future time horizon is increased. This is because the ‘memory’ in the timeseries decreases and the standard error drastically increases with longer time intervals. This can be observed in our data outputs with an augmented error as time horizons are increased. Moreover, as time increases the red lines in our graphs become longer which visually indicates that the forecasting error increases with time.

## [1] "h=1"
## [1] "Mean Forecast Errors: "
## [1] -0.1149146
## [1] "Std Forecast Errors: "
## [1] 0.2625162

## [1] "h=2"
## [1] "Mean Forecast Errors: "
## [1] -0.2369396
## [1] "Std Forecast Errors: "
## [1] 0.4219284

## [1] "h=3"
## [1] "Mean Forecast Errors: "
## [1] -0.362754
## [1] "Std Forecast Errors: "
## [1] 0.5217711

## [1] "h=4"
## [1] "Mean Forecast Errors: "
## [1] -0.4938781
## [1] "Std Forecast Errors: "
## [1] 0.5490052

## [1] "h=5"
## [1] "Mean Forecast Errors: "
## [1] -0.5941244
## [1] "Std Forecast Errors: "
## [1] 0.5636512

## [1] "h=6"
## [1] "Mean Forecast Errors: "
## [1] -0.6785679
## [1] "Std Forecast Errors: "
## [1] 0.5688446

Question 8.

A loop was created using Prof. Jeroan Roumbouts’ original code in order to modify the time scale of the selected variables. The code was used to estimate a new AR1 model from 2018. The parameters of the new AR1 model were graphed in order to check that they were different to those in question 3 and confirm that our code loop really works.

The forecasts for the new AR1 model were taken over 6 horizons and combined onto a single timeseries plot. The forecasted plot can be seen below, with the area beyond the dotted red line indicating forecasted values.

Modified Code for Forecasted GDP to 2024

AR1_forecast_final<-function(vy,hor)
{   
  T=length(vy);
  Tout=24;
  Tin=T;
  mforecast=array(0,dim=c(Tout,hor));
  mparam=array(0,dim=c(Tout,2)); #2 parameters in AR1 model
  
  for(t in 1:Tout)
  {
    vyestim=vy[1:Tin+t-1];
    Testim=length(vyestim);
    AR1_estim_final=lm(vyestim[2:Testim]~vyestim[1:Testim-1]);
    mparam[t,]=AR1_estim_final$coefficients;
    vyfor=c(1,vyestim[Testim]);
    
    for(h in 1:hor)
    {
      mforecast[t,h]=vyfor %*% mparam[t,];
      vyfor=c(1,mforecast[t,h]);
    }
    
    new_value = mean(mforecast[t,]);
    vy = append(vy, new_value);
    
  }
  
  # parameters

  plot(mparam[,1], main="Intercept Parameter Estimates 2024");  
  plot(mparam[,2], main="Coefficient Parameter Estimates 2024");
  
  # forecasts
  return(mforecast);
}