Econometrics: Methods and Applications by Erasmus University Rotterdam

Week 6 Assignment: Time Series

This document was created with R Markdown, and then printed as pdf for peer-graded evaluation purposes.

Code chunks will not be echoed in the paper.


Data set

Consumer price index of the United States of America and of the Euro area (data obtained from the Reserve Bank of Australia):
- CPI_EUR: Consumer price index in the Euro area
- CPI_USA: Consumer price index in the United States of America
- LOGPEUR: logarithm of CPI_EUR
- LOGPUSA: logarithm of CPI_USA
- DPEUR: first difference of LOGPEUR, monthly inflation rate
- DPUSA: first difference of LOGPUSA, monthly inflation rate
- TREND: linear trend (value 1 in Jan 2000 to value 144 in Dec 2011)

Questions

(a) Make time series plots of the CPI of the Euro area and the USA, and also of their logarithm \(log(CPI)\) and of the two monthly inflation series \(DP = ∆log(CPI)\). What conclusions do you draw from these plots?

This plots point out that:

  • USA and EURO prices may be correlated.
  • USA prices are typically higher than EURO prices, and the difference seems to slightly increase over time.
  • Both indexes are steadily increasing over time, with very few exceptions (i.e. year 2008).
  • The indexes increasing trend seems to be rather logarithmic than linear.
  • Both inflation series seems to be stationary.

(b) Perform the Augmented Dickey-Fuller (ADF) test for the two \(log(CPI)\) series. In the ADF test equation, include a constant (\(α\)), a deterministic trend term (\(β_t\)), three lags of \(DP = ∆log(CPI)\) and, of course, the variable of interest \(log(CPI_{t−1})\). Report the coefficient of \(log(CPI_{t−1})\) and its standard error and t-value, and draw your conclusion.

## 
##  Augmented Dickey-Fuller Test
## 
## data:  df$LOGPEUR
## Dickey-Fuller = -2.8263, Lag order = 3, p-value = 0.2324
## alternative hypothesis: stationary
## 
##  Augmented Dickey-Fuller Test
## 
## data:  df$LOGPUSA
## Dickey-Fuller = -2.7345, Lag order = 3, p-value = 0.2706
## alternative hypothesis: stationary

For both variables, the ADF statistic is greater than the critical value of −3.5. Therefore, the non-stationarity hypothesis is not rejected.


(c) As the two series of \(log(CPI)\) are not cointegrated (you need not check this), we continue by modelling the monthly inflation series \(DPEUR = ∆log(CPIEUR)\) for the Euro area. Determine the sample autocorrelations and the sample partial autocorrelations of this series to motivate the use of the following AR model: \(DPEUR_t = α + β_1DPEUR_{t−6} + β_2DPEUR_{t−12} + ε_t\). Estimate the parameters of this model (sample Jan 2000 - Dec 2010).

We calculate sample autocorrelations and the sample partial autocorrelations and show the highest values for partial autocorrelation:

lag AC PAC
12 0.554 0.398
6 0.403 0.374

\(t-6\) and \(t-12\) have the largest partial auto correlations, that justifies the model proposed.

Estimating the AR model for lags 6 and 12 produces the following Autoregressive Fit Model:

## 
## Time series regression with "ts" data:
## Start = 14, End = 132
## 
## Call:
## dynlm(formula = ts(DPEUR) ~ L(ts(DPEUR, 6)) + L(ts(DPEUR, 12)), 
##     data = df_train)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0103343 -0.0017369 -0.0000475  0.0015322  0.0080903 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      0.0003838  0.0002811   1.365   0.1749    
## L(ts(DPEUR, 6))  0.1887459  0.0772888   2.442   0.0161 *  
## L(ts(DPEUR, 12)) 0.5979841  0.0835544   7.157 8.05e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.002569 on 116 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.4232, Adjusted R-squared:  0.4132 
## F-statistic: 42.55 on 2 and 116 DF,  p-value: 1.381e-14

(d) Extend the AR model of part (c) by adding lagged values of monthly inflation in the USA at lags 1, 6, and 12. Check that the coefficient at lag 6 is not significant, and estimate the ADL model \(DPEUR_t = α + β_1DPEUR_{t−6} + β_2DPEUR_{t−12} + γ_1DPUSA_{t−1} + γ_2DPUSA_{t−12} + ε_t\) (sample Jan 2000 - Dec 2010).

We extend the model with USA data:

## 
## Time series regression with "ts" data:
## Start = 14, End = 132
## 
## Call:
## dynlm(formula = ts(DPEUR) ~ L(ts(DPEUR, 6)) + L(ts(DPEUR, 12)) + 
##     L(ts(DPUSA)) + L(ts(DPUSA, 6)) + L(ts(DPUSA, 12)), data = df_train)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0065866 -0.0016535 -0.0000118  0.0012630  0.0082682 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       0.0004407  0.0002853   1.545    0.125    
## L(ts(DPEUR, 6))   0.2029891  0.0785520   2.584    0.011 *  
## L(ts(DPEUR, 12))  0.6367464  0.0874766   7.279 4.78e-11 ***
## L(ts(DPUSA))      0.2264287  0.0511286   4.429 2.20e-05 ***
## L(ts(DPUSA, 6))  -0.0560565  0.0547645  -1.024    0.308    
## L(ts(DPUSA, 12)) -0.2300418  0.0541695  -4.247 4.47e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.002272 on 113 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.5602, Adjusted R-squared:  0.5408 
## F-statistic: 28.79 on 5 and 113 DF,  p-value: < 2.2e-16

USA lag 6 is not significant (p-value>0.30), so we restrict the model like this:

## 
## Time series regression with "ts" data:
## Start = 14, End = 132
## 
## Call:
## dynlm(formula = ts(DPEUR) ~ L(ts(DPEUR, 6)) + L(ts(DPEUR, 12)) + 
##     L(ts(DPUSA)) + L(ts(DPUSA, 12)), data = df_train)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0067809 -0.0016356  0.0000532  0.0013660  0.0082448 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       0.0003391  0.0002676   1.267   0.2076    
## L(ts(DPEUR, 6))   0.1687310  0.0710801   2.374   0.0193 *  
## L(ts(DPEUR, 12))  0.6551529  0.0856263   7.651 6.93e-12 ***
## L(ts(DPUSA))      0.2326460  0.0507772   4.582 1.19e-05 ***
## L(ts(DPUSA, 12)) -0.2264880  0.0540694  -4.189 5.55e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.002273 on 114 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.5561, Adjusted R-squared:  0.5406 
## F-statistic: 35.71 on 4 and 114 DF,  p-value: < 2.2e-16

(e) Use the models of parts (c) and (d) to make two series of 12 monthly inflation forecasts for 2011. At each month, you should use the data that are then available, for example, to forecast inflation for September 2011 you can use the data up to and including August 2011. However, do not re-estimate the model and use the coefficients as obtained in parts (c) and (d). For each of the two forecast series, compute the values of the root mean squared error (RMSE), mean absolute error (MAE), and the sum of the forecast errors (SUM). Finally, give your interpretation of the outcomes.

Model RMSE MAE SUM
AR model forecasting 0.0011367 0.0008655 0.0012774
ADL model forecasting 0.0009278 0.0007104 0.0006066

We can conclude that the ADL model performs better forecasts than the AR model, as it scored less at all errors scores.

Both estimates were pretty accurate, as shown on the graph, but the ADL, using wisely chosen USA lag data, proved better almost every single months.