Abstract

In this project I select one pair of cointegrated time series from 10 instruments using correlation and DF test. Then I fit the most appropriate ARIMA and VAR model for both series. Each instrument is forecasted for 10 next periods using fitted models. Based on RMSE and MAPE out of sample error measures, ARIMA(1,1,1) turned out to be a little bit better forecasting method in comparison to VAR(3) for both instruments.

Introduction

Goal of this analysis is to compare out of sample ARIMA and VAR forecasts. Data consists of 10 time series. Out of them we have to choose 2 that are cointegrated. For this purpose firstly I counted the correlations to choose potentially cointegrated pair of time series. Then to check if they are really cointegrated I run Dickey-Fuller test. After obtaining the pair of cointegrated series I fitted VAR and ARIMA models, and produced forecasts.

EDA

At the beginning we should perform some exploratory data analysis. In the table below we can see descriptive statistics like minimum, mean, maximum or standard deviation.

Descriptive Statistics
y1 y2 y3 y4 y5 y6 y7 y8 y9 y10
Min. :125.6 Min. :105.2 Min. : 34.29 Min. :109.7 Min. :115.0 Min. :157.9 Min. : 78.77 Min. : 51.00 Min. :140.5 Min. :126.3
1st Qu.:146.0 1st Qu.:149.3 1st Qu.:122.62 1st Qu.:136.4 1st Qu.:156.0 1st Qu.:168.5 1st Qu.:133.66 1st Qu.: 83.41 1st Qu.:161.0 1st Qu.:143.1
Median :154.4 Median :161.6 Median :147.24 Median :170.2 Median :172.7 Median :173.2 Median :161.15 Median :114.67 Median :170.4 Median :164.3
Mean :155.3 Mean :159.0 Mean :142.03 Mean :164.0 Mean :174.5 Mean :173.0 Mean :158.58 Mean :107.78 Mean :170.1 Mean :160.4
3rd Qu.:162.3 3rd Qu.:168.3 3rd Qu.:160.46 3rd Qu.:191.8 3rd Qu.:188.6 3rd Qu.:177.8 3rd Qu.:182.31 3rd Qu.:128.95 3rd Qu.:180.0 3rd Qu.:177.8
Max. :206.7 Max. :185.6 Max. :194.24 Max. :216.3 Max. :277.3 Max. :187.1 Max. :227.55 Max. :153.14 Max. :198.1 Max. :193.1
Sd. :16.9 Sd. :14.8 Sd. :29.6 Sd. :31.8 Sd. :33.8 Sd. :7.3 Sd. :34.3 Sd. :28.1 Sd. :14.7 Sd. :19.9

Moreover we can plot histogram of each series.

We can see that some of them are quite similiar. For example y1 to y5 or y2 to y3, or y6 to y9. Maybe this indicate cointegration. Let`s now move to finding cointegrated pair part.

Finding pair of cointegrated time series

In order to find cointegrated pair, Dickey-Fuller test should be performed. Good idea to choose just 2 out of 10 potential time series is to run correlations on their original and differentiated values.

Correlogram

Diff Correlogram

In the plot above we can see correlograms and kendall correlation coefficients. We can see that 4 pairs have quite big correlation - y1 and y5, y2 and y3, y6 and y9, y4 and y10. I am going to choose the pair that have the biggest coeffcient on differentiated series. It turned out to be y1 and y5.

We can also plot them. We can see that they co-move. The only difference is that 5th time series is more volatile in comparison to 1st.

Run chart

Diff Run chart

Pair of series are cointegrated if they are integrated of the same order (d) and if there exists a linear combination of these variables, which is integrated of order d-b. To check if they are really cointegrated, we have to run dickey-fuller test on residuals of regression model. Results showed that they are. Non-stationarity of residuals is strongly rejected, so residuals are stationary, which means that 1st and 5th series are cointegrated.

Regression summary

## 
## Call:
## lm(formula = train_xts[, 1] ~ train_xts[, 5])
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.142091 -0.028678  0.002071  0.028194  0.127817 
## 
## Coefficients:
##                   Estimate  Std. Error t value Pr(>|t|)    
## (Intercept)    68.00935261  0.01445347    4705   <2e-16 ***
## train_xts[, 5]  0.49995659  0.00008131    6149   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04671 on 288 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 3.781e+07 on 1 and 288 DF,  p-value: < 2.2e-16

Residuals Run chart

Table

augmentations adf p_adf bgodfrey p_bg
0 -16.989 0.01 0.000 0.997
1 -11.707 0.01 0.001 0.979
2 -10.306 0.01 0.010 0.919
3 -9.644 0.01 0.000 0.991

In regression summary tab we can see results of regressing 5th variable on 1st. Model and variable is statistically significant. Next we have run chart of model residuals. We can see that they are homoscedastic. In the table tab we see Dickey-Fuller test. Last column about Breusch–Godfrey test indicate that there is no autocorrelation in residuals. This means that we can trust p-value of D-F test, that tells us that we need to reject null hypothesis about non-stationarity. Concluding all results that we obtained, lead us to statement that this pair is cointegrated.
Having a chosen pair let`s move to finding the best Arima model.

Finding Arima models

To find the most optimal Arima model I have chosen auto-arima approach. Using auto.arima() function many models were built. To choose the best one I used three information criteria: AIC, AICc and BIC. BIC in comparison to AIC tends to choose more parsimonous model. Let`s now fit auto.arima to 1st series.

1st series

auto.arima AIC

## Series: x 
## ARIMA(3,1,1) 
## 
## Coefficients:
##          ar1     ar2      ar3     ma1
##       0.4394  0.2319  -0.0309  0.8349
## s.e.  0.0821  0.1019   0.0713  0.0566
## 
## sigma^2 estimated as 0.856:  log likelihood=-386.58
## AIC=783.16   AICc=783.37   BIC=801.49
## 
## Training set error measures:
##                      ME    RMSE       MAE        MPE      MAPE      MASE        ACF1
## Training set 0.01688695 0.91719 0.7286551 0.01617509 0.4766854 0.5066435 0.002034327

auto.arima AICc

## Series: x 
## ARIMA(3,1,1) 
## 
## Coefficients:
##          ar1     ar2      ar3     ma1
##       0.4394  0.2319  -0.0309  0.8349
## s.e.  0.0821  0.1019   0.0713  0.0566
## 
## sigma^2 estimated as 0.856:  log likelihood=-386.58
## AIC=783.16   AICc=783.37   BIC=801.49
## 
## Training set error measures:
##                      ME    RMSE       MAE        MPE      MAPE      MASE        ACF1
## Training set 0.01688695 0.91719 0.7286551 0.01617509 0.4766854 0.5066435 0.002034327

auto.arima BIC

## Series: x 
## ARIMA(1,1,1) 
## 
## Coefficients:
##          ar1     ma1
##       0.6230  0.7064
## s.e.  0.0517  0.0574
## 
## sigma^2 estimated as 0.8692:  log likelihood=-389.75
## AIC=785.5   AICc=785.59   BIC=796.5
## 
## Training set error measures:
##                      ME      RMSE       MAE        MPE      MAPE      MASE       ACF1
## Training set 0.01795807 0.9274546 0.7335921 0.01598381 0.4797407 0.5100763 -0.0603988

In results above we can see that auto.arima using AIC and AICc have chosen the same model - ARIMA(3,1,1) while BIC ARIMA(1,1,1).
Let`s inspect their residuals using ACF plot and Ljung-Box test.

ARIMA(3,1,1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,1,1)
## Q* = 4.7608, df = 6, p-value = 0.5748
## 
## Model df: 4.   Total lags used: 10

ARIMA(1,1,1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,1,1)
## Q* = 11.684, df = 8, p-value = 0.1659
## 
## Model df: 2.   Total lags used: 10

We can see that both of models are well specified. Looking at ACF plot there are no significant lags in both model. Their residuals resemble white noise and are more or less normally distributed. For both models we can not reject null hypotheses of Ljung-Box test, that says that there is no autocorrelation.

To choose the best model of those two I ran cross-validation on train data as our main goal is the best forecast. Cross-validation was counted using prof.Hyndman tsCV() function with 10 forecast horizon.

Cross Validation
RMSE
ARIMA(3,1,1) 8.058
ARIMA(1,1,1) 7.521

Based on RMSE of cross-validation, ARIMA(1,1,1) is slightly better.

5th series

Same procedure was performed for 5th instrument.

auto.arima AIC

## Series: x 
## ARIMA(3,1,1) 
## 
## Coefficients:
##          ar1     ar2      ar3     ma1
##       0.4728  0.1885  -0.0093  0.7617
## s.e.  0.0936  0.1155   0.0735  0.0727
## 
## sigma^2 estimated as 3.563:  log likelihood=-592.54
## AIC=1195.08   AICc=1195.29   BIC=1213.41
## 
## Training set error measures:
##                      ME     RMSE      MAE        MPE      MAPE      MASE        ACF1
## Training set 0.03397884 1.871173 1.493078 0.03764282 0.8938419 0.5188787 0.002075807

auto.arima AICc

## Series: x 
## ARIMA(3,1,1) 
## 
## Coefficients:
##          ar1     ar2      ar3     ma1
##       0.4728  0.1885  -0.0093  0.7617
## s.e.  0.0936  0.1155   0.0735  0.0727
## 
## sigma^2 estimated as 3.563:  log likelihood=-592.54
## AIC=1195.08   AICc=1195.29   BIC=1213.41
## 
## Training set error measures:
##                      ME     RMSE      MAE        MPE      MAPE      MASE        ACF1
## Training set 0.03397884 1.871173 1.493078 0.03764282 0.8938419 0.5188787 0.002075807

auto.arima BIC

## Series: x 
## ARIMA(1,1,1) 
## 
## Coefficients:
##          ar1     ma1
##       0.6364  0.6457
## s.e.  0.0513  0.0589
## 
## sigma^2 estimated as 3.592:  log likelihood=-594.72
## AIC=1195.43   AICc=1195.52   BIC=1206.43
## 
## Training set error measures:
##                      ME     RMSE      MAE        MPE      MAPE      MASE       ACF1
## Training set 0.03574344 1.885456 1.498846 0.03576801 0.8964929 0.5208834 -0.0494498

Results are pretty much the same. Based on AIC and AICc, ARIMA(3,1,1) was the best, for BIC ARIMA(1,1,1).

ARIMA(3,1,1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,1,1)
## Q* = 6.7261, df = 6, p-value = 0.3469
## 
## Model df: 4.   Total lags used: 10

ARIMA(1,1,1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,1,1)
## Q* = 12.368, df = 8, p-value = 0.1355
## 
## Model df: 2.   Total lags used: 10

Analysis of residuals tell us that there is significant 8 lag in residuals for both models. We can not reject null hypothesis in Ljung-Box test for both models and residuals look like white noise.
Let`s one more time decide upon best model using cross-validation.

Cross Validation
RMSE
ARIMA(3,1,1) 15.432
ARIMA(1,1,1) 15.028

Cross-validation one more time states that for 5th series ARIMA(1,1,1) is better.

Finding VAR model

To select right VAR model I used VARselect function that returns information criteria and final prediction error for sequential increasing the lag order up to a VAR(p)-process. Different criteria give us different model but I decided to choose the one proposed by SC criterion. SC tends to select more parsimonious models and they usually are better at forecasting. VAR(3) was chosen.

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      6      4      3      6 
## 
## $criteria
##                  1            2            3            4            5            6            7
## AIC(n) -3.49531880 -4.630297925 -4.717099193 -4.752609538 -4.745702309 -4.765363443 -4.748754805
## HQ(n)  -3.46432872 -4.578647794 -4.644789009 -4.659639301 -4.632072020 -4.631073101 -4.593804411
## SC(n)  -3.41803017 -4.501483547 -4.536759064 -4.520743657 -4.462310678 -4.430446060 -4.362311671
## FPE(n)  0.03033912  0.009751925  0.008941258  0.008629518  0.008689639  0.008520894  0.008664198

Forecasting

To evaluate out of sample prediction I used RMSE and MAPE. In the table below we can see those statistics for first time series.

1st

MAPE RMSE
VAR(3) 10.669 24.418
ARIMA(1,1,1) 9.716 22.326

ARIMA(1,1,1) produced a little bit better forecast. Both RMSE and MAPE is lower for ARIMA. Let`s look at 5th now.

5th

MAPE RMSE
VAR(3) 16.436 48.827
ARIMA(1,1,1) 14.883 44.425

Results are similar. ARIMA(1,1,1) is better when we consider RMSE and MAPE.
Evaluating forecasts based only on numbers might be hard and even misleading. Let`s plot those forecasts with original out of sample data.

1st

5th

Looking at plots of forecasts, we can see that both methods produced poor forecast. They are far from original series. Test data rise drastically, while forecasts decay very fast and do not show almost any growth at all. That`s not a surprise. They converge to unconditional mean and should not be forecasted for 10 next periods in this situation. ARIMA should only be forecasted for next max(p,q) periods. After that it converge to unconditional mean.

Summary

Goal of this project was to firstly find cointegrated pair of time series, it was easily done using correlation analysis and then confirmed using Dickey-Fuller test. After that, VAR(3) and ARIMA(1,1,1) models were fitted. We had to evaluate their forecasts for the next 10 observations. ARIMA model produced a little bit better prediction but looking at test values, both of them performed poorly. Using those specific fitted models we should not forecast for 10 next periods.