Read in the data we obtained in Phase 1. We calculate simple returns.
Plotting returns:
We split the data on the in-sample (training) set and out-of-sample (testing) set.
We split the data the same way as in Phase 2 and 3.
## [1] "Number of observations in training set: 505 (82.11%)"
## [1] "Number of observations in testing set: 110 (17.89%)"
Looking at ACF and PACF plots we see that there’s serial correlation.
We take a look at regular, squared and absolute values for lag 10 and 30.
The regular ACF plot for lag 30 indicates that there’s serial correlation up to 25th lag, but looking at the squared returns we see we actually need to the 30th lag.
PACF plots show the same.
We use Box-Ljung test to test serial correlation on the returns.
Since the p-value is less than 5% we reject the null hypothesis that there is no serial correlation with strong evidence (p-value = 2.2e-16), i.e. there is serial correlation.
This means that we’ll need to use ARMA + GARCH model.
##
## Box-Ljung test
##
## data: training_set
## X-squared = 271.45, df = 30, p-value < 2.2e-16
We need to check if there’s a ARCH effect in the data.
Since the expected return of MSFT is not zero (calculated in Phase 1) we need to adjust for that.
Since p-value is practically zero (2.2e-16), we reject the null hypothesis (that there’s no conditional homoscedastcity).
This means that we have strong evidence to reject this hypothesis, hence, there’s ARCH effect.
##
## Box-Ljung test
##
## data: at^2
## X-squared = 593.01, df = 30, p-value < 2.2e-16
##
## Call:
## lm(formula = atsq ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.0068785 -0.0002686 -0.0000936 0.0000994 0.0132433
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.050e-04 6.543e-05 1.604 0.109386
## x1 6.564e-01 4.744e-02 13.836 < 2e-16 ***
## x2 1.040e-02 5.675e-02 0.183 0.854609
## x3 -1.654e-01 5.672e-02 -2.917 0.003715 **
## x4 1.695e-01 5.710e-02 2.969 0.003149 **
## x5 -4.795e-02 5.748e-02 -0.834 0.404587
## x6 2.789e-02 5.742e-02 0.486 0.627372
## x7 7.517e-02 5.742e-02 1.309 0.191151
## x8 -4.266e-02 5.750e-02 -0.742 0.458544
## x9 4.301e-02 5.753e-02 0.748 0.455027
## x10 1.390e-01 5.755e-02 2.416 0.016094 *
## x11 -6.316e-02 5.787e-02 -1.091 0.275647
## x12 3.464e-02 5.777e-02 0.600 0.549101
## x13 -8.333e-02 5.739e-02 -1.452 0.147172
## x14 -1.523e-02 5.680e-02 -0.268 0.788771
## x15 1.373e-01 5.681e-02 2.417 0.016036 *
## x16 1.004e-04 5.681e-02 0.002 0.998590
## x17 -1.947e-01 5.681e-02 -3.427 0.000666 ***
## x18 1.445e-01 5.742e-02 2.517 0.012193 *
## x19 -9.493e-02 5.780e-02 -1.642 0.101233
## x20 4.377e-02 5.790e-02 0.756 0.450059
## x21 -2.435e-02 5.754e-02 -0.423 0.672358
## x22 -1.432e-02 5.751e-02 -0.249 0.803490
## x23 2.610e-02 5.748e-02 0.454 0.649935
## x24 -3.067e-02 5.737e-02 -0.534 0.593264
## x25 7.369e-02 5.737e-02 1.285 0.199633
## x26 -9.916e-02 5.743e-02 -1.727 0.084919 .
## x27 9.089e-02 5.698e-02 1.595 0.111379
## x28 -3.698e-02 5.651e-02 -0.654 0.513189
## x29 -7.300e-03 5.653e-02 -0.129 0.897316
## x30 2.373e-02 4.726e-02 0.502 0.615828
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.001214 on 444 degrees of freedom
## Multiple R-squared: 0.4863, Adjusted R-squared: 0.4516
## F-statistic: 14.01 on 30 and 444 DF, p-value: < 2.2e-16
Since we found that there’s serial correlation we’ll use ARMA(1, 1)-GARCH(1, 1) model with Student t-distribution.
If we look at the Standardized Residuals Tests we see the following:
Since p-values of Ljung-Box tests on standardized residuals is greater than 5%, there is no evidence of correlation in our residuals
The same is correct for the squared residuals so there’s no dependence in conditional variance.
The p-value for the LM Arch Test is 0.56 (not rejecting the null) which means that there’s no additional ARCH effect our model didn’t captured.
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~arma(1, 1) + garch(1, 1), data = training_set,
## cond.dist = "std", trace = F)
##
## Mean and Variance Equation:
## data ~ arma(1, 1) + garch(1, 1)
## <environment: 0x7ff7a8e1ed68>
## [data = training_set]
##
## Conditional Distribution:
## std
##
## Coefficient(s):
## mu ar1 ma1 omega alpha1 beta1
## 7.0173e-04 6.7945e-01 -8.2628e-01 9.0590e-06 1.8654e-01 7.9894e-01
## shape
## 6.2238e+00
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu 7.017e-04 2.459e-04 2.854 0.004319 **
## ar1 6.795e-01 9.993e-02 6.799 1.05e-11 ***
## ma1 -8.263e-01 7.698e-02 -10.734 < 2e-16 ***
## omega 9.059e-06 4.345e-06 2.085 0.037059 *
## alpha1 1.865e-01 4.618e-02 4.039 5.36e-05 ***
## beta1 7.989e-01 4.026e-02 19.845 < 2e-16 ***
## shape 6.224e+00 1.764e+00 3.529 0.000418 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 1383.943 normalized: 2.740481
##
## Description:
## Mon Jun 21 09:54:05 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 33.82925 4.508907e-08
## Shapiro-Wilk Test R W 0.9830147 1.259921e-05
## Ljung-Box Test R Q(10) 14.75566 0.1412272
## Ljung-Box Test R Q(15) 20.5683 0.1511985
## Ljung-Box Test R Q(20) 22.85278 0.2960811
## Ljung-Box Test R^2 Q(10) 8.821484 0.5491252
## Ljung-Box Test R^2 Q(15) 10.10979 0.8127778
## Ljung-Box Test R^2 Q(20) 11.66964 0.9269777
## LM Arch Test R TR^2 9.144168 0.6905698
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -5.453240 -5.394681 -5.453617 -5.430271
The last model gave us pretty good results, let’s try to increase the order of the model.
All of our parameters except for alpha 2 are significant.
Just like in the previous model, we can see that there’s no correlation and no dependence in conditional variance.
The p-value for the LM Arch Test is 0.72 (not rejecting the null) which means that there’s no additional ARCH effect our model didn’t captured. This value is even greater than in the previous model.
This might indicate that this model will be better, but we’ll keep track of the AIC values and compare models that way.
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~arma(1, 1) + garch(2, 1), data = training_set,
## cond.dist = "std", trace = F)
##
## Mean and Variance Equation:
## data ~ arma(1, 1) + garch(2, 1)
## <environment: 0x7ff78d47ba80>
## [data = training_set]
##
## Conditional Distribution:
## std
##
## Coefficient(s):
## mu ar1 ma1 omega alpha1 alpha2
## 0.00068305 0.68536958 -0.83340048 0.00001009 0.14379790 0.06496868
## beta1 shape
## 0.77600250 6.26818424
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu 6.831e-04 2.330e-04 2.932 0.003370 **
## ar1 6.854e-01 9.570e-02 7.161 7.99e-13 ***
## ma1 -8.334e-01 7.345e-02 -11.347 < 2e-16 ***
## omega 1.009e-05 5.000e-06 2.018 0.043581 *
## alpha1 1.438e-01 7.501e-02 1.917 0.055229 .
## alpha2 6.497e-02 9.147e-02 0.710 0.477529
## beta1 7.760e-01 5.327e-02 14.566 < 2e-16 ***
## shape 6.268e+00 1.792e+00 3.498 0.000468 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 1384.426 normalized: 2.741437
##
## Description:
## Mon Jun 21 09:54:06 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 33.43287 5.497227e-08
## Shapiro-Wilk Test R W 0.9831497 1.371396e-05
## Ljung-Box Test R Q(10) 14.67442 0.1443905
## Ljung-Box Test R Q(15) 20.45534 0.1551523
## Ljung-Box Test R Q(20) 22.61895 0.3078896
## Ljung-Box Test R^2 Q(10) 8.612195 0.5692598
## Ljung-Box Test R^2 Q(15) 9.580863 0.8452443
## Ljung-Box Test R^2 Q(20) 11.3483 0.936685
## LM Arch Test R TR^2 8.73106 0.7257136
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -5.451191 -5.384267 -5.451683 -5.424941
The previous model gave us pretty good results, let’s try same order but different distribution.
Looking at the Standardised Resituals Tests, we conclude the following:
The Jarque-Bera Test p-value is zero (5.032e-08) so we don’t have normal distribution.
There no evidence of correlation in our residuals and there’s no dependence in conditional variance.
The p-value for the LM Arch Test is 0.68 - there’s no additional ARCH effect our model didn’t captured.
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~arma(1, 1) + garch(2, 1), data = training_set,
## cond.dist = "sstd", trace = F)
##
## Mean and Variance Equation:
## data ~ arma(1, 1) + garch(2, 1)
## <environment: 0x7ff78db83eb0>
## [data = training_set]
##
## Conditional Distribution:
## sstd
##
## Coefficient(s):
## mu ar1 ma1 omega alpha1 alpha2
## 0.00061444 0.68385472 -0.83579617 0.00001079 0.12527362 0.08521354
## beta1 skew shape
## 0.76959753 0.86590389 6.71037266
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu 6.144e-04 2.009e-04 3.059 0.002220 **
## ar1 6.839e-01 9.195e-02 7.437 1.03e-13 ***
## ma1 -8.358e-01 6.902e-02 -12.109 < 2e-16 ***
## omega 1.079e-05 4.944e-06 2.182 0.029084 *
## alpha1 1.253e-01 6.930e-02 1.808 0.070672 .
## alpha2 8.521e-02 8.665e-02 0.983 0.325386
## beta1 7.696e-01 5.202e-02 14.794 < 2e-16 ***
## skew 8.659e-01 5.618e-02 15.413 < 2e-16 ***
## shape 6.710e+00 2.027e+00 3.311 0.000931 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 1386.982 normalized: 2.746499
##
## Description:
## Mon Jun 21 09:54:07 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 33.60938 5.032875e-08
## Shapiro-Wilk Test R W 0.9829162 1.184532e-05
## Ljung-Box Test R Q(10) 14.93142 0.1345823
## Ljung-Box Test R Q(15) 20.53264 0.1524379
## Ljung-Box Test R Q(20) 22.63912 0.3068601
## Ljung-Box Test R^2 Q(10) 9.089247 0.5236552
## Ljung-Box Test R^2 Q(15) 9.879089 0.8272758
## Ljung-Box Test R^2 Q(20) 11.83192 0.9217395
## LM Arch Test R TR^2 9.264897 0.6801537
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -5.457355 -5.382066 -5.457975 -5.427824
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~arma(1, 1) + garch(2, 1), data = training_set,
## cond.dist = "ged", trace = F)
##
## Mean and Variance Equation:
## data ~ arma(1, 1) + garch(2, 1)
## <environment: 0x7ff7aaf12230>
## [data = training_set]
##
## Conditional Distribution:
## ged
##
## Coefficient(s):
## mu ar1 ma1 omega alpha1 alpha2
## 6.7615e-04 6.8279e-01 -8.3055e-01 1.0797e-05 1.4818e-01 6.0657e-02
## beta1 shape
## 7.6827e-01 1.4025e+00
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu 6.762e-04 1.814e-04 3.728 0.000193 ***
## ar1 6.828e-01 7.533e-02 9.064 < 2e-16 ***
## ma1 -8.305e-01 6.124e-02 -13.562 < 2e-16 ***
## omega 1.080e-05 5.081e-06 2.125 0.033596 *
## alpha1 1.482e-01 7.098e-02 2.088 0.036825 *
## alpha2 6.066e-02 8.636e-02 0.702 0.482424
## beta1 7.683e-01 5.323e-02 14.432 < 2e-16 ***
## shape 1.402e+00 1.206e-01 11.629 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 1384.262 normalized: 2.741112
##
## Description:
## Mon Jun 21 09:54:07 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 32.57534 8.440238e-08
## Shapiro-Wilk Test R W 0.98335 1.556233e-05
## Ljung-Box Test R Q(10) 14.7511 0.1414033
## Ljung-Box Test R Q(15) 20.61777 0.1494925
## Ljung-Box Test R Q(20) 22.85798 0.2958215
## Ljung-Box Test R^2 Q(10) 8.699573 0.5608328
## Ljung-Box Test R^2 Q(15) 9.613367 0.84333
## Ljung-Box Test R^2 Q(20) 11.47761 0.9328845
## LM Arch Test R TR^2 8.796358 0.7202188
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -5.450541 -5.383617 -5.451033 -5.424292
The exponential GARCH Model is another form of GARCH model which is able to overcome deficiencies of a standard GARCH model, i.e. to capture asymmetries and it also imposes less assumptions on the parameters of the model.
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : eGARCH(2,1)
## Mean Model : ARFIMA(1,0,1)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.002133 0.000262 8.13284 0.000000
## ar1 0.696325 0.036141 19.26708 0.000000
## ma1 -0.835149 0.030195 -27.65864 0.000000
## omega -0.342423 0.149743 -2.28674 0.022211
## alpha1 -0.018614 0.072655 -0.25620 0.797794
## alpha2 -0.011812 0.073890 -0.15986 0.872989
## beta1 0.959055 0.018030 53.19312 0.000000
## gamma1 0.257589 0.124694 2.06576 0.038851
## gamma2 0.108704 0.135069 0.80480 0.420933
## shape 6.352813 1.821125 3.48840 0.000486
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.002133 0.000240 8.89910 0.000000
## ar1 0.696325 0.015976 43.58696 0.000000
## ma1 -0.835149 0.015498 -53.88634 0.000000
## omega -0.342423 0.122392 -2.79775 0.005146
## alpha1 -0.018614 0.081645 -0.22799 0.819653
## alpha2 -0.011812 0.082048 -0.14397 0.885527
## beta1 0.959055 0.014909 64.32551 0.000000
## gamma1 0.257589 0.151085 1.70493 0.088207
## gamma2 0.108704 0.161564 0.67282 0.501060
## shape 6.352813 1.589815 3.99594 0.000064
##
## LogLikelihood : 1383.244
##
## Information Criteria
## ------------------------------------
##
## Akaike -5.4386
## Bayes -5.3549
## Shibata -5.4394
## Hannan-Quinn -5.4058
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.3239 0.569276
## Lag[2*(p+q)+(p+q)-1][5] 4.9894 0.003433
## Lag[4*(p+q)+(p+q)-1][9] 8.1743 0.049741
## d.o.f=2
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.1371 0.7112
## Lag[2*(p+q)+(p+q)-1][8] 1.7968 0.8889
## Lag[4*(p+q)+(p+q)-1][14] 4.5276 0.8254
## d.o.f=3
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[4] 0.1148 0.500 2.000 0.7348
## ARCH Lag[6] 1.8721 1.461 1.711 0.5196
## ARCH Lag[8] 2.4053 2.368 1.583 0.6595
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.1104
## Individual Statistics:
## mu 0.10216
## ar1 0.02347
## ma1 0.02211
## omega 0.20590
## alpha1 0.11822
## alpha2 0.07415
## beta1 0.18502
## gamma1 0.09281
## gamma2 0.12143
## shape 0.15281
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 2.29 2.54 3.05
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.4548 0.6494
## Negative Sign Bias 0.2151 0.8298
## Positive Sign Bias 0.4098 0.6821
## Joint Effect 0.4375 0.9324
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 28.35 0.07698
## 2 30 33.51 0.25746
## 3 40 50.49 0.10295
## 4 50 61.04 0.11606
##
##
## Elapsed time : 0.2306859
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : fGARCH(2,1)
## fGARCH Sub-Model : GARCH
## Mean Model : ARFIMA(1,0,1)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.002197 0.000311 7.06914 0.000000
## ar1 0.686639 0.110424 6.21821 0.000000
## ma1 -0.829487 0.084608 -9.80386 0.000000
## omega 0.000009 0.000006 1.53364 0.125119
## alpha1 0.185852 0.051548 3.60542 0.000312
## alpha2 0.003371 0.013528 0.24918 0.803224
## beta1 0.781031 0.071300 10.95411 0.000000
## shape 6.483214 2.066436 3.13739 0.001705
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.002197 0.000311 7.05301 0.000000
## ar1 0.686639 0.123220 5.57245 0.000000
## ma1 -0.829487 0.094692 -8.75981 0.000000
## omega 0.000009 0.000009 1.02605 0.304867
## alpha1 0.185852 0.073040 2.54452 0.010943
## alpha2 0.003371 0.013891 0.24266 0.808269
## beta1 0.781031 0.060986 12.80665 0.000000
## shape 6.483214 2.302972 2.81515 0.004875
##
## LogLikelihood : 1383.952
##
## Information Criteria
## ------------------------------------
##
## Akaike -5.4493
## Bayes -5.3824
## Shibata -5.4498
## Hannan-Quinn -5.4231
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.137 0.711309
## Lag[2*(p+q)+(p+q)-1][5] 4.901 0.004622
## Lag[4*(p+q)+(p+q)-1][9] 8.130 0.051666
## d.o.f=2
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.05118 0.8210
## Lag[2*(p+q)+(p+q)-1][8] 2.18498 0.8295
## Lag[4*(p+q)+(p+q)-1][14] 5.06911 0.7609
## d.o.f=3
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[4] 0.05846 0.500 2.000 0.8089
## ARCH Lag[6] 1.54203 1.461 1.711 0.5998
## ARCH Lag[8] 2.03951 2.368 1.583 0.7336
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 2.5116
## Individual Statistics:
## mu 0.11303
## ar1 0.02647
## ma1 0.02857
## omega 0.30597
## alpha1 0.21200
## alpha2 0.17454
## beta1 0.19165
## shape 0.14061
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.89 2.11 2.59
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.35189 0.7251
## Negative Sign Bias 0.09837 0.9217
## Positive Sign Bias 0.11232 0.9106
## Joint Effect 0.25044 0.9691
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 32.07 0.03070
## 2 30 27.46 0.54715
## 3 40 53.18 0.06458
## 4 50 58.07 0.17581
##
##
## Elapsed time : 0.4275339
Integrated GARCH Model is a restricted version of the GARCH model, where the persistent parameters sum up to one. \[ \sum^p_{i=1} ~\beta_{i} +\sum_{i=1}^q~\alpha_{i} = 1 \]
##
## *---------------------------------*
## * GARCH Model Fit *
## *---------------------------------*
##
## Conditional Variance Dynamics
## -----------------------------------
## GARCH Model : iGARCH(2,1)
## Mean Model : ARFIMA(1,0,1)
## Distribution : std
##
## Optimal Parameters
## ------------------------------------
## Estimate Std. Error t value Pr(>|t|)
## mu 0.002202 0.000292 7.54328 0.000000
## ar1 0.692584 0.104186 6.64756 0.000000
## ma1 -0.837489 0.079399 -10.54790 0.000000
## omega 0.000009 0.000004 2.43006 0.015096
## alpha1 0.152971 0.079759 1.91792 0.055121
## alpha2 0.070082 0.092792 0.75526 0.450095
## beta1 0.776948 NA NA NA
## shape 6.079698 1.564976 3.88485 0.000102
##
## Robust Standard Errors:
## Estimate Std. Error t value Pr(>|t|)
## mu 0.002202 0.000291 7.56560 0.000000
## ar1 0.692584 0.124074 5.58205 0.000000
## ma1 -0.837489 0.098121 -8.53526 0.000000
## omega 0.000009 0.000003 2.80590 0.005018
## alpha1 0.152971 0.092254 1.65814 0.097289
## alpha2 0.070082 0.103313 0.67834 0.497553
## beta1 0.776948 NA NA NA
## shape 6.079698 1.441285 4.21825 0.000025
##
## LogLikelihood : 1384.019
##
## Information Criteria
## ------------------------------------
##
## Akaike -5.4535
## Bayes -5.3950
## Shibata -5.4539
## Hannan-Quinn -5.4306
##
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.09987 0.751983
## Lag[2*(p+q)+(p+q)-1][5] 4.87384 0.005065
## Lag[4*(p+q)+(p+q)-1][9] 8.03274 0.056110
## d.o.f=2
## H0 : No serial correlation
##
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
## statistic p-value
## Lag[1] 0.02197 0.8822
## Lag[2*(p+q)+(p+q)-1][8] 1.85547 0.8805
## Lag[4*(p+q)+(p+q)-1][14] 4.79125 0.7949
## d.o.f=3
##
## Weighted ARCH LM Tests
## ------------------------------------
## Statistic Shape Scale P-Value
## ARCH Lag[4] 0.01382 0.500 2.000 0.9064
## ARCH Lag[6] 1.38581 1.461 1.711 0.6406
## ARCH Lag[8] 1.97374 2.368 1.583 0.7469
##
## Nyblom stability test
## ------------------------------------
## Joint Statistic: 1.3259
## Individual Statistics:
## mu 0.11112
## ar1 0.02824
## ma1 0.03078
## omega 0.18152
## alpha1 0.12678
## alpha2 0.09033
## shape 0.13387
##
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic: 1.69 1.9 2.35
## Individual Statistic: 0.35 0.47 0.75
##
## Sign Bias Test
## ------------------------------------
## t-value prob sig
## Sign Bias 0.36609 0.7145
## Negative Sign Bias 0.01196 0.9905
## Positive Sign Bias 0.01530 0.9878
## Joint Effect 0.25201 0.9688
##
##
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
## group statistic p-value(g-1)
## 1 20 27.48 0.09406
## 2 30 32.68 0.29070
## 3 40 59.36 0.01937
## 4 50 70.35 0.02443
##
##
## Elapsed time : 0.08620811
If we compare Akaike Information Criteria of all of our models we see that the ARMA(1, 1) - GARCH (2, 1) with skew student distribution is the best one.
## Model AIC
## 1 GARCH(1, 1) Student t -5.452700
## 2 GARCH (2, 1) Student t -5.451191
## 3 GARCH(2, 1) skew Student -5.457355
## 4 EGARCH(2, 1) Student t -5.438600
## 5 FGARCH(2, 1) -5.449300
## 6 GARCH(2, 1) Generalized Error -5.450541
## 7 IGARCH(2, 1) Student t -5.453500
Let’s take a closer look at the chosen model once again.
Even though this model had the lowest AIC, not all of it’s parameters are significant. Here we can see that alpha 2 is insignificant with p-value of 0.325.
Ljung-Box Test R Q(10) 14.93144 0.1345816 - no correlation
Ljung-Box Test R Q(15) 20.53265 0.1524374 - no correlation
Ljung-Box Test R Q(20) 22.63913 0.3068595 - no correlation
Ljung-Box Test R^2 Q(10) 9.089257 0.5236542 - no dependence in conditional variance
Ljung-Box Test R^2 Q(15) 9.8791 0.8272751 - no dependence in conditional variance
Ljung-Box Test R^2 Q(20) 11.83193 0.9217392 - no dependence in conditional variance
LM Arch Test tells us that there’s no additional ARCH effect our model didn’t captured.
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~arma(1, 1) + garch(2, 1), data = training_set,
## cond.dist = "sstd", trace = F)
##
## Mean and Variance Equation:
## data ~ arma(1, 1) + garch(2, 1)
## <environment: 0x7ff7a9aa98d0>
## [data = training_set]
##
## Conditional Distribution:
## sstd
##
## Coefficient(s):
## mu ar1 ma1 omega alpha1 alpha2
## 0.00061444 0.68385472 -0.83579617 0.00001079 0.12527362 0.08521354
## beta1 skew shape
## 0.76959753 0.86590389 6.71037266
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu 6.144e-04 2.009e-04 3.059 0.002220 **
## ar1 6.839e-01 9.195e-02 7.437 1.03e-13 ***
## ma1 -8.358e-01 6.902e-02 -12.109 < 2e-16 ***
## omega 1.079e-05 4.944e-06 2.182 0.029084 *
## alpha1 1.253e-01 6.930e-02 1.808 0.070672 .
## alpha2 8.521e-02 8.665e-02 0.983 0.325386
## beta1 7.696e-01 5.202e-02 14.794 < 2e-16 ***
## skew 8.659e-01 5.618e-02 15.413 < 2e-16 ***
## shape 6.710e+00 2.027e+00 3.311 0.000931 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 1386.982 normalized: 2.746499
##
## Description:
## Mon Jun 21 09:54:09 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 33.60938 5.032875e-08
## Shapiro-Wilk Test R W 0.9829162 1.184532e-05
## Ljung-Box Test R Q(10) 14.93142 0.1345823
## Ljung-Box Test R Q(15) 20.53264 0.1524379
## Ljung-Box Test R Q(20) 22.63912 0.3068601
## Ljung-Box Test R^2 Q(10) 9.089247 0.5236552
## Ljung-Box Test R^2 Q(15) 9.879089 0.8272758
## Ljung-Box Test R^2 Q(20) 11.83192 0.9217395
## LM Arch Test R TR^2 9.264897 0.6801537
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -5.457355 -5.382066 -5.457975 -5.427824
By examining the statistics and QQ plot of our chosen model, we see that even though we used the model with skew-student distribution, we still don’t have normal distribution the skewnees is still not zero.
The experiment with Generalized Error Distribution didn’t fix this issue.
## attilda
## nobs 505.000000
## NAs 0.000000
## Minimum -3.506913
## Maximum 3.389879
## 1. Quartile -0.557014
## 3. Quartile 0.581023
## Mean -0.012745
## Median 0.040364
## Sum -6.436045
## SE Mean 0.044357
## LCL Mean -0.099892
## UCL Mean 0.074403
## Variance 0.993609
## Stdev 0.996799
## Skewness -0.357820
## Kurtosis 1.024246
We first do a one-period-ahead forecast using the selected model, here we use rolling forecast method.
Now let’s take a look at the metrics of our model on out-of-sample data.
## [1] "RMSE of selected model: ARMA(1, 1)-GARCH(2, 1) with skew student distribution 0.013672"
Now let’s do a forecast with two aditional models, EGARCH and IGARCH and compare the results.
EGARCH:
## [1] "RMSE of EGARCH model 0.014105"
IGARCH:
## [1] "RMSE of IGARCH model 0.013615"
Root-mean-square error of all three models is pretty similar.
To further evaluate models, we use the Diebold-Mariano test to determine whether forecasts are significantly different.
Since p-value for both models comparison is zero (2.2e-16 and 1.55e-13) we reject the null hypothesis.
This tells us that the difference in models performance is not significant.
##
## Diebold-Mariano Test
##
## data: structure(abs(predictions_selected - target_values), class = "forecast")structure(abs(predictions_igarch - target_values), class = "forecast")
## DM = -12.492, Forecast horizon = 1, Loss function power = 2, p-value <
## 2.2e-16
## alternative hypothesis: two.sided
##
## Diebold-Mariano Test
##
## data: structure(abs(predictions_selected - target_values), class = "forecast")structure(abs(predictions_egarch - target_values), class = "forecast")
## DM = -8.4225, Forecast horizon = 1, Loss function power = 2, p-value =
## 1.55e-13
## alternative hypothesis: two.sided
Statistics and Financial Data Analysis
A work by: Nikola Krivacevic, Aleksandar Milinkovic and Milos Milunovic
Entire forecasting project on github
(https://github.com/mcf-long-short/statistics-stocks-forecasting)